diff --git a/docs/_posts/DevinTDHa/2023-12-02-zero_shot_classifier_clip_vit_base_patch32_en.md b/docs/_posts/DevinTDHa/2023-12-02-zero_shot_classifier_clip_vit_base_patch32_en.md new file mode 100644 index 000000000000..0cd9cc2246ea --- /dev/null +++ b/docs/_posts/DevinTDHa/2023-12-02-zero_shot_classifier_clip_vit_base_patch32_en.md @@ -0,0 +1,149 @@ +--- +layout: model +title: Image Zero Shot Classification with CLIP +author: John Snow Labs +name: zero_shot_classifier_clip_vit_base_patch32 +date: 2023-12-02 +tags: [classification, image, en, zero_shot, open_source, onnx] +task: Zero-Shot Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: CLIPForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +CLIP (Contrastive Language-Image Pre-Training) is a neural network that was trained on image +and text pairs. It has the ability to predict images without training on any hard-coded +labels. This makes it very flexible, as labels can be provided during inference. This is +similar to the zero-shot capabilities of the GPT-2 and 3 models. + +This model was imported from huggingface transformers: +https://huggingface.co/openai/clip-vit-base-patch32 + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/zero_shot_classifier_clip_vit_base_patch32_en_5.2.0_3.0_1701541274927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/zero_shot_classifier_clip_vit_base_patch32_en_5.2.0_3.0_1701541274927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +import sparknlp +from sparknlp.base import * +from sparknlp.annotator import * +from pyspark.ml import Pipeline + +imageDF = spark.read \ + .format("image") \ + .option("dropInvalid", value = True) \ + .load("src/test/resources/image/") + +imageAssembler: ImageAssembler = ImageAssembler() \ + .setInputCol("image") \ + .setOutputCol("image_assembler") + +candidateLabels = [ + "a photo of a bird", + "a photo of a cat", + "a photo of a dog", + "a photo of a hen", + "a photo of a hippo", + "a photo of a room", + "a photo of a tractor", + "a photo of an ostrich", + "a photo of an ox"] + +imageClassifier = CLIPForZeroShotClassification \ + .pretrained() \ + .setInputCols(["image_assembler"]) \ + .setOutputCol("label") \ + .setCandidateLabels(candidateLabels) + +pipeline = Pipeline().setStages([imageAssembler, imageClassifier]) +pipelineDF = pipeline.fit(imageDF).transform(imageDF) +pipelineDF \ + .selectExpr("reverse(split(image.origin, '/'))[0] as image_name", "label.result") \ + .show(truncate=False) +``` +```scala +import com.johnsnowlabs.nlp.ImageAssembler +import com.johnsnowlabs.nlp.annotator._ +import org.apache.spark.ml.Pipeline +val imageDF = ResourceHelper.spark.read + .format("image") + .option("dropInvalid", value = true) + .load("src/test/resources/image/") +val imageAssembler: ImageAssembler = new ImageAssembler() + .setInputCol("image") + .setOutputCol("image_assembler") +val candidateLabels = Array( + "a photo of a bird", + "a photo of a cat", + "a photo of a dog", + "a photo of a hen", + "a photo of a hippo", + "a photo of a room", + "a photo of a tractor", + "a photo of an ostrich", + "a photo of an ox") +val imageClassifier = CLIPForZeroShotClassification + .pretrained() + .setInputCols("image_assembler") + .setOutputCol("label") + .setCandidateLabels(candidateLabels) +val pipeline = + new Pipeline().setStages(Array(imageAssembler, imageClassifier)).fit(imageDF).transform(imageDF) +pipeline + .selectExpr("reverse(split(image.origin, '/'))[0] as image_name", "label.result") + .show(truncate = false) +``` +
+ +## Results + +```bash ++-----------------+-----------------------+ +|image_name |result | ++-----------------+-----------------------+ +|palace.JPEG |[a photo of a room] | +|egyptian_cat.jpeg|[a photo of a cat] | +|hippopotamus.JPEG|[a photo of a hippo] | +|hen.JPEG |[a photo of a hen] | +|ostrich.JPEG |[a photo of an ostrich]| +|junco.JPEG |[a photo of a bird] | +|bluetick.jpg |[a photo of a dog] | +|chihuahua.jpg |[a photo of a dog] | +|tractor.JPEG |[a photo of a tractor] | +|ox.JPEG |[a photo of an ox] | ++-----------------+-----------------------+ +``` + +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|zero_shot_classifier_clip_vit_base_patch32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[image_assembler]| +|Output Labels:|[classification]| +|Language:|en| +|Size:|392.8 MB| diff --git a/docs/_posts/ahmedlone127/2023-09-15-hf_distilbert_imdb_mlm_cosine_en.md b/docs/_posts/ahmedlone127/2023-09-15-google_Job_data_tuned_trial_2_11_2_2022_en.md similarity index 67% rename from docs/_posts/ahmedlone127/2023-09-15-hf_distilbert_imdb_mlm_cosine_en.md rename to docs/_posts/ahmedlone127/2023-09-15-google_Job_data_tuned_trial_2_11_2_2022_en.md index 321a26b95758..09bd2241a5d4 100644 --- a/docs/_posts/ahmedlone127/2023-09-15-hf_distilbert_imdb_mlm_cosine_en.md +++ b/docs/_posts/ahmedlone127/2023-09-15-google_Job_data_tuned_trial_2_11_2_2022_en.md @@ -1,8 +1,8 @@ --- layout: model -title: English hf_distilbert_imdb_mlm_cosine DistilBertEmbeddings from nos1de +title: English google_job_data_tuned_trial_2_11_2_2022 DistilBertEmbeddings from EslamAhmed author: John Snow Labs -name: hf_distilbert_imdb_mlm_cosine +name: google_job_data_tuned_trial_2_11_2_2022 date: 2023-09-15 tags: [distilbert, en, open_source, fill_mask, onnx] task: Embeddings @@ -19,13 +19,13 @@ use_language_switcher: "Python-Scala-Java" ## Description -Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hf_distilbert_imdb_mlm_cosine` is a English model originally trained by nos1de. +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`google_job_data_tuned_trial_2_11_2_2022` is a English model originally trained by EslamAhmed. {:.btn-box} -[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hf_distilbert_imdb_mlm_cosine_en_5.1.2_3.0_1694769976827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} -[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hf_distilbert_imdb_mlm_cosine_en_5.1.2_3.0_1694769976827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/google_job_data_tuned_trial_2_11_2_2022_en_5.1.2_3.0_1694772782812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/google_job_data_tuned_trial_2_11_2_2022_en_5.1.2_3.0_1694772782812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} ## How to use @@ -41,7 +41,7 @@ document_assembler = DocumentAssembler() \ .setOutputCol("documents") -embeddings =DistilBertEmbeddings.pretrained("hf_distilbert_imdb_mlm_cosine","en") \ +embeddings =DistilBertEmbeddings.pretrained("google_job_data_tuned_trial_2_11_2_2022","en") \ .setInputCols(["documents","token"]) \ .setOutputCol("embeddings") @@ -60,7 +60,7 @@ val document_assembler = new DocumentAssembler() .setOutputCol("embeddings") val embeddings = DistilBertEmbeddings - .pretrained("hf_distilbert_imdb_mlm_cosine", "en") + .pretrained("google_job_data_tuned_trial_2_11_2_2022", "en") .setInputCols(Array("documents","token")) .setOutputCol("embeddings") @@ -79,15 +79,15 @@ val pipelineDF = pipelineModel.transform(data) {:.table-model} |---|---| -|Model Name:|hf_distilbert_imdb_mlm_cosine| +|Model Name:|google_job_data_tuned_trial_2_11_2_2022| |Compatibility:|Spark NLP 5.1.2+| |License:|Open Source| |Edition:|Official| |Input Labels:|[documents, token]| |Output Labels:|[embeddings]| |Language:|en| -|Size:|247.2 MB| +|Size:|402.3 MB| ## References -https://huggingface.co/nos1de/hf-distilbert-imdb-mlm-cosine \ No newline at end of file +https://huggingface.co/EslamAhmed/google_Job_data_tuned_trial_2_11-2-2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-09-15-google_job_data_tuned_trial_2_11_2_2022_en.md b/docs/_posts/ahmedlone127/2023-09-15-google_job_data_tuned_trial_2_11_2_2022_en.md new file mode 100644 index 000000000000..09bd2241a5d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-09-15-google_job_data_tuned_trial_2_11_2_2022_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English google_job_data_tuned_trial_2_11_2_2022 DistilBertEmbeddings from EslamAhmed +author: John Snow Labs +name: google_job_data_tuned_trial_2_11_2_2022 +date: 2023-09-15 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.2 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertEmbeddings +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`google_job_data_tuned_trial_2_11_2_2022` is a English model originally trained by EslamAhmed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/google_job_data_tuned_trial_2_11_2_2022_en_5.1.2_3.0_1694772782812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/google_job_data_tuned_trial_2_11_2_2022_en_5.1.2_3.0_1694772782812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("google_job_data_tuned_trial_2_11_2_2022","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("google_job_data_tuned_trial_2_11_2_2022", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|google_job_data_tuned_trial_2_11_2_2022| +|Compatibility:|Spark NLP 5.1.2+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.3 MB| + +## References + +https://huggingface.co/EslamAhmed/google_Job_data_tuned_trial_2_11-2-2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_0_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_0_en.md new file mode 100644 index 000000000000..63ff6f4d9abf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_0 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_0 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_0` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_0_en_5.1.4_3.4_1698355801234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_0_en_5.1.4_3.4_1698355801234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_0| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_1_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_1_en.md new file mode 100644 index 000000000000..b1d8f49419a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_1 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_1 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_1` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_1_en_5.1.4_3.4_1698356568456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_1_en_5.1.4_3.4_1698356568456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_2_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_2_en.md new file mode 100644 index 000000000000..f56b70df04fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_2 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_2` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_2_en_5.1.4_3.4_1698357439791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_2_en_5.1.4_3.4_1698357439791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_3_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_3_en.md new file mode 100644 index 000000000000..c439ff051c32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_3 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_3 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_3` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_3_en_5.1.4_3.4_1698358205047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_3_en_5.1.4_3.4_1698358205047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_54_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_54_en.md new file mode 100644 index 000000000000..013518226c1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_54_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_54 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_54 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_54` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_54_en_5.1.4_3.4_1698359049867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_54_en_5.1.4_3.4_1698359049867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_54","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_54","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_54| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-54 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_55_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_55_en.md new file mode 100644 index 000000000000..b9dba8b98144 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_55_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_55 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_55 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_55` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_55_en_5.1.4_3.4_1698359796671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_55_en_5.1.4_3.4_1698359796671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_55","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_55","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_55| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-55 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_64_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_64_en.md new file mode 100644 index 000000000000..4c00ff2586f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_64_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_64 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_64 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_64` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_64_en_5.1.4_3.4_1698360796049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_64_en_5.1.4_3.4_1698360796049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_64","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_64","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_64| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-64 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_65_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_65_en.md new file mode 100644 index 000000000000..6bb1ad92ff78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_65_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_65 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_65 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_65` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_65_en_5.1.4_3.4_1698361659384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_65_en_5.1.4_3.4_1698361659384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_65","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_65","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_65| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-65 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_66_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_66_en.md new file mode 100644 index 000000000000..08d4bbc52e6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_66_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_66 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_66 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_66` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_66_en_5.1.4_3.4_1698362656911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_66_en_5.1.4_3.4_1698362656911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_66","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_66","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_66| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-66 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_67_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_67_en.md new file mode 100644 index 000000000000..e5bad791da79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_67_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_67 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_67 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_67` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_67_en_5.1.4_3.4_1698363638009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_67_en_5.1.4_3.4_1698363638009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_67","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_67","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_67| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-67 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_68_en.md b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_68_en.md new file mode 100644 index 000000000000..b054b5fa3d1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-512seq_len_6ep_bert_ft_cola_68_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_68 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_68 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_68` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_68_en_5.1.4_3.4_1698364742932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_68_en_5.1.4_3.4_1698364742932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_68","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_68","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_68| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-68 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_48_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_48_en.md index 2ff46a75a015..8f7ad02ce844 100644 --- a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_48_en.md +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_48_en.md @@ -21,11 +21,15 @@ use_language_switcher: "Python-Scala-Java" Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_48` is a English model originally trained by Jeevesh8. +## Predicted Entities + + + {:.btn-box} -[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_48_en_5.1.4_3.4_1698294005792.zip){:.button.button-orange.button-orange-trans.arr.button-icon} -[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_48_en_5.1.4_3.4_1698294005792.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_48_en_5.1.4_3.4_1698311945088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_48_en_5.1.4_3.4_1698311945088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} ## How to use @@ -34,7 +38,6 @@ Pretrained BertForSequenceClassification model, adapted from Hugging Face and cu
{% include programmingLanguageSelectScalaPythonNLU.html %} ```python - document_assembler = DocumentAssembler()\ .setInputCol("text")\ .setOutputCol("document") @@ -52,10 +55,8 @@ pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifi data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") result = pipeline.fit(data).transform(data) - ``` ```scala - val document_assembler = new DocumentAssembler() .setInputCol("text") .setOutputCol("document") @@ -73,8 +74,6 @@ val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequ val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") val result = pipeline.fit(data).transform(data) - - ```
@@ -94,4 +93,6 @@ val result = pipeline.fit(data).transform(data) ## References +References + https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-48 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_49_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_49_en.md new file mode 100644 index 000000000000..af5eda62b533 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_49_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_49 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_49 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_49` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_49_en_5.1.4_3.4_1698312158484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_49_en_5.1.4_3.4_1698312158484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_49","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_49","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_49| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-49 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_50_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_50_en.md new file mode 100644 index 000000000000..1903979fa129 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_50_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_50 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_50 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_50` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_50_en_5.1.4_3.4_1698312332331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_50_en_5.1.4_3.4_1698312332331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_50","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_50","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_50| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_51_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_51_en.md new file mode 100644 index 000000000000..23ab27cd1637 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_51_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_51 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_51 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_51` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_51_en_5.1.4_3.4_1698312537974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_51_en_5.1.4_3.4_1698312537974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_51","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_51","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_51| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-51 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_52_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_52_en.md new file mode 100644 index 000000000000..15999a30f912 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_52_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_52 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_52 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_52` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_52_en_5.1.4_3.4_1698312778407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_52_en_5.1.4_3.4_1698312778407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_52","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_52","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_52| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-52 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_53_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_53_en.md new file mode 100644 index 000000000000..72e221a7b98b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_53_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_53 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_53 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_53` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_53_en_5.1.4_3.4_1698313021670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_53_en_5.1.4_3.4_1698313021670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_53","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_53","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_53| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-53 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_54_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_54_en.md new file mode 100644 index 000000000000..9f92cffef313 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_54_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_54 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_54 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_54` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_54_en_5.1.4_3.4_1698313233806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_54_en_5.1.4_3.4_1698313233806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_54","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_54","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_54| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-54 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_55_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_55_en.md new file mode 100644 index 000000000000..15d2243b971b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_55_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_55 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_55 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_55` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_55_en_5.1.4_3.4_1698313460889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_55_en_5.1.4_3.4_1698313460889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_55","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_55","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_55| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-55 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_56_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_56_en.md new file mode 100644 index 000000000000..d3a3cb04170a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_56_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_56 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_56 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_56` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_56_en_5.1.4_3.4_1698313683387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_56_en_5.1.4_3.4_1698313683387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_56","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_56","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_56| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-56 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_57_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_57_en.md new file mode 100644 index 000000000000..68d23cc5aae0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_57_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_57 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_57 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_57` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_57_en_5.1.4_3.4_1698314637898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_57_en_5.1.4_3.4_1698314637898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_57","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_57","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_57| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-57 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_58_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_58_en.md new file mode 100644 index 000000000000..544355faa848 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_58_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_58 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_58 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_58` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_58_en_5.1.4_3.4_1698315284546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_58_en_5.1.4_3.4_1698315284546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_58","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_58","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_58| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-58 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_59_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_59_en.md new file mode 100644 index 000000000000..ef2a29baa27c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_59_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_59 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_59 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_59` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_59_en_5.1.4_3.4_1698315992641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_59_en_5.1.4_3.4_1698315992641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_59","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_59","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_59| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-59 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_60_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_60_en.md new file mode 100644 index 000000000000..a42b385b876d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_60_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_60 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_60 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_60` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_60_en_5.1.4_3.4_1698316947252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_60_en_5.1.4_3.4_1698316947252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_60","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_60","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_60| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-60 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_61_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_61_en.md new file mode 100644 index 000000000000..377a1df5de77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_61_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_61 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_61 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_61` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_61_en_5.1.4_3.4_1698317640443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_61_en_5.1.4_3.4_1698317640443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_61","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_61","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_61| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-61 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_62_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_62_en.md new file mode 100644 index 000000000000..8970ca279c1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_62_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_62 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_62 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_62` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_62_en_5.1.4_3.4_1698318467583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_62_en_5.1.4_3.4_1698318467583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_62","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_62","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_62| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-62 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_63_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_63_en.md new file mode 100644 index 000000000000..41245ff8b2c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_63_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_63 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_63 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_63` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_63_en_5.1.4_3.4_1698319516149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_63_en_5.1.4_3.4_1698319516149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_63","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_63","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_63| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-63 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_64_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_64_en.md new file mode 100644 index 000000000000..3fe1b23d441d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_64_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_64 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_64 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_64` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_64_en_5.1.4_3.4_1698320245612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_64_en_5.1.4_3.4_1698320245612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_64","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_64","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_64| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-64 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_65_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_65_en.md new file mode 100644 index 000000000000..a09285c50b8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_65_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_65 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_65 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_65` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_65_en_5.1.4_3.4_1698321070755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_65_en_5.1.4_3.4_1698321070755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_65","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_65","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_65| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-65 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_66_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_66_en.md new file mode 100644 index 000000000000..028fd52555a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_66_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_66 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_66 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_66` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_66_en_5.1.4_3.4_1698322077532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_66_en_5.1.4_3.4_1698322077532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_66","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_66","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_66| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-66 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_67_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_67_en.md new file mode 100644 index 000000000000..65d027f8f472 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_67_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_67 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_67 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_67` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_67_en_5.1.4_3.4_1698322772084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_67_en_5.1.4_3.4_1698322772084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_67","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_67","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_67| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-67 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_68_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_68_en.md new file mode 100644 index 000000000000..5020978989ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_68_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_68 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_68 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_68` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_68_en_5.1.4_3.4_1698323719081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_68_en_5.1.4_3.4_1698323719081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_68","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_68","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_68| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-68 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_69_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_69_en.md new file mode 100644 index 000000000000..5746f7cddef8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_69_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_69 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_69 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_69` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_69_en_5.1.4_3.4_1698324511674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_69_en_5.1.4_3.4_1698324511674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_69","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_69","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_69| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-69 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_70_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_70_en.md new file mode 100644 index 000000000000..8e21337c8055 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_70_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_70 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_70 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_70` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_70_en_5.1.4_3.4_1698325486183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_70_en_5.1.4_3.4_1698325486183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_70","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_70","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_70| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-70 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_71_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_71_en.md new file mode 100644 index 000000000000..79e8b47d9fff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_71_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_71 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_71 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_71` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_71_en_5.1.4_3.4_1698326381065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_71_en_5.1.4_3.4_1698326381065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_71","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_71","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_71| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-71 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_72_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_72_en.md new file mode 100644 index 000000000000..38cf42017ac5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_72_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_72 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_72 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_72` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_72_en_5.1.4_3.4_1698327397059.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_72_en_5.1.4_3.4_1698327397059.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_72","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_72","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_72| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-72 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_73_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_73_en.md new file mode 100644 index 000000000000..0320755a87de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_73_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_73 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_73 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_73` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_73_en_5.1.4_3.4_1698328228027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_73_en_5.1.4_3.4_1698328228027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_73","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_73","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_73| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-73 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_74_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_74_en.md new file mode 100644 index 000000000000..939e9b3c04f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_74_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_74 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_74 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_74` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_74_en_5.1.4_3.4_1698329243536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_74_en_5.1.4_3.4_1698329243536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_74","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_74","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_74| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-74 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_75_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_75_en.md new file mode 100644 index 000000000000..70bd726eb9ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_75_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_75 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_75 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_75` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_75_en_5.1.4_3.4_1698330116575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_75_en_5.1.4_3.4_1698330116575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_75","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_75","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_75| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-75 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_76_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_76_en.md new file mode 100644 index 000000000000..a7bdc617358f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_76_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_76 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_76 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_76` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_76_en_5.1.4_3.4_1698330966136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_76_en_5.1.4_3.4_1698330966136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_76","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_76","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_76| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-76 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_77_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_77_en.md new file mode 100644 index 000000000000..2dc9bb709ad7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_77_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_77 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_77 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_77` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_77_en_5.1.4_3.4_1698339508325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_77_en_5.1.4_3.4_1698339508325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_77","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_77","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_77| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-77 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_78_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_78_en.md new file mode 100644 index 000000000000..ecd411cb2a3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_78_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_78 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_78 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_78` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_78_en_5.1.4_3.4_1698339691653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_78_en_5.1.4_3.4_1698339691653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_78","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_78","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_78| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-78 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_79_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_79_en.md new file mode 100644 index 000000000000..f390a17c44ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_79_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_79 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_79 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_79` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_79_en_5.1.4_3.4_1698339914291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_79_en_5.1.4_3.4_1698339914291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_79","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_79","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_79| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-79 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_80_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_80_en.md new file mode 100644 index 000000000000..4bf02ceedacc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_80_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_80 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_80 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_80` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_80_en_5.1.4_3.4_1698340113056.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_80_en_5.1.4_3.4_1698340113056.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_80","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_80","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_80| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-80 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_81_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_81_en.md new file mode 100644 index 000000000000..d9c09d79fb31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_81_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_81 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_81 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_81` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_81_en_5.1.4_3.4_1698340303061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_81_en_5.1.4_3.4_1698340303061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_81","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_81","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_81| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-81 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_82_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_82_en.md new file mode 100644 index 000000000000..9492c030d23e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_82_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_82 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_82 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_82` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_82_en_5.1.4_3.4_1698340490465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_82_en_5.1.4_3.4_1698340490465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_82","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_82","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_82| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-82 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_83_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_83_en.md new file mode 100644 index 000000000000..ee496aae6aa0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_83_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_83 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_83 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_83` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_83_en_5.1.4_3.4_1698340650965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_83_en_5.1.4_3.4_1698340650965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_83","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_83","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_83| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-83 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_84_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_84_en.md new file mode 100644 index 000000000000..43199e88eb83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_84_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_84 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_84 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_84` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_84_en_5.1.4_3.4_1698340818024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_84_en_5.1.4_3.4_1698340818024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_84","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_84","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_84| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-84 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_85_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_85_en.md new file mode 100644 index 000000000000..9b41ed785de2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_85_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_85 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_85 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_85` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_85_en_5.1.4_3.4_1698340987075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_85_en_5.1.4_3.4_1698340987075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_85","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_85","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_85| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-85 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_86_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_86_en.md new file mode 100644 index 000000000000..a52775a50b94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_86_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_86 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_86 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_86` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_86_en_5.1.4_3.4_1698341175773.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_86_en_5.1.4_3.4_1698341175773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_86","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_86","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_86| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-86 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_87_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_87_en.md new file mode 100644 index 000000000000..2a827624fe90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_87_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_87 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_87 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_87` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_87_en_5.1.4_3.4_1698341373677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_87_en_5.1.4_3.4_1698341373677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_87","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_87","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_87| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-87 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_88_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_88_en.md new file mode 100644 index 000000000000..b78b85376b0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_88_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_88 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_88 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_88` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_88_en_5.1.4_3.4_1698341564396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_88_en_5.1.4_3.4_1698341564396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_88","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_88","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_88| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-88 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_89_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_89_en.md new file mode 100644 index 000000000000..bd5d59040884 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_89_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_89 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_89 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_89` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_89_en_5.1.4_3.4_1698342377586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_89_en_5.1.4_3.4_1698342377586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_89","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_89","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_89| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-89 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_90_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_90_en.md new file mode 100644 index 000000000000..d6d6d0946b04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_90_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_90 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_90 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_90` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_90_en_5.1.4_3.4_1698343548621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_90_en_5.1.4_3.4_1698343548621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_90","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_90","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_90| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-90 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_91_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_91_en.md new file mode 100644 index 000000000000..2291778457c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_91_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_91 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_91 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_91` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_91_en_5.1.4_3.4_1698344286817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_91_en_5.1.4_3.4_1698344286817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_91","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_91","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_91| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-91 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_92_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_92_en.md new file mode 100644 index 000000000000..8b3bc8e1a8c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_92_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_92 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_92 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_92` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_92_en_5.1.4_3.4_1698345383731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_92_en_5.1.4_3.4_1698345383731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_92","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_92","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_92| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-92 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_93_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_93_en.md new file mode 100644 index 000000000000..1ddce8a6bb65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_93_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_93 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_93 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_93` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_93_en_5.1.4_3.4_1698346402594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_93_en_5.1.4_3.4_1698346402594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_93","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_93","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_93| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-93 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_94_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_94_en.md new file mode 100644 index 000000000000..719be73e8daa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_94_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_94 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_94 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_94` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_94_en_5.1.4_3.4_1698347297858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_94_en_5.1.4_3.4_1698347297858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_94","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_94","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_94| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-94 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_95_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_95_en.md new file mode 100644 index 000000000000..47a737573f2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_95_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_95 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_95 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_95` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_95_en_5.1.4_3.4_1698348321692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_95_en_5.1.4_3.4_1698348321692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_95","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_95","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_95| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-95 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_96_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_96_en.md new file mode 100644 index 000000000000..09065f7b8011 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_96 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_96 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_96` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_96_en_5.1.4_3.4_1698349137775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_96_en_5.1.4_3.4_1698349137775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_96| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_97_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_97_en.md new file mode 100644 index 000000000000..f6977027ebf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_97_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_97 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_97 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_97` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_97_en_5.1.4_3.4_1698350218361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_97_en_5.1.4_3.4_1698350218361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_97","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_97","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_97| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-97 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_98_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_98_en.md new file mode 100644 index 000000000000..4041f4e7a655 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_98_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_98 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_98 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_98` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_98_en_5.1.4_3.4_1698351147127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_98_en_5.1.4_3.4_1698351147127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_98","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_98","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_98| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-98 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_99_en.md b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_99_en.md new file mode 100644 index 000000000000..b2b61e4ff6b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-6ep_bert_ft_cola_99_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 6ep_bert_ft_cola_99 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 6ep_bert_ft_cola_99 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`6ep_bert_ft_cola_99` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_99_en_5.1.4_3.4_1698351964261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/6ep_bert_ft_cola_99_en_5.1.4_3.4_1698351964261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_99","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("6ep_bert_ft_cola_99","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|6ep_bert_ft_cola_99| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/6ep_bert_ft_cola-99 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-augmented_bert_en.md b/docs/_posts/ahmedlone127/2023-10-26-augmented_bert_en.md new file mode 100644 index 000000000000..be15254dad9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-augmented_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English augmented_bert BertForSequenceClassification from noob123 +author: John Snow Labs +name: augmented_bert +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`augmented_bert` is a English model originally trained by noob123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/augmented_bert_en_5.1.4_3.4_1698325486375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/augmented_bert_en_5.1.4_3.4_1698325486375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("augmented_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("augmented_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|augmented_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/noob123/augmented_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bangla_bert_base_vitd_bn.md b/docs/_posts/ahmedlone127/2023-10-26-bangla_bert_base_vitd_bn.md new file mode 100644 index 000000000000..4b89e894d2e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bangla_bert_base_vitd_bn.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Bengali bangla_bert_base_vitd BertForSequenceClassification from ka05ar +author: John Snow Labs +name: bangla_bert_base_vitd +date: 2023-10-26 +tags: [bert, bn, open_source, sequence_classification, onnx] +task: Text Classification +language: bn +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bangla_bert_base_vitd` is a Bengali model originally trained by ka05ar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bangla_bert_base_vitd_bn_5.1.4_3.4_1698324751110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bangla_bert_base_vitd_bn_5.1.4_3.4_1698324751110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bangla_bert_base_vitd","bn")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bangla_bert_base_vitd","bn") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bangla_bert_base_vitd| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|bn| +|Size:|616.9 MB| + +## References + +https://huggingface.co/ka05ar/bangla-bert-base-VITD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-beep_kcbert_base_hate_en.md b/docs/_posts/ahmedlone127/2023-10-26-beep_kcbert_base_hate_en.md new file mode 100644 index 000000000000..9c566694e9d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-beep_kcbert_base_hate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English beep_kcbert_base_hate BertForSequenceClassification from beomi +author: John Snow Labs +name: beep_kcbert_base_hate +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beep_kcbert_base_hate` is a English model originally trained by beomi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beep_kcbert_base_hate_en_5.1.4_3.4_1698341347704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beep_kcbert_base_hate_en_5.1.4_3.4_1698341347704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("beep_kcbert_base_hate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("beep_kcbert_base_hate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beep_kcbert_base_hate| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.4 MB| + +## References + +https://huggingface.co/beomi/beep-kcbert-base-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_action_romanian_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_action_romanian_en.md new file mode 100644 index 000000000000..e6a93157ec51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_action_romanian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_action_romanian BertForSequenceClassification from LibrAI +author: John Snow Labs +name: bert_action_romanian +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_action_romanian` is a English model originally trained by LibrAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_action_romanian_en_5.1.4_3.4_1698351900116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_action_romanian_en_5.1.4_3.4_1698351900116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_action_romanian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_action_romanian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_action_romanian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/LibrAI/bert-action-ro \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_aig_flvs_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_aig_flvs_en.md new file mode 100644 index 000000000000..3fabf676568e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_aig_flvs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_aig_flvs BertForSequenceClassification from wallacenpj +author: John Snow Labs +name: bert_aig_flvs +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_aig_flvs` is a English model originally trained by wallacenpj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_aig_flvs_en_5.1.4_3.4_1698313666989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_aig_flvs_en_5.1.4_3.4_1698313666989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_aig_flvs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_aig_flvs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_aig_flvs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/wallacenpj/bert_aig_flvs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_banking77_pt2_egel_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_banking77_pt2_egel_en.md new file mode 100644 index 000000000000..6c5c2abbb474 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_banking77_pt2_egel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_banking77_pt2_egel BertForSequenceClassification from Egel +author: John Snow Labs +name: bert_base_banking77_pt2_egel +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_egel` is a English model originally trained by Egel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_egel_en_5.1.4_3.4_1698323789337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_egel_en_5.1.4_3.4_1698323789337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_egel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_egel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_egel| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/Egel/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_banking77_pt2_philschmid_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_banking77_pt2_philschmid_en.md new file mode 100644 index 000000000000..3f9a0f154cf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_banking77_pt2_philschmid_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_banking77_pt2_philschmid BertForSequenceClassification from philschmid +author: John Snow Labs +name: bert_base_banking77_pt2_philschmid +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_philschmid` is a English model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_philschmid_en_5.1.4_3.4_1698355466992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_philschmid_en_5.1.4_3.4_1698355466992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_philschmid","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_philschmid","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_philschmid| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/philschmid/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_caption_classifier_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_caption_classifier_en.md new file mode 100644 index 000000000000..4db421da1474 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_caption_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_caption_classifier BertForSequenceClassification from nielsr +author: John Snow Labs +name: bert_base_caption_classifier +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_caption_classifier` is a English model originally trained by nielsr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_caption_classifier_en_5.1.4_3.4_1698316747852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_caption_classifier_en_5.1.4_3.4_1698316747852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_caption_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_caption_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_caption_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nielsr/bert-base-caption-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_cased_finetuned_mrpc_yangdechuan_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_cased_finetuned_mrpc_yangdechuan_en.md new file mode 100644 index 000000000000..2f7c193ede32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_cased_finetuned_mrpc_yangdechuan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_finetuned_mrpc_yangdechuan BertForSequenceClassification from yangdechuan +author: John Snow Labs +name: bert_base_cased_finetuned_mrpc_yangdechuan +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_mrpc_yangdechuan` is a English model originally trained by yangdechuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_mrpc_yangdechuan_en_5.1.4_3.4_1698361422710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_mrpc_yangdechuan_en_5.1.4_3.4_1698361422710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_mrpc_yangdechuan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_mrpc_yangdechuan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_mrpc_yangdechuan| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/yangdechuan/bert-base-cased-finetuned-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_cased_zeyu2000_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_cased_zeyu2000_en.md new file mode 100644 index 000000000000..2bc6aa931e6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_cased_zeyu2000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_zeyu2000 BertForSequenceClassification from Zeyu2000 +author: John Snow Labs +name: bert_base_cased_zeyu2000 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_zeyu2000` is a English model originally trained by Zeyu2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_zeyu2000_en_5.1.4_3.4_1698341138734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_zeyu2000_en_5.1.4_3.4_1698341138734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_zeyu2000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_zeyu2000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_zeyu2000| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Zeyu2000/bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_dutch_cased_hebban_reviews5_nl.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_dutch_cased_hebban_reviews5_nl.md new file mode 100644 index 000000000000..17e64aaada5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_dutch_cased_hebban_reviews5_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish bert_base_dutch_cased_hebban_reviews5 BertForSequenceClassification from BramVanroy +author: John Snow Labs +name: bert_base_dutch_cased_hebban_reviews5 +date: 2023-10-26 +tags: [bert, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_dutch_cased_hebban_reviews5` is a Dutch, Flemish model originally trained by BramVanroy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_hebban_reviews5_nl_5.1.4_3.4_1698326263409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_hebban_reviews5_nl_5.1.4_3.4_1698326263409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_dutch_cased_hebban_reviews5","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_dutch_cased_hebban_reviews5","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_dutch_cased_hebban_reviews5| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.0 MB| + +## References + +https://huggingface.co/BramVanroy/bert-base-dutch-cased-hebban-reviews5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_emotion_en.md index 6cf358f69ff9..e92bf461d6eb 100644 --- a/docs/_posts/ahmedlone127/2023-10-26-bert_base_emotion_en.md +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_emotion_en.md @@ -21,11 +21,15 @@ use_language_switcher: "Python-Scala-Java" Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_emotion` is a English model originally trained by Anonymous1111. +## Predicted Entities + + + {:.btn-box} -[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_emotion_en_5.1.4_3.4_1698294005799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} -[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_emotion_en_5.1.4_3.4_1698294005799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_emotion_en_5.1.4_3.4_1698311943722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_emotion_en_5.1.4_3.4_1698311943722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} ## How to use @@ -34,7 +38,6 @@ Pretrained BertForSequenceClassification model, adapted from Hugging Face and cu
{% include programmingLanguageSelectScalaPythonNLU.html %} ```python - document_assembler = DocumentAssembler()\ .setInputCol("text")\ .setOutputCol("document") @@ -52,10 +55,8 @@ pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifi data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") result = pipeline.fit(data).transform(data) - ``` ```scala - val document_assembler = new DocumentAssembler() .setInputCol("text") .setOutputCol("document") @@ -73,8 +74,6 @@ val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequ val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") val result = pipeline.fit(data).transform(data) - - ```
@@ -94,4 +93,6 @@ val result = pipeline.fit(data).transform(data) ## References +References + https://huggingface.co/Anonymous1111/bert-base-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_sts_brcps12_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_sts_brcps12_en.md new file mode 100644 index 000000000000..d9f37febd1dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_sts_brcps12_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_finetuned_sts_brcps12 BertForSequenceClassification from brcps12 +author: John Snow Labs +name: bert_base_finetuned_sts_brcps12 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_brcps12` is a English model originally trained by brcps12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_brcps12_en_5.1.4_3.4_1698356608256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_brcps12_en_5.1.4_3.4_1698356608256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_brcps12","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_brcps12","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_brcps12| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/brcps12/bert-base-finetuned-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_sts_wisejiyoon_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_sts_wisejiyoon_en.md new file mode 100644 index 000000000000..efd4efd38e6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_sts_wisejiyoon_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_finetuned_sts_wisejiyoon BertForSequenceClassification from wisejiyoon +author: John Snow Labs +name: bert_base_finetuned_sts_wisejiyoon +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sts_wisejiyoon` is a English model originally trained by wisejiyoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_wisejiyoon_en_5.1.4_3.4_1698318991323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sts_wisejiyoon_en_5.1.4_3.4_1698318991323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_wisejiyoon","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_sts_wisejiyoon","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sts_wisejiyoon| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/wisejiyoon/bert-base-finetuned-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_toxic_comment_classification_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_toxic_comment_classification_en.md new file mode 100644 index 000000000000..37c99959db33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_toxic_comment_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_finetuned_toxic_comment_classification BertForSequenceClassification from ZiruiXiong +author: John Snow Labs +name: bert_base_finetuned_toxic_comment_classification +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_toxic_comment_classification` is a English model originally trained by ZiruiXiong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_toxic_comment_classification_en_5.1.4_3.4_1698318912214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_toxic_comment_classification_en_5.1.4_3.4_1698318912214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_toxic_comment_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_toxic_comment_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_toxic_comment_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ZiruiXiong/bert-base-finetuned-toxic-comment-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_ynat_bash1130_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_ynat_bash1130_en.md new file mode 100644 index 000000000000..7ba2f721fbf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_finetuned_ynat_bash1130_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_finetuned_ynat_bash1130 BertForSequenceClassification from bash1130 +author: John Snow Labs +name: bert_base_finetuned_ynat_bash1130 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_ynat_bash1130` is a English model originally trained by bash1130. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ynat_bash1130_en_5.1.4_3.4_1698360227120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ynat_bash1130_en_5.1.4_3.4_1698360227120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_ynat_bash1130","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_finetuned_ynat_bash1130","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_ynat_bash1130| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/bash1130/bert-base-finetuned-ynat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_bak_rus_similarity_xx.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_bak_rus_similarity_xx.md new file mode 100644 index 000000000000..708152cbb6fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_bak_rus_similarity_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_bak_rus_similarity BertForSequenceClassification from slone +author: John Snow Labs +name: bert_base_multilingual_cased_bak_rus_similarity +date: 2023-10-26 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_bak_rus_similarity` is a Multilingual model originally trained by slone. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_bak_rus_similarity_xx_5.1.4_3.4_1698362086778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_bak_rus_similarity_xx_5.1.4_3.4_1698362086778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_bak_rus_similarity","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_bak_rus_similarity","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_bak_rus_similarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/slone/bert-base-multilingual-cased-bak-rus-similarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_fine_tuned_intent_classification_xx.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_fine_tuned_intent_classification_xx.md new file mode 100644 index 000000000000..2677c040c593 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_fine_tuned_intent_classification_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_fine_tuned_intent_classification BertForSequenceClassification from Geo +author: John Snow Labs +name: bert_base_multilingual_cased_fine_tuned_intent_classification +date: 2023-10-26 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_fine_tuned_intent_classification` is a Multilingual model originally trained by Geo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_fine_tuned_intent_classification_xx_5.1.4_3.4_1698348512669.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_fine_tuned_intent_classification_xx_5.1.4_3.4_1698348512669.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_fine_tuned_intent_classification","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_fine_tuned_intent_classification","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_fine_tuned_intent_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Geo/bert-base-multilingual-cased-fine-tuned-intent-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_finetuned_nli_xx.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_finetuned_nli_xx.md new file mode 100644 index 000000000000..b54867af587f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_finetuned_nli_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_nli BertForSequenceClassification from MayaGalvez +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_nli +date: 2023-10-26 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_nli` is a Multilingual model originally trained by MayaGalvez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_nli_xx_5.1.4_3.4_1698358462400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_nli_xx_5.1.4_3.4_1698358462400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_nli","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_nli","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_nli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/MayaGalvez/bert-base-multilingual-cased-finetuned-nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_hebban_reviews5_xx.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_hebban_reviews5_xx.md new file mode 100644 index 000000000000..fe88b3266f76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_hebban_reviews5_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_hebban_reviews5 BertForSequenceClassification from BramVanroy +author: John Snow Labs +name: bert_base_multilingual_cased_hebban_reviews5 +date: 2023-10-26 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_hebban_reviews5` is a Multilingual model originally trained by BramVanroy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_hebban_reviews5_xx_5.1.4_3.4_1698327437700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_hebban_reviews5_xx_5.1.4_3.4_1698327437700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_hebban_reviews5","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_hebban_reviews5","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_hebban_reviews5| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/BramVanroy/bert-base-multilingual-cased-hebban-reviews5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_vitd_xx.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_vitd_xx.md new file mode 100644 index 000000000000..a36abc1ef97e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_multilingual_cased_vitd_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_vitd BertForSequenceClassification from ka05ar +author: John Snow Labs +name: bert_base_multilingual_cased_vitd +date: 2023-10-26 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_vitd` is a Multilingual model originally trained by ka05ar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vitd_xx_5.1.4_3.4_1698325714264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_vitd_xx_5.1.4_3.4_1698325714264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_vitd","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_vitd","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_vitd| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/ka05ar/bert-base-multilingual-cased-VITD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_crows_pairs_classifieronly_henryscheible_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_crows_pairs_classifieronly_henryscheible_en.md new file mode 100644 index 000000000000..8aa138780a40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_crows_pairs_classifieronly_henryscheible_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_crows_pairs_classifieronly_henryscheible BertForSequenceClassification from henryscheible +author: John Snow Labs +name: bert_base_uncased_crows_pairs_classifieronly_henryscheible +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_crows_pairs_classifieronly_henryscheible` is a English model originally trained by henryscheible. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_crows_pairs_classifieronly_henryscheible_en_5.1.4_3.4_1698321086344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_crows_pairs_classifieronly_henryscheible_en_5.1.4_3.4_1698321086344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_crows_pairs_classifieronly_henryscheible","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_crows_pairs_classifieronly_henryscheible","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_crows_pairs_classifieronly_henryscheible| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/henryscheible/bert-base-uncased_crows_pairs_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_crows_pairs_finetuned_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_crows_pairs_finetuned_en.md new file mode 100644 index 000000000000..801c49989491 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_crows_pairs_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_crows_pairs_finetuned BertForSequenceClassification from henryscheible +author: John Snow Labs +name: bert_base_uncased_crows_pairs_finetuned +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_crows_pairs_finetuned` is a English model originally trained by henryscheible. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_crows_pairs_finetuned_en_5.1.4_3.4_1698322773265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_crows_pairs_finetuned_en_5.1.4_3.4_1698322773265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_crows_pairs_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_crows_pairs_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_crows_pairs_finetuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/henryscheible/bert-base-uncased_crows_pairs_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_dailydialog_turn_classifier_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_dailydialog_turn_classifier_en.md new file mode 100644 index 000000000000..5ab7dae508cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_dailydialog_turn_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_dailydialog_turn_classifier BertForSequenceClassification from benjaminbeilharz +author: John Snow Labs +name: bert_base_uncased_dailydialog_turn_classifier +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_dailydialog_turn_classifier` is a English model originally trained by benjaminbeilharz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_dailydialog_turn_classifier_en_5.1.4_3.4_1698340341576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_dailydialog_turn_classifier_en_5.1.4_3.4_1698340341576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_dailydialog_turn_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_dailydialog_turn_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_dailydialog_turn_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/benjaminbeilharz/bert-base-uncased-dailydialog-turn-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_emotion_honours_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_emotion_honours_en.md new file mode 100644 index 000000000000..ec2beeb86440 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_emotion_honours_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_emotion_honours BertForSequenceClassification from L-40408203 +author: John Snow Labs +name: bert_base_uncased_emotion_honours +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_emotion_honours` is a English model originally trained by L-40408203. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_honours_en_5.1.4_3.4_1698359158419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_honours_en_5.1.4_3.4_1698359158419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_honours","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_honours","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_emotion_honours| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/L-40408203/bert-base-uncased-emotion-honours \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_empatheticdialogues_sentiment_classifier_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_empatheticdialogues_sentiment_classifier_en.md new file mode 100644 index 000000000000..a1be6f143fd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_empatheticdialogues_sentiment_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_empatheticdialogues_sentiment_classifier BertForSequenceClassification from benjaminbeilharz +author: John Snow Labs +name: bert_base_uncased_empatheticdialogues_sentiment_classifier +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_empatheticdialogues_sentiment_classifier` is a English model originally trained by benjaminbeilharz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_empatheticdialogues_sentiment_classifier_en_5.1.4_3.4_1698340532992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_empatheticdialogues_sentiment_classifier_en_5.1.4_3.4_1698340532992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_empatheticdialogues_sentiment_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_empatheticdialogues_sentiment_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_empatheticdialogues_sentiment_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/benjaminbeilharz/bert-base-uncased-empatheticdialogues-sentiment-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_ag_news_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_ag_news_en.md new file mode 100644 index 000000000000..60e872139c39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_ag_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_ag_news BertForSequenceClassification from 202k +author: John Snow Labs +name: bert_base_uncased_finetuned_ag_news +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_ag_news` is a English model originally trained by 202k. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ag_news_en_5.1.4_3.4_1698358204993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ag_news_en_5.1.4_3.4_1698358204993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_ag_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_ag_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_ag_news| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/202k/bert-base-uncased-finetuned-ag_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_md_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_md_en.md new file mode 100644 index 000000000000..35a171c34dc1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_md_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_md BertForSequenceClassification from caioamb +author: John Snow Labs +name: bert_base_uncased_finetuned_md +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_md` is a English model originally trained by caioamb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_md_en_5.1.4_3.4_1698358974527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_md_en_5.1.4_3.4_1698358974527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_md","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_md","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_md| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/caioamb/bert-base-uncased-finetuned-md \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_review_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_review_sentiment_analysis_en.md new file mode 100644 index 000000000000..fde2866a7129 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_review_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_review_sentiment_analysis BertForSequenceClassification from DataMonke +author: John Snow Labs +name: bert_base_uncased_finetuned_review_sentiment_analysis +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_review_sentiment_analysis` is a English model originally trained by DataMonke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_review_sentiment_analysis_en_5.1.4_3.4_1698341604035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_review_sentiment_analysis_en_5.1.4_3.4_1698341604035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_review_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_review_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_review_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/DataMonke/bert-base-uncased-finetuned-review-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_smsspam_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_smsspam_en.md new file mode 100644 index 000000000000..00d0a69f60e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_smsspam_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_smsspam BertForSequenceClassification from shre-db +author: John Snow Labs +name: bert_base_uncased_finetuned_smsspam +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_smsspam` is a English model originally trained by shre-db. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_smsspam_en_5.1.4_3.4_1698340774498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_smsspam_en_5.1.4_3.4_1698340774498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_smsspam","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_smsspam","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_smsspam| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/shre-db/bert-base-uncased-finetuned-smsspam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_wnli_jinghan_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_wnli_jinghan_en.md new file mode 100644 index 000000000000..2ccb7fc46bd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_finetuned_wnli_jinghan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_wnli_jinghan BertForSequenceClassification from jinghan +author: John Snow Labs +name: bert_base_uncased_finetuned_wnli_jinghan +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_wnli_jinghan` is a English model originally trained by jinghan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wnli_jinghan_en_5.1.4_3.4_1698324635067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_wnli_jinghan_en_5.1.4_3.4_1698324635067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_wnli_jinghan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_wnli_jinghan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_wnli_jinghan| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jinghan/bert-base-uncased-finetuned-wnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_hoax_classifier_def_v1_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_hoax_classifier_def_v1_en.md new file mode 100644 index 000000000000..87f64247d3ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_hoax_classifier_def_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_hoax_classifier_def_v1 BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_base_uncased_hoax_classifier_def_v1 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_hoax_classifier_def_v1` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_def_v1_en_5.1.4_3.4_1698343785308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_def_v1_en_5.1.4_3.4_1698343785308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_def_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_def_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_hoax_classifier_def_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/research-dump/bert-base-uncased_hoax_classifier_def_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_hoax_classifier_tsonga_v1_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_hoax_classifier_tsonga_v1_en.md new file mode 100644 index 000000000000..0a2eca289965 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_hoax_classifier_tsonga_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_hoax_classifier_tsonga_v1 BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_base_uncased_hoax_classifier_tsonga_v1 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_hoax_classifier_tsonga_v1` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_tsonga_v1_en_5.1.4_3.4_1698349545741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_hoax_classifier_tsonga_v1_en_5.1.4_3.4_1698349545741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_tsonga_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_hoax_classifier_tsonga_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_hoax_classifier_tsonga_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/research-dump/bert-base-uncased_hoax_classifier_ts_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology_en.md new file mode 100644 index 000000000000..694e3b835693 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology BertForSequenceClassification from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology_en_5.1.4_3.4_1698312550574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology_en_5.1.4_3.4_1698312550574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mlm_scirepeval_fos_chemistry_textcls_rheology| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-MLM-scirepeval_fos_chemistry-textCLS-RHEOLOGY \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology_en.md new file mode 100644 index 000000000000..96186a00b388 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology BertForSequenceClassification from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology_en_5.1.4_3.4_1698340158748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology_en_5.1.4_3.4_1698340158748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mlp_scirepeval_chemistry_large_textcls_rheology| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-MLP-scirepeval-chemistry-LARGE-textCLS-RHEOLOGY \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mnli_v1_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mnli_v1_en.md new file mode 100644 index 000000000000..a6a73dcb8df4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_mnli_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mnli_v1 BertForSequenceClassification from blackbird +author: John Snow Labs +name: bert_base_uncased_mnli_v1 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli_v1` is a English model originally trained by blackbird. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_v1_en_5.1.4_3.4_1698344775544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_v1_en_5.1.4_3.4_1698344775544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/blackbird/bert-base-uncased-MNLI-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_next_turn_classifier_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_next_turn_classifier_en.md new file mode 100644 index 000000000000..4b69ad9e829d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_next_turn_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_next_turn_classifier BertForSequenceClassification from benjaminbeilharz +author: John Snow Labs +name: bert_base_uncased_next_turn_classifier +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_next_turn_classifier` is a English model originally trained by benjaminbeilharz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_next_turn_classifier_en_5.1.4_3.4_1698340706122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_next_turn_classifier_en_5.1.4_3.4_1698340706122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_next_turn_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_next_turn_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_next_turn_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/benjaminbeilharz/bert-base-uncased-next-turn-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_sentiment_classifier_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_sentiment_classifier_en.md new file mode 100644 index 000000000000..9723a553efb8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_sentiment_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sentiment_classifier BertForSequenceClassification from benjaminbeilharz +author: John Snow Labs +name: bert_base_uncased_sentiment_classifier +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sentiment_classifier` is a English model originally trained by benjaminbeilharz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sentiment_classifier_en_5.1.4_3.4_1698340893522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sentiment_classifier_en_5.1.4_3.4_1698340893522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sentiment_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sentiment_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sentiment_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/benjaminbeilharz/bert-base-uncased-sentiment-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_sst2_aviator_neural_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_sst2_aviator_neural_en.md new file mode 100644 index 000000000000..a56b26cd49b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_sst2_aviator_neural_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst2_aviator_neural BertForSequenceClassification from aviator-neural +author: John Snow Labs +name: bert_base_uncased_sst2_aviator_neural +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_aviator_neural` is a English model originally trained by aviator-neural. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_aviator_neural_en_5.1.4_3.4_1698312280148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_aviator_neural_en_5.1.4_3.4_1698312280148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_aviator_neural","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_aviator_neural","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_aviator_neural| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/aviator-neural/bert-base-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_textcls_rheology_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_textcls_rheology_en.md new file mode 100644 index 000000000000..350bdef1d5f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_textcls_rheology_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_textcls_rheology BertForSequenceClassification from jonas-luehrs +author: John Snow Labs +name: bert_base_uncased_textcls_rheology +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_textcls_rheology` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_en_5.1.4_3.4_1698340342229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_textcls_rheology_en_5.1.4_3.4_1698340342229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_textcls_rheology","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_textcls_rheology","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_textcls_rheology| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonas-luehrs/bert-base-uncased-textCLS-RHEOLOGY \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_winobias_classifieronly_henryscheible_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_winobias_classifieronly_henryscheible_en.md new file mode 100644 index 000000000000..71fce56ddc26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_winobias_classifieronly_henryscheible_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_winobias_classifieronly_henryscheible BertForSequenceClassification from henryscheible +author: John Snow Labs +name: bert_base_uncased_winobias_classifieronly_henryscheible +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_winobias_classifieronly_henryscheible` is a English model originally trained by henryscheible. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_winobias_classifieronly_henryscheible_en_5.1.4_3.4_1698322077587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_winobias_classifieronly_henryscheible_en_5.1.4_3.4_1698322077587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_winobias_classifieronly_henryscheible","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_winobias_classifieronly_henryscheible","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_winobias_classifieronly_henryscheible| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/henryscheible/bert-base-uncased_winobias_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_winobias_finetuned_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_winobias_finetuned_en.md new file mode 100644 index 000000000000..a9603562db6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_base_uncased_winobias_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_winobias_finetuned BertForSequenceClassification from henryscheible +author: John Snow Labs +name: bert_base_uncased_winobias_finetuned +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_winobias_finetuned` is a English model originally trained by henryscheible. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_winobias_finetuned_en_5.1.4_3.4_1698323667058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_winobias_finetuned_en_5.1.4_3.4_1698323667058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_winobias_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_winobias_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_winobias_finetuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/henryscheible/bert-base-uncased_winobias_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_based_uncased_imdb_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_based_uncased_imdb_en.md new file mode 100644 index 000000000000..001d2e710fe4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_based_uncased_imdb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_based_uncased_imdb BertForSequenceClassification from car13mesquita +author: John Snow Labs +name: bert_based_uncased_imdb +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_based_uncased_imdb` is a English model originally trained by car13mesquita. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_based_uncased_imdb_en_5.1.4_3.4_1698340942907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_based_uncased_imdb_en_5.1.4_3.4_1698340942907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_based_uncased_imdb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_based_uncased_imdb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_based_uncased_imdb| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/car13mesquita/bert-based-uncased-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_cased_exist_2_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_cased_exist_2_en.md new file mode 100644 index 000000000000..11769ecaa0be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_cased_exist_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_cased_exist_2 BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: bert_cased_exist_2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cased_exist_2` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cased_exist_2_en_5.1.4_3.4_1698351900122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cased_exist_2_en_5.1.4_3.4_1698351900122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_cased_exist_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_cased_exist_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cased_exist_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/nouman-10/bert-cased-exist-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_cased_exist_5_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_cased_exist_5_en.md new file mode 100644 index 000000000000..8ca396d58556 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_cased_exist_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_cased_exist_5 BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: bert_cased_exist_5 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cased_exist_5` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cased_exist_5_en_5.1.4_3.4_1698351142148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cased_exist_5_en_5.1.4_3.4_1698351142148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_cased_exist_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_cased_exist_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cased_exist_5| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/nouman-10/bert-cased-exist-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_chinese_ainews_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_chinese_ainews_en.md new file mode 100644 index 000000000000..59629cc61419 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_chinese_ainews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_chinese_ainews BertForSequenceClassification from AllenMai +author: John Snow Labs +name: bert_chinese_ainews +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_chinese_ainews` is a English model originally trained by AllenMai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_chinese_ainews_en_5.1.4_3.4_1698312776524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_chinese_ainews_en_5.1.4_3.4_1698312776524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_chinese_ainews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_chinese_ainews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_chinese_ainews| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/AllenMai/bert-chinese-ainews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_random_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_random_en.md new file mode 100644 index 000000000000..6820fb5e9685 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_random_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from baykenney) +author: John Snow Labs +name: bert_classifier_base_gpt2detector_random +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-gpt2detector-random` is a English model originally trained by `baykenney`. + +## Predicted Entities + +`Machine`, `Human` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_random_en_5.1.4_3.4_1698322884373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_random_en_5.1.4_3.4_1698322884373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_random","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_random","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.base_random.by_baykenney").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_gpt2detector_random| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/baykenney/bert-base-gpt2detector-random \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topk40_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topk40_en.md new file mode 100644 index 000000000000..54bc4310b935 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topk40_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from baykenney) +author: John Snow Labs +name: bert_classifier_base_gpt2detector_topk40 +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-gpt2detector-topk40` is a English model originally trained by `baykenney`. + +## Predicted Entities + +`Machine`, `Human` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_topk40_en_5.1.4_3.4_1698324093316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_topk40_en_5.1.4_3.4_1698324093316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_topk40","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_topk40","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.base_topk40.by_baykenney").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_gpt2detector_topk40| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/baykenney/bert-base-gpt2detector-topk40 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topp92_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topp92_en.md new file mode 100644 index 000000000000..a3f22f45fee8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topp92_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from baykenney) +author: John Snow Labs +name: bert_classifier_base_gpt2detector_topp92 +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-gpt2detector-topp92` is a English model originally trained by `baykenney`. + +## Predicted Entities + +`Machine`, `Human` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_topp92_en_5.1.4_3.4_1698325313708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_topp92_en_5.1.4_3.4_1698325313708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_topp92","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_topp92","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_gpt2detector_topp92| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/baykenney/bert-base-gpt2detector-topp92 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topp96_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topp96_en.md new file mode 100644 index 000000000000..84bbb07c35bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_gpt2detector_topp96_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from baykenney) +author: John Snow Labs +name: bert_classifier_base_gpt2detector_topp96 +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-gpt2detector-topp96` is a English model originally trained by `baykenney`. + +## Predicted Entities + +`Machine`, `Human` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_topp96_en_5.1.4_3.4_1698326381270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_gpt2detector_topp96_en_5.1.4_3.4_1698326381270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_topp96","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_gpt2detector_topp96","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.base_topp96.by_baykenney").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_gpt2detector_topp96| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/baykenney/bert-base-gpt2detector-topp96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_uncased_finetuned_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_uncased_finetuned_en.md new file mode 100644 index 000000000000..aadc41ff2da6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_uncased_finetuned_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from hazrulakmal) +author: John Snow Labs +name: bert_classifier_base_uncased_finetuned +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned` is a English model originally trained by `hazrulakmal`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_en_5.1.4_3.4_1698363742576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_en_5.1.4_3.4_1698363742576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_base_finetuned.by_hazrulakmal").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_uncased_finetuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/hazrulakmal/bert-base-uncased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_uncased_finetuned_plutchik_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_uncased_finetuned_plutchik_emotion_en.md new file mode 100644 index 000000000000..8485d9e454db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_base_uncased_finetuned_plutchik_emotion_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from Yuetian) +author: John Snow Labs +name: bert_classifier_base_uncased_finetuned_plutchik_emotion +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-plutchik-emotion` is a English model originally trained by `Yuetian`. + +## Predicted Entities + +`sadness`, `anger`, `disgust`, `fear`, `joy`, `anticipation`, `surprise`, `trust` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_plutchik_emotion_en_5.1.4_3.4_1698318227944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_plutchik_emotion_en_5.1.4_3.4_1698318227944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned_plutchik_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned_plutchik_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_base_finetuned.by_yuetian").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_uncased_finetuned_plutchik_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Yuetian/bert-base-uncased-finetuned-plutchik-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batterybert_cased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batterybert_cased_abstract_en.md new file mode 100644 index 000000000000..7af98e6bc225 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batterybert_cased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from batterydata) +author: John Snow Labs +name: bert_classifier_batterybert_cased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `batterybert-cased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`battery`, `non-battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_batterybert_cased_abstract_en_5.1.4_3.4_1698313901730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_batterybert_cased_abstract_en_5.1.4_3.4_1698313901730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batterybert_cased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batterybert_cased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.battery.cased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_batterybert_cased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/batterybert-cased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batterybert_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batterybert_uncased_abstract_en.md new file mode 100644 index 000000000000..128a02f635e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batterybert_uncased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Uncased model (from batterydata) +author: John Snow Labs +name: bert_classifier_batterybert_uncased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `batterybert-uncased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`battery`, `non-battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_batterybert_uncased_abstract_en_5.1.4_3.4_1698315174780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_batterybert_uncased_abstract_en_5.1.4_3.4_1698315174780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batterybert_uncased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batterybert_uncased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.battery.uncased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_batterybert_uncased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/batterybert-uncased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryonlybert_cased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryonlybert_cased_abstract_en.md new file mode 100644 index 000000000000..3209a603f050 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryonlybert_cased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from batterydata) +author: John Snow Labs +name: bert_classifier_batteryonlybert_cased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `batteryonlybert-cased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`battery`, `non-battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryonlybert_cased_abstract_en_5.1.4_3.4_1698316346664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryonlybert_cased_abstract_en_5.1.4_3.4_1698316346664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryonlybert_cased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryonlybert_cased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.battery.cased.by_batterydata").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_batteryonlybert_cased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/batteryonlybert-cased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryonlybert_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryonlybert_uncased_abstract_en.md new file mode 100644 index 000000000000..e5b5217e9c00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryonlybert_uncased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Uncased model (from batterydata) +author: John Snow Labs +name: bert_classifier_batteryonlybert_uncased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `batteryonlybert-uncased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`battery`, `non-battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryonlybert_uncased_abstract_en_5.1.4_3.4_1698317556459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryonlybert_uncased_abstract_en_5.1.4_3.4_1698317556459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryonlybert_uncased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryonlybert_uncased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.battery.uncased.by_batterydata").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_batteryonlybert_uncased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/batteryonlybert-uncased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryscibert_cased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryscibert_cased_abstract_en.md new file mode 100644 index 000000000000..f6ed65504358 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryscibert_cased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from batterydata) +author: John Snow Labs +name: bert_classifier_batteryscibert_cased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `batteryscibert-cased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`battery`, `non-battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryscibert_cased_abstract_en_5.1.4_3.4_1698318567781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryscibert_cased_abstract_en_5.1.4_3.4_1698318567781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryscibert_cased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryscibert_cased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.scibert.battery_scibert.cased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_batteryscibert_cased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/batteryscibert-cased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryscibert_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryscibert_uncased_abstract_en.md new file mode 100644 index 000000000000..a9a38c98881d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_batteryscibert_uncased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Uncased model (from batterydata) +author: John Snow Labs +name: bert_classifier_batteryscibert_uncased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `batteryscibert-uncased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`battery`, `non-battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryscibert_uncased_abstract_en_5.1.4_3.4_1698319667047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_batteryscibert_uncased_abstract_en_5.1.4_3.4_1698319667047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryscibert_uncased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_batteryscibert_uncased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.scibert.battery_scibert.uncased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_batteryscibert_uncased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/batteryscibert-uncased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_beep_kc_base_bias_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_beep_kc_base_bias_en.md new file mode 100644 index 000000000000..996508454c8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_beep_kc_base_bias_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from beomi) +author: John Snow Labs +name: bert_classifier_beep_kc_base_bias +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beep-kcbert-base-bias` is a English model originally trained by `beomi`. + +## Predicted Entities + +`others`, `none`, `gender` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_beep_kc_base_bias_en_5.1.4_3.4_1698341139349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_beep_kc_base_bias_en_5.1.4_3.4_1698341139349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_beep_kc_base_bias","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_beep_kc_base_bias","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.base.by_beomi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_beep_kc_base_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/beomi/beep-kcbert-base-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_cased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_cased_abstract_en.md new file mode 100644 index 000000000000..0360a30148a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_cased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from batterydata) +author: John Snow Labs +name: bert_classifier_bert_base_cased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`battery`, `non-battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_cased_abstract_en_5.1.4_3.4_1698320487661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_cased_abstract_en_5.1.4_3.4_1698320487661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_cased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_cased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_cased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/bert-base-cased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_cased_trec_coarse_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_cased_trec_coarse_en.md new file mode 100644 index 000000000000..edab1d7e4f4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_cased_trec_coarse_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from aychang) +author: John Snow Labs +name: bert_classifier_bert_base_cased_trec_coarse +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-trec-coarse` is a English model originally trained by `aychang`. + +## Predicted Entities + +`HUM`, `ABBR`, `DESC`, `LOC`, `ENTY`, `NUM` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_cased_trec_coarse_en_5.1.4_3.4_1698313080244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_cased_trec_coarse_en_5.1.4_3.4_1698313080244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_cased_trec_coarse","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_cased_trec_coarse","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.cased_base.by_aychang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_cased_trec_coarse| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aychang/bert-base-cased-trec-coarse \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_dutch_cased_hebban_reviews_nl.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_dutch_cased_hebban_reviews_nl.md new file mode 100644 index 000000000000..710a15334b03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_dutch_cased_hebban_reviews_nl.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Base Cased model (from BramVanroy) +author: John Snow Labs +name: bert_classifier_bert_base_dutch_cased_hebban_reviews +date: 2023-10-26 +tags: [nl, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-dutch-cased-hebban-reviews` is a Dutch model originally trained by `BramVanroy`. + +## Predicted Entities + +`negative`, `neutral`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_dutch_cased_hebban_reviews_nl_5.1.4_3.4_1698322929081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_dutch_cased_hebban_reviews_nl_5.1.4_3.4_1698322929081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_dutch_cased_hebban_reviews","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_dutch_cased_hebban_reviews","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_dutch_cased_hebban_reviews| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BramVanroy/bert-base-dutch-cased-hebban-reviews +- https://paperswithcode.com/sota?task=sentiment+analysis&dataset=BramVanroy%2Fhebban-reviews+-+filtered_sentiment+-+2.0.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_multilingual_cased_hebban_reviews_nl.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_multilingual_cased_hebban_reviews_nl.md new file mode 100644 index 000000000000..55cfac34d89a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_multilingual_cased_hebban_reviews_nl.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Base Cased model (from BramVanroy) +author: John Snow Labs +name: bert_classifier_bert_base_multilingual_cased_hebban_reviews +date: 2023-10-26 +tags: [nl, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-hebban-reviews` is a Dutch model originally trained by `BramVanroy`. + +## Predicted Entities + +`negative`, `neutral`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_multilingual_cased_hebban_reviews_nl_5.1.4_3.4_1698321872824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_multilingual_cased_hebban_reviews_nl_5.1.4_3.4_1698321872824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_multilingual_cased_hebban_reviews","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_multilingual_cased_hebban_reviews","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.cased_multilingual_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_multilingual_cased_hebban_reviews| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|667.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BramVanroy/bert-base-multilingual-cased-hebban-reviews +- https://paperswithcode.com/sota?task=sentiment+analysis&dataset=BramVanroy%2Fhebban-reviews+-+filtered_sentiment+-+2.0.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..eb68bf25530d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bert_base_uncased_abstract_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from batterydata) +author: John Snow Labs +name: bert_classifier_bert_base_uncased_abstract +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-abstract` is a English model originally trained by `batterydata`. + +## Predicted Entities + +`non-battery`, `battery` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_uncased_abstract_en_5.1.4_3.4_1698321660501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_uncased_abstract_en_5.1.4_3.4_1698321660501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_uncased_abstract","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_uncased_abstract","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/batterydata/bert-base-uncased-abstract +- https://github.com/ShuHuang/batterybert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bhadresh_savani_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bhadresh_savani_base_uncased_emotion_en.md new file mode 100644 index 000000000000..e00d024e49e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_bhadresh_savani_base_uncased_emotion_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from bhadresh-savani) +author: John Snow Labs +name: bert_classifier_bhadresh_savani_base_uncased_emotion +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-emotion` is a English model originally trained by `bhadresh-savani`. + +## Predicted Entities + +`anger`, `surprise`, `joy`, `love`, `sadness`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bhadresh_savani_base_uncased_emotion_en_5.1.4_3.4_1698343785484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bhadresh_savani_base_uncased_emotion_en_5.1.4_3.4_1698343785484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bhadresh_savani_base_uncased_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bhadresh_savani_base_uncased_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.emotion.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bhadresh_savani_base_uncased_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bhadresh-savani/bert-base-uncased-emotion +- https://arxiv.org/abs/1810.04805 +- https://github.com/bhadreshpsavani/ExploringSentimentalAnalysis/blob/main/SentimentalAnalysisWithDistilbert.ipynb +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_clinical_assertion_negation_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_clinical_assertion_negation_en.md new file mode 100644 index 000000000000..d9a0a880b129 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_clinical_assertion_negation_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bvanaken) +author: John Snow Labs +name: bert_classifier_clinical_assertion_negation +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `clinical-assertion-negation-bert` is a English model originally trained by `bvanaken`. + +## Predicted Entities + +`POSSIBLE`, `ABSENT`, `PRESENT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_clinical_assertion_negation_en_5.1.4_3.4_1698357855554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_clinical_assertion_negation_en_5.1.4_3.4_1698357855554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_clinical_assertion_negation","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_clinical_assertion_negation","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.clinical.by_bvanaken").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_clinical_assertion_negation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bvanaken/clinical-assertion-negation-bert +- https://aclanthology.org/2021.nlpmc-1.5/ +- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168320/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_e21_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_e21_en.md new file mode 100644 index 000000000000..6630fc08a120 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_e21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classifier_e21 BertForSequenceClassification from arthurbittencourt +author: John Snow Labs +name: bert_classifier_e21 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_e21` is a English model originally trained by arthurbittencourt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_e21_en_5.1.4_3.4_1698360547768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_e21_en_5.1.4_3.4_1698360547768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_e21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_e21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_e21| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/arthurbittencourt/bert_classifier_e21 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_finbert_fls_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_finbert_fls_en.md new file mode 100644 index 000000000000..43aeb5894beb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_finbert_fls_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from FinanceInc) +author: John Snow Labs +name: bert_classifier_finbert_fls +date: 2023-10-26 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finbert_fls` is a English model originally trained by `FinanceInc`. + +## Predicted Entities + +`Specific FLS`, `Non-specific FLS`, `Not FLS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_finbert_fls_en_5.1.4_3.4_1698315175596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_finbert_fls_en_5.1.4_3.4_1698315175596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_finbert_fls","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_finbert_fls","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_financeinc").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_finbert_fls| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/FinanceInc/finbert_fls +- https://finbert.ai/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_ilana_tiny_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_ilana_tiny_sst2_distilled_en.md new file mode 100644 index 000000000000..cf52c9c9f355 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_ilana_tiny_sst2_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from ilana) +author: John Snow Labs +name: bert_classifier_ilana_tiny_sst2_distilled +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-distilled` is a English model originally trained by `ilana`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_ilana_tiny_sst2_distilled_en_5.1.4_3.4_1698315946007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_ilana_tiny_sst2_distilled_en_5.1.4_3.4_1698315946007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_ilana_tiny_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_ilana_tiny_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.distilled_tiny.by_ilana").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_ilana_tiny_sst2_distilled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ilana/tiny-bert-sst2-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_large_gpt2detector_random_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_large_gpt2detector_random_en.md new file mode 100644 index 000000000000..42ed47a09b4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_large_gpt2detector_random_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from baykenney) +author: John Snow Labs +name: bert_classifier_large_gpt2detector_random +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-gpt2detector-random` is a English model originally trained by `baykenney`. + +## Predicted Entities + +`Machine`, `Human` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_large_gpt2detector_random_en_5.1.4_3.4_1698328205720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_large_gpt2detector_random_en_5.1.4_3.4_1698328205720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_large_gpt2detector_random","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_large_gpt2detector_random","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.large.random.by_baykenney").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_large_gpt2detector_random| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/baykenney/bert-large-gpt2detector-random \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_large_gpt2detector_topk40_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_large_gpt2detector_topk40_en.md new file mode 100644 index 000000000000..6f250d89f353 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_large_gpt2detector_topk40_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from baykenney) +author: John Snow Labs +name: bert_classifier_large_gpt2detector_topk40 +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-gpt2detector-topk40` is a English model originally trained by `baykenney`. + +## Predicted Entities + +`Machine`, `Human` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_large_gpt2detector_topk40_en_5.1.4_3.4_1698330092409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_large_gpt2detector_topk40_en_5.1.4_3.4_1698330092409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_large_gpt2detector_topk40","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_large_gpt2detector_topk40","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.large.topk40.by_baykenney").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_large_gpt2detector_topk40| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/baykenney/bert-large-gpt2detector-topk40 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_platzi_base_mrpc_glue_omar_espejel_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_platzi_base_mrpc_glue_omar_espejel_en.md new file mode 100644 index 000000000000..907d73a38ba8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_platzi_base_mrpc_glue_omar_espejel_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from platzi) +author: John Snow Labs +name: bert_classifier_platzi_base_mrpc_glue_omar_espejel +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `platzi-bert-base-mrpc-glue-omar-espejel` is a English model originally trained by `platzi`. + +## Predicted Entities + +`equivalent`, `not_equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_platzi_base_mrpc_glue_omar_espejel_en_5.1.4_3.4_1698328324573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_platzi_base_mrpc_glue_omar_espejel_en_5.1.4_3.4_1698328324573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_platzi_base_mrpc_glue_omar_espejel","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_platzi_base_mrpc_glue_omar_espejel","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_platzi_base_mrpc_glue_omar_espejel| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/platzi/platzi-bert-base-mrpc-glue-omar-espejel +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_snli_base_cased_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_snli_base_cased_en.md new file mode 100644 index 000000000000..b10bc18486e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_snli_base_cased_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from boychaboy) +author: John Snow Labs +name: bert_classifier_snli_base_cased +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SNLI_bert-base-cased` is a English model originally trained by `boychaboy`. + +## Predicted Entities + +`contradiction`, `entailment`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_snli_base_cased_en_5.1.4_3.4_1698353946745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_snli_base_cased_en_5.1.4_3.4_1698353946745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_snli_base_cased","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_snli_base_cased","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.snli.bert.cased_base.by_boychaboy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_snli_base_cased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/boychaboy/SNLI_bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_snli_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_snli_base_uncased_en.md new file mode 100644 index 000000000000..814a58a3800f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_snli_base_uncased_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from boychaboy) +author: John Snow Labs +name: bert_classifier_snli_base_uncased +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SNLI_bert-base-uncased` is a English model originally trained by `boychaboy`. + +## Predicted Entities + +`contradiction`, `entailment`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_snli_base_uncased_en_5.1.4_3.4_1698355389860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_snli_base_uncased_en_5.1.4_3.4_1698355389860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_snli_base_uncased","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_snli_base_uncased","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_base.by_boychaboy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_snli_base_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/boychaboy/SNLI_bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_swtx_erlangshen_roberta_110m_similarity_zh.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_swtx_erlangshen_roberta_110m_similarity_zh.md new file mode 100644 index 000000000000..db2e6edc4479 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_swtx_erlangshen_roberta_110m_similarity_zh.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from swtx) +author: John Snow Labs +name: bert_classifier_swtx_erlangshen_roberta_110m_similarity +date: 2023-10-26 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Roberta-110M-Similarity` is a Chinese model originally trained by `swtx`. + +## Predicted Entities + +`similar`, `not similar` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_swtx_erlangshen_roberta_110m_similarity_zh_5.1.4_3.4_1698320193845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_swtx_erlangshen_roberta_110m_similarity_zh_5.1.4_3.4_1698320193845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_swtx_erlangshen_roberta_110m_similarity","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_swtx_erlangshen_roberta_110m_similarity","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.lang_110m.by_swtx").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_swtx_erlangshen_roberta_110m_similarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/swtx/Erlangshen-Roberta-110M-Similarity +- https://github.com/IDEA-CCNL/Fengshenbang-LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_tiny_finetuned_glue_rte_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_tiny_finetuned_glue_rte_en.md new file mode 100644 index 000000000000..22d3cf26ec12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_classifier_tiny_finetuned_glue_rte_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from muhtasham) +author: John Snow Labs +name: bert_classifier_tiny_finetuned_glue_rte +date: 2023-10-26 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-glue-rte` is a English model originally trained by `muhtasham`. + +## Predicted Entities + +`not_entailment`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_finetuned_glue_rte_en_5.1.4_3.4_1698352916130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_finetuned_glue_rte_en_5.1.4_3.4_1698352916130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_finetuned_glue_rte","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_finetuned_glue_rte","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.tiny_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_finetuned_glue_rte| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/muhtasham/bert-tiny-finetuned-glue-rte +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_emo_classifier_vasanth_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_emo_classifier_vasanth_en.md new file mode 100644 index 000000000000..852824dcdf11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_emo_classifier_vasanth_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emo_classifier_vasanth BertForSequenceClassification from Vasanth +author: John Snow Labs +name: bert_emo_classifier_vasanth +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emo_classifier_vasanth` is a English model originally trained by Vasanth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emo_classifier_vasanth_en_5.1.4_3.4_1698342650586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emo_classifier_vasanth_en_5.1.4_3.4_1698342650586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_emo_classifier_vasanth","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_emo_classifier_vasanth","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emo_classifier_vasanth| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Vasanth/bert_emo_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_fake_news_classification_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_fake_news_classification_fine_tuned_en.md new file mode 100644 index 000000000000..341880020cae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_fake_news_classification_fine_tuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_fake_news_classification_fine_tuned BertForSequenceClassification from h-pal +author: John Snow Labs +name: bert_fake_news_classification_fine_tuned +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fake_news_classification_fine_tuned` is a English model originally trained by h-pal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fake_news_classification_fine_tuned_en_5.1.4_3.4_1698363021397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fake_news_classification_fine_tuned_en_5.1.4_3.4_1698363021397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fake_news_classification_fine_tuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fake_news_classification_fine_tuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fake_news_classification_fine_tuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/h-pal/bert-fake-news-classification-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_fe2plus_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_fe2plus_en.md new file mode 100644 index 000000000000..3f38cfdb3b63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_fe2plus_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_fine_tuned_cola_fe2plus BertForSequenceClassification from fe2plus +author: John Snow Labs +name: bert_fine_tuned_cola_fe2plus +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fine_tuned_cola_fe2plus` is a English model originally trained by fe2plus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_fe2plus_en_5.1.4_3.4_1698325861556.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_fe2plus_en_5.1.4_3.4_1698325861556.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_fe2plus","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_fe2plus","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fine_tuned_cola_fe2plus| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/fe2plus/bert-fine-tuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_lagyamfi_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_lagyamfi_en.md new file mode 100644 index 000000000000..6c2d0647c529 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_lagyamfi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_fine_tuned_cola_lagyamfi BertForSequenceClassification from Lagyamfi +author: John Snow Labs +name: bert_fine_tuned_cola_lagyamfi +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fine_tuned_cola_lagyamfi` is a English model originally trained by Lagyamfi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_lagyamfi_en_5.1.4_3.4_1698356548469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_lagyamfi_en_5.1.4_3.4_1698356548469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_lagyamfi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_lagyamfi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fine_tuned_cola_lagyamfi| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Lagyamfi/bert-fine-tuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_phamvanlinh143_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_phamvanlinh143_en.md new file mode 100644 index 000000000000..52a3d4ca9e36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_fine_tuned_cola_phamvanlinh143_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_fine_tuned_cola_phamvanlinh143 BertForSequenceClassification from phamvanlinh143 +author: John Snow Labs +name: bert_fine_tuned_cola_phamvanlinh143 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fine_tuned_cola_phamvanlinh143` is a English model originally trained by phamvanlinh143. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_phamvanlinh143_en_5.1.4_3.4_1698317263916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_phamvanlinh143_en_5.1.4_3.4_1698317263916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_phamvanlinh143","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_phamvanlinh143","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fine_tuned_cola_phamvanlinh143| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/phamvanlinh143/bert-fine-tuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_finetuned_cryptos_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_finetuned_cryptos_en.md new file mode 100644 index 000000000000..44968b4aa468 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_finetuned_cryptos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuned_cryptos BertForSequenceClassification from flowfree +author: John Snow Labs +name: bert_finetuned_cryptos +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_cryptos` is a English model originally trained by flowfree. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_cryptos_en_5.1.4_3.4_1698346957105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_cryptos_en_5.1.4_3.4_1698346957105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_cryptos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_cryptos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_cryptos| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/flowfree/bert-finetuned-cryptos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_finetuning_test_baihaisheng_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_finetuning_test_baihaisheng_en.md new file mode 100644 index 000000000000..199d7669e752 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_finetuning_test_baihaisheng_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuning_test_baihaisheng BertForSequenceClassification from baihaisheng +author: John Snow Labs +name: bert_finetuning_test_baihaisheng +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_baihaisheng` is a English model originally trained by baihaisheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_baihaisheng_en_5.1.4_3.4_1698313285686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_baihaisheng_en_5.1.4_3.4_1698313285686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_baihaisheng","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_baihaisheng","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_baihaisheng| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/baihaisheng/bert_finetuning_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_finetuning_test_bella_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_finetuning_test_bella_en.md new file mode 100644 index 000000000000..8b29381f5a11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_finetuning_test_bella_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuning_test_bella BertForSequenceClassification from bella +author: John Snow Labs +name: bert_finetuning_test_bella +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_bella` is a English model originally trained by bella. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_bella_en_5.1.4_3.4_1698340139648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_bella_en_5.1.4_3.4_1698340139648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_bella","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_bella","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_bella| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/bella/bert_finetuning_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_harmful_romanian_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_harmful_romanian_en.md new file mode 100644 index 000000000000..d31af2071373 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_harmful_romanian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_harmful_romanian BertForSequenceClassification from LibrAI +author: John Snow Labs +name: bert_harmful_romanian +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_harmful_romanian` is a English model originally trained by LibrAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_harmful_romanian_en_5.1.4_3.4_1698357514500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_harmful_romanian_en_5.1.4_3.4_1698357514500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_harmful_romanian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_harmful_romanian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_harmful_romanian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/LibrAI/bert-harmful-ro \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_ielts_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_ielts_en.md new file mode 100644 index 000000000000..25b53198f728 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_ielts_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_ielts BertForSequenceClassification from karanzrk +author: John Snow Labs +name: bert_ielts +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ielts` is a English model originally trained by karanzrk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ielts_en_5.1.4_3.4_1698329243603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ielts_en_5.1.4_3.4_1698329243603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_ielts","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_ielts","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ielts| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/karanzrk/bert-IELTS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_gpt2detector_topp92_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_gpt2detector_topp92_en.md new file mode 100644 index 000000000000..f6501c22f32c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_gpt2detector_topp92_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_gpt2detector_topp92 BertForSequenceClassification from baykenney +author: John Snow Labs +name: bert_large_gpt2detector_topp92 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_gpt2detector_topp92` is a English model originally trained by baykenney. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_gpt2detector_topp92_en_5.1.4_3.4_1698339554145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_gpt2detector_topp92_en_5.1.4_3.4_1698339554145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_gpt2detector_topp92","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_gpt2detector_topp92","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_gpt2detector_topp92| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/baykenney/bert-large-gpt2detector-topp92 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_gpt2detector_topp96_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_gpt2detector_topp96_en.md new file mode 100644 index 000000000000..c236dee49a18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_gpt2detector_topp96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_gpt2detector_topp96 BertForSequenceClassification from baykenney +author: John Snow Labs +name: bert_large_gpt2detector_topp96 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_gpt2detector_topp96` is a English model originally trained by baykenney. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_gpt2detector_topp96_en_5.1.4_3.4_1698339921681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_gpt2detector_topp96_en_5.1.4_3.4_1698339921681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_gpt2detector_topp96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_gpt2detector_topp96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_gpt2detector_topp96| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/baykenney/bert-large-gpt2detector-topp96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_10000_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_10000_en.md new file mode 100644 index 000000000000..1c2becddf1a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_10000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_fine_tune_winogrande_8_1e_10000 BertForSequenceClassification from Stupendousabhi +author: John Snow Labs +name: bert_large_uncased_fine_tune_winogrande_8_1e_10000 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_fine_tune_winogrande_8_1e_10000` is a English model originally trained by Stupendousabhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_8_1e_10000_en_5.1.4_3.4_1698312050378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_8_1e_10000_en_5.1.4_3.4_1698312050378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_8_1e_10000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_8_1e_10000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_fine_tune_winogrande_8_1e_10000| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Stupendousabhi/bert-large-uncased-fine-tune-winogrande-8-1e-10000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8_en.md new file mode 100644 index 000000000000..04d97b3d44fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8 BertForSequenceClassification from Stupendousabhi +author: John Snow Labs +name: bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8` is a English model originally trained by Stupendousabhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8_en_5.1.4_3.4_1698313456029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8_en_5.1.4_3.4_1698313456029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_fine_tune_winogrande_8_1e_11262_bs8| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Stupendousabhi/bert-large-uncased-fine-tune-winogrande-8-1e-11262-bs8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_20000_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_20000_en.md new file mode 100644 index 000000000000..7de1de79cf77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_8_1e_20000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_fine_tune_winogrande_8_1e_20000 BertForSequenceClassification from Stupendousabhi +author: John Snow Labs +name: bert_large_uncased_fine_tune_winogrande_8_1e_20000 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_fine_tune_winogrande_8_1e_20000` is a English model originally trained by Stupendousabhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_8_1e_20000_en_5.1.4_3.4_1698313105749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_8_1e_20000_en_5.1.4_3.4_1698313105749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_8_1e_20000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_8_1e_20000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_fine_tune_winogrande_8_1e_20000| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Stupendousabhi/bert-large-uncased-fine-tune-winogrande-8-1e-20000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_ep_11262_bs16_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_ep_11262_bs16_en.md new file mode 100644 index 000000000000..eff07769f90e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_ep_11262_bs16_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_fine_tune_winogrande_ep_11262_bs16 BertForSequenceClassification from Stupendousabhi +author: John Snow Labs +name: bert_large_uncased_fine_tune_winogrande_ep_11262_bs16 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_fine_tune_winogrande_ep_11262_bs16` is a English model originally trained by Stupendousabhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_ep_11262_bs16_en_5.1.4_3.4_1698316640612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_ep_11262_bs16_en_5.1.4_3.4_1698316640612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_ep_11262_bs16","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_ep_11262_bs16","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_fine_tune_winogrande_ep_11262_bs16| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Stupendousabhi/bert-large-uncased-fine-tune-winogrande-ep-11262_bs16 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16_en.md new file mode 100644 index 000000000000..f3c4875681f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16 BertForSequenceClassification from Stupendousabhi +author: John Snow Labs +name: bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16` is a English model originally trained by Stupendousabhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16_en_5.1.4_3.4_1698315174118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16_en_5.1.4_3.4_1698315174118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_fine_tune_winogrande_ep_8_11262_bs16| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Stupendousabhi/bert-large-uncased-fine-tune-winogrande-ep-8-11262-bs16 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_hoax_classifier_def_v1_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_hoax_classifier_def_v1_en.md new file mode 100644 index 000000000000..ea6b47ea54f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_hoax_classifier_def_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_hoax_classifier_def_v1 BertForSequenceClassification from research-dump +author: John Snow Labs +name: bert_large_uncased_hoax_classifier_def_v1 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_hoax_classifier_def_v1` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_hoax_classifier_def_v1_en_5.1.4_3.4_1698346233225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_hoax_classifier_def_v1_en_5.1.4_3.4_1698346233225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_hoax_classifier_def_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_hoax_classifier_def_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_hoax_classifier_def_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/research-dump/bert-large-uncased_hoax_classifier_def_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_sst2_assemblyai_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_sst2_assemblyai_en.md new file mode 100644 index 000000000000..41696eb92f44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_large_uncased_sst2_assemblyai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_sst2_assemblyai BertForSequenceClassification from assemblyai +author: John Snow Labs +name: bert_large_uncased_sst2_assemblyai +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_sst2_assemblyai` is a English model originally trained by assemblyai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_sst2_assemblyai_en_5.1.4_3.4_1698312050580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_sst2_assemblyai_en_5.1.4_3.4_1698312050580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_sst2_assemblyai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_sst2_assemblyai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_sst2_assemblyai| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/assemblyai/bert-large-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_playground_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_playground_en.md new file mode 100644 index 000000000000..97dbd3a7c719 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_playground_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_playground BertForSequenceClassification from antoineross +author: John Snow Labs +name: bert_playground +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_playground` is a English model originally trained by antoineross. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_playground_en_5.1.4_3.4_1698350365614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_playground_en_5.1.4_3.4_1698350365614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_playground","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_playground","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_playground| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/antoineross/bert-playground \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_product_classifier_final_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_product_classifier_final_en.md new file mode 100644 index 000000000000..75e9e11ad339 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_product_classifier_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_product_classifier_final BertForSequenceClassification from sianbru +author: John Snow Labs +name: bert_product_classifier_final +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_product_classifier_final` is a English model originally trained by sianbru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_product_classifier_final_en_5.1.4_3.4_1698355192966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_product_classifier_final_en_5.1.4_3.4_1698355192966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_product_classifier_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_product_classifier_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_product_classifier_final| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/sianbru/bert_product_classifier_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_product_classifier_name_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_product_classifier_name_en.md new file mode 100644 index 000000000000..f777b74fd3c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_product_classifier_name_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_product_classifier_name BertForSequenceClassification from sianbru +author: John Snow Labs +name: bert_product_classifier_name +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_product_classifier_name` is a English model originally trained by sianbru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_product_classifier_name_en_5.1.4_3.4_1698352910293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_product_classifier_name_en_5.1.4_3.4_1698352910293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_product_classifier_name","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_product_classifier_name","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_product_classifier_name| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/sianbru/bert_product_classifier_name \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_semeval_long2_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_semeval_long2_en.md new file mode 100644 index 000000000000..178f301efc0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_semeval_long2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_semeval_long2 BertForSequenceClassification from Babak-Behkamkia +author: John Snow Labs +name: bert_semeval_long2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_semeval_long2` is a English model originally trained by Babak-Behkamkia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_semeval_long2_en_5.1.4_3.4_1698328170592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_semeval_long2_en_5.1.4_3.4_1698328170592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_semeval_long2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_semeval_long2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_semeval_long2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Babak-Behkamkia/bert_SEMEVAL_long2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_semeval_long_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_semeval_long_en.md new file mode 100644 index 000000000000..ec92d416bbc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_semeval_long_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_semeval_long BertForSequenceClassification from Babak-Behkamkia +author: John Snow Labs +name: bert_semeval_long +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_semeval_long` is a English model originally trained by Babak-Behkamkia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_semeval_long_en_5.1.4_3.4_1698327198252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_semeval_long_en_5.1.4_3.4_1698327198252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_semeval_long","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_semeval_long","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_semeval_long| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Babak-Behkamkia/bert_SEMEVAL_long \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_sentiment_analysis_sst_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_sentiment_analysis_sst_en.md new file mode 100644 index 000000000000..60e305f57cc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_sentiment_analysis_sst_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_sentiment_analysis_sst BertForSequenceClassification from barissayil +author: John Snow Labs +name: bert_sentiment_analysis_sst +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sentiment_analysis_sst` is a English model originally trained by barissayil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sentiment_analysis_sst_en_5.1.4_3.4_1698313473570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sentiment_analysis_sst_en_5.1.4_3.4_1698313473570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sentiment_analysis_sst","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sentiment_analysis_sst","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sentiment_analysis_sst| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/barissayil/bert-sentiment-analysis-sst \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_id.md b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_id.md new file mode 100644 index 000000000000..4f862faf2de0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_id.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Indonesian BertForSequenceClassification Base Cased model (from ayameRushia) +author: John Snow Labs +name: bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa +date: 2023-10-26 +tags: [id, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: id +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-indonesian-1.5G-sentiment-analysis-smsa` is a Indonesian model originally trained by `ayameRushia`. + +## Predicted Entities + +`Neutral`, `Positive`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_id_5.1.4_3.4_1698312586345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_id_5.1.4_3.4_1698312586345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa","id") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_indonesian_1.5g_sentiment_analysis_smsa| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|id| +|Size:|414.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ayameRushia/bert-base-indonesian-1.5G-sentiment-analysis-smsa +- https://paperswithcode.com/sota?task=Text+Classification&dataset=indonlu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_med_ru.md b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_med_ru.md new file mode 100644 index 000000000000..152ed413c4fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_med_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian bert_sequence_classifier_russian_base_cased_sentiment_med BertForSequenceClassification from blanchefort +author: John Snow Labs +name: bert_sequence_classifier_russian_base_cased_sentiment_med +date: 2023-10-26 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_russian_base_cased_sentiment_med` is a Russian model originally trained by blanchefort. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_sentiment_med_ru_5.1.4_3.4_1698345887002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_sentiment_med_ru_5.1.4_3.4_1698345887002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_sentiment_med","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_sentiment_med","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_russian_base_cased_sentiment_med| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/blanchefort/rubert-base-cased-sentiment-med \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_rurewiews_ru.md b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_rurewiews_ru.md new file mode 100644 index 000000000000..0133df345086 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_rurewiews_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian bert_sequence_classifier_russian_base_cased_sentiment_rurewiews BertForSequenceClassification from blanchefort +author: John Snow Labs +name: bert_sequence_classifier_russian_base_cased_sentiment_rurewiews +date: 2023-10-26 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_russian_base_cased_sentiment_rurewiews` is a Russian model originally trained by blanchefort. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_sentiment_rurewiews_ru_5.1.4_3.4_1698347926163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_sentiment_rurewiews_ru_5.1.4_3.4_1698347926163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_sentiment_rurewiews","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_sentiment_rurewiews","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_russian_base_cased_sentiment_rurewiews| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/blanchefort/rubert-base-cased-sentiment-rurewiews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_rusentiment_ru.md b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_rusentiment_ru.md new file mode 100644 index 000000000000..593ae3a263ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_sequence_classifier_russian_base_cased_sentiment_rusentiment_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian bert_sequence_classifier_russian_base_cased_sentiment_rusentiment BertForSequenceClassification from blanchefort +author: John Snow Labs +name: bert_sequence_classifier_russian_base_cased_sentiment_rusentiment +date: 2023-10-26 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_russian_base_cased_sentiment_rusentiment` is a Russian model originally trained by blanchefort. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_sentiment_rusentiment_ru_5.1.4_3.4_1698348956934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_sentiment_rusentiment_ru_5.1.4_3.4_1698348956934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_sentiment_rusentiment","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_sentiment_rusentiment","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_russian_base_cased_sentiment_rusentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/blanchefort/rubert-base-cased-sentiment-rusentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_test_0803_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_test_0803_en.md new file mode 100644 index 000000000000..b907d79ed4ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_test_0803_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_test_0803 BertForSequenceClassification from tingzhou +author: John Snow Labs +name: bert_test_0803 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_test_0803` is a English model originally trained by tingzhou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_test_0803_en_5.1.4_3.4_1698357009915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_test_0803_en_5.1.4_3.4_1698357009915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_test_0803","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_test_0803","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_test_0803| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/tingzhou/bert_test_0803 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_fake_news_detection_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_fake_news_detection_en.md new file mode 100644 index 000000000000..560025c3c347 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_fake_news_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_tiny_fake_news_detection BertForSequenceClassification from ErfanMoosaviMonazzah +author: John Snow Labs +name: bert_tiny_fake_news_detection +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_fake_news_detection` is a English model originally trained by ErfanMoosaviMonazzah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_fake_news_detection_en_5.1.4_3.4_1698348568332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_fake_news_detection_en_5.1.4_3.4_1698348568332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_fake_news_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_fake_news_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_fake_news_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/ErfanMoosaviMonazzah/bert-tiny-fake-news-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_finetuned_legal_definitions_downstream_alt_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_finetuned_legal_definitions_downstream_alt_en.md new file mode 100644 index 000000000000..27202426745a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_finetuned_legal_definitions_downstream_alt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_tiny_finetuned_legal_definitions_downstream_alt BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_tiny_finetuned_legal_definitions_downstream_alt +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_legal_definitions_downstream_alt` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_legal_definitions_downstream_alt_en_5.1.4_3.4_1698361780403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_legal_definitions_downstream_alt_en_5.1.4_3.4_1698361780403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_finetuned_legal_definitions_downstream_alt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_finetuned_legal_definitions_downstream_alt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_legal_definitions_downstream_alt| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/muhtasham/bert-tiny-finetuned-legal-definitions-downstream-alt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_finetuned_legal_definitions_longer_downstream_alt_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_finetuned_legal_definitions_longer_downstream_alt_en.md new file mode 100644 index 000000000000..0a3cab42e534 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_tiny_finetuned_legal_definitions_longer_downstream_alt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_tiny_finetuned_legal_definitions_longer_downstream_alt BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_tiny_finetuned_legal_definitions_longer_downstream_alt +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_legal_definitions_longer_downstream_alt` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_legal_definitions_longer_downstream_alt_en_5.1.4_3.4_1698362214023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_legal_definitions_longer_downstream_alt_en_5.1.4_3.4_1698362214023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_finetuned_legal_definitions_longer_downstream_alt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_finetuned_legal_definitions_longer_downstream_alt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_legal_definitions_longer_downstream_alt| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/muhtasham/bert-tiny-finetuned-legal-definitions-longer-downstream-alt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_toxic_comment_classification_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_toxic_comment_classification_en.md new file mode 100644 index 000000000000..66d367e12b42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_toxic_comment_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_toxic_comment_classification BertForSequenceClassification from JungleLee +author: John Snow Labs +name: bert_toxic_comment_classification +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_toxic_comment_classification` is a English model originally trained by JungleLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_toxic_comment_classification_en_5.1.4_3.4_1698317556311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_toxic_comment_classification_en_5.1.4_3.4_1698317556311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_toxic_comment_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_toxic_comment_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_toxic_comment_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JungleLee/bert-toxic-comment-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_fine_tuned_zero_shot_baseline_mnli_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_fine_tuned_zero_shot_baseline_mnli_en.md new file mode 100644 index 000000000000..2958dea6a163 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_fine_tuned_zero_shot_baseline_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_uncased_fine_tuned_zero_shot_baseline_mnli BertForSequenceClassification from jcbao77 +author: John Snow Labs +name: bert_uncased_fine_tuned_zero_shot_baseline_mnli +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_fine_tuned_zero_shot_baseline_mnli` is a English model originally trained by jcbao77. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_fine_tuned_zero_shot_baseline_mnli_en_5.1.4_3.4_1698330092651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_fine_tuned_zero_shot_baseline_mnli_en_5.1.4_3.4_1698330092651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_fine_tuned_zero_shot_baseline_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_fine_tuned_zero_shot_baseline_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_fine_tuned_zero_shot_baseline_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jcbao77/bert-uncased-fine-tuned-zero-shot-baseline-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_multi_task_dynamic_weights_mnli_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_multi_task_dynamic_weights_mnli_en.md new file mode 100644 index 000000000000..6f75246a036e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_multi_task_dynamic_weights_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_uncased_multi_task_dynamic_weights_mnli BertForSequenceClassification from jcbao77 +author: John Snow Labs +name: bert_uncased_multi_task_dynamic_weights_mnli +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_multi_task_dynamic_weights_mnli` is a English model originally trained by jcbao77. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_multi_task_dynamic_weights_mnli_en_5.1.4_3.4_1698353435459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_multi_task_dynamic_weights_mnli_en_5.1.4_3.4_1698353435459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_multi_task_dynamic_weights_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_multi_task_dynamic_weights_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_multi_task_dynamic_weights_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jcbao77/bert-uncased-multi-task-dynamic-weights-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_multi_task_fixed_weights_mnli_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_multi_task_fixed_weights_mnli_en.md new file mode 100644 index 000000000000..c091728a2b1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_uncased_multi_task_fixed_weights_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_uncased_multi_task_fixed_weights_mnli BertForSequenceClassification from jcbao77 +author: John Snow Labs +name: bert_uncased_multi_task_fixed_weights_mnli +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_multi_task_fixed_weights_mnli` is a English model originally trained by jcbao77. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_multi_task_fixed_weights_mnli_en_5.1.4_3.4_1698352759719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_multi_task_fixed_weights_mnli_en_5.1.4_3.4_1698352759719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_multi_task_fixed_weights_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_multi_task_fixed_weights_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_multi_task_fixed_weights_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jcbao77/bert-uncased-multi-task-fixed-weights-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_vast_long2_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_vast_long2_en.md new file mode 100644 index 000000000000..c7dcc6284f31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_vast_long2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_vast_long2 BertForSequenceClassification from Babak-Behkamkia +author: John Snow Labs +name: bert_vast_long2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_vast_long2` is a English model originally trained by Babak-Behkamkia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_vast_long2_en_5.1.4_3.4_1698339488784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_vast_long2_en_5.1.4_3.4_1698339488784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_vast_long2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_vast_long2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_vast_long2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Babak-Behkamkia/bert_VAST_long2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bert_vast_long_en.md b/docs/_posts/ahmedlone127/2023-10-26-bert_vast_long_en.md new file mode 100644 index 000000000000..defcdba062a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bert_vast_long_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_vast_long BertForSequenceClassification from Babak-Behkamkia +author: John Snow Labs +name: bert_vast_long +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_vast_long` is a English model originally trained by Babak-Behkamkia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_vast_long_en_5.1.4_3.4_1698330383883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_vast_long_en_5.1.4_3.4_1698330383883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_vast_long","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_vast_long","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_vast_long| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Babak-Behkamkia/bert_VAST_long \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-bertimbau_finetune_breton_news_en.md b/docs/_posts/ahmedlone127/2023-10-26-bertimbau_finetune_breton_news_en.md new file mode 100644 index 000000000000..b14d1928f731 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-bertimbau_finetune_breton_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bertimbau_finetune_breton_news BertForSequenceClassification from Caesarcc +author: John Snow Labs +name: bertimbau_finetune_breton_news +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertimbau_finetune_breton_news` is a English model originally trained by Caesarcc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertimbau_finetune_breton_news_en_5.1.4_3.4_1698352958930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertimbau_finetune_breton_news_en_5.1.4_3.4_1698352958930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_finetune_breton_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_finetune_breton_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertimbau_finetune_breton_news| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/Caesarcc/bertimbau-finetune-br-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-biobert_2_en.md b/docs/_posts/ahmedlone127/2023-10-26-biobert_2_en.md new file mode 100644 index 000000000000..8968f2a9ddc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-biobert_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biobert_2 BertForSequenceClassification from hagara +author: John Snow Labs +name: biobert_2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_2` is a English model originally trained by hagara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_2_en_5.1.4_3.4_1698320193216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_2_en_5.1.4_3.4_1698320193216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biobert_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobert_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| + +## References + +https://huggingface.co/hagara/biobert-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-biobert_base_cased_v1_1_finetuned_pubmedqa_en.md b/docs/_posts/ahmedlone127/2023-10-26-biobert_base_cased_v1_1_finetuned_pubmedqa_en.md new file mode 100644 index 000000000000..3a919d662161 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-biobert_base_cased_v1_1_finetuned_pubmedqa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biobert_base_cased_v1_1_finetuned_pubmedqa BertForSequenceClassification from blizrys +author: John Snow Labs +name: biobert_base_cased_v1_1_finetuned_pubmedqa +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_base_cased_v1_1_finetuned_pubmedqa` is a English model originally trained by blizrys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_1_finetuned_pubmedqa_en_5.1.4_3.4_1698349612505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_1_finetuned_pubmedqa_en_5.1.4_3.4_1698349612505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biobert_base_cased_v1_1_finetuned_pubmedqa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobert_base_cased_v1_1_finetuned_pubmedqa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_base_cased_v1_1_finetuned_pubmedqa| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| + +## References + +https://huggingface.co/blizrys/biobert-base-cased-v1.1-finetuned-pubmedqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-biobert_v1_1_finetuned_pubmedqa_en.md b/docs/_posts/ahmedlone127/2023-10-26-biobert_v1_1_finetuned_pubmedqa_en.md new file mode 100644 index 000000000000..4d19126eab0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-biobert_v1_1_finetuned_pubmedqa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biobert_v1_1_finetuned_pubmedqa BertForSequenceClassification from blizrys +author: John Snow Labs +name: biobert_v1_1_finetuned_pubmedqa +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_v1_1_finetuned_pubmedqa` is a English model originally trained by blizrys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_v1_1_finetuned_pubmedqa_en_5.1.4_3.4_1698350382228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_v1_1_finetuned_pubmedqa_en_5.1.4_3.4_1698350382228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biobert_v1_1_finetuned_pubmedqa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobert_v1_1_finetuned_pubmedqa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_v1_1_finetuned_pubmedqa| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| + +## References + +https://huggingface.co/blizrys/biobert-v1.1-finetuned-pubmedqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-burmese_legal_bert_small_en.md b/docs/_posts/ahmedlone127/2023-10-26-burmese_legal_bert_small_en.md new file mode 100644 index 000000000000..78d07ccc94ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-burmese_legal_bert_small_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_legal_bert_small BertForSequenceClassification from wiorz +author: John Snow Labs +name: burmese_legal_bert_small +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_legal_bert_small` is a English model originally trained by wiorz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_legal_bert_small_en_5.1.4_3.4_1698318113342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_legal_bert_small_en_5.1.4_3.4_1698318113342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("burmese_legal_bert_small","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("burmese_legal_bert_small","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_legal_bert_small| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|131.6 MB| + +## References + +https://huggingface.co/wiorz/my_legal_bert_small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-chinese_roberta_wwm_ext_finetuned2_en.md b/docs/_posts/ahmedlone127/2023-10-26-chinese_roberta_wwm_ext_finetuned2_en.md new file mode 100644 index 000000000000..c12a65ca90c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-chinese_roberta_wwm_ext_finetuned2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinese_roberta_wwm_ext_finetuned2 BertForSequenceClassification from zhiguoxu +author: John Snow Labs +name: chinese_roberta_wwm_ext_finetuned2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_wwm_ext_finetuned2` is a English model originally trained by zhiguoxu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_finetuned2_en_5.1.4_3.4_1698356247100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_finetuned2_en_5.1.4_3.4_1698356247100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_finetuned2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_finetuned2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_wwm_ext_finetuned2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.2 MB| + +## References + +https://huggingface.co/zhiguoxu/chinese-roberta-wwm-ext-finetuned2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_en.md b/docs/_posts/ahmedlone127/2023-10-26-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_en.md new file mode 100644 index 000000000000..0b1ed006f003 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_twitter_bert_v2_norwegian_description_stance_loss_hyp BertForSequenceClassification from sumba +author: John Snow Labs +name: covid_twitter_bert_v2_norwegian_description_stance_loss_hyp +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_twitter_bert_v2_norwegian_description_stance_loss_hyp` is a English model originally trained by sumba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_en_5.1.4_3.4_1698350210246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_en_5.1.4_3.4_1698350210246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("covid_twitter_bert_v2_norwegian_description_stance_loss_hyp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("covid_twitter_bert_v2_norwegian_description_stance_loss_hyp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_twitter_bert_v2_norwegian_description_stance_loss_hyp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/sumba/covid-twitter-bert-v2-no_description-stance-loss-hyp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess_en.md b/docs/_posts/ahmedlone127/2023-10-26-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess_en.md new file mode 100644 index 000000000000..44d4e7c4749f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess BertForSequenceClassification from sumba +author: John Snow Labs +name: covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess` is a English model originally trained by sumba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess_en_5.1.4_3.4_1698355385022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess_en_5.1.4_3.4_1698355385022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/sumba/covid-twitter-bert-v2-no_description-stance-loss-hyp-unprocess \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-distilbert_base_uncased_phuongthanhnguyen_en.md b/docs/_posts/ahmedlone127/2023-10-26-distilbert_base_uncased_phuongthanhnguyen_en.md new file mode 100644 index 000000000000..d2a2bcf6ad71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-distilbert_base_uncased_phuongthanhnguyen_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_phuongthanhnguyen BertForSequenceClassification from phuongthanhnguyen +author: John Snow Labs +name: distilbert_base_uncased_phuongthanhnguyen +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_phuongthanhnguyen` is a English model originally trained by phuongthanhnguyen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_phuongthanhnguyen_en_5.1.4_3.4_1698351142335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_phuongthanhnguyen_en_5.1.4_3.4_1698351142335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_phuongthanhnguyen","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_phuongthanhnguyen","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_phuongthanhnguyen| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/phuongthanhnguyen/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-dk_emotion_bert_in_class_sadiksha_en.md b/docs/_posts/ahmedlone127/2023-10-26-dk_emotion_bert_in_class_sadiksha_en.md new file mode 100644 index 000000000000..e7d095d51fa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-dk_emotion_bert_in_class_sadiksha_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dk_emotion_bert_in_class_sadiksha BertForSequenceClassification from Sadiksha +author: John Snow Labs +name: dk_emotion_bert_in_class_sadiksha +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dk_emotion_bert_in_class_sadiksha` is a English model originally trained by Sadiksha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dk_emotion_bert_in_class_sadiksha_en_5.1.4_3.4_1698324526539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dk_emotion_bert_in_class_sadiksha_en_5.1.4_3.4_1698324526539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dk_emotion_bert_in_class_sadiksha","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dk_emotion_bert_in_class_sadiksha","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dk_emotion_bert_in_class_sadiksha| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/Sadiksha/dk_emotion_bert_in_class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-equality_bert_en.md b/docs/_posts/ahmedlone127/2023-10-26-equality_bert_en.md new file mode 100644 index 000000000000..36dc769fce88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-equality_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English equality_bert BertForSequenceClassification from tiya1012 +author: John Snow Labs +name: equality_bert +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`equality_bert` is a English model originally trained by tiya1012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/equality_bert_en_5.1.4_3.4_1698354089127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/equality_bert_en_5.1.4_3.4_1698354089127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("equality_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("equality_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|equality_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/tiya1012/equality_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-esg_classification_bert_all_data_v2_1808_en.md b/docs/_posts/ahmedlone127/2023-10-26-esg_classification_bert_all_data_v2_1808_en.md new file mode 100644 index 000000000000..9b9fdce2cb6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-esg_classification_bert_all_data_v2_1808_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English esg_classification_bert_all_data_v2_1808 BertForSequenceClassification from dsmsb +author: John Snow Labs +name: esg_classification_bert_all_data_v2_1808 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`esg_classification_bert_all_data_v2_1808` is a English model originally trained by dsmsb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/esg_classification_bert_all_data_v2_1808_en_5.1.4_3.4_1698322877577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/esg_classification_bert_all_data_v2_1808_en_5.1.4_3.4_1698322877577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("esg_classification_bert_all_data_v2_1808","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("esg_classification_bert_all_data_v2_1808","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|esg_classification_bert_all_data_v2_1808| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/dsmsb/esg-classification_bert_all_data_v2_1808 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-esg_classification_bert_v1_1808_en.md b/docs/_posts/ahmedlone127/2023-10-26-esg_classification_bert_v1_1808_en.md new file mode 100644 index 000000000000..c6f3ba459b8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-esg_classification_bert_v1_1808_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English esg_classification_bert_v1_1808 BertForSequenceClassification from dsmsb +author: John Snow Labs +name: esg_classification_bert_v1_1808 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`esg_classification_bert_v1_1808` is a English model originally trained by dsmsb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/esg_classification_bert_v1_1808_en_5.1.4_3.4_1698322077739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/esg_classification_bert_v1_1808_en_5.1.4_3.4_1698322077739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("esg_classification_bert_v1_1808","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("esg_classification_bert_v1_1808","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|esg_classification_bert_v1_1808| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/dsmsb/esg-classification_bert_v1_1808 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-esg_classification_distilbert_bert_base_multilingual_cased_v4_xx.md b/docs/_posts/ahmedlone127/2023-10-26-esg_classification_distilbert_bert_base_multilingual_cased_v4_xx.md new file mode 100644 index 000000000000..2c93a7314a1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-esg_classification_distilbert_bert_base_multilingual_cased_v4_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual esg_classification_distilbert_bert_base_multilingual_cased_v4 BertForSequenceClassification from dsmsb +author: John Snow Labs +name: esg_classification_distilbert_bert_base_multilingual_cased_v4 +date: 2023-10-26 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`esg_classification_distilbert_bert_base_multilingual_cased_v4` is a Multilingual model originally trained by dsmsb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/esg_classification_distilbert_bert_base_multilingual_cased_v4_xx_5.1.4_3.4_1698317739607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/esg_classification_distilbert_bert_base_multilingual_cased_v4_xx_5.1.4_3.4_1698317739607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("esg_classification_distilbert_bert_base_multilingual_cased_v4","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("esg_classification_distilbert_bert_base_multilingual_cased_v4","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|esg_classification_distilbert_bert_base_multilingual_cased_v4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/dsmsb/esg-classification_distilbert-bert-base-multilingual-cased_v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-fake_news_small_bert_en.md b/docs/_posts/ahmedlone127/2023-10-26-fake_news_small_bert_en.md new file mode 100644 index 000000000000..fcf10b652b16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-fake_news_small_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_news_small_bert BertForSequenceClassification from safikhan +author: John Snow Labs +name: fake_news_small_bert +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_news_small_bert` is a English model originally trained by safikhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_news_small_bert_en_5.1.4_3.4_1698359545754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_news_small_bert_en_5.1.4_3.4_1698359545754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fake_news_small_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fake_news_small_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_news_small_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|65.8 MB| + +## References + +https://huggingface.co/safikhan/fake-news-small-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-finbert2_en.md b/docs/_posts/ahmedlone127/2023-10-26-finbert2_en.md new file mode 100644 index 000000000000..174faa1753e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-finbert2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finbert2 BertForSequenceClassification from Narsil +author: John Snow Labs +name: finbert2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert2` is a English model originally trained by Narsil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert2_en_5.1.4_3.4_1698346084668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert2_en_5.1.4_3.4_1698346084668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finbert2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finbert2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Narsil/finbert2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_combined_fine_grained_en.md b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_combined_fine_grained_en.md new file mode 100644 index 000000000000..85d2194f1630 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_combined_fine_grained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_bert_combined_fine_grained BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: fine_tune_bert_combined_fine_grained +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_bert_combined_fine_grained` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_bert_combined_fine_grained_en_5.1.4_3.4_1698350313101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_bert_combined_fine_grained_en_5.1.4_3.4_1698350313101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_combined_fine_grained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_combined_fine_grained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_bert_combined_fine_grained| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/nouman-10/fine-tune-bert-combined-fine-grained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_combined_mlm_en.md b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_combined_mlm_en.md new file mode 100644 index 000000000000..cd514ca28605 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_combined_mlm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_bert_combined_mlm BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: fine_tune_bert_combined_mlm +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_bert_combined_mlm` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_bert_combined_mlm_en_5.1.4_3.4_1698349543206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_bert_combined_mlm_en_5.1.4_3.4_1698349543206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_combined_mlm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_combined_mlm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_bert_combined_mlm| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/nouman-10/fine-tune-bert-combined-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_exist_fine_grained_en.md b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_exist_fine_grained_en.md new file mode 100644 index 000000000000..9b06dc6435c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_exist_fine_grained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_bert_exist_fine_grained BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: fine_tune_bert_exist_fine_grained +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_bert_exist_fine_grained` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_bert_exist_fine_grained_en_5.1.4_3.4_1698348044997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_bert_exist_fine_grained_en_5.1.4_3.4_1698348044997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_exist_fine_grained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_exist_fine_grained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_bert_exist_fine_grained| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nouman-10/fine-tune-bert-exist-fine-grained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_exist_mlm_en.md b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_exist_mlm_en.md new file mode 100644 index 000000000000..c3a7d0f743de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_exist_mlm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_bert_exist_mlm BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: fine_tune_bert_exist_mlm +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_bert_exist_mlm` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_bert_exist_mlm_en_5.1.4_3.4_1698345178117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_bert_exist_mlm_en_5.1.4_3.4_1698345178117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_exist_mlm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_exist_mlm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_bert_exist_mlm| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nouman-10/fine-tune-bert-exist-mlm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_semitic_languages_exist_en.md b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_semitic_languages_exist_en.md new file mode 100644 index 000000000000..87c864d70868 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_bert_semitic_languages_exist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_bert_semitic_languages_exist BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: fine_tune_bert_semitic_languages_exist +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_bert_semitic_languages_exist` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_bert_semitic_languages_exist_en_5.1.4_3.4_1698339504027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_bert_semitic_languages_exist_en_5.1.4_3.4_1698339504027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_semitic_languages_exist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_semitic_languages_exist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_bert_semitic_languages_exist| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nouman-10/fine-tune-bert-sem-exist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-fine_tune_mbert_semitic_languages_exist_en.md b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_mbert_semitic_languages_exist_en.md new file mode 100644 index 000000000000..ac880c22f4bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-fine_tune_mbert_semitic_languages_exist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_mbert_semitic_languages_exist BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: fine_tune_mbert_semitic_languages_exist +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_mbert_semitic_languages_exist` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_mbert_semitic_languages_exist_en_5.1.4_3.4_1698339826651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_mbert_semitic_languages_exist_en_5.1.4_3.4_1698339826651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_mbert_semitic_languages_exist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_mbert_semitic_languages_exist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_mbert_semitic_languages_exist| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/nouman-10/fine-tune-mbert-sem-exist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-finetuning_bert_case_intent_class_en.md b/docs/_posts/ahmedlone127/2023-10-26-finetuning_bert_case_intent_class_en.md new file mode 100644 index 000000000000..3f84c7b6181c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-finetuning_bert_case_intent_class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_bert_case_intent_class BertForSequenceClassification from Geo +author: John Snow Labs +name: finetuning_bert_case_intent_class +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_case_intent_class` is a English model originally trained by Geo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_case_intent_class_en_5.1.4_3.4_1698347408295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_case_intent_class_en_5.1.4_3.4_1698347408295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_case_intent_class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_bert_case_intent_class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_case_intent_class| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Geo/finetuning-bert-case-intent-class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-finetuning_esg_sentiment_model_bert_nepal_bhasa_data_en.md b/docs/_posts/ahmedlone127/2023-10-26-finetuning_esg_sentiment_model_bert_nepal_bhasa_data_en.md new file mode 100644 index 000000000000..87bb2d0940d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-finetuning_esg_sentiment_model_bert_nepal_bhasa_data_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_esg_sentiment_model_bert_nepal_bhasa_data BertForSequenceClassification from Bennet1996 +author: John Snow Labs +name: finetuning_esg_sentiment_model_bert_nepal_bhasa_data +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_esg_sentiment_model_bert_nepal_bhasa_data` is a English model originally trained by Bennet1996. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_esg_sentiment_model_bert_nepal_bhasa_data_en_5.1.4_3.4_1698341111207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_esg_sentiment_model_bert_nepal_bhasa_data_en_5.1.4_3.4_1698341111207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_esg_sentiment_model_bert_nepal_bhasa_data","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_esg_sentiment_model_bert_nepal_bhasa_data","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_esg_sentiment_model_bert_nepal_bhasa_data| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Bennet1996/finetuning-ESG-sentiment-model-bert_new_data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-gbert_multi_class_german_hate_en.md b/docs/_posts/ahmedlone127/2023-10-26-gbert_multi_class_german_hate_en.md new file mode 100644 index 000000000000..41027ed24a45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-gbert_multi_class_german_hate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gbert_multi_class_german_hate BertForSequenceClassification from chrisrtt +author: John Snow Labs +name: gbert_multi_class_german_hate +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gbert_multi_class_german_hate` is a English model originally trained by chrisrtt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gbert_multi_class_german_hate_en_5.1.4_3.4_1698358312208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gbert_multi_class_german_hate_en_5.1.4_3.4_1698358312208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("gbert_multi_class_german_hate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("gbert_multi_class_german_hate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gbert_multi_class_german_hate| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.0 MB| + +## References + +https://huggingface.co/chrisrtt/gbert-multi-class-german-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-hate_speech_bert_unggi_en.md b/docs/_posts/ahmedlone127/2023-10-26-hate_speech_bert_unggi_en.md new file mode 100644 index 000000000000..a2daa5e1c760 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-hate_speech_bert_unggi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_speech_bert_unggi BertForSequenceClassification from Unggi +author: John Snow Labs +name: hate_speech_bert_unggi +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_bert_unggi` is a English model originally trained by Unggi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_bert_unggi_en_5.1.4_3.4_1698329243315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_bert_unggi_en_5.1.4_3.4_1698329243315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_bert_unggi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_bert_unggi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_bert_unggi| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|345.1 MB| + +## References + +https://huggingface.co/Unggi/hate_speech_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-indobert_base_uncased_finetuned_indonlu_smsa_id.md b/docs/_posts/ahmedlone127/2023-10-26-indobert_base_uncased_finetuned_indonlu_smsa_id.md new file mode 100644 index 000000000000..e9629da587ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-indobert_base_uncased_finetuned_indonlu_smsa_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian indobert_base_uncased_finetuned_indonlu_smsa BertForSequenceClassification from ayameRushia +author: John Snow Labs +name: indobert_base_uncased_finetuned_indonlu_smsa +date: 2023-10-26 +tags: [bert, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_base_uncased_finetuned_indonlu_smsa` is a Indonesian model originally trained by ayameRushia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_base_uncased_finetuned_indonlu_smsa_id_5.1.4_3.4_1698312802788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_base_uncased_finetuned_indonlu_smsa_id_5.1.4_3.4_1698312802788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobert_base_uncased_finetuned_indonlu_smsa","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobert_base_uncased_finetuned_indonlu_smsa","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_base_uncased_finetuned_indonlu_smsa| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|413.9 MB| + +## References + +https://huggingface.co/ayameRushia/indobert-base-uncased-finetuned-indonlu-smsa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-indobert_hoax_classification_en.md b/docs/_posts/ahmedlone127/2023-10-26-indobert_hoax_classification_en.md new file mode 100644 index 000000000000..ae45bb86ee0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-indobert_hoax_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indobert_hoax_classification BertForSequenceClassification from Rifky +author: John Snow Labs +name: indobert_hoax_classification +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_hoax_classification` is a English model originally trained by Rifky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_hoax_classification_en_5.1.4_3.4_1698353797562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_hoax_classification_en_5.1.4_3.4_1698353797562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobert_hoax_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobert_hoax_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_hoax_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.4 MB| + +## References + +https://huggingface.co/Rifky/indobert-hoax-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-indobertweet_sentiment2_en.md b/docs/_posts/ahmedlone127/2023-10-26-indobertweet_sentiment2_en.md new file mode 100644 index 000000000000..8c7f33b3b10a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-indobertweet_sentiment2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indobertweet_sentiment2 BertForSequenceClassification from candra +author: John Snow Labs +name: indobertweet_sentiment2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobertweet_sentiment2` is a English model originally trained by candra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobertweet_sentiment2_en_5.1.4_3.4_1698344775485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobertweet_sentiment2_en_5.1.4_3.4_1698344775485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobertweet_sentiment2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobertweet_sentiment2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobertweet_sentiment2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.0 MB| + +## References + +https://huggingface.co/candra/indobertweet-sentiment2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-indobertweet_ulasan_beauty_products_id.md b/docs/_posts/ahmedlone127/2023-10-26-indobertweet_ulasan_beauty_products_id.md new file mode 100644 index 000000000000..5db151aeb343 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-indobertweet_ulasan_beauty_products_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian indobertweet_ulasan_beauty_products BertForSequenceClassification from sekarmulyani +author: John Snow Labs +name: indobertweet_ulasan_beauty_products +date: 2023-10-26 +tags: [bert, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobertweet_ulasan_beauty_products` is a Indonesian model originally trained by sekarmulyani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobertweet_ulasan_beauty_products_id_5.1.4_3.4_1698360552443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobertweet_ulasan_beauty_products_id_5.1.4_3.4_1698360552443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobertweet_ulasan_beauty_products","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobertweet_ulasan_beauty_products","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobertweet_ulasan_beauty_products| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|414.0 MB| + +## References + +https://huggingface.co/sekarmulyani/indobertweet-ulasan-beauty-products \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-jobbert_base_cased_compdecs_en.md b/docs/_posts/ahmedlone127/2023-10-26-jobbert_base_cased_compdecs_en.md new file mode 100644 index 000000000000..d1dff79ee5a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-jobbert_base_cased_compdecs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English jobbert_base_cased_compdecs BertForSequenceClassification from nestauk +author: John Snow Labs +name: jobbert_base_cased_compdecs +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobbert_base_cased_compdecs` is a English model originally trained by nestauk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobbert_base_cased_compdecs_en_5.1.4_3.4_1698359131105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobbert_base_cased_compdecs_en_5.1.4_3.4_1698359131105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("jobbert_base_cased_compdecs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("jobbert_base_cased_compdecs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobbert_base_cased_compdecs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|404.4 MB| + +## References + +https://huggingface.co/nestauk/jobbert-base-cased-compdecs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-kd_roberta_1lbert_mixed_trial1_en.md b/docs/_posts/ahmedlone127/2023-10-26-kd_roberta_1lbert_mixed_trial1_en.md new file mode 100644 index 000000000000..39ff39cc3115 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-kd_roberta_1lbert_mixed_trial1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English kd_roberta_1lbert_mixed_trial1 BertForSequenceClassification from Youssef320 +author: John Snow Labs +name: kd_roberta_1lbert_mixed_trial1 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kd_roberta_1lbert_mixed_trial1` is a English model originally trained by Youssef320. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kd_roberta_1lbert_mixed_trial1_en_5.1.4_3.4_1698341379413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kd_roberta_1lbert_mixed_trial1_en_5.1.4_3.4_1698341379413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("kd_roberta_1lbert_mixed_trial1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("kd_roberta_1lbert_mixed_trial1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kd_roberta_1lbert_mixed_trial1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|43.9 MB| + +## References + +https://huggingface.co/Youssef320/KD_Roberta_1LBERT_Mixed_trial1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-lc_2_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-10-26-lc_2_bert_base_uncased_en.md new file mode 100644 index 000000000000..bf5c509a3295 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-lc_2_bert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English lc_2_bert_base_uncased BertForSequenceClassification from PiceTRP +author: John Snow Labs +name: lc_2_bert_base_uncased +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lc_2_bert_base_uncased` is a English model originally trained by PiceTRP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lc_2_bert_base_uncased_en_5.1.4_3.4_1698363973733.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lc_2_bert_base_uncased_en_5.1.4_3.4_1698363973733.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("lc_2_bert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("lc_2_bert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lc_2_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/PiceTRP/lc_2_bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-legal_bert_test2_en.md b/docs/_posts/ahmedlone127/2023-10-26-legal_bert_test2_en.md new file mode 100644 index 000000000000..2dad795ddf4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-legal_bert_test2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English legal_bert_test2 BertForSequenceClassification from wiorz +author: John Snow Labs +name: legal_bert_test2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_bert_test2` is a English model originally trained by wiorz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_bert_test2_en_5.1.4_3.4_1698320290761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_bert_test2_en_5.1.4_3.4_1698320290761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("legal_bert_test2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("legal_bert_test2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_bert_test2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|131.6 MB| + +## References + +https://huggingface.co/wiorz/legal_bert_test2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mediabiasdetector_bert_2e_en.md b/docs/_posts/ahmedlone127/2023-10-26-mediabiasdetector_bert_2e_en.md new file mode 100644 index 000000000000..6f1aac12e9a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mediabiasdetector_bert_2e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mediabiasdetector_bert_2e BertForSequenceClassification from jordankrishnayah +author: John Snow Labs +name: mediabiasdetector_bert_2e +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mediabiasdetector_bert_2e` is a English model originally trained by jordankrishnayah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mediabiasdetector_bert_2e_en_5.1.4_3.4_1698355958928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mediabiasdetector_bert_2e_en_5.1.4_3.4_1698355958928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mediabiasdetector_bert_2e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mediabiasdetector_bert_2e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mediabiasdetector_bert_2e| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jordankrishnayah/mediabiasdetector-bert-2e \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mediabiasdetector_bert_3e_en.md b/docs/_posts/ahmedlone127/2023-10-26-mediabiasdetector_bert_3e_en.md new file mode 100644 index 000000000000..1c1a107cfc01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mediabiasdetector_bert_3e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mediabiasdetector_bert_3e BertForSequenceClassification from jordankrishnayah +author: John Snow Labs +name: mediabiasdetector_bert_3e +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mediabiasdetector_bert_3e` is a English model originally trained by jordankrishnayah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mediabiasdetector_bert_3e_en_5.1.4_3.4_1698356838066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mediabiasdetector_bert_3e_en_5.1.4_3.4_1698356838066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mediabiasdetector_bert_3e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mediabiasdetector_bert_3e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mediabiasdetector_bert_3e| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jordankrishnayah/mediabiasdetector-bert-3e \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10_en.md b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10_en.md new file mode 100644 index 000000000000..bc047a0de2a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10_en_5.1.4_3.4_1698315973534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10_en_5.1.4_3.4_1698315973534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v10| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/hw2942/mengzi-bert-base-fin-wallstreetcn-morning-news-market-overview-SSEC-v10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4_en.md b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4_en.md new file mode 100644 index 000000000000..2679c519a618 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4_en_5.1.4_3.4_1698311940600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4_en_5.1.4_3.4_1698311940600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/hw2942/mengzi-bert-base-fin-wallstreetcn-morning-news-market-overview-SSEC-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5_en.md b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5_en.md new file mode 100644 index 000000000000..5760328d922a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5_en_5.1.4_3.4_1698312160553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5_en_5.1.4_3.4_1698312160553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v5| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/hw2942/mengzi-bert-base-fin-wallstreetcn-morning-news-market-overview-SSEC-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6_en.md b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6_en.md new file mode 100644 index 000000000000..25374c14e1f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6_en_5.1.4_3.4_1698313516445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6_en_5.1.4_3.4_1698313516445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v6| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/hw2942/mengzi-bert-base-fin-wallstreetcn-morning-news-market-overview-SSEC-v6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7_en.md b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7_en.md new file mode 100644 index 000000000000..dda7c8066fec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7_en_5.1.4_3.4_1698313713892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7_en_5.1.4_3.4_1698313713892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v7| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/hw2942/mengzi-bert-base-fin-wallstreetcn-morning-news-market-overview-SSEC-v7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8_en.md b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8_en.md new file mode 100644 index 000000000000..54b6e3fe63f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8_en_5.1.4_3.4_1698314535252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8_en_5.1.4_3.4_1698314535252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v8| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/hw2942/mengzi-bert-base-fin-wallstreetcn-morning-news-market-overview-SSEC-v8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9_en.md b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9_en.md new file mode 100644 index 000000000000..a5a9a1b88c3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9_en_5.1.4_3.4_1698315232836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9_en_5.1.4_3.4_1698315232836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mengzi_bert_base_fin_wallstreetcn_morning_news_market_overview_ssec_v9| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/hw2942/mengzi-bert-base-fin-wallstreetcn-morning-news-market-overview-SSEC-v9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mental_bert_base_uncased_masked_finetuned_0517_en.md b/docs/_posts/ahmedlone127/2023-10-26-mental_bert_base_uncased_masked_finetuned_0517_en.md new file mode 100644 index 000000000000..091fecd326d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mental_bert_base_uncased_masked_finetuned_0517_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mental_bert_base_uncased_masked_finetuned_0517 BertForSequenceClassification from YeRyeongLee +author: John Snow Labs +name: mental_bert_base_uncased_masked_finetuned_0517 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mental_bert_base_uncased_masked_finetuned_0517` is a English model originally trained by YeRyeongLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mental_bert_base_uncased_masked_finetuned_0517_en_5.1.4_3.4_1698353803382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mental_bert_base_uncased_masked_finetuned_0517_en_5.1.4_3.4_1698353803382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mental_bert_base_uncased_masked_finetuned_0517","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mental_bert_base_uncased_masked_finetuned_0517","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mental_bert_base_uncased_masked_finetuned_0517| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.8 MB| + +## References + +https://huggingface.co/YeRyeongLee/mental-bert-base-uncased-masked_finetuned-0517 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_cased_2_en.md b/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_cased_2_en.md new file mode 100644 index 000000000000..eba442af3b8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_cased_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mnli_bert_base_cased_2 BertForSequenceClassification from boychaboy +author: John Snow Labs +name: mnli_bert_base_cased_2 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mnli_bert_base_cased_2` is a English model originally trained by boychaboy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mnli_bert_base_cased_2_en_5.1.4_3.4_1698351977762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mnli_bert_base_cased_2_en_5.1.4_3.4_1698351977762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mnli_bert_base_cased_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mnli_bert_base_cased_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mnli_bert_base_cased_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/boychaboy/MNLI_bert-base-cased_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_cased_en.md new file mode 100644 index 000000000000..ee80cf946a0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_cased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mnli_bert_base_cased BertForSequenceClassification from boychaboy +author: John Snow Labs +name: mnli_bert_base_cased +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mnli_bert_base_cased` is a English model originally trained by boychaboy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mnli_bert_base_cased_en_5.1.4_3.4_1698351191247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mnli_bert_base_cased_en_5.1.4_3.4_1698351191247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mnli_bert_base_cased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mnli_bert_base_cased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mnli_bert_base_cased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/boychaboy/MNLI_bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_uncased_en.md new file mode 100644 index 000000000000..95e6db63c103 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mnli_bert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mnli_bert_base_uncased BertForSequenceClassification from boychaboy +author: John Snow Labs +name: mnli_bert_base_uncased +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mnli_bert_base_uncased` is a English model originally trained by boychaboy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mnli_bert_base_uncased_en_5.1.4_3.4_1698352907236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mnli_bert_base_uncased_en_5.1.4_3.4_1698352907236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mnli_bert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mnli_bert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mnli_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/boychaboy/MNLI_bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-monobert_large_msmarco_finetune_only_en.md b/docs/_posts/ahmedlone127/2023-10-26-monobert_large_msmarco_finetune_only_en.md new file mode 100644 index 000000000000..d5db3e3488c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-monobert_large_msmarco_finetune_only_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English monobert_large_msmarco_finetune_only BertForSequenceClassification from castorini +author: John Snow Labs +name: monobert_large_msmarco_finetune_only +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`monobert_large_msmarco_finetune_only` is a English model originally trained by castorini. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/monobert_large_msmarco_finetune_only_en_5.1.4_3.4_1698362521396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/monobert_large_msmarco_finetune_only_en_5.1.4_3.4_1698362521396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("monobert_large_msmarco_finetune_only","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("monobert_large_msmarco_finetune_only","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|monobert_large_msmarco_finetune_only| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/castorini/monobert-large-msmarco-finetune-only \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mrpc_bert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-10-26-mrpc_bert_base_cased_en.md new file mode 100644 index 000000000000..a47860449c16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mrpc_bert_base_cased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mrpc_bert_base_cased BertForSequenceClassification from hf-internal-testing +author: John Snow Labs +name: mrpc_bert_base_cased +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mrpc_bert_base_cased` is a English model originally trained by hf-internal-testing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mrpc_bert_base_cased_en_5.1.4_3.4_1698351037660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mrpc_bert_base_cased_en_5.1.4_3.4_1698351037660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mrpc_bert_base_cased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mrpc_bert_base_cased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mrpc_bert_base_cased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/hf-internal-testing/mrpc-bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-mulfakenews_mbert_en.md b/docs/_posts/ahmedlone127/2023-10-26-mulfakenews_mbert_en.md new file mode 100644 index 000000000000..06dcdc5c18c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-mulfakenews_mbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mulfakenews_mbert BertForSequenceClassification from tiya1012 +author: John Snow Labs +name: mulfakenews_mbert +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mulfakenews_mbert` is a English model originally trained by tiya1012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mulfakenews_mbert_en_5.1.4_3.4_1698319837731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mulfakenews_mbert_en_5.1.4_3.4_1698319837731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mulfakenews_mbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mulfakenews_mbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mulfakenews_mbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/tiya1012/mulfakenews_mbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-nepal_bhasa_bert_url_clasification_en.md b/docs/_posts/ahmedlone127/2023-10-26-nepal_bhasa_bert_url_clasification_en.md new file mode 100644 index 000000000000..b9db1fb60dfd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-nepal_bhasa_bert_url_clasification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nepal_bhasa_bert_url_clasification BertForSequenceClassification from priyabrat +author: John Snow Labs +name: nepal_bhasa_bert_url_clasification +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepal_bhasa_bert_url_clasification` is a English model originally trained by priyabrat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepal_bhasa_bert_url_clasification_en_5.1.4_3.4_1698327198231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepal_bhasa_bert_url_clasification_en_5.1.4_3.4_1698327198231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nepal_bhasa_bert_url_clasification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nepal_bhasa_bert_url_clasification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepal_bhasa_bert_url_clasification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/priyabrat/new_bert_url_clasification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_3000_rader_2_test_en.md b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_3000_rader_2_test_en.md new file mode 100644 index 000000000000..7fea58cf10b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_3000_rader_2_test_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English norbert2_sentiment_norec_english_gpu_3000_rader_2_test BertForSequenceClassification from NTCAL +author: John Snow Labs +name: norbert2_sentiment_norec_english_gpu_3000_rader_2_test +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norbert2_sentiment_norec_english_gpu_3000_rader_2_test` is a English model originally trained by NTCAL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_3000_rader_2_test_en_5.1.4_3.4_1698340187330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_3000_rader_2_test_en_5.1.4_3.4_1698340187330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_3000_rader_2_test","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_3000_rader_2_test","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norbert2_sentiment_norec_english_gpu_3000_rader_2_test| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/NTCAL/norbert2_sentiment_norec_en_gpu_3000_rader_2_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_3000_rader_3_en.md b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_3000_rader_3_en.md new file mode 100644 index 000000000000..ff81430dc74c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_3000_rader_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English norbert2_sentiment_norec_english_gpu_3000_rader_3 BertForSequenceClassification from NTCAL +author: John Snow Labs +name: norbert2_sentiment_norec_english_gpu_3000_rader_3 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norbert2_sentiment_norec_english_gpu_3000_rader_3` is a English model originally trained by NTCAL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_3000_rader_3_en_5.1.4_3.4_1698340554688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_3000_rader_3_en_5.1.4_3.4_1698340554688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_3000_rader_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_3000_rader_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norbert2_sentiment_norec_english_gpu_3000_rader_3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/NTCAL/norbert2_sentiment_norec_en_gpu_3000_rader_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_9_en.md b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_9_en.md new file mode 100644 index 000000000000..20c133b3ba44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English norbert2_sentiment_norec_english_gpu_500_rader_9 BertForSequenceClassification from NTCAL +author: John Snow Labs +name: norbert2_sentiment_norec_english_gpu_500_rader_9 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norbert2_sentiment_norec_english_gpu_500_rader_9` is a English model originally trained by NTCAL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_500_rader_9_en_5.1.4_3.4_1698341450446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_500_rader_9_en_5.1.4_3.4_1698341450446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_500_rader_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_500_rader_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norbert2_sentiment_norec_english_gpu_500_rader_9| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/NTCAL/norbert2_sentiment_norec_en_gpu_500_rader_9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_max_1_en.md b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_max_1_en.md new file mode 100644 index 000000000000..a5a2bb5b1933 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_max_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English norbert2_sentiment_norec_english_gpu_500_rader_max_1 BertForSequenceClassification from NTCAL +author: John Snow Labs +name: norbert2_sentiment_norec_english_gpu_500_rader_max_1 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norbert2_sentiment_norec_english_gpu_500_rader_max_1` is a English model originally trained by NTCAL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_500_rader_max_1_en_5.1.4_3.4_1698342448549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_500_rader_max_1_en_5.1.4_3.4_1698342448549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_500_rader_max_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_500_rader_max_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norbert2_sentiment_norec_english_gpu_500_rader_max_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/NTCAL/norbert2_sentiment_norec_en_gpu_500_rader_max_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task_en.md b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task_en.md new file mode 100644 index 000000000000..4e4012e7c74a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task BertForSequenceClassification from NTCAL +author: John Snow Labs +name: norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task` is a English model originally trained by NTCAL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task_en_5.1.4_3.4_1698343873004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task_en_5.1.4_3.4_1698343873004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norbert2_sentiment_norec_english_gpu_500_rader_max_noder_task| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/NTCAL/norbert2_sentiment_norec_en_gpu_500_rader_max_noder_task \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8_en.md b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8_en.md new file mode 100644 index 000000000000..48ae58c75955 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8 BertForSequenceClassification from NTCAL +author: John Snow Labs +name: norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8` is a English model originally trained by NTCAL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8_en_5.1.4_3.4_1698340916723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8_en_5.1.4_3.4_1698340916723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norbert2_sentiment_norec_tonga_tonga_islands_gpu_500_rader_8| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/NTCAL/norbert2_sentiment_norec_to_gpu_500_rader_8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-otus_fw_bert_en.md b/docs/_posts/ahmedlone127/2023-10-26-otus_fw_bert_en.md new file mode 100644 index 000000000000..aaab28e81015 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-otus_fw_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English otus_fw_bert BertForSequenceClassification from Dezzpil +author: John Snow Labs +name: otus_fw_bert +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`otus_fw_bert` is a English model originally trained by Dezzpil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/otus_fw_bert_en_5.1.4_3.4_1698363973670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/otus_fw_bert_en_5.1.4_3.4_1698363973670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("otus_fw_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("otus_fw_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|otus_fw_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/Dezzpil/otus-fw-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-rubert_base_cased_sentiment_mokoron_ru.md b/docs/_posts/ahmedlone127/2023-10-26-rubert_base_cased_sentiment_mokoron_ru.md new file mode 100644 index 000000000000..99fdeffc530d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-rubert_base_cased_sentiment_mokoron_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_cased_sentiment_mokoron BertForSequenceClassification from blanchefort +author: John Snow Labs +name: rubert_base_cased_sentiment_mokoron +date: 2023-10-26 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_sentiment_mokoron` is a Russian model originally trained by blanchefort. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentiment_mokoron_ru_5.1.4_3.4_1698346902055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentiment_mokoron_ru_5.1.4_3.4_1698346902055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentiment_mokoron","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentiment_mokoron","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_sentiment_mokoron| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/blanchefort/rubert-base-cased-sentiment-mokoron \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-sentiment_bert_imdb_10_en.md b/docs/_posts/ahmedlone127/2023-10-26-sentiment_bert_imdb_10_en.md new file mode 100644 index 000000000000..eaec92802487 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-sentiment_bert_imdb_10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_bert_imdb_10 BertForSequenceClassification from pachequinho +author: John Snow Labs +name: sentiment_bert_imdb_10 +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_bert_imdb_10` is a English model originally trained by pachequinho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_bert_imdb_10_en_5.1.4_3.4_1698342876890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_bert_imdb_10_en_5.1.4_3.4_1698342876890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_bert_imdb_10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_bert_imdb_10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_bert_imdb_10| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/pachequinho/sentiment_bert_imdb_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-strong_password_checker_bert_en.md b/docs/_posts/ahmedlone127/2023-10-26-strong_password_checker_bert_en.md new file mode 100644 index 000000000000..2163524e3bf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-strong_password_checker_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English strong_password_checker_bert BertForSequenceClassification from dima806 +author: John Snow Labs +name: strong_password_checker_bert +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`strong_password_checker_bert` is a English model originally trained by dima806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/strong_password_checker_bert_en_5.1.4_3.4_1698353885197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/strong_password_checker_bert_en_5.1.4_3.4_1698353885197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("strong_password_checker_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("strong_password_checker_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|strong_password_checker_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/dima806/strong-password-checker-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-test_bert_base_multilingual_uncased_sentiment_xx.md b/docs/_posts/ahmedlone127/2023-10-26-test_bert_base_multilingual_uncased_sentiment_xx.md new file mode 100644 index 000000000000..aa05e4101c48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-test_bert_base_multilingual_uncased_sentiment_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual test_bert_base_multilingual_uncased_sentiment BertForSequenceClassification from kkkzzzkkk +author: John Snow Labs +name: test_bert_base_multilingual_uncased_sentiment +date: 2023-10-26 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_bert_base_multilingual_uncased_sentiment` is a Multilingual model originally trained by kkkzzzkkk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_bert_base_multilingual_uncased_sentiment_xx_5.1.4_3.4_1698340580870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_bert_base_multilingual_uncased_sentiment_xx_5.1.4_3.4_1698340580870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("test_bert_base_multilingual_uncased_sentiment","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("test_bert_base_multilingual_uncased_sentiment","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_bert_base_multilingual_uncased_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/kkkzzzkkk/test_bert-base-multilingual-uncased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-trans_encoder_cross_simcse_bert_base_en.md b/docs/_posts/ahmedlone127/2023-10-26-trans_encoder_cross_simcse_bert_base_en.md new file mode 100644 index 000000000000..cddd2d53aa45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-trans_encoder_cross_simcse_bert_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English trans_encoder_cross_simcse_bert_base BertForSequenceClassification from cambridgeltl +author: John Snow Labs +name: trans_encoder_cross_simcse_bert_base +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trans_encoder_cross_simcse_bert_base` is a English model originally trained by cambridgeltl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trans_encoder_cross_simcse_bert_base_en_5.1.4_3.4_1698359783414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trans_encoder_cross_simcse_bert_base_en_5.1.4_3.4_1698359783414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("trans_encoder_cross_simcse_bert_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trans_encoder_cross_simcse_bert_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trans_encoder_cross_simcse_bert_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cambridgeltl/trans-encoder-cross-simcse-bert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-trans_encoder_cross_simcse_bert_large_en.md b/docs/_posts/ahmedlone127/2023-10-26-trans_encoder_cross_simcse_bert_large_en.md new file mode 100644 index 000000000000..5af81390dc43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-trans_encoder_cross_simcse_bert_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English trans_encoder_cross_simcse_bert_large BertForSequenceClassification from cambridgeltl +author: John Snow Labs +name: trans_encoder_cross_simcse_bert_large +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trans_encoder_cross_simcse_bert_large` is a English model originally trained by cambridgeltl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trans_encoder_cross_simcse_bert_large_en_5.1.4_3.4_1698361126331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trans_encoder_cross_simcse_bert_large_en_5.1.4_3.4_1698361126331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("trans_encoder_cross_simcse_bert_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trans_encoder_cross_simcse_bert_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trans_encoder_cross_simcse_bert_large| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/cambridgeltl/trans-encoder-cross-simcse-bert-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-26-umit_42000news_bert_turkish_en.md b/docs/_posts/ahmedlone127/2023-10-26-umit_42000news_bert_turkish_en.md new file mode 100644 index 000000000000..0500996f5e01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-26-umit_42000news_bert_turkish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English umit_42000news_bert_turkish BertForSequenceClassification from uisikdag +author: John Snow Labs +name: umit_42000news_bert_turkish +date: 2023-10-26 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`umit_42000news_bert_turkish` is a English model originally trained by uisikdag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/umit_42000news_bert_turkish_en_5.1.4_3.4_1698328170532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/umit_42000news_bert_turkish_en_5.1.4_3.4_1698328170532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("umit_42000news_bert_turkish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("umit_42000news_bert_turkish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|umit_42000news_bert_turkish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/uisikdag/umit_42000news_bert_turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-4_way_detection_prop_16_bert_en.md b/docs/_posts/ahmedlone127/2023-10-27-4_way_detection_prop_16_bert_en.md new file mode 100644 index 000000000000..9797758f3d7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-4_way_detection_prop_16_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 4_way_detection_prop_16_bert BertForSequenceClassification from ultra-coder54732 +author: John Snow Labs +name: 4_way_detection_prop_16_bert +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`4_way_detection_prop_16_bert` is a English model originally trained by ultra-coder54732. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/4_way_detection_prop_16_bert_en_5.1.4_3.4_1698394743938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/4_way_detection_prop_16_bert_en_5.1.4_3.4_1698394743938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("4_way_detection_prop_16_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("4_way_detection_prop_16_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|4_way_detection_prop_16_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ultra-coder54732/4-way-detection-prop-16-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_69_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_69_en.md new file mode 100644 index 000000000000..fd67f4c17551 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_69_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_69 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_69 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_69` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_69_en_5.1.4_3.4_1698365538986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_69_en_5.1.4_3.4_1698365538986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_69","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_69","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_69| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-69 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_70_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_70_en.md new file mode 100644 index 000000000000..11d0d90aa79c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_70_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_70 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_70 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_70` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_70_en_5.1.4_3.4_1698366363571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_70_en_5.1.4_3.4_1698366363571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_70","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_70","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_70| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-70 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_71_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_71_en.md new file mode 100644 index 000000000000..4d99ac6f2d5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_71_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_71 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_71 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_71` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_71_en_5.1.4_3.4_1698367018717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_71_en_5.1.4_3.4_1698367018717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_71","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_71","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_71| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-71 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_72_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_72_en.md new file mode 100644 index 000000000000..99b6852d8453 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_72_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_72 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_72 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_72` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_72_en_5.1.4_3.4_1698367893284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_72_en_5.1.4_3.4_1698367893284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_72","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_72","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_72| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-72 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_73_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_73_en.md new file mode 100644 index 000000000000..324416c4ebf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_73_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_73 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_73 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_73` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_73_en_5.1.4_3.4_1698368747805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_73_en_5.1.4_3.4_1698368747805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_73","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_73","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_73| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-73 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_74_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_74_en.md new file mode 100644 index 000000000000..f0030f1a0731 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_74_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_74 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_74 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_74` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_74_en_5.1.4_3.4_1698369732590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_74_en_5.1.4_3.4_1698369732590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_74","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_74","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_74| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-74 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_75_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_75_en.md new file mode 100644 index 000000000000..34805b938ca3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_75_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_75 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_75 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_75` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_75_en_5.1.4_3.4_1698370380205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_75_en_5.1.4_3.4_1698370380205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_75","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_75","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_75| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-75 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_76_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_76_en.md new file mode 100644 index 000000000000..ac6712870ed0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_76_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_76 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_76 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_76` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_76_en_5.1.4_3.4_1698371134365.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_76_en_5.1.4_3.4_1698371134365.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_76","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_76","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_76| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-76 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_77_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_77_en.md new file mode 100644 index 000000000000..d474590d5e13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_77_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_77 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_77 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_77` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_77_en_5.1.4_3.4_1698372340127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_77_en_5.1.4_3.4_1698372340127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_77","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_77","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_77| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-77 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_78_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_78_en.md new file mode 100644 index 000000000000..3f9a125dd643 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_78_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_78 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_78 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_78` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_78_en_5.1.4_3.4_1698373304752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_78_en_5.1.4_3.4_1698373304752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_78","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_78","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_78| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-78 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_79_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_79_en.md new file mode 100644 index 000000000000..820da687fe95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_79_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_79 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_79 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_79` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_79_en_5.1.4_3.4_1698374359675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_79_en_5.1.4_3.4_1698374359675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_79","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_79","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_79| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-79 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_80_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_80_en.md new file mode 100644 index 000000000000..a4e46715fc76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_80_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_80 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_80 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_80` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_80_en_5.1.4_3.4_1698375363251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_80_en_5.1.4_3.4_1698375363251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_80","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_80","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_80| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-80 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_81_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_81_en.md new file mode 100644 index 000000000000..7b06bd32f2fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_81_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_81 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_81 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_81` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_81_en_5.1.4_3.4_1698376103299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_81_en_5.1.4_3.4_1698376103299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_81","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_81","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_81| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-81 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_82_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_82_en.md new file mode 100644 index 000000000000..3cef8f83709b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_82_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_82 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_82 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_82` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_82_en_5.1.4_3.4_1698377069045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_82_en_5.1.4_3.4_1698377069045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_82","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_82","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_82| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-82 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_83_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_83_en.md new file mode 100644 index 000000000000..0e82c43b50c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_83_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_83 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_83 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_83` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_83_en_5.1.4_3.4_1698378024384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_83_en_5.1.4_3.4_1698378024384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_83","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_83","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_83| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-83 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_84_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_84_en.md new file mode 100644 index 000000000000..8026c7aab8ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_84_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_84 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_84 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_84` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_84_en_5.1.4_3.4_1698378892690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_84_en_5.1.4_3.4_1698378892690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_84","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_84","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_84| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-84 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_85_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_85_en.md new file mode 100644 index 000000000000..066fd44afa3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_85_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_85 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_85 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_85` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_85_en_5.1.4_3.4_1698379701713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_85_en_5.1.4_3.4_1698379701713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_85","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_85","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_85| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-85 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_86_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_86_en.md new file mode 100644 index 000000000000..d4d745d128ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_86_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_86 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_86 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_86` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_86_en_5.1.4_3.4_1698380350317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_86_en_5.1.4_3.4_1698380350317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_86","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_86","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_86| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-86 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_87_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_87_en.md new file mode 100644 index 000000000000..3e3b489c0ddb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_87_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_87 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_87 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_87` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_87_en_5.1.4_3.4_1698381019932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_87_en_5.1.4_3.4_1698381019932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_87","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_87","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_87| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-87 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_88_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_88_en.md new file mode 100644 index 000000000000..143c2c363ffb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_88_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_88 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_88 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_88` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_88_en_5.1.4_3.4_1698381906561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_88_en_5.1.4_3.4_1698381906561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_88","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_88","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_88| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-88 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_89_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_89_en.md new file mode 100644 index 000000000000..4155e2c09ad4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_89_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_89 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_89 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_89` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_89_en_5.1.4_3.4_1698382661140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_89_en_5.1.4_3.4_1698382661140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_89","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_89","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_89| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-89 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_90_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_90_en.md new file mode 100644 index 000000000000..0bc65b9dbee4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_90_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_90 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_90 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_90` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_90_en_5.1.4_3.4_1698383440064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_90_en_5.1.4_3.4_1698383440064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_90","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_90","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_90| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-90 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_91_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_91_en.md new file mode 100644 index 000000000000..de314c92f483 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_91_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_91 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_91 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_91` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_91_en_5.1.4_3.4_1698384314595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_91_en_5.1.4_3.4_1698384314595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_91","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_91","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_91| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-91 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_96_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_96_en.md new file mode 100644 index 000000000000..50183aa1e964 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_96 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_96 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_96` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_96_en_5.1.4_3.4_1698385344044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_96_en_5.1.4_3.4_1698385344044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_96| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_97_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_97_en.md new file mode 100644 index 000000000000..00e8495a2768 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_97_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_97 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_97 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_97` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_97_en_5.1.4_3.4_1698386201061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_97_en_5.1.4_3.4_1698386201061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_97","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_97","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_97| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-97 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_98_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_98_en.md new file mode 100644 index 000000000000..3f03c60d6373 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_98_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_98 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_98 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_98` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_98_en_5.1.4_3.4_1698386943136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_98_en_5.1.4_3.4_1698386943136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_98","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_98","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_98| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-98 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_99_en.md b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_99_en.md new file mode 100644 index 000000000000..30424a820987 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-512seq_len_6ep_bert_ft_cola_99_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 512seq_len_6ep_bert_ft_cola_99 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: 512seq_len_6ep_bert_ft_cola_99 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`512seq_len_6ep_bert_ft_cola_99` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_99_en_5.1.4_3.4_1698387764931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/512seq_len_6ep_bert_ft_cola_99_en_5.1.4_3.4_1698387764931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_99","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("512seq_len_6ep_bert_ft_cola_99","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|512seq_len_6ep_bert_ft_cola_99| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/512seq_len_6ep_bert_ft_cola-99 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-autotrain_bert_base_xxl_uncased_ft_85992142947_en.md b/docs/_posts/ahmedlone127/2023-10-27-autotrain_bert_base_xxl_uncased_ft_85992142947_en.md new file mode 100644 index 000000000000..ce2599f773cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-autotrain_bert_base_xxl_uncased_ft_85992142947_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_bert_base_xxl_uncased_ft_85992142947 BertForSequenceClassification from giuseppemartino +author: John Snow Labs +name: autotrain_bert_base_xxl_uncased_ft_85992142947 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_bert_base_xxl_uncased_ft_85992142947` is a English model originally trained by giuseppemartino. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_bert_base_xxl_uncased_ft_85992142947_en_5.1.4_3.4_1698376306067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_bert_base_xxl_uncased_ft_85992142947_en_5.1.4_3.4_1698376306067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_bert_base_xxl_uncased_ft_85992142947","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_bert_base_xxl_uncased_ft_85992142947","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_bert_base_xxl_uncased_ft_85992142947| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/giuseppemartino/autotrain-bert-base-xxl-uncased-ft-85992142947 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-beer_sentiment_bert_en.md b/docs/_posts/ahmedlone127/2023-10-27-beer_sentiment_bert_en.md new file mode 100644 index 000000000000..0f00192bc7bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-beer_sentiment_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English beer_sentiment_bert BertForSequenceClassification from GiRak +author: John Snow Labs +name: beer_sentiment_bert +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beer_sentiment_bert` is a English model originally trained by GiRak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beer_sentiment_bert_en_5.1.4_3.4_1698391623868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beer_sentiment_bert_en_5.1.4_3.4_1698391623868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("beer_sentiment_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("beer_sentiment_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beer_sentiment_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/GiRak/beer-sentiment-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_babe_2epochs_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_babe_2epochs_en.md new file mode 100644 index 000000000000..acbbdefef47d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_babe_2epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_babe_2epochs BertForSequenceClassification from jordankrishnayah +author: John Snow Labs +name: bert_babe_2epochs +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_babe_2epochs` is a English model originally trained by jordankrishnayah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_babe_2epochs_en_5.1.4_3.4_1698378459312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_babe_2epochs_en_5.1.4_3.4_1698378459312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_babe_2epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_babe_2epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_babe_2epochs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jordankrishnayah/bert-BABE-2epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_babe_3epochs_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_babe_3epochs_en.md new file mode 100644 index 000000000000..35d651758a8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_babe_3epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_babe_3epochs BertForSequenceClassification from jordankrishnayah +author: John Snow Labs +name: bert_babe_3epochs +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_babe_3epochs` is a English model originally trained by jordankrishnayah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_babe_3epochs_en_5.1.4_3.4_1698379460920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_babe_3epochs_en_5.1.4_3.4_1698379460920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_babe_3epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_babe_3epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_babe_3epochs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jordankrishnayah/bert-BABE-3epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_banking77_pt2_faviasono_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_banking77_pt2_faviasono_en.md new file mode 100644 index 000000000000..108204e4575c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_banking77_pt2_faviasono_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_banking77_pt2_faviasono BertForSequenceClassification from faviasono +author: John Snow Labs +name: bert_base_banking77_pt2_faviasono +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_faviasono` is a English model originally trained by faviasono. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_faviasono_en_5.1.4_3.4_1698378705526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_faviasono_en_5.1.4_3.4_1698378705526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_faviasono","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_faviasono","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_faviasono| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/faviasono/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_banking77_pt2_shreyasm_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_banking77_pt2_shreyasm_en.md new file mode 100644 index 000000000000..bea653e24cc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_banking77_pt2_shreyasm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_banking77_pt2_shreyasm BertForSequenceClassification from ShreyasM +author: John Snow Labs +name: bert_base_banking77_pt2_shreyasm +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_banking77_pt2_shreyasm` is a English model originally trained by ShreyasM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_shreyasm_en_5.1.4_3.4_1698367704237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_banking77_pt2_shreyasm_en_5.1.4_3.4_1698367704237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_shreyasm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_banking77_pt2_shreyasm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_banking77_pt2_shreyasm| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/ShreyasM/bert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_cased_best_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_cased_best_en.md new file mode 100644 index 000000000000..10ad265072b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_cased_best_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_best BertForSequenceClassification from edwardgowsmith +author: John Snow Labs +name: bert_base_cased_best +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_best` is a English model originally trained by edwardgowsmith. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_best_en_5.1.4_3.4_1698397074640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_best_en_5.1.4_3.4_1698397074640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_best","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_best","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_best| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/edwardgowsmith/bert-base-cased-best \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_cased_sst2_charlescao2023_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_cased_sst2_charlescao2023_en.md new file mode 100644 index 000000000000..79e4d4095f3d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_cased_sst2_charlescao2023_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_sst2_charlescao2023 BertForSequenceClassification from charlescao2023 +author: John Snow Labs +name: bert_base_cased_sst2_charlescao2023 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_sst2_charlescao2023` is a English model originally trained by charlescao2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_sst2_charlescao2023_en_5.1.4_3.4_1698384809101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_sst2_charlescao2023_en_5.1.4_3.4_1698384809101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_sst2_charlescao2023","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_sst2_charlescao2023","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_sst2_charlescao2023| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/charlescao2023/bert-base-cased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1_en.md new file mode 100644 index 000000000000..f3ae612149ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1_en_5.1.4_3.4_1698393668345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1_en_5.1.4_3.4_1698393668345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-wallstreetcn-morning-news-market-overview-SSEC-f1-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1_en.md new file mode 100644 index 000000000000..a46d8376dfca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1_en_5.1.4_3.4_1698394502427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1_en_5.1.4_3.4_1698394502427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-wallstreetcn-morning-news-market-overview-SSEC-f1-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2_en.md new file mode 100644 index 000000000000..51005559bcf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2_en_5.1.4_3.4_1698395126917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2_en_5.1.4_3.4_1698395126917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-wallstreetcn-morning-news-market-overview-SSEC-f1-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3_en.md new file mode 100644 index 000000000000..1d476e0469c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3_en_5.1.4_3.4_1698395887581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3_en_5.1.4_3.4_1698395887581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-wallstreetcn-morning-news-market-overview-SSEC-f1-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4_en.md new file mode 100644 index 000000000000..4307c6523061 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4_en_5.1.4_3.4_1698396592548.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4_en_5.1.4_3.4_1698396592548.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-wallstreetcn-morning-news-market-overview-SSEC-f1-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5_en.md new file mode 100644 index 000000000000..768d68b5052d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5_en_5.1.4_3.4_1698397380576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5_en_5.1.4_3.4_1698397380576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wallstreetcn_morning_news_market_overview_ssec_f1_v5| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-wallstreetcn-morning-news-market-overview-SSEC-f1-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_dutch_cased_finetuned_snli_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_dutch_cased_finetuned_snli_en.md new file mode 100644 index 000000000000..5c6a04a51d68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_dutch_cased_finetuned_snli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_dutch_cased_finetuned_snli BertForSequenceClassification from LoicDL +author: John Snow Labs +name: bert_base_dutch_cased_finetuned_snli +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_dutch_cased_finetuned_snli` is a English model originally trained by LoicDL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_snli_en_5.1.4_3.4_1698380108206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_snli_en_5.1.4_3.4_1698380108206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_dutch_cased_finetuned_snli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_dutch_cased_finetuned_snli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_dutch_cased_finetuned_snli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.0 MB| + +## References + +https://huggingface.co/LoicDL/bert-base-dutch-cased-finetuned-snli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_ehddnr_ynat_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_ehddnr_ynat_en.md new file mode 100644 index 000000000000..9de21d2b18ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_ehddnr_ynat_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_ehddnr_ynat BertForSequenceClassification from ehddnr301 +author: John Snow Labs +name: bert_base_ehddnr_ynat +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_ehddnr_ynat` is a English model originally trained by ehddnr301. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_ehddnr_ynat_en_5.1.4_3.4_1698397930103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_ehddnr_ynat_en_5.1.4_3.4_1698397930103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_ehddnr_ynat","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_ehddnr_ynat","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_ehddnr_ynat| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/ehddnr301/bert-base-ehddnr-ynat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_multilingual_cased_finetuned_news_headlines_xx.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_multilingual_cased_finetuned_news_headlines_xx.md new file mode 100644 index 000000000000..60162212baa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_multilingual_cased_finetuned_news_headlines_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_news_headlines BertForSequenceClassification from chrommium +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_news_headlines +date: 2023-10-27 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_news_headlines` is a Multilingual model originally trained by chrommium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_news_headlines_xx_5.1.4_3.4_1698367435420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_news_headlines_xx_5.1.4_3.4_1698367435420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_news_headlines","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_news_headlines","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_news_headlines| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/chrommium/bert-base-multilingual-cased-finetuned-news-headlines \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2_xx.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2_xx.md new file mode 100644 index 000000000000..2791b3c8a32e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2 BertForSequenceClassification from adnanakbr +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2 +date: 2023-10-27 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2` is a Multilingual model originally trained by adnanakbr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2_xx_5.1.4_3.4_1698382252086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2_xx_5.1.4_3.4_1698382252086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_fine_tuned_for_amazon_english_reviews_on_200k_review_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.8 MB| + +## References + +https://huggingface.co/adnanakbr/bert-base-multilingual-uncased-sentiment-fine_tuned_for_amazon_english_reviews_on_200K_review_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_turkish_cased_emotion_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_turkish_cased_emotion_analysis_en.md new file mode 100644 index 000000000000..e651901348d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_turkish_cased_emotion_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_turkish_cased_emotion_analysis BertForSequenceClassification from maymuni +author: John Snow Labs +name: bert_base_turkish_cased_emotion_analysis +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_cased_emotion_analysis` is a English model originally trained by maymuni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_cased_emotion_analysis_en_5.1.4_3.4_1698375165036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_cased_emotion_analysis_en_5.1.4_3.4_1698375165036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_cased_emotion_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_cased_emotion_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_cased_emotion_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/maymuni/bert-base-turkish-cased-emotion-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_emotion_sabersol_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_emotion_sabersol_en.md new file mode 100644 index 000000000000..ab90e706838c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_emotion_sabersol_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_emotion_sabersol BertForSequenceClassification from sabersol +author: John Snow Labs +name: bert_base_uncased_emotion_sabersol +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_emotion_sabersol` is a English model originally trained by sabersol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_sabersol_en_5.1.4_3.4_1698389721705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_emotion_sabersol_en_5.1.4_3.4_1698389721705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_sabersol","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_emotion_sabersol","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_emotion_sabersol| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sabersol/bert-base-uncased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_eurlex_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_eurlex_en.md new file mode 100644 index 000000000000..a4697f0db7c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_eurlex_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English bert_base_uncased_eurlex BertEmbeddings from nlpaueb +author: John Snow Labs +name: bert_base_uncased_eurlex +date: 2023-10-27 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_eurlex` is a English model originally trained by nlpaueb. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_eurlex_en_5.1.4_3.4_1698386885020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_eurlex_en_5.1.4_3.4_1698386885020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_eurlex","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_eurlex", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_eurlex| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +References + +https://huggingface.co/nlpaueb/bert-base-uncased-eurlex \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_3d_sentiment_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_3d_sentiment_en.md new file mode 100644 index 000000000000..03f9ea997f01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_3d_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_3d_sentiment BertForSequenceClassification from venetis +author: John Snow Labs +name: bert_base_uncased_finetuned_3d_sentiment +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_3d_sentiment` is a English model originally trained by venetis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_3d_sentiment_en_5.1.4_3.4_1698389726355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_3d_sentiment_en_5.1.4_3.4_1698389726355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_3d_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_3d_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_3d_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/venetis/bert-base-uncased-finetuned-3d-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sdg_mar23_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sdg_mar23_en.md new file mode 100644 index 000000000000..2d08a0b568c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sdg_mar23_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sdg_mar23 BertForSequenceClassification from jonas +author: John Snow Labs +name: bert_base_uncased_finetuned_sdg_mar23 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sdg_mar23` is a English model originally trained by jonas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sdg_mar23_en_5.1.4_3.4_1698372272115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sdg_mar23_en_5.1.4_3.4_1698372272115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sdg_mar23","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sdg_mar23","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sdg_mar23| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/jonas/bert-base-uncased-finetuned-sdg-Mar23 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sentiment_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sentiment_en.md new file mode 100644 index 000000000000..a7c3490fc2d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sentiment BertForSequenceClassification from riddhi17pawar +author: John Snow Labs +name: bert_base_uncased_finetuned_sentiment +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sentiment` is a English model originally trained by riddhi17pawar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sentiment_en_5.1.4_3.4_1698383750184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sentiment_en_5.1.4_3.4_1698383750184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/riddhi17pawar/bert-base-uncased-finetuned-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_202k_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_202k_en.md new file mode 100644 index 000000000000..90f2b2ae35e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_202k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_202k BertForSequenceClassification from 202k +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_202k +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_202k` is a English model originally trained by 202k. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_202k_en_5.1.4_3.4_1698376102993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_202k_en_5.1.4_3.4_1698376102993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_202k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_202k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_202k| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/202k/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_junwupark_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_junwupark_en.md new file mode 100644 index 000000000000..1976a232550f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_junwupark_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_junwupark BertForSequenceClassification from junwupark +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_junwupark +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_junwupark` is a English model originally trained by junwupark. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_junwupark_en_5.1.4_3.4_1698395672524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_junwupark_en_5.1.4_3.4_1698395672524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_junwupark","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_junwupark","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_junwupark| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/junwupark/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_khs05109_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_khs05109_en.md new file mode 100644 index 000000000000..74d74639ed44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_finetuned_sst2_khs05109_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sst2_khs05109 BertForSequenceClassification from khs05109 +author: John Snow Labs +name: bert_base_uncased_finetuned_sst2_khs05109 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sst2_khs05109` is a English model originally trained by khs05109. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_khs05109_en_5.1.4_3.4_1698380510756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sst2_khs05109_en_5.1.4_3.4_1698380510756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_khs05109","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sst2_khs05109","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sst2_khs05109| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/khs05109/bert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_poems_sentiment_jonathan0528_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_poems_sentiment_jonathan0528_en.md new file mode 100644 index 000000000000..1a1dbda3679b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_poems_sentiment_jonathan0528_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_poems_sentiment_jonathan0528 BertForSequenceClassification from Jonathan0528 +author: John Snow Labs +name: bert_base_uncased_poems_sentiment_jonathan0528 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_poems_sentiment_jonathan0528` is a English model originally trained by Jonathan0528. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_poems_sentiment_jonathan0528_en_5.1.4_3.4_1698373289776.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_poems_sentiment_jonathan0528_en_5.1.4_3.4_1698373289776.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_poems_sentiment_jonathan0528","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_poems_sentiment_jonathan0528","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_poems_sentiment_jonathan0528| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jonathan0528/bert-base-uncased-poems-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_qqp_f87_8_d36_hybrid_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_qqp_f87_8_d36_hybrid_en.md new file mode 100644 index 000000000000..68378307fba1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_qqp_f87_8_d36_hybrid_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_qqp_f87_8_d36_hybrid BertForSequenceClassification from echarlaix +author: John Snow Labs +name: bert_base_uncased_qqp_f87_8_d36_hybrid +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qqp_f87_8_d36_hybrid` is a English model originally trained by echarlaix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_f87_8_d36_hybrid_en_5.1.4_3.4_1698393520026.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_f87_8_d36_hybrid_en_5.1.4_3.4_1698393520026.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_f87_8_d36_hybrid","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_f87_8_d36_hybrid","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qqp_f87_8_d36_hybrid| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|218.2 MB| + +## References + +https://huggingface.co/echarlaix/bert-base-uncased-qqp-f87.8-d36-hybrid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst2_acc91_1_d37_hybrid_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst2_acc91_1_d37_hybrid_en.md new file mode 100644 index 000000000000..8005434b0855 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst2_acc91_1_d37_hybrid_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst2_acc91_1_d37_hybrid BertForSequenceClassification from echarlaix +author: John Snow Labs +name: bert_base_uncased_sst2_acc91_1_d37_hybrid +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_acc91_1_d37_hybrid` is a English model originally trained by echarlaix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_acc91_1_d37_hybrid_en_5.1.4_3.4_1698394240737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_acc91_1_d37_hybrid_en_5.1.4_3.4_1698394240737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_acc91_1_d37_hybrid","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_acc91_1_d37_hybrid","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_acc91_1_d37_hybrid| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|221.6 MB| + +## References + +https://huggingface.co/echarlaix/bert-base-uncased-sst2-acc91.1-d37-hybrid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst2_static_quant_test_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst2_static_quant_test_en.md new file mode 100644 index 000000000000..5d651e54fc1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst2_static_quant_test_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst2_static_quant_test BertForSequenceClassification from echarlaix +author: John Snow Labs +name: bert_base_uncased_sst2_static_quant_test +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_static_quant_test` is a English model originally trained by echarlaix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_static_quant_test_en_5.1.4_3.4_1698394820936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_static_quant_test_en_5.1.4_3.4_1698394820936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_static_quant_test","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_static_quant_test","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_static_quant_test| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/echarlaix/bert-base-uncased-sst2-static-quant-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst_bin_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst_bin_en.md new file mode 100644 index 000000000000..c38c87154a47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_sst_bin_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst_bin BertForSequenceClassification from jjezabek +author: John Snow Labs +name: bert_base_uncased_sst_bin +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst_bin` is a English model originally trained by jjezabek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst_bin_en_5.1.4_3.4_1698391623464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst_bin_en_5.1.4_3.4_1698391623464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst_bin","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst_bin","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst_bin| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jjezabek/bert-base-uncased-sst_bin \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_title_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_title_fine_tuned_en.md new file mode 100644 index 000000000000..4f15ab57cc44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_title_fine_tuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_title_fine_tuned BertForSequenceClassification from Izarel +author: John Snow Labs +name: bert_base_uncased_title_fine_tuned +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_title_fine_tuned` is a English model originally trained by Izarel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_title_fine_tuned_en_5.1.4_3.4_1698365619911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_title_fine_tuned_en_5.1.4_3.4_1698365619911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_title_fine_tuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_title_fine_tuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_title_fine_tuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Izarel/bert-base-uncased_title_fine_tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_uhack_reviews_multilabel_clf_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_uhack_reviews_multilabel_clf_en.md new file mode 100644 index 000000000000..33e62f675849 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_uhack_reviews_multilabel_clf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_uhack_reviews_multilabel_clf BertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: bert_base_uncased_uhack_reviews_multilabel_clf +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_uhack_reviews_multilabel_clf` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_uhack_reviews_multilabel_clf_en_5.1.4_3.4_1698366363705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_uhack_reviews_multilabel_clf_en_5.1.4_3.4_1698366363705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_uhack_reviews_multilabel_clf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_uhack_reviews_multilabel_clf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_uhack_reviews_multilabel_clf| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/DunnBC22/bert-base-uncased-uHack_reviews_multilabel_clf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_yelp_bin_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_yelp_bin_en.md new file mode 100644 index 000000000000..0fcd6268a58b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_base_uncased_yelp_bin_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_yelp_bin BertForSequenceClassification from jjezabek +author: John Snow Labs +name: bert_base_uncased_yelp_bin +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_yelp_bin` is a English model originally trained by jjezabek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yelp_bin_en_5.1.4_3.4_1698392659480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yelp_bin_en_5.1.4_3.4_1698392659480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_yelp_bin","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_yelp_bin","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_yelp_bin| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jjezabek/bert-base-uncased-yelp_bin \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classification_experience_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classification_experience_en.md new file mode 100644 index 000000000000..df5ad6823185 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classification_experience_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classification_experience BertForSequenceClassification from Donaldbassa +author: John Snow Labs +name: bert_classification_experience +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classification_experience` is a English model originally trained by Donaldbassa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classification_experience_en_5.1.4_3.4_1698377612459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classification_experience_en_5.1.4_3.4_1698377612459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classification_experience","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classification_experience","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classification_experience| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Donaldbassa/bert-classification-experience \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classification_text_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classification_text_en.md new file mode 100644 index 000000000000..2656174a3571 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classification_text_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classification_text BertForSequenceClassification from Donaldbassa +author: John Snow Labs +name: bert_classification_text +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classification_text` is a English model originally trained by Donaldbassa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classification_text_en_5.1.4_3.4_1698380939200.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classification_text_en_5.1.4_3.4_1698380939200.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classification_text","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classification_text","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classification_text| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Donaldbassa/bert-classification-text \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_amitkayal_finetuned_semitic_languages_eval_english_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_amitkayal_finetuned_semitic_languages_eval_english_en.md new file mode 100644 index 000000000000..589d63bb2cb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_amitkayal_finetuned_semitic_languages_eval_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classifier_amitkayal_finetuned_semitic_languages_eval_english BertForSequenceClassification from amitkayal +author: John Snow Labs +name: bert_classifier_amitkayal_finetuned_semitic_languages_eval_english +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_amitkayal_finetuned_semitic_languages_eval_english` is a English model originally trained by amitkayal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_amitkayal_finetuned_semitic_languages_eval_english_en_5.1.4_3.4_1698375858566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_amitkayal_finetuned_semitic_languages_eval_english_en_5.1.4_3.4_1698375858566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_amitkayal_finetuned_semitic_languages_eval_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_amitkayal_finetuned_semitic_languages_eval_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_amitkayal_finetuned_semitic_languages_eval_english| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/amitkayal/bert-finetuned-sem_eval-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_base_german_cased_hatespeech_germeval18coarse_de.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_base_german_cased_hatespeech_germeval18coarse_de.md new file mode 100644 index 000000000000..30eaee64b909 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_base_german_cased_hatespeech_germeval18coarse_de.md @@ -0,0 +1,107 @@ +--- +layout: model +title: German BertForSequenceClassification Base Cased model (from deepset) +author: John Snow Labs +name: bert_classifier_base_german_cased_hatespeech_germeval18coarse +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, de, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-german-cased-hatespeech-GermEval18Coarse` is a German model originally trained by `deepset`. + +## Predicted Entities + +`OTHER`, `OFFENSE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_german_cased_hatespeech_germeval18coarse_de_5.1.4_3.4_1698382927007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_german_cased_hatespeech_germeval18coarse_de_5.1.4_3.4_1698382927007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_german_cased_hatespeech_germeval18coarse","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_german_cased_hatespeech_germeval18coarse","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.bert.hate.cased_base").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_german_cased_hatespeech_germeval18coarse| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/deepset/bert-base-german-cased-hatespeech-GermEval18Coarse +- https://deepset.ai/german-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_base_german_cased_sentiment_germeval17_de.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_base_german_cased_sentiment_germeval17_de.md new file mode 100644 index 000000000000..744caa185c82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_base_german_cased_sentiment_germeval17_de.md @@ -0,0 +1,106 @@ +--- +layout: model +title: German BertForSequenceClassification Base Cased model (from deepset) +author: John Snow Labs +name: bert_classifier_base_german_cased_sentiment_germeval17 +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, de, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-german-cased-sentiment-Germeval17` is a German model originally trained by `deepset`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_german_cased_sentiment_germeval17_de_5.1.4_3.4_1698384314655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_german_cased_sentiment_germeval17_de_5.1.4_3.4_1698384314655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_german_cased_sentiment_germeval17","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_german_cased_sentiment_germeval17","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.bert.sentiment.cased_base.by_deepset").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_german_cased_sentiment_germeval17| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/deepset/bert-base-german-cased-sentiment-Germeval17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_bcms_ic_frenk_hate_hr.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_bcms_ic_frenk_hate_hr.md new file mode 100644 index 000000000000..65bd35917e57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_bcms_ic_frenk_hate_hr.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Croatian BertForSequenceClassification Cased model (from classla) +author: John Snow Labs +name: bert_classifier_bcms_ic_frenk_hate +date: 2023-10-27 +tags: [hr, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: hr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bcms-bertic-frenk-hate` is a Croatian model originally trained by `classla`. + +## Predicted Entities + +`Acceptable`, `Offensive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bcms_ic_frenk_hate_hr_5.1.4_3.4_1698377473154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bcms_ic_frenk_hate_hr_5.1.4_3.4_1698377473154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bcms_ic_frenk_hate","hr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bcms_ic_frenk_hate","hr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("hr.classify.bert.hate.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bcms_ic_frenk_hate| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|hr| +|Size:|465.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/classla/bcms-bertic-frenk-hate +- https://www.clarin.si/repository/xmlui/handle/11356/1433 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_benchmark_finetuned_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_benchmark_finetuned_en.md new file mode 100644 index 000000000000..5974dc6b97dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_benchmark_finetuned_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from hazrulakmal) +author: John Snow Labs +name: bert_classifier_benchmark_finetuned +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `benchmark-finetuned-bert` is a English model originally trained by `hazrulakmal`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_benchmark_finetuned_en_5.1.4_3.4_1698365071272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_benchmark_finetuned_en_5.1.4_3.4_1698365071272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_benchmark_finetuned","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_benchmark_finetuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.finetuned.by_hazrulakmal").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_benchmark_finetuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/hazrulakmal/benchmark-finetuned-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_bert_base_turkish_bullying_tr.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_bert_base_turkish_bullying_tr.md new file mode 100644 index 000000000000..84cdf580a0f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_bert_base_turkish_bullying_tr.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Base Cased model (from nanelimon) +author: John Snow Labs +name: bert_classifier_bert_base_turkish_bullying +date: 2023-10-27 +tags: [tr, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-bullying` is a Turkish model originally trained by `nanelimon`. + +## Predicted Entities + +`Nötr`, `Kızdırma/Hakaret`, `Cinsiyetçi Zorbalık`, `Irkçılık` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_turkish_bullying_tr_5.1.4_3.4_1698379312870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_turkish_bullying_tr_5.1.4_3.4_1698379312870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_turkish_bullying","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_turkish_bullying","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.bert.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_turkish_bullying| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|691.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nanelimon/bert-base-turkish-bullying \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_fine_tuned_cola1_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_fine_tuned_cola1_en.md new file mode 100644 index 000000000000..1cd383ecfd2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_fine_tuned_cola1_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from VanHoan) +author: John Snow Labs +name: bert_classifier_fine_tuned_cola1 +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-fine-tuned-cola1` is a English model originally trained by `VanHoan`. + +## Predicted Entities + +`unacceptable`, `acceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_fine_tuned_cola1_en_5.1.4_3.4_1698392369384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_fine_tuned_cola1_en_5.1.4_3.4_1698392369384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_fine_tuned_cola1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_fine_tuned_cola1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.cola1.by_vanhoan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_fine_tuned_cola1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/VanHoan/bert-fine-tuned-cola1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_fine_tuned_cola2_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_fine_tuned_cola2_en.md new file mode 100644 index 000000000000..d70af3a6907e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_fine_tuned_cola2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from VanHoan) +author: John Snow Labs +name: bert_classifier_fine_tuned_cola2 +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-fine-tuned-cola2` is a English model originally trained by `VanHoan`. + +## Predicted Entities + +`unacceptable`, `acceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_fine_tuned_cola2_en_5.1.4_3.4_1698393571914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_fine_tuned_cola2_en_5.1.4_3.4_1698393571914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_fine_tuned_cola2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_fine_tuned_cola2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.cola2.by_vanhoan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_fine_tuned_cola2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/VanHoan/bert-fine-tuned-cola2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_finetuned_semantic_chinese_zh.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_finetuned_semantic_chinese_zh.md new file mode 100644 index 000000000000..9e590dbfe5ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_finetuned_semantic_chinese_zh.md @@ -0,0 +1,111 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from Ayazhankad) +author: John Snow Labs +name: bert_classifier_finetuned_semantic_chinese +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, zh, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-semantic-chinese` is a Chinese model originally trained by `Ayazhankad`. + +## Predicted Entities + +`Star_1`, `Star_2`, `Star_3`, `Star_4`, `Star_5` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_semantic_chinese_zh_5.1.4_3.4_1698390010302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_semantic_chinese_zh_5.1.4_3.4_1698390010302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_semantic_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_semantic_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_finetuned_semantic_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Ayazhankad/bert-finetuned-semantic-chinese +- https://www.kaggle.com/datasets/utmhikari/doubanmovieshortcomments +- https://www.kaggle.com +- https://en.wikipedia.org/wiki/Douban#:~:text=Douban.com%20(Chinese%3A%20%E8%B1%86%E7%93%A3,and%20activities%20in%20Chinese%20cities. +- https://www.kaggle.com/datasets/utmhikari/doubanmovieshortcomments +- https://www.kaggle.com \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_gbert_base_germandpr_reranking_de.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_gbert_base_germandpr_reranking_de.md new file mode 100644 index 000000000000..ce91ff6a20b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_gbert_base_germandpr_reranking_de.md @@ -0,0 +1,111 @@ +--- +layout: model +title: German BertForSequenceClassification Base Cased model (from deepset) +author: John Snow Labs +name: bert_classifier_gbert_base_germandpr_reranking +date: 2023-10-27 +tags: [de, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `gbert-base-germandpr-reranking` is a German model originally trained by `deepset`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_gbert_base_germandpr_reranking_de_5.1.4_3.4_1698385455839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_gbert_base_germandpr_reranking_de_5.1.4_3.4_1698385455839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_gbert_base_germandpr_reranking","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_gbert_base_germandpr_reranking","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.bert.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_gbert_base_germandpr_reranking| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/deepset/gbert-base-germandpr-reranking +- https://github.com/deepset-ai/haystack/ +- https://deepset.ai/german-bert +- https://deepset.ai/germanquad +- https://github.com/deepset-ai/FARM +- https://github.com/deepset-ai/haystack/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_ibrahim2030_tiny_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_ibrahim2030_tiny_sst2_distilled_en.md new file mode 100644 index 000000000000..d317f4849853 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_ibrahim2030_tiny_sst2_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from ibrahim2030) +author: John Snow Labs +name: bert_classifier_ibrahim2030_tiny_sst2_distilled +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-distilled` is a English model originally trained by `ibrahim2030`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_ibrahim2030_tiny_sst2_distilled_en_5.1.4_3.4_1698388808734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_ibrahim2030_tiny_sst2_distilled_en_5.1.4_3.4_1698388808734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_ibrahim2030_tiny_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_ibrahim2030_tiny_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.distilled_tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_ibrahim2030_tiny_sst2_distilled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ibrahim2030/tiny-bert-sst2-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_base_cased_dp_paraphrase_detection_ru.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_base_cased_dp_paraphrase_detection_ru.md new file mode 100644 index 000000000000..a6da6222951b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_base_cased_dp_paraphrase_detection_ru.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Russian BertForSequenceClassification Base Cased model (from cointegrated) +author: John Snow Labs +name: bert_classifier_rubert_base_cased_dp_paraphrase_detection +date: 2023-10-27 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-base-cased-dp-paraphrase-detection` is a Russian model originally trained by `cointegrated`. + +## Predicted Entities + +`entailment`, `not_entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_base_cased_dp_paraphrase_detection_ru_5.1.4_3.4_1698379110552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_base_cased_dp_paraphrase_detection_ru_5.1.4_3.4_1698379110552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_base_cased_dp_paraphrase_detection","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_base_cased_dp_paraphrase_detection","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.bert.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rubert_base_cased_dp_paraphrase_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cointegrated/rubert-base-cased-dp-paraphrase-detection +- http://docs.deeppavlov.ai/en/master/features/overview.html#ranking-model-docs +- http://paraphraser.ru/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny2_cedr_emotion_detection_ru.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny2_cedr_emotion_detection_ru.md new file mode 100644 index 000000000000..51ac8f4cc759 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny2_cedr_emotion_detection_ru.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Russian BertForSequenceClassification Tiny Cased model (from cointegrated) +author: John Snow Labs +name: bert_classifier_rubert_tiny2_cedr_emotion_detection +date: 2023-10-27 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-tiny2-cedr-emotion-detection` is a Russian model originally trained by `cointegrated`. + +## Predicted Entities + +`sadness`, `fear`, `surprise`, `anger`, `no_emotion`, `joy` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny2_cedr_emotion_detection_ru_5.1.4_3.4_1698380977825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny2_cedr_emotion_detection_ru_5.1.4_3.4_1698380977825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny2_cedr_emotion_detection","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny2_cedr_emotion_detection","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.emotion.bert.tiny.by_cointegrated").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rubert_tiny2_cedr_emotion_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cointegrated/rubert-tiny2-cedr-emotion-detection +- https://doi.org/10.1016/j.procs.2021.06.075 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny_sentiment_balanced_ru.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny_sentiment_balanced_ru.md new file mode 100644 index 000000000000..6f8b10adf09f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny_sentiment_balanced_ru.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Russian BertForSequenceClassification Tiny Cased model (from cointegrated) +author: John Snow Labs +name: bert_classifier_rubert_tiny_sentiment_balanced +date: 2023-10-27 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-tiny-sentiment-balanced` is a Russian model originally trained by `cointegrated`. + +## Predicted Entities + +`negative`, `neutral`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny_sentiment_balanced_ru_5.1.4_3.4_1698379765590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny_sentiment_balanced_ru_5.1.4_3.4_1698379765590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny_sentiment_balanced","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny_sentiment_balanced","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.bert.sentiment.tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rubert_tiny_sentiment_balanced| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|44.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cointegrated/rubert-tiny-sentiment-balanced +- https://github.com/sismetanin/sentiment-analysis-in-russian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny_toxicity_ru.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny_toxicity_ru.md new file mode 100644 index 000000000000..0460aa41c736 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_rubert_tiny_toxicity_ru.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Russian BertForSequenceClassification Tiny Cased model (from cointegrated) +author: John Snow Labs +name: bert_classifier_rubert_tiny_toxicity +date: 2023-10-27 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-tiny-toxicity` is a Russian model originally trained by `cointegrated`. + +## Predicted Entities + +`insult`, `dangerous`, `obscenity`, `non-toxic`, `threat` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny_toxicity_ru_5.1.4_3.4_1698380240122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny_toxicity_ru_5.1.4_3.4_1698380240122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny_toxicity","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny_toxicity","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.toxic.bert.tiny.by_cointegrated").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rubert_tiny_toxicity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|44.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cointegrated/rubert-tiny-toxicity +- https://cups.mail.ru/ru/tasks/1048 +- https://arxiv.org/abs/2103.05345 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_russian_toxic_ru.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_russian_toxic_ru.md new file mode 100644 index 000000000000..145b25438092 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_russian_toxic_ru.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Russian BertForSequenceClassification Tiny Cased model (from chgk13) +author: John Snow Labs +name: bert_classifier_tiny_russian_toxic +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, ru, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny_russian_toxic_bert` is a Russian model originally trained by `chgk13`. + +## Predicted Entities + +`toxic`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_russian_toxic_ru_5.1.4_3.4_1698365560351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_russian_toxic_ru_5.1.4_3.4_1698365560351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_russian_toxic","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Я люблю Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_russian_toxic","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Я люблю Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_russian_toxic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|44.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/chgk13/tiny_russian_toxic_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation_en.md new file mode 100644 index 000000000000..d4201a230dc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from gokuls) +author: John Snow Labs +name: bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-1_mobilebert_2_bert_3_gold_labels-distillation` is a English model originally trained by `gokuls`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation_en_5.1.4_3.4_1698397508959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation_en_5.1.4_3.4_1698397508959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue_gold_labels.distilled_tiny.by_gokuls").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_sst2_1_mobile_2_3_gold_labels_distillation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gokuls/tiny-bert-sst2-1_mobilebert_2_bert_3_gold_labels-distillation +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_distillation_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_distillation_en.md new file mode 100644 index 000000000000..86c4336d5526 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_distillation_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from gokuls) +author: John Snow Labs +name: bert_classifier_tiny_sst2_1_mobile_2_distillation +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-1_mobilebert-2_bert-distillation` is a English model originally trained by `gokuls`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_2_distillation_en_5.1.4_3.4_1698395142255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_2_distillation_en_5.1.4_3.4_1698395142255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_2_distillation","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_2_distillation","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.ssts2.mobile.distilled_tiny.by_gokuls").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_sst2_1_mobile_2_distillation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gokuls/tiny-bert-sst2-1_mobilebert-2_bert-distillation +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_only_distillation_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_only_distillation_en.md new file mode 100644 index 000000000000..38b2e7f7f8e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_2_only_distillation_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from gokuls) +author: John Snow Labs +name: bert_classifier_tiny_sst2_1_mobile_2_only_distillation +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-1_mobilebert_2_bert-only-distillation` is a English model originally trained by `gokuls`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_2_only_distillation_en_5.1.4_3.4_1698397053380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_2_only_distillation_en_5.1.4_3.4_1698397053380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_2_only_distillation","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_2_only_distillation","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.only_distilled_tiny.by_gokuls").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_sst2_1_mobile_2_only_distillation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gokuls/tiny-bert-sst2-1_mobilebert_2_bert-only-distillation +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation_en.md new file mode 100644 index 000000000000..953336899883 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from gokuls) +author: John Snow Labs +name: bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-1_mobilebert_and_bert-multi-teacher-distillation` is a English model originally trained by `gokuls`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation_en_5.1.4_3.4_1698396006155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation_en_5.1.4_3.4_1698396006155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.mobile_multi_teacher_distilled_tiny.by_gokuls").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_sst2_1_mobile_and_multi_teacher_distillation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gokuls/tiny-bert-sst2-1_mobilebert_and_bert-multi-teacher-distillation +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_only_distillation_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_only_distillation_en.md new file mode 100644 index 000000000000..ece1dea7f431 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_1_mobile_only_distillation_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from gokuls) +author: John Snow Labs +name: bert_classifier_tiny_sst2_1_mobile_only_distillation +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-1_mobilebert-only-distillation` is a English model originally trained by `gokuls`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_only_distillation_en_5.1.4_3.4_1698396520699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_1_mobile_only_distillation_en_5.1.4_3.4_1698396520699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_only_distillation","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_1_mobile_only_distillation","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.mobile_only_distilled_tiny.by_gokuls").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_sst2_1_mobile_only_distillation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gokuls/tiny-bert-sst2-1_mobilebert-only-distillation +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_mobile_distillation_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_mobile_distillation_en.md new file mode 100644 index 000000000000..f420880a3020 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_tiny_sst2_mobile_distillation_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from gokuls) +author: John Snow Labs +name: bert_classifier_tiny_sst2_mobile_distillation +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-mobilebert-distillation` is a English model originally trained by `gokuls`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_mobile_distillation_en_5.1.4_3.4_1698393977743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_mobile_distillation_en_5.1.4_3.4_1698393977743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_mobile_distillation","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_mobile_distillation","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.distilled_tiny_mobile.by_gokuls").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_sst2_mobile_distillation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gokuls/tiny-bert-sst2-mobilebert-distillation +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_vanhoan_fine_tuned_cola_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_vanhoan_fine_tuned_cola_en.md new file mode 100644 index 000000000000..774a12ab2314 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_classifier_vanhoan_fine_tuned_cola_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from VanHoan) +author: John Snow Labs +name: bert_classifier_vanhoan_fine_tuned_cola +date: 2023-10-27 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-fine-tuned-cola` is a English model originally trained by `VanHoan`. + +## Predicted Entities + +`unacceptable`, `acceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_vanhoan_fine_tuned_cola_en_5.1.4_3.4_1698391153418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_vanhoan_fine_tuned_cola_en_5.1.4_3.4_1698391153418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vanhoan_fine_tuned_cola","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vanhoan_fine_tuned_cola","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue_cola1.by_vanhoan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_vanhoan_fine_tuned_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/VanHoan/bert-fine-tuned-cola +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_cn_finetuning_chihao_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_cn_finetuning_chihao_en.md new file mode 100644 index 000000000000..2124281d0d64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_cn_finetuning_chihao_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_cn_finetuning_chihao BertForSequenceClassification from chihao +author: John Snow Labs +name: bert_cn_finetuning_chihao +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cn_finetuning_chihao` is a English model originally trained by chihao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cn_finetuning_chihao_en_5.1.4_3.4_1698366364796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cn_finetuning_chihao_en_5.1.4_3.4_1698366364796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_cn_finetuning_chihao","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_cn_finetuning_chihao","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cn_finetuning_chihao| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/chihao/bert_cn_finetuning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_cvs_estimation_years_experience_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_cvs_estimation_years_experience_en.md new file mode 100644 index 000000000000..1dfa6489b9bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_cvs_estimation_years_experience_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_cvs_estimation_years_experience BertForSequenceClassification from jhonparra18 +author: John Snow Labs +name: bert_cvs_estimation_years_experience +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cvs_estimation_years_experience` is a English model originally trained by jhonparra18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cvs_estimation_years_experience_en_5.1.4_3.4_1698373784156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cvs_estimation_years_experience_en_5.1.4_3.4_1698373784156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_cvs_estimation_years_experience","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_cvs_estimation_years_experience","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cvs_estimation_years_experience| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/jhonparra18/bert-cvs-estimation-years-experience \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_emo_classifier_manirathinam21_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_emo_classifier_manirathinam21_en.md new file mode 100644 index 000000000000..ac6b1cc29455 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_emo_classifier_manirathinam21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emo_classifier_manirathinam21 BertForSequenceClassification from Manirathinam21 +author: John Snow Labs +name: bert_emo_classifier_manirathinam21 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emo_classifier_manirathinam21` is a English model originally trained by Manirathinam21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emo_classifier_manirathinam21_en_5.1.4_3.4_1698379836709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emo_classifier_manirathinam21_en_5.1.4_3.4_1698379836709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_emo_classifier_manirathinam21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_emo_classifier_manirathinam21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emo_classifier_manirathinam21| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Manirathinam21/bert_emo_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_fine_tuned_cola_khaledab2023_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_fine_tuned_cola_khaledab2023_en.md new file mode 100644 index 000000000000..29f369baf3fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_fine_tuned_cola_khaledab2023_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_fine_tuned_cola_khaledab2023 BertForSequenceClassification from KhaledAB2023 +author: John Snow Labs +name: bert_fine_tuned_cola_khaledab2023 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fine_tuned_cola_khaledab2023` is a English model originally trained by KhaledAB2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_khaledab2023_en_5.1.4_3.4_1698376944769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_cola_khaledab2023_en_5.1.4_3.4_1698376944769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_khaledab2023","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_cola_khaledab2023","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fine_tuned_cola_khaledab2023| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/KhaledAB2023/bert-fine-tuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_fined_tunned_model_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_fined_tunned_model_en.md new file mode 100644 index 000000000000..0ce39ba095d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_fined_tunned_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_fined_tunned_model BertForSequenceClassification from Dewa +author: John Snow Labs +name: bert_fined_tunned_model +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fined_tunned_model` is a English model originally trained by Dewa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fined_tunned_model_en_5.1.4_3.4_1698386884689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fined_tunned_model_en_5.1.4_3.4_1698386884689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fined_tunned_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fined_tunned_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fined_tunned_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Dewa/bert-fined-tunned-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_finetuned_rottentomatoes_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_finetuned_rottentomatoes_en.md new file mode 100644 index 000000000000..7fa9aa77ba12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_finetuned_rottentomatoes_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuned_rottentomatoes BertForSequenceClassification from flowfree +author: John Snow Labs +name: bert_finetuned_rottentomatoes +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_rottentomatoes` is a English model originally trained by flowfree. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_rottentomatoes_en_5.1.4_3.4_1698370759242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_rottentomatoes_en_5.1.4_3.4_1698370759242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_rottentomatoes","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_rottentomatoes","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_rottentomatoes| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/flowfree/bert-finetuned-rottentomatoes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_finetuned_semitic_languages_eval_english_rajueee_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_finetuned_semitic_languages_eval_english_rajueee_en.md new file mode 100644 index 000000000000..61142ca680c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_finetuned_semitic_languages_eval_english_rajueee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuned_semitic_languages_eval_english_rajueee BertForSequenceClassification from RajuEEE +author: John Snow Labs +name: bert_finetuned_semitic_languages_eval_english_rajueee +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_semitic_languages_eval_english_rajueee` is a English model originally trained by RajuEEE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_rajueee_en_5.1.4_3.4_1698365277340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_semitic_languages_eval_english_rajueee_en_5.1.4_3.4_1698365277340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_rajueee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_semitic_languages_eval_english_rajueee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_semitic_languages_eval_english_rajueee| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/RajuEEE/bert-finetuned-sem_eval-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_finetuning_test_chenqian_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_finetuning_test_chenqian_en.md new file mode 100644 index 000000000000..de0663a86649 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_finetuning_test_chenqian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuning_test_chenqian BertForSequenceClassification from chenqian +author: John Snow Labs +name: bert_finetuning_test_chenqian +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuning_test_chenqian` is a English model originally trained by chenqian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_chenqian_en_5.1.4_3.4_1698364900889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuning_test_chenqian_en_5.1.4_3.4_1698364900889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_chenqian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuning_test_chenqian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuning_test_chenqian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/chenqian/bert_finetuning_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_gptdataset_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_gptdataset_en.md new file mode 100644 index 000000000000..c6a1ed36227e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_gptdataset_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_gptdataset BertForSequenceClassification from Babak-Behkamkia +author: John Snow Labs +name: bert_gptdataset +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_gptdataset` is a English model originally trained by Babak-Behkamkia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_gptdataset_en_5.1.4_3.4_1698374359684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_gptdataset_en_5.1.4_3.4_1698374359684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_gptdataset","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_gptdataset","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_gptdataset| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Babak-Behkamkia/bert_gptdataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_large_finnish_cased_toxicity_fi.md b/docs/_posts/ahmedlone127/2023-10-27-bert_large_finnish_cased_toxicity_fi.md new file mode 100644 index 000000000000..71c7b245380d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_large_finnish_cased_toxicity_fi.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Finnish bert_large_finnish_cased_toxicity BertForSequenceClassification from TurkuNLP +author: John Snow Labs +name: bert_large_finnish_cased_toxicity +date: 2023-10-27 +tags: [bert, fi, open_source, sequence_classification, onnx] +task: Text Classification +language: fi +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_finnish_cased_toxicity` is a Finnish model originally trained by TurkuNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_finnish_cased_toxicity_fi_5.1.4_3.4_1698390029410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_finnish_cased_toxicity_fi_5.1.4_3.4_1698390029410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_finnish_cased_toxicity","fi")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_finnish_cased_toxicity","fi") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_finnish_cased_toxicity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fi| +|Size:|1.3 GB| + +## References + +https://huggingface.co/TurkuNLP/bert-large-finnish-cased-toxicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_large_uncased_financial_phrasebank_allagree2_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_large_uncased_financial_phrasebank_allagree2_en.md new file mode 100644 index 000000000000..fc2b425a208b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_large_uncased_financial_phrasebank_allagree2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_financial_phrasebank_allagree2 BertForSequenceClassification from Farshid +author: John Snow Labs +name: bert_large_uncased_financial_phrasebank_allagree2 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_financial_phrasebank_allagree2` is a English model originally trained by Farshid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_financial_phrasebank_allagree2_en_5.1.4_3.4_1698372649990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_financial_phrasebank_allagree2_en_5.1.4_3.4_1698372649990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_financial_phrasebank_allagree2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_financial_phrasebank_allagree2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_financial_phrasebank_allagree2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Farshid/bert-large-uncased-financial-phrasebank-allagree2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_large_uncased_whole_word_masking_finetuned_sst_2_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_large_uncased_whole_word_masking_finetuned_sst_2_en.md new file mode 100644 index 000000000000..65a02b712e91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_large_uncased_whole_word_masking_finetuned_sst_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_whole_word_masking_finetuned_sst_2 BertForSequenceClassification from echarlaix +author: John Snow Labs +name: bert_large_uncased_whole_word_masking_finetuned_sst_2 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_whole_word_masking_finetuned_sst_2` is a English model originally trained by echarlaix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_sst_2_en_5.1.4_3.4_1698396187782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_whole_word_masking_finetuned_sst_2_en_5.1.4_3.4_1698396187782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_whole_word_masking_finetuned_sst_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_whole_word_masking_finetuned_sst_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_whole_word_masking_finetuned_sst_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/echarlaix/bert-large-uncased-whole-word-masking-finetuned-sst-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_lli_gptdetetor_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_lli_gptdetetor_en.md new file mode 100644 index 000000000000..cab3446aedf0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_lli_gptdetetor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_lli_gptdetetor BertForSequenceClassification from Nintw923 +author: John Snow Labs +name: bert_lli_gptdetetor +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_lli_gptdetetor` is a English model originally trained by Nintw923. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_lli_gptdetetor_en_5.1.4_3.4_1698390698398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_lli_gptdetetor_en_5.1.4_3.4_1698390698398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_lli_gptdetetor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_lli_gptdetetor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_lli_gptdetetor| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Nintw923/bert-lli-gptdetetor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_mini_sst2_distilled_sparse_90_1x4_block_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_mini_sst2_distilled_sparse_90_1x4_block_en.md new file mode 100644 index 000000000000..581ee1f8552b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_mini_sst2_distilled_sparse_90_1x4_block_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_mini_sst2_distilled_sparse_90_1x4_block BertForSequenceClassification from Intel +author: John Snow Labs +name: bert_mini_sst2_distilled_sparse_90_1x4_block +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mini_sst2_distilled_sparse_90_1x4_block` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mini_sst2_distilled_sparse_90_1x4_block_en_5.1.4_3.4_1698384313741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mini_sst2_distilled_sparse_90_1x4_block_en_5.1.4_3.4_1698384313741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mini_sst2_distilled_sparse_90_1x4_block","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mini_sst2_distilled_sparse_90_1x4_block","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mini_sst2_distilled_sparse_90_1x4_block| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|32.5 MB| + +## References + +https://huggingface.co/Intel/bert-mini-sst2-distilled-sparse-90-1X4-block \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_multilingual_passage_reranking_msmarco_fluid_ai_xx.md b/docs/_posts/ahmedlone127/2023-10-27-bert_multilingual_passage_reranking_msmarco_fluid_ai_xx.md new file mode 100644 index 000000000000..81f4a209fffa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_multilingual_passage_reranking_msmarco_fluid_ai_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_multilingual_passage_reranking_msmarco_fluid_ai BertForSequenceClassification from fluid-ai +author: John Snow Labs +name: bert_multilingual_passage_reranking_msmarco_fluid_ai +date: 2023-10-27 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingual_passage_reranking_msmarco_fluid_ai` is a Multilingual model originally trained by fluid-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingual_passage_reranking_msmarco_fluid_ai_xx_5.1.4_3.4_1698375394207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingual_passage_reranking_msmarco_fluid_ai_xx_5.1.4_3.4_1698375394207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_multilingual_passage_reranking_msmarco_fluid_ai","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_multilingual_passage_reranking_msmarco_fluid_ai","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingual_passage_reranking_msmarco_fluid_ai| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/fluid-ai/bert-multilingual-passage-reranking-msmarco \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_mydataset_vast_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_mydataset_vast_en.md new file mode 100644 index 000000000000..fa2d990d543d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_mydataset_vast_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_mydataset_vast BertForSequenceClassification from Babak-Behkamkia +author: John Snow Labs +name: bert_mydataset_vast +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mydataset_vast` is a English model originally trained by Babak-Behkamkia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mydataset_vast_en_5.1.4_3.4_1698387764990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mydataset_vast_en_5.1.4_3.4_1698387764990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mydataset_vast","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mydataset_vast","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mydataset_vast| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Babak-Behkamkia/bert_mydataset_VAST \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_en.md new file mode 100644 index 000000000000..f9b942048cf0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_small_finetuned_eoir_privacy BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_small_finetuned_eoir_privacy +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_eoir_privacy` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_en_5.1.4_3.4_1698380605530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_en_5.1.4_3.4_1698380605530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_eoir_privacy| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|107.9 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-eoir_privacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer10_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer10_en.md new file mode 100644 index 000000000000..48f8076f3835 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_small_finetuned_eoir_privacy_longer10 BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_small_finetuned_eoir_privacy_longer10 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_eoir_privacy_longer10` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_longer10_en_5.1.4_3.4_1698381602797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_longer10_en_5.1.4_3.4_1698381602797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy_longer10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy_longer10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_eoir_privacy_longer10| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|107.9 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-eoir_privacy-longer10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer20_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer20_en.md new file mode 100644 index 000000000000..528d7f0863fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer20_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_small_finetuned_eoir_privacy_longer20 BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_small_finetuned_eoir_privacy_longer20 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_eoir_privacy_longer20` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_longer20_en_5.1.4_3.4_1698382251783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_longer20_en_5.1.4_3.4_1698382251783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy_longer20","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy_longer20","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_eoir_privacy_longer20| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|107.9 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-eoir_privacy-longer20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer30_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer30_en.md new file mode 100644 index 000000000000..88f66c06cfc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_small_finetuned_eoir_privacy_longer30_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_small_finetuned_eoir_privacy_longer30 BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_small_finetuned_eoir_privacy_longer30 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_eoir_privacy_longer30` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_longer30_en_5.1.4_3.4_1698382833161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_eoir_privacy_longer30_en_5.1.4_3.4_1698382833161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy_longer30","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_small_finetuned_eoir_privacy_longer30","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_eoir_privacy_longer30| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|107.9 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-eoir-privacy-longer30 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bert_vast_long3_en.md b/docs/_posts/ahmedlone127/2023-10-27-bert_vast_long3_en.md new file mode 100644 index 000000000000..ad76c280e5e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bert_vast_long3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_vast_long3 BertForSequenceClassification from Babak-Behkamkia +author: John Snow Labs +name: bert_vast_long3 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_vast_long3` is a English model originally trained by Babak-Behkamkia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_vast_long3_en_5.1.4_3.4_1698367200495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_vast_long3_en_5.1.4_3.4_1698367200495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_vast_long3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_vast_long3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_vast_long3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Babak-Behkamkia/bert_VAST_long3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-bertimbau_products_reviews_portuguese_breton_pt.md b/docs/_posts/ahmedlone127/2023-10-27-bertimbau_products_reviews_portuguese_breton_pt.md new file mode 100644 index 000000000000..40b6c55d9724 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-bertimbau_products_reviews_portuguese_breton_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese bertimbau_products_reviews_portuguese_breton BertForSequenceClassification from ramonmedeiro1 +author: John Snow Labs +name: bertimbau_products_reviews_portuguese_breton +date: 2023-10-27 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertimbau_products_reviews_portuguese_breton` is a Portuguese model originally trained by ramonmedeiro1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertimbau_products_reviews_portuguese_breton_pt_5.1.4_3.4_1698388598031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertimbau_products_reviews_portuguese_breton_pt_5.1.4_3.4_1698388598031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_products_reviews_portuguese_breton","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_products_reviews_portuguese_breton","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertimbau_products_reviews_portuguese_breton| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ramonmedeiro1/bertimbau-products-reviews-pt-br \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-biobert_large_cased_v1_1_mnli_en.md b/docs/_posts/ahmedlone127/2023-10-27-biobert_large_cased_v1_1_mnli_en.md new file mode 100644 index 000000000000..0c2d57a16559 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-biobert_large_cased_v1_1_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biobert_large_cased_v1_1_mnli BertForSequenceClassification from dmis-lab +author: John Snow Labs +name: biobert_large_cased_v1_1_mnli +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_large_cased_v1_1_mnli` is a English model originally trained by dmis-lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_large_cased_v1_1_mnli_en_5.1.4_3.4_1698392025993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_large_cased_v1_1_mnli_en_5.1.4_3.4_1698392025993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biobert_large_cased_v1_1_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobert_large_cased_v1_1_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_large_cased_v1_1_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/dmis-lab/biobert-large-cased-v1.1-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-biolinkbert_large_mnli_resampled_en.md b/docs/_posts/ahmedlone127/2023-10-27-biolinkbert_large_mnli_resampled_en.md new file mode 100644 index 000000000000..cfbc55361074 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-biolinkbert_large_mnli_resampled_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biolinkbert_large_mnli_resampled BertForSequenceClassification from cnut1648 +author: John Snow Labs +name: biolinkbert_large_mnli_resampled +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biolinkbert_large_mnli_resampled` is a English model originally trained by cnut1648. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biolinkbert_large_mnli_resampled_en_5.1.4_3.4_1698365457050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biolinkbert_large_mnli_resampled_en_5.1.4_3.4_1698365457050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biolinkbert_large_mnli_resampled","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biolinkbert_large_mnli_resampled","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biolinkbert_large_mnli_resampled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/cnut1648/biolinkbert-large-mnli-resampled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm_en.md b/docs/_posts/ahmedlone127/2023-10-27-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm_en.md new file mode 100644 index 000000000000..a6105e08d06a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm BertForSequenceClassification from jakub014 +author: John Snow Labs +name: cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm` is a English model originally trained by jakub014. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm_en_5.1.4_3.4_1698388674044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm_en_5.1.4_3.4_1698388674044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_bert_base_uncased_itr23_seed0_finetuned_convincingness_ibm| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jakub014/ColD-Fusion-bert-base-uncased-itr23-seed0-finetuned-convincingness-IBM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-covid19_mbert_fine_tune_model_en.md b/docs/_posts/ahmedlone127/2023-10-27-covid19_mbert_fine_tune_model_en.md new file mode 100644 index 000000000000..5e64750f637d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-covid19_mbert_fine_tune_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid19_mbert_fine_tune_model BertForSequenceClassification from AbdoMamdouh +author: John Snow Labs +name: covid19_mbert_fine_tune_model +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid19_mbert_fine_tune_model` is a English model originally trained by AbdoMamdouh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid19_mbert_fine_tune_model_en_5.1.4_3.4_1698366369748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid19_mbert_fine_tune_model_en_5.1.4_3.4_1698366369748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("covid19_mbert_fine_tune_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("covid19_mbert_fine_tune_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid19_mbert_fine_tune_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/AbdoMamdouh/covid19_mbert_fine_tune_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-covid19_moh_bert_fine_tune_model_en.md b/docs/_posts/ahmedlone127/2023-10-27-covid19_moh_bert_fine_tune_model_en.md new file mode 100644 index 000000000000..91598d86fe65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-covid19_moh_bert_fine_tune_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid19_moh_bert_fine_tune_model BertForSequenceClassification from AbdoMamdouh +author: John Snow Labs +name: covid19_moh_bert_fine_tune_model +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid19_moh_bert_fine_tune_model` is a English model originally trained by AbdoMamdouh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid19_moh_bert_fine_tune_model_en_5.1.4_3.4_1698368088110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid19_moh_bert_fine_tune_model_en_5.1.4_3.4_1698368088110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("covid19_moh_bert_fine_tune_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("covid19_moh_bert_fine_tune_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid19_moh_bert_fine_tune_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.2 MB| + +## References + +https://huggingface.co/AbdoMamdouh/covid19_moh_bert_fine_tune_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2_en.md b/docs/_posts/ahmedlone127/2023-10-27-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2_en.md new file mode 100644 index 000000000000..6d8f6173a43d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2 BertForSequenceClassification from sumba +author: John Snow Labs +name: covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2` is a English model originally trained by sumba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2_en_5.1.4_3.4_1698369359322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2_en_5.1.4_3.4_1698369359322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_twitter_bert_v2_norwegian_description_stance_loss_hyp_unprocess2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/sumba/covid-twitter-bert-v2-no_description-stance-loss-hyp-unprocess2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-dbert_eth2_en.md b/docs/_posts/ahmedlone127/2023-10-27-dbert_eth2_en.md new file mode 100644 index 000000000000..639f713e4682 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-dbert_eth2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dbert_eth2 BertForSequenceClassification from baikal-nlp +author: John Snow Labs +name: dbert_eth2 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbert_eth2` is a English model originally trained by baikal-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbert_eth2_en_5.1.4_3.4_1698387828550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbert_eth2_en_5.1.4_3.4_1698387828550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dbert_eth2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dbert_eth2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbert_eth2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.4 MB| + +## References + +https://huggingface.co/baikal-nlp/dbert-eth2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-dbert_sentiment_en.md b/docs/_posts/ahmedlone127/2023-10-27-dbert_sentiment_en.md new file mode 100644 index 000000000000..2f4836826438 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-dbert_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dbert_sentiment BertForSequenceClassification from baikal-nlp +author: John Snow Labs +name: dbert_sentiment +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbert_sentiment` is a English model originally trained by baikal-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbert_sentiment_en_5.1.4_3.4_1698388655495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbert_sentiment_en_5.1.4_3.4_1698388655495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dbert_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dbert_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbert_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.4 MB| + +## References + +https://huggingface.co/baikal-nlp/dbert-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-dialog_sbert_base_en.md b/docs/_posts/ahmedlone127/2023-10-27-dialog_sbert_base_en.md new file mode 100644 index 000000000000..19d4ac30227e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-dialog_sbert_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dialog_sbert_base BertForSequenceClassification from digit82 +author: John Snow Labs +name: dialog_sbert_base +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dialog_sbert_base` is a English model originally trained by digit82. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dialog_sbert_base_en_5.1.4_3.4_1698389814293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dialog_sbert_base_en_5.1.4_3.4_1698389814293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dialog_sbert_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dialog_sbert_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dialog_sbert_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/digit82/dialog-sbert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-esci_us_bert_crossencoder_en.md b/docs/_posts/ahmedlone127/2023-10-27-esci_us_bert_crossencoder_en.md new file mode 100644 index 000000000000..5ef985844118 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-esci_us_bert_crossencoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English esci_us_bert_crossencoder BertForSequenceClassification from spacemanidol +author: John Snow Labs +name: esci_us_bert_crossencoder +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`esci_us_bert_crossencoder` is a English model originally trained by spacemanidol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/esci_us_bert_crossencoder_en_5.1.4_3.4_1698370307661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/esci_us_bert_crossencoder_en_5.1.4_3.4_1698370307661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("esci_us_bert_crossencoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("esci_us_bert_crossencoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|esci_us_bert_crossencoder| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/spacemanidol/esci-us-bert-crossencoder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-feather_berts_1_connectivity_en.md b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_1_connectivity_en.md new file mode 100644 index 000000000000..1a4c5d384458 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_1_connectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_1_connectivity BertForSequenceClassification from connectivity +author: John Snow Labs +name: feather_berts_1_connectivity +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_1_connectivity` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_1_connectivity_en_5.1.4_3.4_1698393870625.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_1_connectivity_en_5.1.4_3.4_1698393870625.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_1_connectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_1_connectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_1_connectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/feather_berts_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-feather_berts_2_connectivity_en.md b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_2_connectivity_en.md new file mode 100644 index 000000000000..6365f22d95e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_2_connectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_2_connectivity BertForSequenceClassification from connectivity +author: John Snow Labs +name: feather_berts_2_connectivity +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_2_connectivity` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_2_connectivity_en_5.1.4_3.4_1698394689250.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_2_connectivity_en_5.1.4_3.4_1698394689250.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_2_connectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_2_connectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_2_connectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/feather_berts_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-feather_berts_3_connectivity_en.md b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_3_connectivity_en.md new file mode 100644 index 000000000000..bcfeede9af20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_3_connectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_3_connectivity BertForSequenceClassification from connectivity +author: John Snow Labs +name: feather_berts_3_connectivity +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_3_connectivity` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_3_connectivity_en_5.1.4_3.4_1698395672309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_3_connectivity_en_5.1.4_3.4_1698395672309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_3_connectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_3_connectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_3_connectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/feather_berts_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-feather_berts_4_connectivity_en.md b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_4_connectivity_en.md new file mode 100644 index 000000000000..1d69664fbec2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_4_connectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_4_connectivity BertForSequenceClassification from connectivity +author: John Snow Labs +name: feather_berts_4_connectivity +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_4_connectivity` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_4_connectivity_en_5.1.4_3.4_1698396507828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_4_connectivity_en_5.1.4_3.4_1698396507828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_4_connectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_4_connectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_4_connectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/feather_berts_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-feather_berts_5_connectivity_en.md b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_5_connectivity_en.md new file mode 100644 index 000000000000..3c91b3c052d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_5_connectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_5_connectivity BertForSequenceClassification from connectivity +author: John Snow Labs +name: feather_berts_5_connectivity +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_5_connectivity` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_5_connectivity_en_5.1.4_3.4_1698397197635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_5_connectivity_en_5.1.4_3.4_1698397197635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_5_connectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_5_connectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_5_connectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/feather_berts_5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-feather_berts_6_connectivity_en.md b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_6_connectivity_en.md new file mode 100644 index 000000000000..0b61b8fa7f05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-feather_berts_6_connectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_6_connectivity BertForSequenceClassification from connectivity +author: John Snow Labs +name: feather_berts_6_connectivity +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_6_connectivity` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_6_connectivity_en_5.1.4_3.4_1698398003861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_6_connectivity_en_5.1.4_3.4_1698398003861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_6_connectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_6_connectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_6_connectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/feather_berts_6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-fine_tune_bert_chinese_sent_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-27-fine_tune_bert_chinese_sent_analysis_en.md new file mode 100644 index 000000000000..ff02ebb987c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-fine_tune_bert_chinese_sent_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_bert_chinese_sent_analysis BertForSequenceClassification from Ayazhankad +author: John Snow Labs +name: fine_tune_bert_chinese_sent_analysis +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_bert_chinese_sent_analysis` is a English model originally trained by Ayazhankad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_bert_chinese_sent_analysis_en_5.1.4_3.4_1698388354679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_bert_chinese_sent_analysis_en_5.1.4_3.4_1698388354679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_chinese_sent_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_bert_chinese_sent_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_bert_chinese_sent_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/Ayazhankad/fine-tune-bert-chinese-sent-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-fine_tuned_bert_financial_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-27-fine_tuned_bert_financial_sentiment_analysis_en.md new file mode 100644 index 000000000000..b0648dcce47a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-fine_tuned_bert_financial_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tuned_bert_financial_sentiment_analysis BertForSequenceClassification from mstafam +author: John Snow Labs +name: fine_tuned_bert_financial_sentiment_analysis +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bert_financial_sentiment_analysis` is a English model originally trained by mstafam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_financial_sentiment_analysis_en_5.1.4_3.4_1698385344128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_financial_sentiment_analysis_en_5.1.4_3.4_1698385344128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_financial_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_bert_financial_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bert_financial_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mstafam/fine-tuned-bert-financial-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-finetuned_bert_injection_en.md b/docs/_posts/ahmedlone127/2023-10-27-finetuned_bert_injection_en.md new file mode 100644 index 000000000000..55e800719fa7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-finetuned_bert_injection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_bert_injection BertForSequenceClassification from benediktpri +author: John Snow Labs +name: finetuned_bert_injection +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_injection` is a English model originally trained by benediktpri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_injection_en_5.1.4_3.4_1698383017889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_injection_en_5.1.4_3.4_1698383017889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bert_injection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bert_injection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_injection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/benediktpri/finetuned_bert_injection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-gbert_large_sts_de.md b/docs/_posts/ahmedlone127/2023-10-27-gbert_large_sts_de.md new file mode 100644 index 000000000000..5fcff8681779 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-gbert_large_sts_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German gbert_large_sts BertForSequenceClassification from deepset +author: John Snow Labs +name: gbert_large_sts +date: 2023-10-27 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gbert_large_sts` is a German model originally trained by deepset. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gbert_large_sts_de_5.1.4_3.4_1698386803587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gbert_large_sts_de_5.1.4_3.4_1698386803587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("gbert_large_sts","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("gbert_large_sts","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gbert_large_sts| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|1.3 GB| + +## References + +https://huggingface.co/deepset/gbert-large-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-hubert_base_cc_finance_filter_en.md b/docs/_posts/ahmedlone127/2023-10-27-hubert_base_cc_finance_filter_en.md new file mode 100644 index 000000000000..49141d592608 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-hubert_base_cc_finance_filter_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hubert_base_cc_finance_filter BertForSequenceClassification from papsebestyen +author: John Snow Labs +name: hubert_base_cc_finance_filter +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_base_cc_finance_filter` is a English model originally trained by papsebestyen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_base_cc_finance_filter_en_5.1.4_3.4_1698388598046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_base_cc_finance_filter_en_5.1.4_3.4_1698388598046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hubert_base_cc_finance_filter","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hubert_base_cc_finance_filter","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_base_cc_finance_filter| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/papsebestyen/hubert-base-cc-finance-filter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-interview_ratings_bert_en.md b/docs/_posts/ahmedlone127/2023-10-27-interview_ratings_bert_en.md new file mode 100644 index 000000000000..24d76b8cf69d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-interview_ratings_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English interview_ratings_bert BertForSequenceClassification from csatapathy +author: John Snow Labs +name: interview_ratings_bert +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`interview_ratings_bert` is a English model originally trained by csatapathy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/interview_ratings_bert_en_5.1.4_3.4_1698381906548.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/interview_ratings_bert_en_5.1.4_3.4_1698381906548.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("interview_ratings_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("interview_ratings_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|interview_ratings_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/csatapathy/interview-ratings-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_0_en.md b/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_0_en.md new file mode 100644 index 000000000000..a3e0eb07e3e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English long_feather_bert_ft_mnli_0 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: long_feather_bert_ft_mnli_0 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`long_feather_bert_ft_mnli_0` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/long_feather_bert_ft_mnli_0_en_5.1.4_3.4_1698367200920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/long_feather_bert_ft_mnli_0_en_5.1.4_3.4_1698367200920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("long_feather_bert_ft_mnli_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("long_feather_bert_ft_mnli_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|long_feather_bert_ft_mnli_0| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/long_feather_bert_ft_mnli-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_1_en.md b/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_1_en.md new file mode 100644 index 000000000000..8f4d7d81c6cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English long_feather_bert_ft_mnli_1 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: long_feather_bert_ft_mnli_1 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`long_feather_bert_ft_mnli_1` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/long_feather_bert_ft_mnli_1_en_5.1.4_3.4_1698366417720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/long_feather_bert_ft_mnli_1_en_5.1.4_3.4_1698366417720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("long_feather_bert_ft_mnli_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("long_feather_bert_ft_mnli_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|long_feather_bert_ft_mnli_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/long_feather_bert_ft_mnli-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_2_en.md b/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_2_en.md new file mode 100644 index 000000000000..bac9459d26d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-long_feather_bert_ft_mnli_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English long_feather_bert_ft_mnli_2 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: long_feather_bert_ft_mnli_2 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`long_feather_bert_ft_mnli_2` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/long_feather_bert_ft_mnli_2_en_5.1.4_3.4_1698368006797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/long_feather_bert_ft_mnli_2_en_5.1.4_3.4_1698368006797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("long_feather_bert_ft_mnli_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("long_feather_bert_ft_mnli_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|long_feather_bert_ft_mnli_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/long_feather_bert_ft_mnli-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-matbert_bandgap_en.md b/docs/_posts/ahmedlone127/2023-10-27-matbert_bandgap_en.md new file mode 100644 index 000000000000..c26fc655cdb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-matbert_bandgap_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English matbert_bandgap BertForSequenceClassification from korolewadim +author: John Snow Labs +name: matbert_bandgap +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`matbert_bandgap` is a English model originally trained by korolewadim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/matbert_bandgap_en_5.1.4_3.4_1698369845877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/matbert_bandgap_en_5.1.4_3.4_1698369845877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("matbert_bandgap","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("matbert_bandgap","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|matbert_bandgap| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.5 MB| + +## References + +https://huggingface.co/korolewadim/matbert-bandgap \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_arabic_en.md b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_arabic_en.md new file mode 100644 index 000000000000..92bac208b193 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_arabic_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbert_base_tweet_sentiment_arabic BertForSequenceClassification from cardiffnlp +author: John Snow Labs +name: mbert_base_tweet_sentiment_arabic +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_base_tweet_sentiment_arabic` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_arabic_en_5.1.4_3.4_1698382866105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_arabic_en_5.1.4_3.4_1698382866105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_arabic","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_arabic","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_base_tweet_sentiment_arabic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/cardiffnlp/mbert-base-tweet-sentiment-ar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_french_en.md b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_french_en.md new file mode 100644 index 000000000000..507bac418be8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbert_base_tweet_sentiment_french BertForSequenceClassification from cardiffnlp +author: John Snow Labs +name: mbert_base_tweet_sentiment_french +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_base_tweet_sentiment_french` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_french_en_5.1.4_3.4_1698379699843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_french_en_5.1.4_3.4_1698379699843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_base_tweet_sentiment_french| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/cardiffnlp/mbert-base-tweet-sentiment-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_italian_en.md b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_italian_en.md new file mode 100644 index 000000000000..b47a02663713 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_italian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbert_base_tweet_sentiment_italian BertForSequenceClassification from cardiffnlp +author: John Snow Labs +name: mbert_base_tweet_sentiment_italian +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_base_tweet_sentiment_italian` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_italian_en_5.1.4_3.4_1698383973608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_italian_en_5.1.4_3.4_1698383973608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_italian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_italian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_base_tweet_sentiment_italian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/cardiffnlp/mbert-base-tweet-sentiment-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_portuguese_en.md b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_portuguese_en.md new file mode 100644 index 000000000000..2986ced6b7ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-mbert_base_tweet_sentiment_portuguese_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbert_base_tweet_sentiment_portuguese BertForSequenceClassification from cardiffnlp +author: John Snow Labs +name: mbert_base_tweet_sentiment_portuguese +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_base_tweet_sentiment_portuguese` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_portuguese_en_5.1.4_3.4_1698381906712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_base_tweet_sentiment_portuguese_en_5.1.4_3.4_1698381906712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_portuguese","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mbert_base_tweet_sentiment_portuguese","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_base_tweet_sentiment_portuguese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/cardiffnlp/mbert-base-tweet-sentiment-pt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1_en.md b/docs/_posts/ahmedlone127/2023-10-27-mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1_en.md new file mode 100644 index 000000000000..20342349524b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1 BertForSequenceClassification from karolill +author: John Snow Labs +name: mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1` is a English model originally trained by karolill. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1_en_5.1.4_3.4_1698385786028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1_en_5.1.4_3.4_1698385786028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_lr3e_05_wr0_1_optimadamw_hf_wd0_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/karolill/mbert_LR3e-05_WR0.1_OPTIMadamw_hf_WD0.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_crows_pairs_classifieronly_en.md b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_crows_pairs_classifieronly_en.md new file mode 100644 index 000000000000..c1a3df28fcce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_crows_pairs_classifieronly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multiberts_seed_0_crows_pairs_classifieronly BertForSequenceClassification from asun17904 +author: John Snow Labs +name: multiberts_seed_0_crows_pairs_classifieronly +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multiberts_seed_0_crows_pairs_classifieronly` is a English model originally trained by asun17904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multiberts_seed_0_crows_pairs_classifieronly_en_5.1.4_3.4_1698392985989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multiberts_seed_0_crows_pairs_classifieronly_en_5.1.4_3.4_1698392985989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_0_crows_pairs_classifieronly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_0_crows_pairs_classifieronly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multiberts_seed_0_crows_pairs_classifieronly| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.1 MB| + +## References + +https://huggingface.co/asun17904/multiberts-seed_0_crows_pairs_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_stereoset_classifieronly_en.md b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_stereoset_classifieronly_en.md new file mode 100644 index 000000000000..37661159df59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_stereoset_classifieronly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multiberts_seed_0_stereoset_classifieronly BertForSequenceClassification from asun17904 +author: John Snow Labs +name: multiberts_seed_0_stereoset_classifieronly +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multiberts_seed_0_stereoset_classifieronly` is a English model originally trained by asun17904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multiberts_seed_0_stereoset_classifieronly_en_5.1.4_3.4_1698396508153.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multiberts_seed_0_stereoset_classifieronly_en_5.1.4_3.4_1698396508153.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_0_stereoset_classifieronly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_0_stereoset_classifieronly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multiberts_seed_0_stereoset_classifieronly| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.1 MB| + +## References + +https://huggingface.co/asun17904/multiberts-seed_0_stereoset_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_winobias_classifieronly_en.md b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_winobias_classifieronly_en.md new file mode 100644 index 000000000000..9060cf543f5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_0_winobias_classifieronly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multiberts_seed_0_winobias_classifieronly BertForSequenceClassification from asun17904 +author: John Snow Labs +name: multiberts_seed_0_winobias_classifieronly +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multiberts_seed_0_winobias_classifieronly` is a English model originally trained by asun17904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multiberts_seed_0_winobias_classifieronly_en_5.1.4_3.4_1698397201883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multiberts_seed_0_winobias_classifieronly_en_5.1.4_3.4_1698397201883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_0_winobias_classifieronly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_0_winobias_classifieronly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multiberts_seed_0_winobias_classifieronly| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.1 MB| + +## References + +https://huggingface.co/asun17904/multiberts-seed_0_winobias_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_crows_pairs_classifieronly_en.md b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_crows_pairs_classifieronly_en.md new file mode 100644 index 000000000000..ac6fde9d12b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_crows_pairs_classifieronly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multiberts_seed_1_crows_pairs_classifieronly BertForSequenceClassification from asun17904 +author: John Snow Labs +name: multiberts_seed_1_crows_pairs_classifieronly +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multiberts_seed_1_crows_pairs_classifieronly` is a English model originally trained by asun17904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multiberts_seed_1_crows_pairs_classifieronly_en_5.1.4_3.4_1698394690502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multiberts_seed_1_crows_pairs_classifieronly_en_5.1.4_3.4_1698394690502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_1_crows_pairs_classifieronly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_1_crows_pairs_classifieronly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multiberts_seed_1_crows_pairs_classifieronly| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.2 MB| + +## References + +https://huggingface.co/asun17904/multiberts-seed_1_crows_pairs_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_stereoset_classifieronly_en.md b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_stereoset_classifieronly_en.md new file mode 100644 index 000000000000..17c63cac77e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_stereoset_classifieronly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multiberts_seed_1_stereoset_classifieronly BertForSequenceClassification from asun17904 +author: John Snow Labs +name: multiberts_seed_1_stereoset_classifieronly +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multiberts_seed_1_stereoset_classifieronly` is a English model originally trained by asun17904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multiberts_seed_1_stereoset_classifieronly_en_5.1.4_3.4_1698392031329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multiberts_seed_1_stereoset_classifieronly_en_5.1.4_3.4_1698392031329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_1_stereoset_classifieronly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_1_stereoset_classifieronly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multiberts_seed_1_stereoset_classifieronly| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.2 MB| + +## References + +https://huggingface.co/asun17904/multiberts-seed_1_stereoset_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_winobias_classifieronly_en.md b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_winobias_classifieronly_en.md new file mode 100644 index 000000000000..4351405e57a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-multiberts_seed_1_winobias_classifieronly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multiberts_seed_1_winobias_classifieronly BertForSequenceClassification from asun17904 +author: John Snow Labs +name: multiberts_seed_1_winobias_classifieronly +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multiberts_seed_1_winobias_classifieronly` is a English model originally trained by asun17904. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multiberts_seed_1_winobias_classifieronly_en_5.1.4_3.4_1698393870919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multiberts_seed_1_winobias_classifieronly_en_5.1.4_3.4_1698393870919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_1_winobias_classifieronly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("multiberts_seed_1_winobias_classifieronly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multiberts_seed_1_winobias_classifieronly| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.2 MB| + +## References + +https://huggingface.co/asun17904/multiberts-seed_1_winobias_classifieronly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-nlp_bert_emo_classifier_en.md b/docs/_posts/ahmedlone127/2023-10-27-nlp_bert_emo_classifier_en.md new file mode 100644 index 000000000000..a291ac2f8826 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-nlp_bert_emo_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_bert_emo_classifier BertForSequenceClassification from Jateendra +author: John Snow Labs +name: nlp_bert_emo_classifier +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_bert_emo_classifier` is a English model originally trained by Jateendra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_bert_emo_classifier_en_5.1.4_3.4_1698374678183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_bert_emo_classifier_en_5.1.4_3.4_1698374678183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nlp_bert_emo_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nlp_bert_emo_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_bert_emo_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jateendra/nlp_bert_emo_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-norbert_lr5e_05_wr0_optimadamw_hf_wd0_01_en.md b/docs/_posts/ahmedlone127/2023-10-27-norbert_lr5e_05_wr0_optimadamw_hf_wd0_01_en.md new file mode 100644 index 000000000000..7af658842fb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-norbert_lr5e_05_wr0_optimadamw_hf_wd0_01_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English norbert_lr5e_05_wr0_optimadamw_hf_wd0_01 BertForSequenceClassification from karolill +author: John Snow Labs +name: norbert_lr5e_05_wr0_optimadamw_hf_wd0_01 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`norbert_lr5e_05_wr0_optimadamw_hf_wd0_01` is a English model originally trained by karolill. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/norbert_lr5e_05_wr0_optimadamw_hf_wd0_01_en_5.1.4_3.4_1698373791974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/norbert_lr5e_05_wr0_optimadamw_hf_wd0_01_en_5.1.4_3.4_1698373791974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("norbert_lr5e_05_wr0_optimadamw_hf_wd0_01","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("norbert_lr5e_05_wr0_optimadamw_hf_wd0_01","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|norbert_lr5e_05_wr0_optimadamw_hf_wd0_01| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.4 MB| + +## References + +https://huggingface.co/karolill/norbert_LR5e-05_WR0_OPTIMadamw_hf_WD0.01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-qs_classifier_bert_en.md b/docs/_posts/ahmedlone127/2023-10-27-qs_classifier_bert_en.md new file mode 100644 index 000000000000..b772c40806a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-qs_classifier_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English qs_classifier_bert BertForSequenceClassification from asvs +author: John Snow Labs +name: qs_classifier_bert +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qs_classifier_bert` is a English model originally trained by asvs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qs_classifier_bert_en_5.1.4_3.4_1698371134901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qs_classifier_bert_en_5.1.4_3.4_1698371134901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("qs_classifier_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("qs_classifier_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qs_classifier_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/asvs/qs-classifier-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_headlines_x_en.md b/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_headlines_x_en.md new file mode 100644 index 000000000000..a98a6f253135 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_headlines_x_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_base_cased_sentence_finetuned_headlines_x BertForSequenceClassification from chrommium +author: John Snow Labs +name: rubert_base_cased_sentence_finetuned_headlines_x +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_sentence_finetuned_headlines_x` is a English model originally trained by chrommium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentence_finetuned_headlines_x_en_5.1.4_3.4_1698368576830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentence_finetuned_headlines_x_en_5.1.4_3.4_1698368576830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentence_finetuned_headlines_x","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentence_finetuned_headlines_x","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_sentence_finetuned_headlines_x| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|666.5 MB| + +## References + +https://huggingface.co/chrommium/rubert-base-cased-sentence-finetuned-headlines_X \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_sent_in_news_sents_en.md b/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_sent_in_news_sents_en.md new file mode 100644 index 000000000000..301334c1da78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_sent_in_news_sents_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_base_cased_sentence_finetuned_sent_in_news_sents BertForSequenceClassification from chrommium +author: John Snow Labs +name: rubert_base_cased_sentence_finetuned_sent_in_news_sents +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_sentence_finetuned_sent_in_news_sents` is a English model originally trained by chrommium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentence_finetuned_sent_in_news_sents_en_5.1.4_3.4_1698369752189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentence_finetuned_sent_in_news_sents_en_5.1.4_3.4_1698369752189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentence_finetuned_sent_in_news_sents","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentence_finetuned_sent_in_news_sents","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_sentence_finetuned_sent_in_news_sents| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|666.5 MB| + +## References + +https://huggingface.co/chrommium/rubert-base-cased-sentence-finetuned-sent_in_news_sents \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_sent_in_russian_en.md b/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_sent_in_russian_en.md new file mode 100644 index 000000000000..ec4a76426d13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-rubert_base_cased_sentence_finetuned_sent_in_russian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_base_cased_sentence_finetuned_sent_in_russian BertForSequenceClassification from chrommium +author: John Snow Labs +name: rubert_base_cased_sentence_finetuned_sent_in_russian +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_sentence_finetuned_sent_in_russian` is a English model originally trained by chrommium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentence_finetuned_sent_in_russian_en_5.1.4_3.4_1698371037937.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentence_finetuned_sent_in_russian_en_5.1.4_3.4_1698371037937.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentence_finetuned_sent_in_russian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentence_finetuned_sent_in_russian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_sentence_finetuned_sent_in_russian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|666.5 MB| + +## References + +https://huggingface.co/chrommium/rubert-base-cased-sentence-finetuned-sent_in_ru \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-rubert_conversational_sentiment_balanced_ru.md b/docs/_posts/ahmedlone127/2023-10-27-rubert_conversational_sentiment_balanced_ru.md new file mode 100644 index 000000000000..4396bf3c3186 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-rubert_conversational_sentiment_balanced_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_conversational_sentiment_balanced BertForSequenceClassification from sunny3 +author: John Snow Labs +name: rubert_conversational_sentiment_balanced +date: 2023-10-27 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_conversational_sentiment_balanced` is a Russian model originally trained by sunny3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_conversational_sentiment_balanced_ru_5.1.4_3.4_1698387904691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_conversational_sentiment_balanced_ru_5.1.4_3.4_1698387904691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_conversational_sentiment_balanced","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_conversational_sentiment_balanced","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_conversational_sentiment_balanced| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/sunny3/rubert-conversational-sentiment-balanced \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-rubert_tiny2_finetuned_emotion_experiment_en.md b/docs/_posts/ahmedlone127/2023-10-27-rubert_tiny2_finetuned_emotion_experiment_en.md new file mode 100644 index 000000000000..efa8ea61447e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-rubert_tiny2_finetuned_emotion_experiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_tiny2_finetuned_emotion_experiment BertForSequenceClassification from mmillet +author: John Snow Labs +name: rubert_tiny2_finetuned_emotion_experiment +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_finetuned_emotion_experiment` is a English model originally trained by mmillet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_emotion_experiment_en_5.1.4_3.4_1698390581874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_emotion_experiment_en_5.1.4_3.4_1698390581874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_finetuned_emotion_experiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_finetuned_emotion_experiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_finetuned_emotion_experiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/mmillet/rubert-tiny2_finetuned_emotion_experiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-rubert_tiny2_insp_en.md b/docs/_posts/ahmedlone127/2023-10-27-rubert_tiny2_insp_en.md new file mode 100644 index 000000000000..6c839468fa4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-rubert_tiny2_insp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_tiny2_insp BertForSequenceClassification from RegBel +author: John Snow Labs +name: rubert_tiny2_insp +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_insp` is a English model originally trained by RegBel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_insp_en_5.1.4_3.4_1698372062901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_insp_en_5.1.4_3.4_1698372062901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_insp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_insp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_insp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/RegBel/rubert-tiny2-insp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-sbert_large_finetuned_sent_in_news_sents_3lab_en.md b/docs/_posts/ahmedlone127/2023-10-27-sbert_large_finetuned_sent_in_news_sents_3lab_en.md new file mode 100644 index 000000000000..ccefb805c65a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-sbert_large_finetuned_sent_in_news_sents_3lab_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sbert_large_finetuned_sent_in_news_sents_3lab BertForSequenceClassification from chrommium +author: John Snow Labs +name: sbert_large_finetuned_sent_in_news_sents_3lab +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sbert_large_finetuned_sent_in_news_sents_3lab` is a English model originally trained by chrommium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sbert_large_finetuned_sent_in_news_sents_3lab_en_5.1.4_3.4_1698374429096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sbert_large_finetuned_sent_in_news_sents_3lab_en_5.1.4_3.4_1698374429096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sbert_large_finetuned_sent_in_news_sents_3lab","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sbert_large_finetuned_sent_in_news_sents_3lab","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sbert_large_finetuned_sent_in_news_sents_3lab| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/chrommium/sbert_large-finetuned-sent_in_news_sents_3lab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-sbert_large_finetuned_sent_in_news_sents_en.md b/docs/_posts/ahmedlone127/2023-10-27-sbert_large_finetuned_sent_in_news_sents_en.md new file mode 100644 index 000000000000..e50b527f95fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-sbert_large_finetuned_sent_in_news_sents_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sbert_large_finetuned_sent_in_news_sents BertForSequenceClassification from chrommium +author: John Snow Labs +name: sbert_large_finetuned_sent_in_news_sents +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sbert_large_finetuned_sent_in_news_sents` is a English model originally trained by chrommium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sbert_large_finetuned_sent_in_news_sents_en_5.1.4_3.4_1698372799251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sbert_large_finetuned_sent_in_news_sents_en_5.1.4_3.4_1698372799251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sbert_large_finetuned_sent_in_news_sents","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sbert_large_finetuned_sent_in_news_sents","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sbert_large_finetuned_sent_in_news_sents| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/chrommium/sbert_large-finetuned-sent_in_news_sents \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-sentence_sentiments_analysis_bert_uholodala_en.md b/docs/_posts/ahmedlone127/2023-10-27-sentence_sentiments_analysis_bert_uholodala_en.md new file mode 100644 index 000000000000..2aff8ec253fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-sentence_sentiments_analysis_bert_uholodala_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentence_sentiments_analysis_bert_uholodala BertForSequenceClassification from UholoDala +author: John Snow Labs +name: sentence_sentiments_analysis_bert_uholodala +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_sentiments_analysis_bert_uholodala` is a English model originally trained by UholoDala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_sentiments_analysis_bert_uholodala_en_5.1.4_3.4_1698371192545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_sentiments_analysis_bert_uholodala_en_5.1.4_3.4_1698371192545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentence_sentiments_analysis_bert_uholodala","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentence_sentiments_analysis_bert_uholodala","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_sentiments_analysis_bert_uholodala| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/UholoDala/sentence_sentiments_analysis_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-sentiment_bert_restaurant_10_en.md b/docs/_posts/ahmedlone127/2023-10-27-sentiment_bert_restaurant_10_en.md new file mode 100644 index 000000000000..afe8c30049a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-sentiment_bert_restaurant_10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_bert_restaurant_10 BertForSequenceClassification from pachequinho +author: John Snow Labs +name: sentiment_bert_restaurant_10 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_bert_restaurant_10` is a English model originally trained by pachequinho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_bert_restaurant_10_en_5.1.4_3.4_1698369844717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_bert_restaurant_10_en_5.1.4_3.4_1698369844717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_bert_restaurant_10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_bert_restaurant_10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_bert_restaurant_10| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/pachequinho/sentiment_bert_restaurant_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-sentiment_bert_twitter_airlines_10_en.md b/docs/_posts/ahmedlone127/2023-10-27-sentiment_bert_twitter_airlines_10_en.md new file mode 100644 index 000000000000..dabba576d25c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-sentiment_bert_twitter_airlines_10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_bert_twitter_airlines_10 BertForSequenceClassification from pachequinho +author: John Snow Labs +name: sentiment_bert_twitter_airlines_10 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_bert_twitter_airlines_10` is a English model originally trained by pachequinho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_bert_twitter_airlines_10_en_5.1.4_3.4_1698368985918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_bert_twitter_airlines_10_en_5.1.4_3.4_1698368985918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_bert_twitter_airlines_10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_bert_twitter_airlines_10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_bert_twitter_airlines_10| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/pachequinho/sentiment_bert_twitter_airlines_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-spanish_bert_apoyo_en.md b/docs/_posts/ahmedlone127/2023-10-27-spanish_bert_apoyo_en.md new file mode 100644 index 000000000000..def4dbff9cd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-spanish_bert_apoyo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English spanish_bert_apoyo BertForSequenceClassification from dpalominop +author: John Snow Labs +name: spanish_bert_apoyo +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spanish_bert_apoyo` is a English model originally trained by dpalominop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spanish_bert_apoyo_en_5.1.4_3.4_1698392768802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spanish_bert_apoyo_en_5.1.4_3.4_1698392768802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("spanish_bert_apoyo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("spanish_bert_apoyo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spanish_bert_apoyo| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/dpalominop/spanish-bert-apoyo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-sts_bert_klue_e20_en.md b/docs/_posts/ahmedlone127/2023-10-27-sts_bert_klue_e20_en.md new file mode 100644 index 000000000000..76d7c7408877 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-sts_bert_klue_e20_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sts_bert_klue_e20 BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: sts_bert_klue_e20 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sts_bert_klue_e20` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sts_bert_klue_e20_en_5.1.4_3.4_1698391052220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sts_bert_klue_e20_en_5.1.4_3.4_1698391052220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sts_bert_klue_e20","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sts_bert_klue_e20","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sts_bert_klue_e20| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/ys7yoo/sts_bert_klue_e20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-sts_bert_klue_e7_en.md b/docs/_posts/ahmedlone127/2023-10-27-sts_bert_klue_e7_en.md new file mode 100644 index 000000000000..87310df51bab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-sts_bert_klue_e7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sts_bert_klue_e7 BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: sts_bert_klue_e7 +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sts_bert_klue_e7` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sts_bert_klue_e7_en_5.1.4_3.4_1698398020211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sts_bert_klue_e7_en_5.1.4_3.4_1698398020211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sts_bert_klue_e7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sts_bert_klue_e7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sts_bert_klue_e7| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/ys7yoo/sts_bert_klue_e7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-textattack_bert_base_mnli_fixed_en.md b/docs/_posts/ahmedlone127/2023-10-27-textattack_bert_base_mnli_fixed_en.md new file mode 100644 index 000000000000..a4fd764324fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-textattack_bert_base_mnli_fixed_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English textattack_bert_base_mnli_fixed BertForSequenceClassification from chromeNLP +author: John Snow Labs +name: textattack_bert_base_mnli_fixed +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`textattack_bert_base_mnli_fixed` is a English model originally trained by chromeNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/textattack_bert_base_mnli_fixed_en_5.1.4_3.4_1698384932693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/textattack_bert_base_mnli_fixed_en_5.1.4_3.4_1698384932693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("textattack_bert_base_mnli_fixed","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("textattack_bert_base_mnli_fixed","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|textattack_bert_base_mnli_fixed| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/chromeNLP/textattack_bert_base_MNLI_fixed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-two_step_finetuning_sbert_en.md b/docs/_posts/ahmedlone127/2023-10-27-two_step_finetuning_sbert_en.md new file mode 100644 index 000000000000..29fb4843049f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-two_step_finetuning_sbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English two_step_finetuning_sbert BertForSequenceClassification from chrommium +author: John Snow Labs +name: two_step_finetuning_sbert +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`two_step_finetuning_sbert` is a English model originally trained by chrommium. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/two_step_finetuning_sbert_en_5.1.4_3.4_1698376047295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/two_step_finetuning_sbert_en_5.1.4_3.4_1698376047295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("two_step_finetuning_sbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("two_step_finetuning_sbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|two_step_finetuning_sbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/chrommium/two-step-finetuning-sbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-unsupervised_comb_fine_tune_bert_exist_en.md b/docs/_posts/ahmedlone127/2023-10-27-unsupervised_comb_fine_tune_bert_exist_en.md new file mode 100644 index 000000000000..6a5b1eaf173f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-unsupervised_comb_fine_tune_bert_exist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English unsupervised_comb_fine_tune_bert_exist BertForSequenceClassification from nouman-10 +author: John Snow Labs +name: unsupervised_comb_fine_tune_bert_exist +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unsupervised_comb_fine_tune_bert_exist` is a English model originally trained by nouman-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unsupervised_comb_fine_tune_bert_exist_en_5.1.4_3.4_1698368572633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unsupervised_comb_fine_tune_bert_exist_en_5.1.4_3.4_1698368572633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("unsupervised_comb_fine_tune_bert_exist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("unsupervised_comb_fine_tune_bert_exist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unsupervised_comb_fine_tune_bert_exist| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/nouman-10/unsupervised-comb-fine-tune-bert-exist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_class_bert_claimpremise_en.md b/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_class_bert_claimpremise_en.md new file mode 100644 index 000000000000..2e76eb98a27e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_class_bert_claimpremise_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English vira_outputs_class_bert_claimpremise BertForSequenceClassification from spneshaei +author: John Snow Labs +name: vira_outputs_class_bert_claimpremise +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vira_outputs_class_bert_claimpremise` is a English model originally trained by spneshaei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vira_outputs_class_bert_claimpremise_en_5.1.4_3.4_1698386789133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vira_outputs_class_bert_claimpremise_en_5.1.4_3.4_1698386789133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("vira_outputs_class_bert_claimpremise","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vira_outputs_class_bert_claimpremise","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vira_outputs_class_bert_claimpremise| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/spneshaei/vira_outputs-class-bert-claimpremise \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_class_classifier_bert_suggestion_en.md b/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_class_classifier_bert_suggestion_en.md new file mode 100644 index 000000000000..6f8e34f600fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_class_classifier_bert_suggestion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English vira_outputs_class_classifier_bert_suggestion BertForSequenceClassification from spneshaei +author: John Snow Labs +name: vira_outputs_class_classifier_bert_suggestion +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vira_outputs_class_classifier_bert_suggestion` is a English model originally trained by spneshaei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vira_outputs_class_classifier_bert_suggestion_en_5.1.4_3.4_1698385695032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vira_outputs_class_classifier_bert_suggestion_en_5.1.4_3.4_1698385695032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("vira_outputs_class_classifier_bert_suggestion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vira_outputs_class_classifier_bert_suggestion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vira_outputs_class_classifier_bert_suggestion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/spneshaei/vira_outputs_class_classifier_bert_suggestion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_relation_bert_en.md b/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_relation_bert_en.md new file mode 100644 index 000000000000..c9db7d45d6ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-vira_outputs_relation_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English vira_outputs_relation_bert BertForSequenceClassification from spneshaei +author: John Snow Labs +name: vira_outputs_relation_bert +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vira_outputs_relation_bert` is a English model originally trained by spneshaei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vira_outputs_relation_bert_en_5.1.4_3.4_1698387515868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vira_outputs_relation_bert_en_5.1.4_3.4_1698387515868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("vira_outputs_relation_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vira_outputs_relation_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vira_outputs_relation_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/spneshaei/vira_outputs-relation-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-27-yenning_der_bert_model_en.md b/docs/_posts/ahmedlone127/2023-10-27-yenning_der_bert_model_en.md new file mode 100644 index 000000000000..4ff2ea938fd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-27-yenning_der_bert_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English yenning_der_bert_model BertForSequenceClassification from yyyynnnniiii +author: John Snow Labs +name: yenning_der_bert_model +date: 2023-10-27 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yenning_der_bert_model` is a English model originally trained by yyyynnnniiii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yenning_der_bert_model_en_5.1.4_3.4_1698377743055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yenning_der_bert_model_en_5.1.4_3.4_1698377743055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("yenning_der_bert_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("yenning_der_bert_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yenning_der_bert_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yyyynnnniiii/yenning-der-bert-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-016_microsoft_minilm_finetuned_yahoo_80_20_en.md b/docs/_posts/ahmedlone127/2023-10-31-016_microsoft_minilm_finetuned_yahoo_80_20_en.md new file mode 100644 index 000000000000..ca4ea11fa22c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-016_microsoft_minilm_finetuned_yahoo_80_20_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 016_microsoft_minilm_finetuned_yahoo_80_20 BertForSequenceClassification from diogopaes10 +author: John Snow Labs +name: 016_microsoft_minilm_finetuned_yahoo_80_20 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`016_microsoft_minilm_finetuned_yahoo_80_20` is a English model originally trained by diogopaes10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/016_microsoft_minilm_finetuned_yahoo_80_20_en_5.1.4_3.4_1698787110491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/016_microsoft_minilm_finetuned_yahoo_80_20_en_5.1.4_3.4_1698787110491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("016_microsoft_minilm_finetuned_yahoo_80_20","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("016_microsoft_minilm_finetuned_yahoo_80_20","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|016_microsoft_minilm_finetuned_yahoo_80_20| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|112.9 MB| + +## References + +https://huggingface.co/diogopaes10/016-microsoft-MiniLM-finetuned-yahoo-80_20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-arabert_medium_algerian_ar.md b/docs/_posts/ahmedlone127/2023-10-31-arabert_medium_algerian_ar.md new file mode 100644 index 000000000000..16174ae0d560 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-arabert_medium_algerian_ar.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Arabic arabert_medium_algerian BertForSequenceClassification from Abdou +author: John Snow Labs +name: arabert_medium_algerian +date: 2023-10-31 +tags: [bert, ar, open_source, sequence_classification, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_medium_algerian` is a Arabic model originally trained by Abdou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_medium_algerian_ar_5.1.4_3.4_1698787991508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_medium_algerian_ar_5.1.4_3.4_1698787991508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("arabert_medium_algerian","ar")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("arabert_medium_algerian","ar") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_medium_algerian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|158.2 MB| + +## References + +https://huggingface.co/Abdou/arabert-medium-algerian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-base_model_indukurs_en.md b/docs/_posts/ahmedlone127/2023-10-31-base_model_indukurs_en.md new file mode 100644 index 000000000000..c6a826ddbbd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-base_model_indukurs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English base_model_indukurs BertForSequenceClassification from indukurs +author: John Snow Labs +name: base_model_indukurs +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_model_indukurs` is a English model originally trained by indukurs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_model_indukurs_en_5.1.4_3.4_1698781808639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_model_indukurs_en_5.1.4_3.4_1698781808639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("base_model_indukurs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("base_model_indukurs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_model_indukurs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/indukurs/base-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9_en.md new file mode 100644 index 000000000000..9ea6ebf702ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9_en_5.1.4_3.4_1698785609152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9_en_5.1.4_3.4_1698785609152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wallstreetcn_morning_news_market_overview_sse50_v9| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-wallstreetcn-morning-news-market-overview-SSE50-v9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_base_uncased_alerts04142023_rsplit_3000_category1_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_base_uncased_alerts04142023_rsplit_3000_category1_en.md new file mode 100644 index 000000000000..8bed62b2f367 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_base_uncased_alerts04142023_rsplit_3000_category1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_alerts04142023_rsplit_3000_category1 BertForSequenceClassification from slewis +author: John Snow Labs +name: bert_base_uncased_alerts04142023_rsplit_3000_category1 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_alerts04142023_rsplit_3000_category1` is a English model originally trained by slewis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_alerts04142023_rsplit_3000_category1_en_5.1.4_3.4_1698788259931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_alerts04142023_rsplit_3000_category1_en_5.1.4_3.4_1698788259931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_alerts04142023_rsplit_3000_category1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_alerts04142023_rsplit_3000_category1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_alerts04142023_rsplit_3000_category1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/slewis/bert-base-uncased_alerts04142023_rsplit_3000_Category1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_base_uncased_glue_qnli_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_base_uncased_glue_qnli_en.md new file mode 100644 index 000000000000..ec307d9a07eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_base_uncased_glue_qnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_glue_qnli BertForSequenceClassification from corybalza +author: John Snow Labs +name: bert_base_uncased_glue_qnli +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_glue_qnli` is a English model originally trained by corybalza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_qnli_en_5.1.4_3.4_1698783400543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_glue_qnli_en_5.1.4_3.4_1698783400543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_glue_qnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_glue_qnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_glue_qnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/corybalza/bert-base-uncased-glue-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_2ch_text_classification_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_2ch_text_classification_en.md new file mode 100644 index 000000000000..aa4053148641 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_2ch_text_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from BraveOni) +author: John Snow Labs +name: bert_classifier_2ch_text_classification +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `2ch-text-classification` is a English model originally trained by `BraveOni`. + +## Predicted Entities + +`1.0`, `0.0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_2ch_text_classification_en_5.1.4_3.4_1698789047335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_2ch_text_classification_en_5.1.4_3.4_1698789047335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_2ch_text_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_2ch_text_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_braveoni").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_2ch_text_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BraveOni/2ch-text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_amazon_review_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_amazon_review_sentiment_analysis_en.md new file mode 100644 index 000000000000..c02191f26f52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_amazon_review_sentiment_analysis_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from LiYuan) +author: John Snow Labs +name: bert_classifier_amazon_review_sentiment_analysis +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `amazon-review-sentiment-analysis` is a English model originally trained by `LiYuan`. + +## Predicted Entities + +`3 stars`, `4 stars`, `2 stars`, `5 stars`, `1 star` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_amazon_review_sentiment_analysis_en_5.1.4_3.4_1698789384771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_amazon_review_sentiment_analysis_en_5.1.4_3.4_1698789384771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_amazon_review_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_amazon_review_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.amazon_sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_amazon_review_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/LiYuan/amazon-review-sentiment-analysis +- https://www.kaggle.com/datasets/cynthiarempel/amazon-us-customer-reviews-dataset +- https://github.com/vanderbilt-data-science/bigdata/blob/main/06-fine-tune-BERT-on-our-dataset.ipynb +- https://www.kaggle.com/datasets/cynthiarempel/amazon-us-customer-reviews-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_marbert_dialect_identification_city_ar.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_marbert_dialect_identification_city_ar.md new file mode 100644 index 000000000000..ba1a152668e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_marbert_dialect_identification_city_ar.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from Ammar-alhaj-ali) +author: John Snow Labs +name: bert_classifier_arabic_marbert_dialect_identification_city +date: 2023-10-31 +tags: [ar, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `arabic-MARBERT-dialect-identification-city` is a Arabic model originally trained by `Ammar-alhaj-ali`. + +## Predicted Entities + +`Riyadh`, `Fes`, `Doha`, `Beirut`, `Baghdad`, `Basra`, `Benghazi`, `Tunis`, `Algiers`, `Alexandria`, `Rabat`, `Khartoum`, `Aleppo`, `Tripoli`, `Jerusalem`, `Mosul`, `MSA`, `Jeddah`, `Aswan`, `Amman`, `Muscat`, `Salt`, `Damascus`, `Cairo`, `Sanaa`, `Sfax` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_marbert_dialect_identification_city_ar_5.1.4_3.4_1698784441567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_marbert_dialect_identification_city_ar_5.1.4_3.4_1698784441567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_marbert_dialect_identification_city","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_marbert_dialect_identification_city","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.classify.bert").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_arabic_marbert_dialect_identification_city| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|610.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Ammar-alhaj-ali/arabic-MARBERT-dialect-identification-city +- https://camel.abudhabi.nyu.edu/madar-shared-task-2019/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_marbert_sentiment_ar.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_marbert_sentiment_ar.md new file mode 100644 index 000000000000..a237d5d3a9a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_marbert_sentiment_ar.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from Ammar-alhaj-ali) +author: John Snow Labs +name: bert_classifier_arabic_marbert_sentiment +date: 2023-10-31 +tags: [ar, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `arabic-MARBERT-sentiment` is a Arabic model originally trained by `Ammar-alhaj-ali`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_marbert_sentiment_ar_5.1.4_3.4_1698782428340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_marbert_sentiment_ar_5.1.4_3.4_1698782428340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_marbert_sentiment","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_marbert_sentiment","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.classify.bert.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_arabic_marbert_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|610.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Ammar-alhaj-ali/arabic-MARBERT-sentiment +- https://www.kaggle.com/competitions/arabic-sentiment-analysis-2021-kaust \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_ner_ace_xx.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_ner_ace_xx.md new file mode 100644 index 000000000000..b962978956d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_ner_ace_xx.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Multilingual BertForSequenceClassification Cased model (from ychenNLP) +author: John Snow Labs +name: bert_classifier_arabic_ner_ace +date: 2023-10-31 +tags: [en, ar, open_source, bert, sequence_classification, classification, xx, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `arabic-ner-ace` is a Multilingual model originally trained by `ychenNLP`. + +## Predicted Entities + +`VEH`, `ORG`, `PER`, `WEA`, `FAC`, `LOC`, `GPE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_ner_ace_xx_5.1.4_3.4_1698784811281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_ner_ace_xx_5.1.4_3.4_1698784811281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_ner_ace","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_ner_ace","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.bert").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_arabic_ner_ace| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|466.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ychenNLP/arabic-ner-ace +- https://github.com/edchengg/GigaBERT +- https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/english-entities-guidelines-v6.6.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_poem_meter_3_ar.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_poem_meter_3_ar.md new file mode 100644 index 000000000000..dccb8a3d830f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_poem_meter_3_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from Yah216) +author: John Snow Labs +name: bert_classifier_arabic_poem_meter_3 +date: 2023-10-31 +tags: [ar, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Arabic_poem_meter_3` is a Arabic model originally trained by `Yah216`. + +## Predicted Entities + +`المنسرح`, `السلسلة`, `المضارع`, `موشح`, `البسيط`, `السريع`, `الرمل`, `المجتث`, `المتدارك`, `الطويل`, `المتقارب`, `الخفيف`, `عامي`, `المواليا`, `الهزج`, `الكامل`, `الوافر`, `شعر التفعيلة`, `شعر حر`, `المقتضب`, `الدوبيت`, `المديد`, `الرجز` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_poem_meter_3_ar_5.1.4_3.4_1698781235165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_poem_meter_3_ar_5.1.4_3.4_1698781235165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_poem_meter_3","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_poem_meter_3","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_arabic_poem_meter_3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|414.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Yah216/Arabic_poem_meter_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_poem_meter_classification_ar.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_poem_meter_classification_ar.md new file mode 100644 index 000000000000..2502f5516af2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_arabic_poem_meter_classification_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from Yah216) +author: John Snow Labs +name: bert_classifier_arabic_poem_meter_classification +date: 2023-10-31 +tags: [ar, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Arabic_poem_meter_classification` is a Arabic model originally trained by `Yah216`. + +## Predicted Entities + +`المنسرح`, `السلسلة`, `المضارع`, `موشح`, `البسيط`, `السريع`, `الرمل`, `المجتث`, `المتدارك`, `الطويل`, `المتقارب`, `الخفيف`, `عامي`, `المواليا`, `الهزج`, `الكامل`, `الوافر`, `شعر التفعيلة`, `شعر حر`, `المقتضب`, `الدوبيت`, `المديد`, `الرجز` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_poem_meter_classification_ar_5.1.4_3.4_1698782755190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_arabic_poem_meter_classification_ar_5.1.4_3.4_1698782755190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_poem_meter_classification","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_arabic_poem_meter_classification","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_arabic_poem_meter_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|506.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Yah216/Arabic_poem_meter_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_auditor_sentiment_finetuned_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_auditor_sentiment_finetuned_en.md new file mode 100644 index 000000000000..386903916276 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_auditor_sentiment_finetuned_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from FinanceInc) +author: John Snow Labs +name: bert_classifier_auditor_sentiment_finetuned +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `auditor_sentiment_finetuned` is a English model originally trained by `FinanceInc`. + +## Predicted Entities + +`Positive`, `Neutral`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_auditor_sentiment_finetuned_en_5.1.4_3.4_1698785097339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_auditor_sentiment_finetuned_en_5.1.4_3.4_1698785097339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_auditor_sentiment_finetuned","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_auditor_sentiment_finetuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_auditor_sentiment_finetuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/FinanceInc/auditor_sentiment_finetuned +- https://paperswithcode.com/sota?task=Text+Classification&dataset=FinanceInc%2Fauditor_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_abbb_622117836_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_abbb_622117836_zh.md new file mode 100644 index 000000000000..2ac77cb37ec4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_abbb_622117836_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from kyleinincubated) +author: John Snow Labs +name: bert_classifier_autonlp_abbb_622117836 +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-abbb-622117836` is a Chinese model originally trained by `kyleinincubated`. + +## Predicted Entities + +`传媒`, `计算机应用`, `化学制药`, `电力`, `汽车租赁`, `拍卖`, `电影`, `生态保护`, `水泥`, `园区开发`, `广播电视`, `港口`, `典当`, `电信`, `种植业`, `畜牧业辅助性活动`, `房地产租赁`, `保障性住房开发`, `燃气`, `小额贷款公司服务`, `保险代理`, `资产管理`, `包装印刷`, `专业公共卫生`, `监护设备`, `文化艺术产业`, `物流`, `保险`, `燃气设备`, `银行`, `商业养老金`, `融资性担保`, `水务`, `中药材种植`, `农产品初加工`, `铁路运输`, `新能源发电`, `中药生产`, `物流配送`, `商业健康保险`, `农产品加工`, `房屋建设`, `医疗人工智能`, `物联网`, `高等教育`, `环保工程`, `珠宝首饰`, `学前教育`, `高速公路服务区`, `娱乐`, `煤炭`, `住宿`, `商品住房开发`, `疾病预防`, `高速公路建设`, `集成电路`, `医疗服务`, `职业技能培训`, `医疗器械`, `贸易`, `燃气供应`, `葡萄酒`, `航空货运`, `证券`, `机场`, `污水处理`, `临床检验`, `中药`, `农村资金互助社服务`, `人身保险`, `锂`, `交通运输`, `水电`, `公用事业`, `纺织服装制造`, `网络安全监测`, `畜禽粪污处理`, `乘用车`, `畜禽养殖`, `航空运输`, `水利工程`, `铁路建设`, `物业管理`, `种子生产`, `保险经纪`, `塑料`, `广播`, `房地产`, `通信设备`, `文化`, `集成电路设计`, `再保险`, `工业园区开发`, `外贸`, `水产品`, `互联网安全服务`, `橡胶`, `互联网服务`, `证券交易`, `贷款公司`, `建筑材料`, `物业服务`, `兽药产品`, `健康体检`, `教育`, `钢铁`, `民宿`, `家具`, `基金`, `基层医疗卫生`, `餐饮`, `普通高中教育`, `建筑业`, `房屋建筑`, `卫星`, `疫苗`, `图书馆`, `航运`, `林业`, `特殊教育`, `体育场馆建筑`, `铝`, `渔业`, `农业`, `公交`, `信托`, `信息技术`, `中等职业学校教育`, `终端设备`, `森林防火`, `证券期货监管服务`, `小额贷款公司`, `互联网平台`, `供水`, `博物馆`, `通信`, `外卖`, `期货`, `公募基金`, `电源设备`, `铁路货物运输`, `房地产中介`, `电视`, `水产养殖`, `旅游综合`, `有色金属`, `传感器设计`, `建筑设计`, `内贸`, `技能培训`, `科技园区开发`, `物联网技术`, `饲料`, `体育`, `医院`, `高速公路`, `超市`, `运行维护服务` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_abbb_622117836_zh_5.1.4_3.4_1698789641652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_abbb_622117836_zh_5.1.4_3.4_1698789641652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_abbb_622117836","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_abbb_622117836","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_abbb_622117836| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/kyleinincubated/autonlp-abbb-622117836 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_antisemitism_2_21194454_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_antisemitism_2_21194454_en.md new file mode 100644 index 000000000000..5e20d329ffc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_antisemitism_2_21194454_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from astarostap) +author: John Snow Labs +name: bert_classifier_autonlp_antisemitism_2_21194454 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-antisemitism-2-21194454` is a English model originally trained by `astarostap`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_antisemitism_2_21194454_en_5.1.4_3.4_1698789895155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_antisemitism_2_21194454_en_5.1.4_3.4_1698789895155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_antisemitism_2_21194454","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_antisemitism_2_21194454","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_astarostap").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_antisemitism_2_21194454| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/astarostap/autonlp-antisemitism-2-21194454 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bank_transaction_classification_5521155_it.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bank_transaction_classification_5521155_it.md new file mode 100644 index 000000000000..fb9d5f958416 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bank_transaction_classification_5521155_it.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Italian BertForSequenceClassification Cased model (from mgrella) +author: John Snow Labs +name: bert_classifier_autonlp_bank_transaction_classification_5521155 +date: 2023-10-31 +tags: [it, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-bank-transaction-classification-5521155` is a Italian model originally trained by `mgrella`. + +## Predicted Entities + +`Category.PROFITS_PROFITS`, `Category.TRAVELS_TRANSPORTATION_TOLLS`, `Category.HEALTH_WELLNESS_WELLNESS_RELAX`, `Category.TRAVELS_TRANSPORTATION_HOTELS`, `Category.TAXES_SERVICES_PROFIT_DEDUCTION`, `Category.SHOPPING_OTHER`, `Category.HOUSING_FAMILY_VETERINARY`, `Category.WAGES_PROFESSIONAL_COMPENSATION`, `Category.TRAVELS_TRANSPORTATION_PARKING_URBAN_TRANSPORTS`, `Category.SHOPPING_HTECH`, `Category.EATING_OUT_OTHER`, `Category.TRAVELS_TRANSPORTATION_OTHER`, `Category.LEISURE_BOOKS`, `Category.LEISURE_CINEMA`, `Category.TAXES_SERVICES_BANK_FEES`, `Category.TAXES_SERVICES_DEFAULT_PAYMENTS`, `Category.TAXES_SERVICES_PROFESSIONAL_ACTIVITY`, `Category.SHOPPING_SPORT_ARTICLES`, `Category.HOUSING_FAMILY_OTHER`, `Category.BILLS_SUBSCRIPTIONS_OTHER`, `Category.MORTGAGES_LOANS_MORTGAGES`, `Category.TRAVELS_TRANSPORTATION_TRAVELS_HOLIDAYS`, `Category.LEISURE_SPORT_EVENTS`, `Category.HEALTH_WELLNESS_MEDICAL_EXPENSES`, `Category.BILLS_SUBSCRIPTIONS_BILLS`, `Category.HEALTH_WELLNESS_AID_EXPENSES`, `Category.TRAVELS_TRANSPORTATION_TAXIS`, `Category.TAXES_SERVICES_MONEY_ORDERS`, `Category.WAGES_PENSION`, `Category.HOUSING_FAMILY_GROCERIES`, `Category.CREDIT_CARDS_CREDIT_CARDS`, `Category.BILLS_SUBSCRIPTIONS_INTERNET_PHONE`, `Category.TRANSFERS_RENT_INCOMES`, `Category.TRAVELS_TRANSPORTATION_FUEL`, `Category.HOUSING_FAMILY_CHILDHOOD`, `Category.OTHER_CASH`, `Category.SHOPPING_ACCESSORIZE`, `Category.TRAVELS_TRANSPORTATION_BUSES`, `Category.EATING_OUT_COFFEE_SHOPS`, `Category.EATING_OUT_TAKEAWAY_RESTAURANTS`, `Category.WAGES_SALARY`, `Category.HEALTH_WELLNESS_DRUGS`, `Category.TRANSFERS_BANK_TRANSFERS`, `Category.HOUSING_FAMILY_RENTS`, `Category.TRAVELS_TRANSPORTATION_VEHICLE_MAINTENANCE`, `Category.HOUSING_FAMILY_APPLIANCES`, `Category.HOUSING_FAMILY_FURNITURE`, `Category.LEISURE_MAGAZINES_NEWSPAPERS`, `Category.BILLS_SUBSCRIPTIONS_SUBSCRIPTIONS`, `Category.HOUSING_FAMILY_MAINTENANCE_RENOVATION`, `Category.HOUSING_FAMILY_SERVANTS`, `Category.TRANSFERS_GIFTS_DONATIONS`, `Category.TRANSFERS_INVESTMENTS`, `Category.LEISURE_GAMBLING`, `Category.LEISURE_OTHER`, `Category.TRANSFERS_REFUNDS`, `Category.EATING_OUT_RESTAURANTS`, `Category.TRAVELS_TRANSPORTATION_FLIGHTS`, `Category.OTHER_OTHER`, `Category.LEISURE_CLUASSOCIATIONS`, `Category.MORTGAGES_LOANS_LOANS`, `Category.TRAVELS_TRANSPORTATION_TRAINS`, `Category.HEALTH_WELLNESS_OTHER`, `Category.TRANSFERS_SAVINGS`, `Category.TAXES_SERVICES_TAXES`, `Category.LEISURE_VIDEOGAMES`, `Category.TAXES_SERVICES_OTHER`, `Category.HEALTH_WELLNESS_GYMS`, `Category.OTHER_CHECKS`, `Category.TRANSFERS_OTHER`, `Category.SHOPPING_CLOTHING`, `Category.LEISURE_MOVIES_MUSICS`, `Category.TRAVELS_TRANSPORTATION_CAR_RENTAL`, `Category.LEISURE_THEATERS_CONCERTS`, `Category.SHOPPING_FOOTWEAR`, `Category.HOUSING_FAMILY_INSURANCES` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_bank_transaction_classification_5521155_it_5.1.4_3.4_1698785362376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_bank_transaction_classification_5521155_it_5.1.4_3.4_1698785362376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_bank_transaction_classification_5521155","it") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_bank_transaction_classification_5521155","it") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("it.classify.bert").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_bank_transaction_classification_5521155| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mgrella/autonlp-bank-transaction-classification-5521155 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bbc_news_classification_37229289_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bbc_news_classification_37229289_en.md new file mode 100644 index 000000000000..7a14c76801d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bbc_news_classification_37229289_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from abhishek) +author: John Snow Labs +name: bert_classifier_autonlp_bbc_news_classification_37229289 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-bbc-news-classification-37229289` is a English model originally trained by `abhishek`. + +## Predicted Entities + +`sport`, `business`, `tech`, `politics`, `entertainment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_bbc_news_classification_37229289_en_5.1.4_3.4_1698785897130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_bbc_news_classification_37229289_en_5.1.4_3.4_1698785897130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_bbc_news_classification_37229289","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_bbc_news_classification_37229289","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.news.by_abhishek").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_bbc_news_classification_37229289| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/abhishek/autonlp-bbc-news-classification-37229289 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bp_29016523_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bp_29016523_en.md new file mode 100644 index 000000000000..3e7f5be3badf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_bp_29016523_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bush) +author: John Snow Labs +name: bert_classifier_autonlp_bp_29016523 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-bp-29016523` is a English model originally trained by `bush`. + +## Predicted Entities + +`greeting`, `information`, `question`, `command`, `other` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_bp_29016523_en_5.1.4_3.4_1698781771855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_bp_29016523_en_5.1.4_3.4_1698781771855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_bp_29016523","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_bp_29016523","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_bush").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_bp_29016523| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bush/autonlp-bp-29016523 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cat33_624317932_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cat33_624317932_zh.md new file mode 100644 index 000000000000..fd00802d65b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cat33_624317932_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from kyleinincubated) +author: John Snow Labs +name: bert_classifier_autonlp_cat33_624317932 +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-cat33-624317932` is a Chinese model originally trained by `kyleinincubated`. + +## Predicted Entities + +`渔业`, `采矿业`, `公用事业`, `交通运输`, `农业`, `电子制造`, `休闲服务`, `文化`, `机械装备制造`, `商业贸易`, `畜牧业`, `林业`, `轻工制造`, `教育`, `国防军工`, `食品饮料`, `化工制造`, `非银金融`, `房地产`, `传媒`, `通信`, `家用电器`, `汽车制造`, `信息技术`, `有色金属`, `互联网服务`, `银行`, `纺织服装制造`, `医药生物`, `钢铁`, `建筑业`, `电气设备` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cat33_624317932_zh_5.1.4_3.4_1698782027226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cat33_624317932_zh_5.1.4_3.4_1698782027226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cat33_624317932","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cat33_624317932","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_cat33_624317932| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/kyleinincubated/autonlp-cat33-624317932 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_classification_9522090_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_classification_9522090_en.md new file mode 100644 index 000000000000..6d78ec57a374 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_classification_9522090_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bshlgrs) +author: John Snow Labs +name: bert_classifier_autonlp_classification_9522090 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-classification-9522090` is a English model originally trained by `bshlgrs`. + +## Predicted Entities + +`No`, `Yes`, `Unsure` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_classification_9522090_en_5.1.4_3.4_1698790163777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_classification_9522090_en_5.1.4_3.4_1698790163777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_classification_9522090","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_classification_9522090","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.v1.by_bshlgrs").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_classification_9522090| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bshlgrs/autonlp-classification-9522090 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_classification_with_all_labellers_9532137_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_classification_with_all_labellers_9532137_en.md new file mode 100644 index 000000000000..35064d2d3762 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_classification_with_all_labellers_9532137_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bshlgrs) +author: John Snow Labs +name: bert_classifier_autonlp_classification_with_all_labellers_9532137 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-classification_with_all_labellers-9532137` is a English model originally trained by `bshlgrs`. + +## Predicted Entities + +`No`, `Yes`, `Unsure` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_classification_with_all_labellers_9532137_en_5.1.4_3.4_1698783032480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_classification_with_all_labellers_9532137_en_5.1.4_3.4_1698783032480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_classification_with_all_labellers_9532137","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_classification_with_all_labellers_9532137","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.v2.by_bshlgrs").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_classification_with_all_labellers_9532137| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bshlgrs/autonlp-classification_with_all_labellers-9532137 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cola_gram_208681_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cola_gram_208681_en.md new file mode 100644 index 000000000000..37a3b11c8d5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cola_gram_208681_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from kamivao) +author: John Snow Labs +name: bert_classifier_autonlp_cola_gram_208681 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-cola_gram-208681` is a English model originally trained by `kamivao`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cola_gram_208681_en_5.1.4_3.4_1698783344836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cola_gram_208681_en_5.1.4_3.4_1698783344836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cola_gram_208681","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cola_gram_208681","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.cola1.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_cola_gram_208681| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/kamivao/autonlp-cola_gram-208681 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cyberlandr_ai_4_614417500_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cyberlandr_ai_4_614417500_en.md new file mode 100644 index 000000000000..31bb064ece23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cyberlandr_ai_4_614417500_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from billfrench) +author: John Snow Labs +name: bert_classifier_autonlp_cyberlandr_ai_4_614417500 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-cyberlandr-ai-4-614417500` is a English model originally trained by `billfrench`. + +## Predicted Entities + +`close door`, `opaque windows`, `open door`, `clear windows` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cyberlandr_ai_4_614417500_en_5.1.4_3.4_1698782548435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cyberlandr_ai_4_614417500_en_5.1.4_3.4_1698782548435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cyberlandr_ai_4_614417500","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cyberlandr_ai_4_614417500","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.cyberlandr_ai.bert.v1.by_billfrench").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_cyberlandr_ai_4_614417500| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/billfrench/autonlp-cyberlandr-ai-4-614417500 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cyberlandr_ai_4_614417501_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cyberlandr_ai_4_614417501_en.md new file mode 100644 index 000000000000..285b51b433bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_cyberlandr_ai_4_614417501_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from billfrench) +author: John Snow Labs +name: bert_classifier_autonlp_cyberlandr_ai_4_614417501 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-cyberlandr-ai-4-614417501` is a English model originally trained by `billfrench`. + +## Predicted Entities + +`close door`, `opaque windows`, `open door`, `clear windows` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cyberlandr_ai_4_614417501_en_5.1.4_3.4_1698790676711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_cyberlandr_ai_4_614417501_en_5.1.4_3.4_1698790676711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cyberlandr_ai_4_614417501","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_cyberlandr_ai_4_614417501","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.cyberlandr_ai.bert.v2.by_billfrench").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_cyberlandr_ai_4_614417501| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/billfrench/autonlp-cyberlandr-ai-4-614417501 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_doctor_german_24595545_de.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_doctor_german_24595545_de.md new file mode 100644 index 000000000000..1e5be25cad7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_doctor_german_24595545_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German bert_classifier_autonlp_doctor_german_24595545 BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_classifier_autonlp_doctor_german_24595545 +date: 2023-10-31 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_autonlp_doctor_german_24595545` is a German model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_doctor_german_24595545_de_5.1.4_3.4_1698783573146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_doctor_german_24595545_de_5.1.4_3.4_1698783573146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_doctor_german_24595545","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_doctor_german_24595545","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_doctor_german_24595545| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.1 MB| + +## References + +https://huggingface.co/muhtasham/autonlp-Doctor_DE-24595545 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_doctor_german_24595546_de.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_doctor_german_24595546_de.md new file mode 100644 index 000000000000..cb8191aad2d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_doctor_german_24595546_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German bert_classifier_autonlp_doctor_german_24595546 BertForSequenceClassification from muhtasham +author: John Snow Labs +name: bert_classifier_autonlp_doctor_german_24595546 +date: 2023-10-31 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_autonlp_doctor_german_24595546` is a German model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_doctor_german_24595546_de_5.1.4_3.4_1698782774671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_doctor_german_24595546_de_5.1.4_3.4_1698782774671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_doctor_german_24595546","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_doctor_german_24595546","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_doctor_german_24595546| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.1 MB| + +## References + +https://huggingface.co/muhtasham/autonlp-Doctor_DE-24595546 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_email_classification_657119381_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_email_classification_657119381_en.md new file mode 100644 index 000000000000..b3bb6ae6374b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_email_classification_657119381_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from esiebomajeremiah) +author: John Snow Labs +name: bert_classifier_autonlp_email_classification_657119381 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-email-classification-657119381` is a English model originally trained by `esiebomajeremiah`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_email_classification_657119381_en_5.1.4_3.4_1698791250777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_email_classification_657119381_en_5.1.4_3.4_1698791250777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_email_classification_657119381","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_email_classification_657119381","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_esiebomajeremiah").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_email_classification_657119381| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/esiebomajeremiah/autonlp-email-classification-657119381 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_imdb_test_21134442_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_imdb_test_21134442_en.md new file mode 100644 index 000000000000..526e2309145b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_imdb_test_21134442_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from mmcquade11) +author: John Snow Labs +name: bert_classifier_autonlp_imdb_test_21134442 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb-test-21134442` is a English model originally trained by `mmcquade11`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_imdb_test_21134442_en_5.1.4_3.4_1698791879686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_imdb_test_21134442_en_5.1.4_3.4_1698791879686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_imdb_test_21134442","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_imdb_test_21134442","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.imdb.by_mmcquade11").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_imdb_test_21134442| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mmcquade11/autonlp-imdb-test-21134442 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_kaggledays_625717986_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_kaggledays_625717986_en.md new file mode 100644 index 000000000000..05eb11b48f05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_kaggledays_625717986_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Someshfengde) +author: John Snow Labs +name: bert_classifier_autonlp_kaggledays_625717986 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-kaggledays-625717986` is a English model originally trained by `Someshfengde`. + +## Predicted Entities + +`unbiased`, `disagreement`, `association` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_kaggledays_625717986_en_5.1.4_3.4_1698783040647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_kaggledays_625717986_en_5.1.4_3.4_1698783040647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_kaggledays_625717986","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_kaggledays_625717986","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_someshfengde").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_kaggledays_625717986| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Someshfengde/autonlp-kaggledays-625717986 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_sentiment_detection_1781580_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_sentiment_detection_1781580_en.md new file mode 100644 index 000000000000..6f8226742683 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_sentiment_detection_1781580_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from severo) +author: John Snow Labs +name: bert_classifier_autonlp_sentiment_detection_1781580 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-sentiment_detection-1781580` is a English model originally trained by `severo`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_sentiment_detection_1781580_en_5.1.4_3.4_1698792185896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_sentiment_detection_1781580_en_5.1.4_3.4_1698792185896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_sentiment_detection_1781580","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_sentiment_detection_1781580","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_sentiment_detection_1781580| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/severo/autonlp-sentiment_detection-1781580 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_test3_2101779_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_test3_2101779_en.md new file mode 100644 index 000000000000..38cb93457600 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_test3_2101779_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from clem) +author: John Snow Labs +name: bert_classifier_autonlp_test3_2101779 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-test3-2101779` is a English model originally trained by `clem`. + +## Predicted Entities + +`not_urgent`, `urgent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_test3_2101779_en_5.1.4_3.4_1698786205588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_test3_2101779_en_5.1.4_3.4_1698786205588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_test3_2101779","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_test3_2101779","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.v1.by_clem").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_test3_2101779| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/clem/autonlp-test3-2101779 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_test3_2101782_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_test3_2101782_en.md new file mode 100644 index 000000000000..6a856d80e7a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_test3_2101782_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from clem) +author: John Snow Labs +name: bert_classifier_autonlp_test3_2101782 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-test3-2101782` is a English model originally trained by `clem`. + +## Predicted Entities + +`not_urgent`, `urgent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_test3_2101782_en_5.1.4_3.4_1698784128589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_test3_2101782_en_5.1.4_3.4_1698784128589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_test3_2101782","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_test3_2101782","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.v2.by_clem").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_test3_2101782| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/clem/autonlp-test3-2101782 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_user_review_classification_536415182_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_user_review_classification_536415182_en.md new file mode 100644 index 000000000000..5723a462b9e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autonlp_user_review_classification_536415182_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from alperiox) +author: John Snow Labs +name: bert_classifier_autonlp_user_review_classification_536415182 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-user-review-classification-536415182` is a English model originally trained by `alperiox`. + +## Predicted Entities + +`CONTENT`, `SUBSCRIPTION`, `INTERFACE`, `USER_EXPERIENCE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_user_review_classification_536415182_en_5.1.4_3.4_1698792435167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autonlp_user_review_classification_536415182_en_5.1.4_3.4_1698792435167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_user_review_classification_536415182","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autonlp_user_review_classification_536415182","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_alperiox").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autonlp_user_review_classification_536415182| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/alperiox/autonlp-user-review-classification-536415182 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_apm2_1212245840_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_apm2_1212245840_en.md new file mode 100644 index 000000000000..c074bb8e1a7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_apm2_1212245840_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from BenWord) +author: John Snow Labs +name: bert_classifier_autotrain_apm2_1212245840 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-APM2-1212245840` is a English model originally trained by `BenWord`. + +## Predicted Entities + +` MY_APPLICATIONS`, ` ALL_OTHER_QUERIES` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_apm2_1212245840_en_5.1.4_3.4_1698792972040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_apm2_1212245840_en_5.1.4_3.4_1698792972040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_apm2_1212245840","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_apm2_1212245840","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_benword").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_apm2_1212245840| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BenWord/autotrain-APM2-1212245840 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_apmv2multiclass_1216046004_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_apmv2multiclass_1216046004_en.md new file mode 100644 index 000000000000..03ec90865d34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_apmv2multiclass_1216046004_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from BenWord) +author: John Snow Labs +name: bert_classifier_autotrain_apmv2multiclass_1216046004 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-APMv2Multiclass-1216046004` is a English model originally trained by `BenWord`. + +## Predicted Entities + +` MY_APPLICATIONS`, ` ALL_OTHER_QUERIES` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_apmv2multiclass_1216046004_en_5.1.4_3.4_1698793533577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_apmv2multiclass_1216046004_en_5.1.4_3.4_1698793533577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_apmv2multiclass_1216046004","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_apmv2multiclass_1216046004","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.v2_2m").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_apmv2multiclass_1216046004| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BenWord/autotrain-APMv2Multiclass-1216046004 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_argument_feedback_1154042511_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_argument_feedback_1154042511_en.md new file mode 100644 index 000000000000..107b728134e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_argument_feedback_1154042511_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from snap) +author: John Snow Labs +name: bert_classifier_autotrain_argument_feedback_1154042511 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-argument-feedback-1154042511` is a English model originally trained by `snap`. + +## Predicted Entities + +`Adequate`, `Effective`, `Ineffective` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_argument_feedback_1154042511_en_5.1.4_3.4_1698786686391.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_argument_feedback_1154042511_en_5.1.4_3.4_1698786686391.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_argument_feedback_1154042511","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_argument_feedback_1154042511","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_snap").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_argument_feedback_1154042511| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/snap/autotrain-argument-feedback-1154042511 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248996_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248996_en.md new file mode 100644 index 000000000000..11aead046e2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248996_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: bert_classifier_autotrain_base_tweeteval_1281248996 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-BERTBase-TweetEval-1281248996` is a English model originally trained by `sasha`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248996_en_5.1.4_3.4_1698784425861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248996_en_5.1.4_3.4_1698784425861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248996","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248996","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.tweet.base_128d").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_base_tweeteval_1281248996| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-BERTBase-TweetEval-1281248996 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248997_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248997_en.md new file mode 100644 index 000000000000..16ab0afeef2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248997_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: bert_classifier_autotrain_base_tweeteval_1281248997 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-BERTBase-TweetEval-1281248997` is a English model originally trained by `sasha`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248997_en_5.1.4_3.4_1698786965696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248997_en_5.1.4_3.4_1698786965696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248997","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248997","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.tweet.base_128d_v1.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_base_tweeteval_1281248997| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-BERTBase-TweetEval-1281248997 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248998_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248998_en.md new file mode 100644 index 000000000000..d838b67dcf27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248998_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: bert_classifier_autotrain_base_tweeteval_1281248998 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-BERTBase-TweetEval-1281248998` is a English model originally trained by `sasha`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248998_en_5.1.4_3.4_1698793799127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248998_en_5.1.4_3.4_1698793799127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248998","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248998","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.tweet.base_128d_v2.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_base_tweeteval_1281248998| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-BERTBase-TweetEval-1281248998 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248999_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248999_en.md new file mode 100644 index 000000000000..055ecbff8bc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281248999_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: bert_classifier_autotrain_base_tweeteval_1281248999 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-BERTBase-TweetEval-1281248999` is a English model originally trained by `sasha`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248999_en_5.1.4_3.4_1698784809709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281248999_en_5.1.4_3.4_1698784809709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248999","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281248999","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.tweet.base_128d_v3.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_base_tweeteval_1281248999| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-BERTBase-TweetEval-1281248999 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281249000_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281249000_en.md new file mode 100644 index 000000000000..af9d2f68d237 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_base_tweeteval_1281249000_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: bert_classifier_autotrain_base_tweeteval_1281249000 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-BERTBase-TweetEval-1281249000` is a English model originally trained by `sasha`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281249000_en_5.1.4_3.4_1698785111164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_base_tweeteval_1281249000_en_5.1.4_3.4_1698785111164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281249000","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_base_tweeteval_1281249000","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.tweet.base_128d_v4.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_base_tweeteval_1281249000| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-BERTBase-TweetEval-1281249000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_chat_bot_responses_949231426_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_chat_bot_responses_949231426_en.md new file mode 100644 index 000000000000..bd9a8b31ed03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_chat_bot_responses_949231426_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from nitishkumargundapu793) +author: John Snow Labs +name: bert_classifier_autotrain_chat_bot_responses_949231426 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-chat-bot-responses-949231426` is a English model originally trained by `nitishkumargundapu793`. + +## Predicted Entities + +`goodbye`, `greeting`, `thanks`, `help` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_chat_bot_responses_949231426_en_5.1.4_3.4_1698783773749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_chat_bot_responses_949231426_en_5.1.4_3.4_1698783773749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_chat_bot_responses_949231426","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_chat_bot_responses_949231426","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_nitishkumargundapu793").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_chat_bot_responses_949231426| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nitishkumargundapu793/autotrain-chat-bot-responses-949231426 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_chemprot_re_838426740_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_chemprot_re_838426740_en.md new file mode 100644 index 000000000000..e6d5c12a63fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_chemprot_re_838426740_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from pier297) +author: John Snow Labs +name: bert_classifier_autotrain_chemprot_re_838426740 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-chemprot-re-838426740` is a English model originally trained by `pier297`. + +## Predicted Entities + +`AGONIST`, `INHIBITOR`, `INDIRECT-DOWNREGULATOR`, `AGONIST-ACTIVATOR`, `ACTIVATOR`, `DOWNREGULATOR`, `AGONIST-INHIBITOR`, `UPREGULATOR`, `SUBSTRATE`, `ANTAGONIST`, `PRODUCT-OF`, `INDIRECT-UPREGULATOR`, `SUBSTRATE_PRODUCT-OF` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_chemprot_re_838426740_en_5.1.4_3.4_1698784341710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_chemprot_re_838426740_en_5.1.4_3.4_1698784341710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_chemprot_re_838426740","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_chemprot_re_838426740","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.chemical.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_chemprot_re_838426740| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/pier297/autotrain-chemprot-re-838426740 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_dontknowwhatimdoing_980432459_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_dontknowwhatimdoing_980432459_en.md new file mode 100644 index 000000000000..f8f8de2b8e9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_dontknowwhatimdoing_980432459_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Jerimee) +author: John Snow Labs +name: bert_classifier_autotrain_dontknowwhatimdoing_980432459 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-dontknowwhatImdoing-980432459` is a English model originally trained by `Jerimee`. + +## Predicted Entities + +`Goblin`, `Mundane` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_dontknowwhatimdoing_980432459_en_5.1.4_3.4_1698784923417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_dontknowwhatimdoing_980432459_en_5.1.4_3.4_1698784923417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_dontknowwhatimdoing_980432459","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_dontknowwhatimdoing_980432459","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_jerimee").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_dontknowwhatimdoing_980432459| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jerimee/autotrain-dontknowwhatImdoing-980432459 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_financial_sentiment_765323474_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_financial_sentiment_765323474_en.md new file mode 100644 index 000000000000..d24aaf54e9cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_financial_sentiment_765323474_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from ktangri) +author: John Snow Labs +name: bert_classifier_autotrain_financial_sentiment_765323474 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-financial-sentiment-765323474` is a English model originally trained by `ktangri`. + +## Predicted Entities + +`negative`, `neutral`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_financial_sentiment_765323474_en_5.1.4_3.4_1698787234109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_financial_sentiment_765323474_en_5.1.4_3.4_1698787234109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_financial_sentiment_765323474","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_financial_sentiment_765323474","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.by_ktangri").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_financial_sentiment_765323474| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ktangri/autotrain-financial-sentiment-765323474 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_gluefinetunedmodel_1013533786_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_gluefinetunedmodel_1013533786_en.md new file mode 100644 index 000000000000..34fc89d9e87a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_gluefinetunedmodel_1013533786_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from deepesh0x) +author: John Snow Labs +name: bert_classifier_autotrain_gluefinetunedmodel_1013533786 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-GlueFineTunedModel-1013533786` is a English model originally trained by `deepesh0x`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_gluefinetunedmodel_1013533786_en_5.1.4_3.4_1698787489622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_gluefinetunedmodel_1013533786_en_5.1.4_3.4_1698787489622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_gluefinetunedmodel_1013533786","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_gluefinetunedmodel_1013533786","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.finetuned.by_deepesh0x").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_gluefinetunedmodel_1013533786| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/deepesh0x/autotrain-GlueFineTunedModel-1013533786 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_imdbtestmodel_9215210_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_imdbtestmodel_9215210_en.md new file mode 100644 index 000000000000..a24ee8391e27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_imdbtestmodel_9215210_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from abhishek) +author: John Snow Labs +name: bert_classifier_autotrain_imdbtestmodel_9215210 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-imdbtestmodel-9215210` is a English model originally trained by `abhishek`. + +## Predicted Entities + +`neg`, `pos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_imdbtestmodel_9215210_en_5.1.4_3.4_1698794031933.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_imdbtestmodel_9215210_en_5.1.4_3.4_1698794031933.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_imdbtestmodel_9215210","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_imdbtestmodel_9215210","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.imdb.v2.by_abhishek").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_imdbtestmodel_9215210| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/abhishek/autotrain-imdbtestmodel-9215210 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_keywordextraction_882328335_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_keywordextraction_882328335_en.md new file mode 100644 index 000000000000..a787c3977fd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_keywordextraction_882328335_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from priyamm) +author: John Snow Labs +name: bert_classifier_autotrain_keywordextraction_882328335 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-KeywordExtraction-882328335` is a English model originally trained by `priyamm`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_keywordextraction_882328335_en_5.1.4_3.4_1698787827650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_keywordextraction_882328335_en_5.1.4_3.4_1698787827650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_keywordextraction_882328335","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_autotrain_keywordextraction_882328335","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_priyamm").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_keywordextraction_882328335| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/priyamm/autotrain-KeywordExtraction-882328335 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_maysix_828926405_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_maysix_828926405_zh.md new file mode 100644 index 000000000000..372b2a86f91c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_maysix_828926405_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from EAST) +author: John Snow Labs +name: bert_classifier_autotrain_maysix_828926405 +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-maysix-828926405` is a Chinese model originally trained by `EAST`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_maysix_828926405_zh_5.1.4_3.4_1698785191770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_maysix_828926405_zh_5.1.4_3.4_1698785191770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_maysix_828926405","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_maysix_828926405","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.maysix.by_east").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_maysix_828926405| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/EAST/autotrain-maysix-828926405 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_rule_793324440_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_rule_793324440_zh.md new file mode 100644 index 000000000000..865d375706a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_autotrain_rule_793324440_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from EAST) +author: John Snow Labs +name: bert_classifier_autotrain_rule_793324440 +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-Rule-793324440` is a Chinese model originally trained by `EAST`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_rule_793324440_zh_5.1.4_3.4_1698785442313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_autotrain_rule_793324440_zh_5.1.4_3.4_1698785442313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_rule_793324440","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_autotrain_rule_793324440","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.rule.by_east").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_autotrain_rule_793324440| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/EAST/autotrain-Rule-793324440 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_gewerke_de.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_gewerke_de.md new file mode 100644 index 000000000000..27a21e5f00e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_gewerke_de.md @@ -0,0 +1,106 @@ +--- +layout: model +title: German BertForSequenceClassification Cased model (from cm-mueller) +author: John Snow Labs +name: bert_classifier_bacnet_klassifizierung_gewerke +date: 2023-10-31 +tags: [de, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BACnet-Klassifizierung-Gewerke` is a German model originally trained by `cm-mueller`. + +## Predicted Entities + +`Starkstromanlagen`, `Gebäudeautomation`, `Andere_Anlagen`, `Lufttechnische_Anlagen`, `Abwasser-Wasser-Gasnlagen`, `Kälteanlagen`, `Wärmeversorgungsanlagen` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bacnet_klassifizierung_gewerke_de_5.1.4_3.4_1698788191261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bacnet_klassifizierung_gewerke_de_5.1.4_3.4_1698788191261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bacnet_klassifizierung_gewerke","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bacnet_klassifizierung_gewerke","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.gewerke.bert.by_cm_mueller").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bacnet_klassifizierung_gewerke| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cm-mueller/BACnet-Klassifizierung-Gewerke \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_kaeltettechnik_de.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_kaeltettechnik_de.md new file mode 100644 index 000000000000..45fc60d5e911 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_kaeltettechnik_de.md @@ -0,0 +1,106 @@ +--- +layout: model +title: German BertForSequenceClassification Cased model (from cm-mueller) +author: John Snow Labs +name: bert_classifier_bacnet_klassifizierung_kaeltettechnik +date: 2023-10-31 +tags: [de, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BACnet-Klassifizierung-Kaeltettechnik` is a German model originally trained by `cm-mueller`. + +## Predicted Entities + +`Kältemaschine`, `Kälteanlage_Allgemein`, `Kältespeicher`, `Freie_Kühlung`, `Rückkühlwerk` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bacnet_klassifizierung_kaeltettechnik_de_5.1.4_3.4_1698788487027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bacnet_klassifizierung_kaeltettechnik_de_5.1.4_3.4_1698788487027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bacnet_klassifizierung_kaeltettechnik","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bacnet_klassifizierung_kaeltettechnik","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.kaeltettechnik.bert.by_cm_mueller").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bacnet_klassifizierung_kaeltettechnik| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cm-mueller/BACnet-Klassifizierung-Kaeltettechnik \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_sanitaertechnik_de.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_sanitaertechnik_de.md new file mode 100644 index 000000000000..0d454a278c58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bacnet_klassifizierung_sanitaertechnik_de.md @@ -0,0 +1,106 @@ +--- +layout: model +title: German BertForSequenceClassification Cased model (from cm-mueller) +author: John Snow Labs +name: bert_classifier_bacnet_klassifizierung_sanitaertechnik +date: 2023-10-31 +tags: [de, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BACnet-Klassifizierung-Sanitaertechnik` is a German model originally trained by `cm-mueller`. + +## Predicted Entities + +`Enthärtungsanlage`, `Druckerhöhungsanlage`, `Trinkwassererwärmungsanlage`, `Wasserzähler`, `Schmutzwasser`, `Andere`, `Hebeanlage`, `Sanitär_Allgemein` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bacnet_klassifizierung_sanitaertechnik_de_5.1.4_3.4_1698785718393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bacnet_klassifizierung_sanitaertechnik_de_5.1.4_3.4_1698785718393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bacnet_klassifizierung_sanitaertechnik","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bacnet_klassifizierung_sanitaertechnik","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.sanitaertechnik.bert.by_cm_mueller").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bacnet_klassifizierung_sanitaertechnik| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cm-mueller/BACnet-Klassifizierung-Sanitaertechnik \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf_ar.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf_ar.md new file mode 100644 index 000000000000..17a1902583af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Base Cased model (from Abdelrahman-Rezk) +author: John Snow Labs +name: bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, ar, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-arabic-camelbert-mix-poetry-finetuned-qawaf` is a Arabic model originally trained by `Abdelrahman-Rezk`. + +## Predicted Entities + +`المضارع`, `المتقارب`, `المقتضب`, `الهزج`, `السلسلة`, `المجتث`, `الطويل`, `عامي`, `الرمل`, `الرجز`, `الوافر`, `المتدارك`, `المواليا`, `الدوبيت`, `الخفيف`, `شعر التفعيلة`, `المديد`, `السريع`, `المنسرح`, `شعر حر`, `البسيط`, `الكامل`, `موشح` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf_ar_5.1.4_3.4_1698794306775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf_ar_5.1.4_3.4_1698794306775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["أنا أحب الشرارة NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("أنا أحب الشرارة NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_arabic_camel_mix_poetry_finetuned_qawaf| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|408.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Abdelrahman-Rezk/bert-base-arabic-camelbert-mix-poetry-finetuned-qawaf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_clickbait_news_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_clickbait_news_en.md new file mode 100644 index 000000000000..fda9442c1156 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_clickbait_news_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from elozano) +author: John Snow Labs +name: bert_classifier_base_cased_clickbait_news +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-clickbait-news` is a English model originally trained by `elozano`. + +## Predicted Entities + +`Clickbait`, `Normal` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_clickbait_news_en_5.1.4_3.4_1698788733659.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_clickbait_news_en_5.1.4_3.4_1698788733659.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_clickbait_news","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_clickbait_news","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.news.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_cased_clickbait_news| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/elozano/bert-base-cased-clickbait-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_finetuned_sst2_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_finetuned_sst2_en.md new file mode 100644 index 000000000000..e46070a1fccb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_finetuned_sst2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from mfuntowicz) +author: John Snow Labs +name: bert_classifier_base_cased_finetuned_sst2 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-sst2` is a English model originally trained by `mfuntowicz`. + +## Predicted Entities + +`NEGATIVE`, `POSITIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_finetuned_sst2_en_5.1.4_3.4_1698785999465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_finetuned_sst2_en_5.1.4_3.4_1698785999465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_finetuned_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_finetuned_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.cased_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_cased_finetuned_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mfuntowicz/bert-base-cased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_news_category_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_news_category_en.md new file mode 100644 index 000000000000..dc167274a121 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_news_category_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from elozano) +author: John Snow Labs +name: bert_classifier_base_cased_news_category +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-news-category` is a English model originally trained by `elozano`. + +## Predicted Entities + +`World`, `Politics`, `Science`, `Technology`, `Automobile`, `Sports`, `Entertainment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_news_category_en_5.1.4_3.4_1698789003433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_news_category_en_5.1.4_3.4_1698789003433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_news_category","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_news_category","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.news.cased_base.by_elozano").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_cased_news_category| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/elozano/bert-base-cased-news-category \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_tamil_mix_sentiment_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_tamil_mix_sentiment_en.md new file mode 100644 index 000000000000..842144e8c726 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_cased_tamil_mix_sentiment_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from vishnun) +author: John Snow Labs +name: bert_classifier_base_cased_tamil_mix_sentiment +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-tamil-mix-sentiment` is a English model originally trained by `vishnun`. + +## Predicted Entities + +`Positive`, `Mixed_feelings`, `unknown_state`, `not-Tamil`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_tamil_mix_sentiment_en_5.1.4_3.4_1698789242640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_cased_tamil_mix_sentiment_en_5.1.4_3.4_1698789242640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_tamil_mix_sentiment","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_cased_tamil_mix_sentiment","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_cased_tamil_mix_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/vishnun/bert-base-cased-tamil-mix-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_dutch_cased_finetuned_sentiment_nl.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_dutch_cased_finetuned_sentiment_nl.md new file mode 100644 index 000000000000..06fa39324c7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_dutch_cased_finetuned_sentiment_nl.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Base Cased model (from wietsedv) +author: John Snow Labs +name: bert_classifier_base_dutch_cased_finetuned_sentiment +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, nl, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-dutch-cased-finetuned-sentiment` is a Dutch model originally trained by `wietsedv`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_dutch_cased_finetuned_sentiment_nl_5.1.4_3.4_1698789529850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_dutch_cased_finetuned_sentiment_nl_5.1.4_3.4_1698789529850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_dutch_cased_finetuned_sentiment","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_dutch_cased_finetuned_sentiment","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.sentiment.cased_base_finetuned").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_dutch_cased_finetuned_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|408.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/wietsedv/bert-base-dutch-cased-finetuned-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_finance_sentiment_noisy_search_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_finance_sentiment_noisy_search_en.md new file mode 100644 index 000000000000..0a3dee2a5bd4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_finance_sentiment_noisy_search_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from oferweintraub) +author: John Snow Labs +name: bert_classifier_base_finance_sentiment_noisy_search +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-finance-sentiment-noisy-search` is a English model originally trained by `oferweintraub`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_finance_sentiment_noisy_search_en_5.1.4_3.4_1698789843150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_finance_sentiment_noisy_search_en_5.1.4_3.4_1698789843150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_finance_sentiment_noisy_search","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_finance_sentiment_noisy_search","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_finance_sentiment_noisy_search| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/oferweintraub/bert-base-finance-sentiment-noisy-search +- https://www.kaggle.com/ankurzing/sentiment-analysis-for-financial-news +- https://drive.google.com/file/d/1MI9gRdppactVZ_XvhCwvoaOV1aRfprrd/view?usp=sharing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_swedish_cased_sentiment_sv.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_swedish_cased_sentiment_sv.md new file mode 100644 index 000000000000..9ae9e2c71c1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_swedish_cased_sentiment_sv.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Swedish BertForSequenceClassification Base Cased model (from marma) +author: John Snow Labs +name: bert_classifier_base_swedish_cased_sentiment +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, sv, onnx] +task: Text Classification +language: sv +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-swedish-cased-sentiment` is a Swedish model originally trained by `marma`. + +## Predicted Entities + +`NEGATIVE`, `POSITIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_swedish_cased_sentiment_sv_5.1.4_3.4_1698790144186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_swedish_cased_sentiment_sv_5.1.4_3.4_1698790144186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_swedish_cased_sentiment","sv") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Jag älskar Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_swedish_cased_sentiment","sv") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Jag älskar Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sv.classify.bert.sentiment.cased_base").predict("""Jag älskar Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_swedish_cased_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|sv| +|Size:|467.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/marma/bert-base-swedish-cased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_finetuned_clinc_en.md new file mode 100644 index 000000000000..4b063a8e12c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_finetuned_clinc_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from transformersbook) +author: John Snow Labs +name: bert_classifier_base_uncased_finetuned_clinc +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-clinc` is a English model originally trained by `transformersbook`. + +## Predicted Entities + +`timezone`, `are_you_a_bot`, `improve_credit_score`, `taxes`, `no`, `todo_list_update`, `schedule_maintenance`, `fun_fact`, `make_call`, `insurance`, `payday`, `vaccines`, `routing`, `order_status`, `pto_request`, `where_are_you_from`, `do_you_have_pets`, `redeem_rewards`, `calendar_update`, `directions`, `smart_home`, `calculator`, `international_fees`, `mpg`, `credit_limit`, `goodbye`, `interest_rate`, `car_rental`, `calories`, `change_volume`, `change_language`, `next_song`, `weather`, `next_holiday`, `meaning_of_life`, `oos`, `spending_history`, `shopping_list_update`, `cancel`, `traffic`, `oil_change_how`, `reset_settings`, `ingredients_list`, `travel_notification`, `pto_used`, `international_visa`, `uber`, `date`, `carry_on`, `definition`, `report_lost_card`, `exchange_rate`, `last_maintenance`, `confirm_reservation`, `card_declined`, `what_is_your_name`, `plug_type`, `tell_joke`, `user_name`, `reminder`, `restaurant_reviews`, `account_blocked`, `recipe`, `damaged_card`, `time`, `alarm`, `cook_time`, `roll_dice`, `text`, `book_flight`, `rollover_401k`, `find_phone`, `replacement_card_duration`, `greeting`, `travel_suggestion`, `lost_luggage`, `order`, `ingredient_substitution`, `what_song`, `bill_balance`, `food_last`, `order_checks`, `measurement_conversion`, `shopping_list`, `nutrition_info`, `current_location`, `timer`, `yes`, `reminder_update`, `flip_coin`, `thank_you`, `min_payment`, `meal_suggestion`, `spelling`, `translate`, `who_made_you`, `balance`, `new_card`, `credit_limit_change`, `how_busy`, `oil_change_when`, `sync_device`, `restaurant_reservation`, `flight_status`, `change_ai_name`, `direct_deposit`, `travel_alert`, `w2`, `tire_pressure`, `change_user_name`, `calendar`, `pay_bill`, `who_do_you_work_for`, `repeat`, `restaurant_suggestion`, `cancel_reservation`, `distance`, `pto_request_status`, `income`, `how_old_are_you`, `report_fraud`, `transfer`, `bill_due`, `what_are_your_hobbies`, `accept_reservations`, `credit_score`, `change_speed`, `whisper_mode`, `book_hotel`, `pin_change`, `transactions`, `gas`, `meeting_schedule`, `gas_type`, `expiration_date`, `play_music`, `update_playlist`, `freeze_account`, `change_accent`, `jump_start`, `application_status`, `share_location`, `insurance_change`, `tire_change`, `rewards_balance`, `what_can_i_ask_you`, `pto_balance`, `apr`, `schedule_meeting`, `todo_list`, `maybe` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_clinc_en_5.1.4_3.4_1698794551606.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_clinc_en_5.1.4_3.4_1698794551606.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_base_finetuned.by_transformersbook").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_uncased_finetuned_clinc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.8 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/transformersbook/bert-base-uncased-finetuned-clinc +- https://arxiv.org/abs/1909.02027 +- https://learning.oreilly.com/library/view/natural-language-processing/9781098103231/ +- https://github.com/nlp-with-transformers/notebooks/blob/main/08_model-compression.ipynb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_finetuned_glue_cola_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_finetuned_glue_cola_en.md new file mode 100644 index 000000000000..54488a338610 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_finetuned_glue_cola_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from geckos) +author: John Snow Labs +name: bert_classifier_base_uncased_finetuned_glue_cola +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-glue-cola` is a English model originally trained by `geckos`. + +## Predicted Entities + +`unacceptable`, `acceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_glue_cola_en_5.1.4_3.4_1698794802215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_finetuned_glue_cola_en_5.1.4_3.4_1698794802215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned_glue_cola","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_finetuned_glue_cola","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue_cola1.uncased_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_uncased_finetuned_glue_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/geckos/bert-base-uncased-finetuned-glue-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_offenseval2019_unbalanced_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_offenseval2019_unbalanced_en.md new file mode 100644 index 000000000000..1ae923afe548 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_offenseval2019_unbalanced_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_classifier_base_uncased_offenseval2019_unbalanced +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-offenseval2019-unbalanced` is a English model originally trained by `mohsenfayyaz`. + +## Predicted Entities + +`NOT`, `OFF` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_offenseval2019_unbalanced_en_5.1.4_3.4_1698795022071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_offenseval2019_unbalanced_en_5.1.4_3.4_1698795022071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_offenseval2019_unbalanced","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_offenseval2019_unbalanced","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.offense..bert.uncased_base.by_mohsenfayyaz").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_uncased_offenseval2019_unbalanced| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-uncased-offenseval2019-unbalanced \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4_en.md new file mode 100644 index 000000000000..51f4f8031506 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from Splend1dchan) +author: John Snow Labs +name: bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-slue-goldtrascription-e3-lr1e-4` is a English model originally trained by `Splend1dchan`. + +## Predicted Entities + +`Positive`, `Neutral`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4_en_5.1.4_3.4_1698795301581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4_en_5.1.4_3.4_1698795301581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_base.by_splend1dchan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_uncased_slue_goldtrascription_e3_lr1e_4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Splend1dchan/bert-base-uncased-slue-goldtrascription-e3-lr1e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_toxicity_a_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_toxicity_a_en.md new file mode 100644 index 000000000000..c5c083879a0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_base_uncased_toxicity_a_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_classifier_base_uncased_toxicity_a +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-toxicity-a` is a English model originally trained by `mohsenfayyaz`. + +## Predicted Entities + +`Toxic`, `Non-Toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_toxicity_a_en_5.1.4_3.4_1698785430220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_base_uncased_toxicity_a_en_5.1.4_3.4_1698785430220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_toxicity_a","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_base_uncased_toxicity_a","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.toxicity.bert.uncased_base.by_mohsenfayyaz").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_base_uncased_toxicity_a| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-uncased-toxicity-a \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_base_cased_finetuned_wnli_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_base_cased_finetuned_wnli_en.md new file mode 100644 index 000000000000..6b0519fa2917 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_base_cased_finetuned_wnli_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_classifier_bert_base_cased_finetuned_wnli +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-wnli` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`not_entailment`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_cased_finetuned_wnli_en_5.1.4_3.4_1698786254756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_cased_finetuned_wnli_en_5.1.4_3.4_1698786254756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_cased_finetuned_wnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_cased_finetuned_wnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.wnli_glue.cased_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_cased_finetuned_wnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+WNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_base_multilingual_uncased_sentiment_xx.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_base_multilingual_uncased_sentiment_xx.md new file mode 100644 index 000000000000..a64ebe8c04b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_base_multilingual_uncased_sentiment_xx.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Multilingual BertForSequenceClassification Base Uncased model (from nlptown) +author: John Snow Labs +name: bert_classifier_bert_base_multilingual_uncased_sentiment +date: 2023-10-31 +tags: [en, de, fr, es, it, nl, open_source, bert, sequence_classification, classification, xx, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-uncased-sentiment` is a Multilingual model originally trained by `nlptown`. + +## Predicted Entities + +`5 stars`, `2 stars`, `1 star`, `4 stars`, `3 stars` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_multilingual_uncased_sentiment_xx_5.1.4_3.4_1698790471173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_base_multilingual_uncased_sentiment_xx_5.1.4_3.4_1698790471173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_multilingual_uncased_sentiment","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_base_multilingual_uncased_sentiment","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.bert.sentiment.uncased_multilingual_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_base_multilingual_uncased_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment +- https://rapidapi.com/nlp-town-nlp-town-default/api/multilingual-sentiment-analysis2/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_finetuned_mrpc_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_finetuned_mrpc_en.md new file mode 100644 index 000000000000..57f7f1c2b159 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_finetuned_mrpc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sgugger) +author: John Snow Labs +name: bert_classifier_bert_finetuned_mrpc +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-mrpc` is a English model originally trained by `sgugger`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_finetuned_mrpc_en_5.1.4_3.4_1698781750764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_finetuned_mrpc_en_5.1.4_3.4_1698781750764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_finetuned_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_finetuned_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_finetuned_mrpc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sgugger/bert-finetuned-mrpc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_imdb_1hidden_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_imdb_1hidden_en.md new file mode 100644 index 000000000000..dac45c30911a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_imdb_1hidden_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from lannelin) +author: John Snow Labs +name: bert_classifier_bert_imdb_1hidden +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-imdb-1hidden` is a English model originally trained by `lannelin`. + +## Predicted Entities + +`neg`, `pos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_imdb_1hidden_en_5.1.4_3.4_1698785825441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_imdb_1hidden_en_5.1.4_3.4_1698785825441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_imdb_1hidden","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_imdb_1hidden","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.imdb.1h").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_imdb_1hidden| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|117.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lannelin/bert-imdb-1hidden \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_large_mnli_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_large_mnli_en.md new file mode 100644 index 000000000000..8e190bb32cf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_large_mnli_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from TehranNLP-org) +author: John Snow Labs +name: bert_classifier_bert_large_mnli +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-mnli` is a English model originally trained by `TehranNLP-org`. + +## Predicted Entities + +`neutral`, `contradiction`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_large_mnli_en_5.1.4_3.4_1698786297894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_large_mnli_en_5.1.4_3.4_1698786297894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_large_mnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_large_mnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_large_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/TehranNLP-org/bert-large-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_large_sst2_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_large_sst2_en.md new file mode 100644 index 000000000000..72cfe501f7f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_large_sst2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from TehranNLP-org) +author: John Snow Labs +name: bert_classifier_bert_large_sst2 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-sst2` is a English model originally trained by `TehranNLP-org`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_large_sst2_en_5.1.4_3.4_1698796033930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_large_sst2_en_5.1.4_3.4_1698796033930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_large_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_large_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.large.by_tehrannlp_org").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_large_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/TehranNLP-org/bert-large-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_farstail_fa.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_farstail_fa.md new file mode 100644 index 000000000000..9eac57e6652d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_farstail_fa.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Persian bert_classifier_bert_persian_farsi_base_uncased_farstail BertForSequenceClassification from m3hrdadfi +author: John Snow Labs +name: bert_classifier_bert_persian_farsi_base_uncased_farstail +date: 2023-10-31 +tags: [bert, fa, open_source, sequence_classification, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_bert_persian_farsi_base_uncased_farstail` is a Persian model originally trained by m3hrdadfi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_persian_farsi_base_uncased_farstail_fa_5.1.4_3.4_1698781217102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_persian_farsi_base_uncased_farstail_fa_5.1.4_3.4_1698781217102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bert_persian_farsi_base_uncased_farstail","fa")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bert_persian_farsi_base_uncased_farstail","fa") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_persian_farsi_base_uncased_farstail| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/m3hrdadfi/bert-fa-base-uncased-farstail \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli_fa.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli_fa.md new file mode 100644 index 000000000000..9be6f2bcbf69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli_fa.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Persian bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli BertForSequenceClassification from demoversion +author: John Snow Labs +name: bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli +date: 2023-10-31 +tags: [bert, fa, open_source, sequence_classification, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli` is a Persian model originally trained by demoversion. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli_fa_5.1.4_3.4_1698785654984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli_fa_5.1.4_3.4_1698785654984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli","fa")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli","fa") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_persian_farsi_base_uncased_haddad_wikinli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/demoversion/bert-fa-base-uncased-haddad-wikinli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_wikinli_fa.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_wikinli_fa.md new file mode 100644 index 000000000000..ecccd7d40c86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_persian_farsi_base_uncased_wikinli_fa.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Persian bert_classifier_bert_persian_farsi_base_uncased_wikinli BertForSequenceClassification from m3hrdadfi +author: John Snow Labs +name: bert_classifier_bert_persian_farsi_base_uncased_wikinli +date: 2023-10-31 +tags: [bert, fa, open_source, sequence_classification, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_bert_persian_farsi_base_uncased_wikinli` is a Persian model originally trained by m3hrdadfi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_persian_farsi_base_uncased_wikinli_fa_5.1.4_3.4_1698781485654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_persian_farsi_base_uncased_wikinli_fa_5.1.4_3.4_1698781485654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bert_persian_farsi_base_uncased_wikinli","fa")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bert_persian_farsi_base_uncased_wikinli","fa") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_persian_farsi_base_uncased_wikinli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.7 MB| + +## References + +https://huggingface.co/m3hrdadfi/bert-fa-base-uncased-wikinli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_sentiment_analysis_en.md new file mode 100644 index 000000000000..d9efb0ce7467 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_sentiment_analysis_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Souvikcmsa) +author: John Snow Labs +name: bert_classifier_bert_sentiment_analysis +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BERT_sentiment_analysis` is a English model originally trained by `Souvikcmsa`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_sentiment_analysis_en_5.1.4_3.4_1698786475978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_sentiment_analysis_en_5.1.4_3.4_1698786475978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.by_souvikcmsa").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Souvikcmsa/BERT_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_swahili_news_classification_sw.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_swahili_news_classification_sw.md new file mode 100644 index 000000000000..51ae05dcb723 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bert_swahili_news_classification_sw.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Swahili BertForSequenceClassification Cased model (from flax-community) +author: John Snow Labs +name: bert_classifier_bert_swahili_news_classification +date: 2023-10-31 +tags: [sw, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: sw +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-swahili-news-classification` is a Swahili model originally trained by `flax-community`. + +## Predicted Entities + +`kimataifa`, `burudani`, `kitaifa`, `afya`, `uchumi`, `michezo` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_swahili_news_classification_sw_5.1.4_3.4_1698786765069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bert_swahili_news_classification_sw_5.1.4_3.4_1698786765069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_swahili_news_classification","sw") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_bert_swahili_news_classification","sw") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sw.classify.bert.news.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bert_swahili_news_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|sw| +|Size:|410.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/flax-community/bert-swahili-news-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_berticelli_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_berticelli_en.md new file mode 100644 index 000000000000..97e6565ba888 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_berticelli_en.md @@ -0,0 +1,111 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from patrickquick) +author: John Snow Labs +name: bert_classifier_berticelli +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BERTicelli` is a English model originally trained by `patrickquick`. + +## Predicted Entities + +`NOT`, `OFF` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_berticelli_en_5.1.4_3.4_1698796282312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_berticelli_en_5.1.4_3.4_1698796282312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_berticelli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_berticelli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_patrickquick").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_berticelli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/patrickquick/BERTicelli +- https://github.com/MonaDT +- https://github.com/corvusMidnight +- https://github.com/patrickquick +- https://github.com/google-research/bert +- https://scholar.harvard.edu/malmasi/olid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_4d_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_4d_en.md new file mode 100644 index 000000000000..918c8a5f8112 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_4d_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from ismaelardo) +author: John Snow Labs +name: bert_classifier_beto_4d +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BETO_4d` is a English model originally trained by `ismaelardo`. + +## Predicted Entities + +`9411`, `2433`, `8322`, `1323`, `5414`, `9412`, `2413`, `3343`, `1212`, `2522`, `9621`, `4321`, `2242`, `4225`, `7212`, `3331`, `5249`, `8344`, `2351`, `2431`, `3411`, `2411`, `1330`, `3322`, `1345`, `7127`, `8332`, `5223`, `5242`, `9333`, `2221`, `3511`, `4416`, `2141`, `3251`, `2161`, `4226`, `3344`, `5230`, `1324`, `3111`, `1219`, `3311`, `3257`, `2423`, `3512`, `2519`, `4323`, `9112`, `2143`, `2310`, `3321`, `5244`, `2635`, `4110`, `2421`, `7412`, `3118`, `5222`, `8343`, `1221`, `3122`, `2521`, `3115`, `2330`, `2529`, `3313`, `1211`, `3112`, `3611`, `2341`, `3113`, `2243`, `2513`, `8321`, `2342`, `3323`, `2145`, `2151`, `7233`, `2512`, `4214`, `3221`, `2424`, `2166`, `4222`, `3432`, `2642`, `2144`, `1412`, `2511`, `5120`, `9334`, `7231`, `4211`, `9321`, `2142`, `3142`, `2634`, `3312`, `3114`, `4311`, `1420`, `3334` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_beto_4d_en_5.1.4_3.4_1698790912100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_beto_4d_en_5.1.4_3.4_1698790912100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_beto_4d","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_beto_4d","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.beto_bert").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_beto_4d| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ismaelardo/BETO_4d \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_headlines_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_headlines_sentiment_analysis_en.md new file mode 100644 index 000000000000..3925355200d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_headlines_sentiment_analysis_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from finiteautomata) +author: John Snow Labs +name: bert_classifier_beto_headlines_sentiment_analysis +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beto-headlines-sentiment-analysis` is a English model originally trained by `finiteautomata`. + +## Predicted Entities + +`POS`, `NEG`, `NEU` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_beto_headlines_sentiment_analysis_en_5.1.4_3.4_1698791212476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_beto_headlines_sentiment_analysis_en_5.1.4_3.4_1698791212476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_beto_headlines_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_beto_headlines_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.beto_bert.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_beto_headlines_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/finiteautomata/beto-headlines-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_sentiment_analysis_es.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_sentiment_analysis_es.md new file mode 100644 index 000000000000..2d8a9fc6041c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_beto_sentiment_analysis_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Cased model (from finiteautomata) +author: John Snow Labs +name: bert_classifier_beto_sentiment_analysis +date: 2023-10-31 +tags: [es, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beto-sentiment-analysis` is a Spanish model originally trained by `finiteautomata`. + +## Predicted Entities + +`POS`, `NEG`, `NEU` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_beto_sentiment_analysis_es_5.1.4_3.4_1698782030527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_beto_sentiment_analysis_es_5.1.4_3.4_1698782030527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_beto_sentiment_analysis","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_beto_sentiment_analysis","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.beto_bert.sentiment.by_finiteautomata").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_beto_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.7 MB| + +## References + +References + +- https://huggingface.co/finiteautomata/beto-sentiment-analysis +- https://github.com/pysentimiento/pysentimiento/ +- https://github.com/dccuchile/beto +- http://tass.sepln.org/tass_data/download.php +- https://arxiv.org/abs/2106.09462 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_binary_classification_arabic_ar.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_binary_classification_arabic_ar.md new file mode 100644 index 000000000000..a4bdf54eb1bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_binary_classification_arabic_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from M47Labs) +author: John Snow Labs +name: bert_classifier_binary_classification_arabic +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, ar, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `binary_classification_arabic` is a Arabic model originally trained by `M47Labs`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_binary_classification_arabic_ar_5.1.4_3.4_1698787292055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_binary_classification_arabic_ar_5.1.4_3.4_1698787292055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_binary_classification_arabic","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["أنا أحب الشرارة NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_binary_classification_arabic","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("أنا أحب الشرارة NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_binary_classification_arabic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|414.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/M47Labs/binary_classification_arabic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bio_pubmed200krct_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bio_pubmed200krct_en.md new file mode 100644 index 000000000000..73596406f08f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bio_pubmed200krct_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from pritamdeka) +author: John Snow Labs +name: bert_classifier_bio_pubmed200krct +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BioBert-PubMed200kRCT` is a English model originally trained by `pritamdeka`. + +## Predicted Entities + +`METHODS`, `BACKGROUND`, `RESULTS`, `OBJECTIVE`, `CONCLUSIONS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bio_pubmed200krct_en_5.1.4_3.4_1698787527001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bio_pubmed200krct_en_5.1.4_3.4_1698787527001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_bio_pubmed200krct","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_bio_pubmed200krct","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.bio_pubmed.by_pritamdeka").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bio_pubmed200krct| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/pritamdeka/BioBert-PubMed200kRCT +- https://github.com/Franck-Dernoncourt/pubmed-rct/tree/master/PubMed_200k_RCT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bioformer_cased_v1.0_mnli_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bioformer_cased_v1.0_mnli_en.md new file mode 100644 index 000000000000..1233a31a1367 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bioformer_cased_v1.0_mnli_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bioformers) +author: John Snow Labs +name: bert_classifier_bioformer_cased_v1.0_mnli +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bioformer-cased-v1.0-mnli` is a English model originally trained by `bioformers`. + +## Predicted Entities + +`entailment`, `neutral`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bioformer_cased_v1.0_mnli_en_5.1.4_3.4_1698787728003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bioformer_cased_v1.0_mnli_en_5.1.4_3.4_1698787728003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_bioformer_cased_v1.0_mnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_bioformer_cased_v1.0_mnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bioformer.cased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bioformer_cased_v1.0_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|159.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bioformers/bioformer-cased-v1.0-mnli +- https://cims.nyu.edu/~sbowman/multinli/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bioformer_cased_v1.0_qnli_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bioformer_cased_v1.0_qnli_en.md new file mode 100644 index 000000000000..823d891a91ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bioformer_cased_v1.0_qnli_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bioformers) +author: John Snow Labs +name: bert_classifier_bioformer_cased_v1.0_qnli +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bioformer-cased-v1.0-qnli` is a English model originally trained by `bioformers`. + +## Predicted Entities + +`entailment`, `not_entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bioformer_cased_v1.0_qnli_en_5.1.4_3.4_1698796453918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bioformer_cased_v1.0_qnli_en_5.1.4_3.4_1698796453918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_bioformer_cased_v1.0_qnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_bioformer_cased_v1.0_qnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bioformer.cased.by_bioformers").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bioformer_cased_v1.0_qnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|159.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bioformers/bioformer-cased-v1.0-qnli +- https://paperswithcode.com/dataset/qnli +- https://arxiv.org/abs/1804.07461 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section_en.md new file mode 100644 index 000000000000..bf857bba0704 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from ml4pubmed) +author: John Snow Labs +name: bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section` is a English model originally trained by `ml4pubmed`. + +## Predicted Entities + +`METHODS`, `CONCLUSIONS`, `RESULTS`, `BACKGROUND`, `OBJECTIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section_en_5.1.4_3.4_1698787959071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section_en_5.1.4_3.4_1698787959071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.pubmed_bert.pubmed.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_biomednlp_pubmedbert_base_uncased_abstract_fulltext_pub_section| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ml4pubmed/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bislama_classification_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bislama_classification_en.md new file mode 100644 index 000000000000..3e5e06500e8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_bislama_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classifier_bislama_classification BertForSequenceClassification from HCKLab +author: John Snow Labs +name: bert_classifier_bislama_classification +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_bislama_classification` is a English model originally trained by HCKLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_bislama_classification_en_5.1.4_3.4_1698787013463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_bislama_classification_en_5.1.4_3.4_1698787013463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bislama_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_bislama_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_bislama_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/HCKLab/BiBert-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_c2_roberta_base_finetuned_dianping_chinese_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_c2_roberta_base_finetuned_dianping_chinese_zh.md new file mode 100644 index 000000000000..3e68cf1944ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_c2_roberta_base_finetuned_dianping_chinese_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Base Cased model (from liam168) +author: John Snow Labs +name: bert_classifier_c2_roberta_base_finetuned_dianping_chinese +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `c2-roberta-base-finetuned-dianping-chinese` is a Chinese model originally trained by `liam168`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_c2_roberta_base_finetuned_dianping_chinese_zh_5.1.4_3.4_1698782277411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_c2_roberta_base_finetuned_dianping_chinese_zh_5.1.4_3.4_1698782277411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_c2_roberta_base_finetuned_dianping_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_c2_roberta_base_finetuned_dianping_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_c2_roberta_base_finetuned_dianping_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/liam168/c2-roberta-base-finetuned-dianping-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_cbert_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_cbert_en.md new file mode 100644 index 000000000000..21bcdecabca0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_cbert_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from yonichi) +author: John Snow Labs +name: bert_classifier_cbert +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `cbert` is a English model originally trained by `yonichi`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_cbert_en_5.1.4_3.4_1698788211570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_cbert_en_5.1.4_3.4_1698788211570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_cbert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_cbert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_yonichi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_cbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/yonichi/cbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_chinese_sentiment_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_chinese_sentiment_zh.md new file mode 100644 index 000000000000..deff1833220c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_chinese_sentiment_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from techthiyanes) +author: John Snow Labs +name: bert_classifier_chinese_sentiment +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, zh, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `chinese_sentiment` is a Chinese model originally trained by `techthiyanes`. + +## Predicted Entities + +`star 1`, `star 2`, `star 4`, `star 3`, `star 5` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_chinese_sentiment_zh_5.1.4_3.4_1698786571797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_chinese_sentiment_zh_5.1.4_3.4_1698786571797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_chinese_sentiment","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_chinese_sentiment","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_chinese_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/techthiyanes/chinese_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_cl_1_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_cl_1_en.md new file mode 100644 index 000000000000..0ab4742902b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_cl_1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from yuan1729) +author: John Snow Labs +name: bert_classifier_cl_1 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `CL_1` is a English model originally trained by `yuan1729`. + +## Predicted Entities + +`搶奪強盜及海盜罪`, `藏匿人犯及湮滅證據罪`, `賭博罪`, `侵占罪`, `遺棄罪`, `恐嚇及擄人勒贖罪`, `殺人罪`, `妨害秩序罪`, `偽證及誣告罪`, `妨害電腦使用罪`, `妨害風化罪`, `瀆職罪`, `妨害婚姻及家庭罪`, `竊盜罪`, `妨害名譽及信用罪`, `傷害罪`, `妨害性自主罪`, `贓物罪`, `妨害自由罪`, `妨害秘密罪`, `妨害公務罪`, `詐欺背信及重利罪`, `妨害投票罪`, `偽造文書印文罪`, `偽造有價證券罪`, `公共危險罪`, `毀棄損壞罪` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_cl_1_en_5.1.4_3.4_1698786858197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_cl_1_en_5.1.4_3.4_1698786858197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_cl_1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_cl_1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_cl_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/yuan1729/CL_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_contextualized_hate_speech_es.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_contextualized_hate_speech_es.md new file mode 100644 index 000000000000..176edb33518e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_contextualized_hate_speech_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Cased model (from finiteautomata) +author: John Snow Labs +name: bert_classifier_contextualized_hate_speech +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, es, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-contextualized-hate-speech-es` is a Spanish model originally trained by `finiteautomata`. + +## Predicted Entities + +`Hateful`, `Not hateful` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_contextualized_hate_speech_es_5.1.4_3.4_1698788492987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_contextualized_hate_speech_es_5.1.4_3.4_1698788492987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_contextualized_hate_speech","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Amo Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_contextualized_hate_speech","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Amo Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.hate_contextualized.bert.by_finiteautomata").predict("""Amo Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_contextualized_hate_speech| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/finiteautomata/bert-contextualized-hate-speech-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_core_clinical_mortality_prediction_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_core_clinical_mortality_prediction_en.md new file mode 100644 index 000000000000..9a88592b519f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_core_clinical_mortality_prediction_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bvanaken) +author: John Snow Labs +name: bert_classifier_core_clinical_mortality_prediction +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `CORe-clinical-mortality-prediction` is a English model originally trained by `bvanaken`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_core_clinical_mortality_prediction_en_5.1.4_3.4_1698787209502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_core_clinical_mortality_prediction_en_5.1.4_3.4_1698787209502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_core_clinical_mortality_prediction","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_core_clinical_mortality_prediction","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.clinical.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_core_clinical_mortality_prediction| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bvanaken/CORe-clinical-mortality-prediction +- https://www.aclweb.org/anthology/2021.eacl-main.75.pdf +- http://core.app.datexis.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_covid_misinfo_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_covid_misinfo_en.md new file mode 100644 index 000000000000..9cb9199b5f1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_covid_misinfo_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from liyijing024) +author: John Snow Labs +name: bert_classifier_covid_misinfo +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `covid-misinfo` is a English model originally trained by `liyijing024`. + +## Predicted Entities + +`entailment`, `neutral`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_covid_misinfo_en_5.1.4_3.4_1698787733260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_covid_misinfo_en_5.1.4_3.4_1698787733260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_covid_misinfo","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_covid_misinfo","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.covid.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_covid_misinfo| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/liyijing024/covid-misinfo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_curiosity_bio_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_curiosity_bio_en.md new file mode 100644 index 000000000000..6aeedcdf06e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_curiosity_bio_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from k-partha) +author: John Snow Labs +name: bert_classifier_curiosity_bio +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `curiosity_bert_bio` is a English model originally trained by `k-partha`. + +## Predicted Entities + +`Sensing`, `Intuitive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_curiosity_bio_en_5.1.4_3.4_1698782625096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_curiosity_bio_en_5.1.4_3.4_1698782625096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_curiosity_bio","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_curiosity_bio","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.curiosity_bio.bert.by_k_partha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_curiosity_bio| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/k-partha/curiosity_bert_bio +- https://en.wikipedia.org/wiki/Openness_to_experience +- https://arxiv.org/abs/2109.06402 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_emotion_binary_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_emotion_binary_da.md new file mode 100644 index 000000000000..5d43881f88fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_emotion_binary_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish bert_classifier_danish_emotion_binary BertForSequenceClassification from DaNLP +author: John Snow Labs +name: bert_classifier_danish_emotion_binary +date: 2023-10-31 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_danish_emotion_binary` is a Danish model originally trained by DaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_emotion_binary_da_5.1.4_3.4_1698791520933.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_emotion_binary_da_5.1.4_3.4_1698791520933.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_emotion_binary","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_emotion_binary","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_danish_emotion_binary| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| + +## References + +https://huggingface.co/DaNLP/da-bert-emotion-binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_emotion_classification_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_emotion_classification_da.md new file mode 100644 index 000000000000..db0fdc39a1c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_emotion_classification_da.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Danish BertForSequenceClassification Cased model (from NikolajMunch) +author: John Snow Labs +name: bert_classifier_danish_emotion_classification +date: 2023-10-31 +tags: [da, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `danish-emotion-classification` is a Danish model originally trained by `NikolajMunch`. + +## Predicted Entities + +`Afsky`, `Glæde`, `Frygt`, `Overraskelse`, `Vrede`, `Tristhed` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_emotion_classification_da_5.1.4_3.4_1698788027376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_emotion_classification_da_5.1.4_3.4_1698788027376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_danish_emotion_classification","da") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_danish_emotion_classification","da") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("da.classify.bert.by_nikolajmunch").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_danish_emotion_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/NikolajMunch/danish-emotion-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hatespeech_classification_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hatespeech_classification_da.md new file mode 100644 index 000000000000..3155a1a75867 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hatespeech_classification_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish bert_classifier_danish_hatespeech_classification BertForSequenceClassification from DaNLP +author: John Snow Labs +name: bert_classifier_danish_hatespeech_classification +date: 2023-10-31 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_danish_hatespeech_classification` is a Danish model originally trained by DaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_hatespeech_classification_da_5.1.4_3.4_1698788216113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_hatespeech_classification_da_5.1.4_3.4_1698788216113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_hatespeech_classification","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_hatespeech_classification","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_danish_hatespeech_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| + +## References + +https://huggingface.co/DaNLP/da-bert-hatespeech-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hatespeech_detection_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hatespeech_detection_da.md new file mode 100644 index 000000000000..1079e33c80c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hatespeech_detection_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish bert_classifier_danish_hatespeech_detection BertForSequenceClassification from DaNLP +author: John Snow Labs +name: bert_classifier_danish_hatespeech_detection +date: 2023-10-31 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_danish_hatespeech_detection` is a Danish model originally trained by DaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_hatespeech_detection_da_5.1.4_3.4_1698788402868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_hatespeech_detection_da_5.1.4_3.4_1698788402868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_hatespeech_detection","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_hatespeech_detection","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_danish_hatespeech_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| + +## References + +https://huggingface.co/DaNLP/da-bert-hatespeech-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hyggebert_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hyggebert_da.md new file mode 100644 index 000000000000..5137eb87fad1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_hyggebert_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish bert_classifier_danish_hyggebert BertForSequenceClassification from RJuro +author: John Snow Labs +name: bert_classifier_danish_hyggebert +date: 2023-10-31 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_danish_hyggebert` is a Danish model originally trained by RJuro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_hyggebert_da_5.1.4_3.4_1698788690065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_hyggebert_da_5.1.4_3.4_1698788690065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_hyggebert","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_hyggebert","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_danish_hyggebert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| + +## References + +https://huggingface.co/RJuro/Da-HyggeBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_tone_sentiment_polarity_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_tone_sentiment_polarity_da.md new file mode 100644 index 000000000000..89daf60cf454 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_tone_sentiment_polarity_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish bert_classifier_danish_tone_sentiment_polarity BertForSequenceClassification from DaNLP +author: John Snow Labs +name: bert_classifier_danish_tone_sentiment_polarity +date: 2023-10-31 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_danish_tone_sentiment_polarity` is a Danish model originally trained by DaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_tone_sentiment_polarity_da_5.1.4_3.4_1698796642710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_tone_sentiment_polarity_da_5.1.4_3.4_1698796642710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_tone_sentiment_polarity","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_tone_sentiment_polarity","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_danish_tone_sentiment_polarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| + +## References + +https://huggingface.co/DaNLP/da-bert-tone-sentiment-polarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_tone_subjective_objective_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_tone_subjective_objective_da.md new file mode 100644 index 000000000000..3936bce15913 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_danish_tone_subjective_objective_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish bert_classifier_danish_tone_subjective_objective BertForSequenceClassification from DaNLP +author: John Snow Labs +name: bert_classifier_danish_tone_subjective_objective +date: 2023-10-31 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_danish_tone_subjective_objective` is a Danish model originally trained by DaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_tone_subjective_objective_da_5.1.4_3.4_1698788607047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_danish_tone_subjective_objective_da_5.1.4_3.4_1698788607047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_tone_subjective_objective","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_danish_tone_subjective_objective","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_danish_tone_subjective_objective| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| + +## References + +https://huggingface.co/DaNLP/da-bert-tone-subjective-objective \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_decision_style_bio_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_decision_style_bio_en.md new file mode 100644 index 000000000000..645cbc7cc6f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_decision_style_bio_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from k-partha) +author: John Snow Labs +name: bert_classifier_decision_style_bio +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `decision_style_bert_bio` is a English model originally trained by `k-partha`. + +## Predicted Entities + +`Prospecting`, `Judging` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_decision_style_bio_en_5.1.4_3.4_1698782904182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_decision_style_bio_en_5.1.4_3.4_1698782904182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_decision_style_bio","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_decision_style_bio","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.decision_style_bio.bert.by_k_partha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_decision_style_bio| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/k-partha/decision_style_bert_bio +- https://en.wikipedia.org/wiki/Conscientiousness +- https://arxiv.org/abs/2109.06402 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos_es.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos_es.md new file mode 100644 index 000000000000..f4bc76bf1e2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Cased model (from anthonny) +author: John Snow Labs +name: bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, es, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `dehatebert-mono-spanish-finetuned-sentiments_reviews_politicos` is a Spanish model originally trained by `anthonny`. + +## Predicted Entities + +`NON_HATE`, `HATE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos_es_5.1.4_3.4_1698788973714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos_es_5.1.4_3.4_1698788973714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Amo Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Amo Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.bert.sentiment_hate.finetuned").predict("""Amo Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_dehate_mono_spanish_finetuned_sentiments_reviews_politicos| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|627.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/anthonny/dehatebert-mono-spanish-finetuned-sentiments_reviews_politicos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_demo_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_demo_en.md new file mode 100644 index 000000000000..7a3f06d8db7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_demo_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from junzai) +author: John Snow Labs +name: bert_classifier_demo +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `demo` is a English model originally trained by `junzai`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_demo_en_5.1.4_3.4_1698788953635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_demo_en_5.1.4_3.4_1698788953635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_demo","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_demo","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.v1.by_junzai").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_demo| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/junzai/demo +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_demotest_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_demotest_en.md new file mode 100644 index 000000000000..24a0fb34bac4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_demotest_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from junzai) +author: John Snow Labs +name: bert_classifier_demotest +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `demotest` is a English model originally trained by `junzai`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_demotest_en_5.1.4_3.4_1698783137028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_demotest_en_5.1.4_3.4_1698783137028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_demotest","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_demotest","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.v2.by_junzai").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_demotest| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/junzai/demotest +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_drug_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_drug_sentiment_analysis_en.md new file mode 100644 index 000000000000..90468d8bf4a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_drug_sentiment_analysis_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from shahidul034) +author: John Snow Labs +name: bert_classifier_drug_sentiment_analysis +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `drug_sentiment_analysis` is a English model originally trained by `shahidul034`. + +## Predicted Entities + +`good`, `bad` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_drug_sentiment_analysis_en_5.1.4_3.4_1698789244919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_drug_sentiment_analysis_en_5.1.4_3.4_1698789244919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_drug_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_drug_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.by_shahidul034").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_drug_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shahidul034/drug_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dum_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dum_en.md new file mode 100644 index 000000000000..d51cbe6905f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dum_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from lysandre) +author: John Snow Labs +name: bert_classifier_dum +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `dum` is a English model originally trained by `lysandre`. + +## Predicted Entities + +`NEGATIVE`, `POSITIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_dum_en_5.1.4_3.4_1698791888585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_dum_en_5.1.4_3.4_1698791888585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_dum","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_dum","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_lysandre").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_dum| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lysandre/dum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dutch_news_clf_finetuned_nl.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dutch_news_clf_finetuned_nl.md new file mode 100644 index 000000000000..927f0dca517d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dutch_news_clf_finetuned_nl.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Cased model (from RuudVelo) +author: John Snow Labs +name: bert_classifier_dutch_news_clf_finetuned +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, nl, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `dutch_news_clf_bert_finetuned` is a Dutch model originally trained by `RuudVelo`. + +## Predicted Entities + +`Economie`, `Buitenland`, `Politiek`, `Regionaal nieuws`, `Tech`, `Koningshuis`, `Binnenland`, `Cultuur & Media`, `Opmerkelijk` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_dutch_news_clf_finetuned_nl_5.1.4_3.4_1698789540127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_dutch_news_clf_finetuned_nl_5.1.4_3.4_1698789540127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_dutch_news_clf_finetuned","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_dutch_news_clf_finetuned","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.news.finetuned").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_dutch_news_clf_finetuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/RuudVelo/dutch_news_clf_bert_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dvs_f_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dvs_f_en.md new file mode 100644 index 000000000000..ee5fe3d6d83f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_dvs_f_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from crcb) +author: John Snow Labs +name: bert_classifier_dvs_f +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `dvs_f` is a English model originally trained by `crcb`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_dvs_f_en_5.1.4_3.4_1698789838384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_dvs_f_en_5.1.4_3.4_1698789838384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_dvs_f","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_dvs_f","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_crcb").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_dvs_f| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/crcb/dvs_f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_emo_nojoylove_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_emo_nojoylove_en.md new file mode 100644 index 000000000000..0b99d0eeb220 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_emo_nojoylove_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from crcb) +author: John Snow Labs +name: bert_classifier_emo_nojoylove +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `emo_nojoylove` is a English model originally trained by `crcb`. + +## Predicted Entities + +`fear`, `sadness`, `anger`, `surprise` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_emo_nojoylove_en_5.1.4_3.4_1698783771552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_emo_nojoylove_en_5.1.4_3.4_1698783771552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_emo_nojoylove","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_emo_nojoylove","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.joy.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_emo_nojoylove| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/crcb/emo_nojoylove \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_emojify_mvp_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_emojify_mvp_en.md new file mode 100644 index 000000000000..2d4126935ab2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_emojify_mvp_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from jpreilly123) +author: John Snow Labs +name: bert_classifier_emojify_mvp +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `emojify_mvp` is a English model originally trained by `jpreilly123`. + +## Predicted Entities + +`👌`, `😭`, `🙏`, `🔥`, `💕`, `😩`, `😡`, `😂`, `😘`, `😊`, `😍`, `😔` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_emojify_mvp_en_5.1.4_3.4_1698789195325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_emojify_mvp_en_5.1.4_3.4_1698789195325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_emojify_mvp","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_emojify_mvp","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_emojify_mvp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jpreilly123/emojify_mvp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_english_yelp_sentiment_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_english_yelp_sentiment_en.md new file mode 100644 index 000000000000..587805b1399f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_english_yelp_sentiment_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from gilf) +author: John Snow Labs +name: bert_classifier_english_yelp_sentiment +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `english-yelp-sentiment` is a English model originally trained by `gilf`. + +## Predicted Entities + +`3 stars`, `4 stars`, `2 stars`, `5 stars`, `1 star` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_english_yelp_sentiment_en_5.1.4_3.4_1698792196267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_english_yelp_sentiment_en_5.1.4_3.4_1698792196267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_english_yelp_sentiment","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_english_yelp_sentiment","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.by_gilf").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_english_yelp_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gilf/english-yelp-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_erlangshen_roberta_330m_similarity_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_erlangshen_roberta_330m_similarity_zh.md new file mode 100644 index 000000000000..1af3726b4f85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_erlangshen_roberta_330m_similarity_zh.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from IDEA-CCNL) +author: John Snow Labs +name: bert_classifier_erlangshen_roberta_330m_similarity +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Roberta-330M-Similarity` is a Chinese model originally trained by `IDEA-CCNL`. + +## Predicted Entities + +`similar`, `not similar` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_330m_similarity_zh_5.1.4_3.4_1698789683943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_330m_similarity_zh_5.1.4_3.4_1698789683943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_330m_similarity","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_330m_similarity","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.lang_330m.by_idea_ccnl").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_erlangshen_roberta_330m_similarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-330M-Similarity +- https://github.com/IDEA-CCNL/Fengshenbang-LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_erlangshen_sentiment_finetune_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_erlangshen_sentiment_finetune_zh.md new file mode 100644 index 000000000000..0a8fb84de626 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_erlangshen_sentiment_finetune_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from tezign) +author: John Snow Labs +name: bert_classifier_erlangshen_sentiment_finetune +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Sentiment-FineTune` is a Chinese model originally trained by `tezign`. + +## Predicted Entities + +`Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_sentiment_finetune_zh_5.1.4_3.4_1698790197445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_sentiment_finetune_zh_5.1.4_3.4_1698790197445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_sentiment_finetune","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_sentiment_finetune","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.sentiment.lang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_erlangshen_sentiment_finetune| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tezign/Erlangshen-Sentiment-FineTune \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_esg_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_esg_en.md new file mode 100644 index 000000000000..5918a341bd44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_esg_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from nbroad) +author: John Snow Labs +name: bert_classifier_esg +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ESG-BERT` is a English model originally trained by `nbroad`. + +## Predicted Entities + +`Waste_And_Hazardous_Materials_Management`, `Management_Of_Legal_And_Regulatory_Framework`, `Air_Quality`, `GHG_Emissions`, `Business_Model_Resilience`, `Water_And_Wastewater_Management`, `Systemic_Risk_Management`, `Director_Removal`, `Data_Security`, `Employee_Engagement_Inclusion_And_Diversity`, `Access_And_Affordability`, `Competitive_Behavior`, `Ecological_Impacts`, `Employee_Health_And_Safety`, `Supply_Chain_Management`, `Critical_Incident_Risk_Management`, `Business_Ethics`, `Product_Design_And_Lifecycle_Management`, `Energy_Management`, `Labor_Practices`, `Physical_Impacts_Of_Climate_Change`, `Product_Quality_And_Safety`, `Human_Rights_And_Community_Relations`, `Customer_Welfare`, `Customer_Privacy` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_esg_en_5.1.4_3.4_1698790115877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_esg_en_5.1.4_3.4_1698790115877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_esg","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_esg","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_nbroad").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_esg| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nbroad/ESG-BERT +- https://github.com/mukut03/ESG-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_evidence_types_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_evidence_types_en.md new file mode 100644 index 000000000000..eb0762b7f6c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_evidence_types_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from marieke93) +author: John Snow Labs +name: bert_classifier_evidence_types +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BERT-evidence-types` is a English model originally trained by `marieke93`. + +## Predicted Entities + +`Assumption`, `Definition`, `Testimony`, `Anecdote`, `None`, `Other`, `Statistics/Study` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_evidence_types_en_5.1.4_3.4_1698784323349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_evidence_types_en_5.1.4_3.4_1698784323349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_evidence_types","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_evidence_types","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_marieke93").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_evidence_types| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/marieke93/BERT-evidence-types +- https://github.com/mariekevdh/BA-Thesis-Information-Science-Persuasion-Strategies \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_extreme_go_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_extreme_go_emotion_en.md new file mode 100644 index 000000000000..b5bbda678f1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_extreme_go_emotion_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from vaariis) +author: John Snow Labs +name: bert_classifier_extreme_go_emotion +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `extreme-go-emotion` is a English model originally trained by `vaariis`. + +## Predicted Entities + +`optimism 🤞`, `disappointment 😞`, `amusement 😂`, `desire 😍`, `nervousness 😬`, `sadness 😞`, `surprise 😲`, `love ❤️`, `approval 👍`, `confusion 😕`, `embarrassment 😳`, `curiosity 🤔`, `grief 😢`, `disapproval 👎`, `gratitude 🙏`, `disgust 🤮`, `excitement 🤩`, `realization 💡`, `anger 😡`, `remorse 😞`, `admiration 👏`, `caring 🤗`, `fear 😨`, `annoyance 😒`, `joy 😃`, `neutral 😐`, `relief 😅`, `pride 😌` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_extreme_go_emotion_en_5.1.4_3.4_1698792358587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_extreme_go_emotion_en_5.1.4_3.4_1698792358587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_extreme_go_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_extreme_go_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_extreme_go_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/vaariis/extreme-go-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_fabriceyhc_base_uncased_imdb_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_fabriceyhc_base_uncased_imdb_en.md new file mode 100644 index 000000000000..eaef32f846f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_fabriceyhc_base_uncased_imdb_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from fabriceyhc) +author: John Snow Labs +name: bert_classifier_fabriceyhc_base_uncased_imdb +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-imdb` is a English model originally trained by `fabriceyhc`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_fabriceyhc_base_uncased_imdb_en_5.1.4_3.4_1698790400021.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_fabriceyhc_base_uncased_imdb_en_5.1.4_3.4_1698790400021.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_fabriceyhc_base_uncased_imdb","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_fabriceyhc_base_uncased_imdb","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.imdb.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_fabriceyhc_base_uncased_imdb| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/fabriceyhc/bert-base-uncased-imdb +- https://paperswithcode.com/sota?task=Text+Classification&dataset=imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_financialbert_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_financialbert_sentiment_analysis_en.md new file mode 100644 index 000000000000..ef4c350dc835 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_financialbert_sentiment_analysis_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from ahmedrachid) +author: John Snow Labs +name: bert_classifier_financialbert_sentiment_analysis +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `FinancialBERT-Sentiment-Analysis` is a English model originally trained by `ahmedrachid`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_financialbert_sentiment_analysis_en_5.1.4_3.4_1698792619138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_financialbert_sentiment_analysis_en_5.1.4_3.4_1698792619138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_financialbert_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_financialbert_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.by_ahmedrachid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_financialbert_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ahmedrachid/FinancialBERT-Sentiment-Analysis +- https://www.researchgate.net/publication/358284785_FinancialBERT_-_A_Pretrained_Language_Model_for_Financial_Text_Mining +- https://www.researchgate.net/publication/251231364_FinancialPhraseBank-v10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finbert_finnsentiment_fi.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finbert_finnsentiment_fi.md new file mode 100644 index 000000000000..75b8801d096b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finbert_finnsentiment_fi.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Finnish BertForSequenceClassification Cased model (from fergusq) +author: John Snow Labs +name: bert_classifier_finbert_finnsentiment +date: 2023-10-31 +tags: [fi, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: fi +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finbert-finnsentiment` is a Finnish model originally trained by `fergusq`. + +## Predicted Entities + +`NEUTRAL`, `NEGATIVE`, `POSITIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_finbert_finnsentiment_fi_5.1.4_3.4_1698790679521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_finbert_finnsentiment_fi_5.1.4_3.4_1698790679521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_finbert_finnsentiment","fi") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_finbert_finnsentiment","fi") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("fi.classify.bert.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_finbert_finnsentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|fi| +|Size:|466.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/fergusq/finbert-finnsentiment +- https://arxiv.org/pdf/2012.02613.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_fine_tuning_text_classification_model_habana_gaudi_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_fine_tuning_text_classification_model_habana_gaudi_en.md new file mode 100644 index 000000000000..7ebc833affe1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_fine_tuning_text_classification_model_habana_gaudi_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from jxuhf) +author: John Snow Labs +name: bert_classifier_fine_tuning_text_classification_model_habana_gaudi +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Fine-tuning-text-classification-model-Habana-Gaudi` is a English model originally trained by `jxuhf`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_fine_tuning_text_classification_model_habana_gaudi_en_5.1.4_3.4_1698784912187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_fine_tuning_text_classification_model_habana_gaudi_en_5.1.4_3.4_1698784912187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_fine_tuning_text_classification_model_habana_gaudi","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_fine_tuning_text_classification_model_habana_gaudi","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.by_jxuhf").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_fine_tuning_text_classification_model_habana_gaudi| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jxuhf/Fine-tuning-text-classification-model-Habana-Gaudi +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finetuned_emotion_en.md new file mode 100644 index 000000000000..c7136dc6693d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finetuned_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from ericntay) +author: John Snow Labs +name: bert_classifier_finetuned_emotion +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-emotion` is a English model originally trained by `ericntay`. + +## Predicted Entities + +`anger`, `sadness`, `fear`, `joy`, `love`, `surprise` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_emotion_en_5.1.4_3.4_1698790955245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_emotion_en_5.1.4_3.4_1698790955245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.emotion.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_finetuned_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ericntay/bert-finetuned-emotion +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finetuned_resumes_sections_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finetuned_resumes_sections_en.md new file mode 100644 index 000000000000..087bcb706d4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_finetuned_resumes_sections_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from has-abi) +author: John Snow Labs +name: bert_classifier_finetuned_resumes_sections +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-resumes-sections` is a English model originally trained by `has-abi`. + +## Predicted Entities + +`soft_skills`, `summary`, `languages`, `awards`, `professional_experiences`, `projects`, `skills`, `para`, `interests`, `contact/name/title`, `certificates`, `education` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_resumes_sections_en_5.1.4_3.4_1698791245446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_resumes_sections_en_5.1.4_3.4_1698791245446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_resumes_sections","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_resumes_sections","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.finetuned.by_has_abi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_finetuned_resumes_sections| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/has-abi/bert-finetuned-resumes-sections \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_german_news_sentiment_de.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_german_news_sentiment_de.md new file mode 100644 index 000000000000..3663c93623c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_german_news_sentiment_de.md @@ -0,0 +1,108 @@ +--- +layout: model +title: German BertForSequenceClassification Cased model (from mdraw) +author: John Snow Labs +name: bert_classifier_german_news_sentiment +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, de, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `german-news-sentiment-bert` is a German model originally trained by `mdraw`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_german_news_sentiment_de_5.1.4_3.4_1698790470981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_german_news_sentiment_de_5.1.4_3.4_1698790470981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_german_news_sentiment","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_german_news_sentiment","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.bert.news_sentiment.").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_german_news_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|408.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mdraw/german-news-sentiment-bert +- https://github.com/text-analytics-20/news-sentiment-development +- https://github.com/text-analytics-20/news-sentiment-development/blob/main/sentiment_analysis/bert.py \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_german_sentiment_de.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_german_sentiment_de.md new file mode 100644 index 000000000000..7128996ae09e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_german_sentiment_de.md @@ -0,0 +1,114 @@ +--- +layout: model +title: German BertForSequenceClassification Cased model (from oliverguhr) +author: John Snow Labs +name: bert_classifier_german_sentiment +date: 2023-10-31 +tags: [de, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `german-sentiment-bert` is a German model originally trained by `oliverguhr`. + +## Predicted Entities + +`neutral`, `positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_german_sentiment_de_5.1.4_3.4_1698790745478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_german_sentiment_de_5.1.4_3.4_1698790745478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_german_sentiment","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_german_sentiment","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.bert.sentiment.by_oliverguhr").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_german_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|408.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/oliverguhr/german-sentiment-bert +- http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.202.pdf +- https://pypi.org/project/germansentiment/ +- https://github.com/oliverguhr/german-sentiment +- http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.202.pdf +- https://www.romanklinger.de/scare/ +- https://www.aclweb.org/anthology/L16-1181/ +- https://www.spinningbytes.com/resources/germansentiment/ +- https://wortschatz.uni-leipzig.de/de/download/german \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_glue_mrpc_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_glue_mrpc_en.md new file mode 100644 index 000000000000..4d6812ad6730 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_glue_mrpc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sgugger) +author: John Snow Labs +name: bert_classifier_glue_mrpc +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `glue-mrpc` is a English model originally trained by `sgugger`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_glue_mrpc_en_5.1.4_3.4_1698785269612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_glue_mrpc_en_5.1.4_3.4_1698785269612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_glue_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_glue_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.by_sgugger").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_glue_mrpc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sgugger/glue-mrpc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_guns_relevant_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_guns_relevant_en.md new file mode 100644 index 000000000000..331e697eeffd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_guns_relevant_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from tcaputi) +author: John Snow Labs +name: bert_classifier_guns_relevant +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `guns-relevant` is a English model originally trained by `tcaputi`. + +## Predicted Entities + +`yes`, `no` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_guns_relevant_en_5.1.4_3.4_1698791515843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_guns_relevant_en_5.1.4_3.4_1698791515843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_guns_relevant","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_guns_relevant","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_tcaputi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_guns_relevant| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tcaputi/guns-relevant \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hashtag_tonga_tonga_islands_hashtag_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hashtag_tonga_tonga_islands_hashtag_en.md new file mode 100644 index 000000000000..85d63760bd46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hashtag_tonga_tonga_islands_hashtag_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classifier_hashtag_tonga_tonga_islands_hashtag BertForSequenceClassification from Bryan0123 +author: John Snow Labs +name: bert_classifier_hashtag_tonga_tonga_islands_hashtag +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_hashtag_tonga_tonga_islands_hashtag` is a English model originally trained by Bryan0123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_hashtag_tonga_tonga_islands_hashtag_en_5.1.4_3.4_1698792810910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_hashtag_tonga_tonga_islands_hashtag_en_5.1.4_3.4_1698792810910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_hashtag_tonga_tonga_islands_hashtag","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_hashtag_tonga_tonga_islands_hashtag","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_hashtag_tonga_tonga_islands_hashtag| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Bryan0123/bert-hashtag-to-hashtag \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hatescore_korean_hate_speech_ko.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hatescore_korean_hate_speech_ko.md new file mode 100644 index 000000000000..12b97f3803e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hatescore_korean_hate_speech_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean BertForSequenceClassification Cased model (from sgunderscore) +author: John Snow Labs +name: bert_classifier_hatescore_korean_hate_speech +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, ko, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `hatescore-korean-hate-speech` is a Korean model originally trained by `sgunderscore`. + +## Predicted Entities + +`종교`, `None`, `여성/가족`, `인종/국적`, `기타 혐오`, `단순 악플`, `성소수자`, `지역`, `연령`, `남성` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_hatescore_korean_hate_speech_ko_5.1.4_3.4_1698791846435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_hatescore_korean_hate_speech_ko_5.1.4_3.4_1698791846435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_hatescore_korean_hate_speech","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["나는 Spark NLP를 좋아합니다"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_hatescore_korean_hate_speech","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("나는 Spark NLP를 좋아합니다").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_hatescore_korean_hate_speech| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|408.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sgunderscore/hatescore-korean-hate-speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hateval_re_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hateval_re_en.md new file mode 100644 index 000000000000..768c1ba0c8a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hateval_re_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from crcb) +author: John Snow Labs +name: bert_classifier_hateval_re +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `hateval_re` is a English model originally trained by `crcb`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_hateval_re_en_5.1.4_3.4_1698785551464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_hateval_re_en_5.1.4_3.4_1698785551464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_hateval_re","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_hateval_re","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.hate.by_crcb").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_hateval_re| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/crcb/hateval_re \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hebrew_modern_sentiment_analysis_he.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hebrew_modern_sentiment_analysis_he.md new file mode 100644 index 000000000000..62b69bf0bf75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hebrew_modern_sentiment_analysis_he.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Hebrew bert_classifier_hebrew_modern_sentiment_analysis BertForSequenceClassification from avichr +author: John Snow Labs +name: bert_classifier_hebrew_modern_sentiment_analysis +date: 2023-10-31 +tags: [bert, he, open_source, sequence_classification, onnx] +task: Text Classification +language: he +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_hebrew_modern_sentiment_analysis` is a Hebrew model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_hebrew_modern_sentiment_analysis_he_5.1.4_3.4_1698790932382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_hebrew_modern_sentiment_analysis_he_5.1.4_3.4_1698790932382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_hebrew_modern_sentiment_analysis","he")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_hebrew_modern_sentiment_analysis","he") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_hebrew_modern_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|he| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/heBERT_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hinglish11k_sentiment_analysis_xx.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hinglish11k_sentiment_analysis_xx.md new file mode 100644 index 000000000000..85fe873c5138 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hinglish11k_sentiment_analysis_xx.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Multilingual BertForSequenceClassification Cased model (from yj2773) +author: John Snow Labs +name: bert_classifier_hinglish11k_sentiment_analysis +date: 2023-10-31 +tags: [en, hi, ur, open_source, bert, sequence_classification, classification, xx, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `hinglish11k-sentiment-analysis` is a Multilingual model originally trained by `yj2773`. + +## Predicted Entities + +`Neutral`, `Positive`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_hinglish11k_sentiment_analysis_xx_5.1.4_3.4_1698791283029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_hinglish11k_sentiment_analysis_xx_5.1.4_3.4_1698791283029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_hinglish11k_sentiment_analysis","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_hinglish11k_sentiment_analysis","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.bert.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_hinglish11k_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/yj2773/hinglish11k-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hinglish_class_xx.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hinglish_class_xx.md new file mode 100644 index 000000000000..2420c0828b1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_hinglish_class_xx.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Multilingual BertForSequenceClassification Cased model (from meghanabhange) +author: John Snow Labs +name: bert_classifier_hinglish_class +date: 2023-10-31 +tags: [distilbert, sequence_classification, open_source, hi, en, xx, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Hinglish-Bert-Class` is a Multilingual model originally trained by `meghanabhange`. + +## Predicted Entities + +`NEGATIVE`, `NEUTRAL`, `POSITIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_hinglish_class_xx_5.1.4_3.4_1698792210339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_hinglish_class_xx_5.1.4_3.4_1698792210339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_hinglish_class","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_hinglish_class","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.bert.by_meghanabhange").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_hinglish_class| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/meghanabhange/Hinglish-Bert-Class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_interpress_turkish_news_classification_tr.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_interpress_turkish_news_classification_tr.md new file mode 100644 index 000000000000..bf09d40a1d11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_interpress_turkish_news_classification_tr.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from serdarakyol) +author: John Snow Labs +name: bert_classifier_interpress_turkish_news_classification +date: 2023-10-31 +tags: [tr, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `interpress-turkish-news-classification` is a Turkish model originally trained by `serdarakyol`. + +## Predicted Entities + +`Politics`, `Agenda`, `Technology`, `World`, `Education`, `Sport`, `Culture-Art`, `Economy`, `Magazine`, `Health` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_interpress_turkish_news_classification_tr_5.1.4_3.4_1698791546470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_interpress_turkish_news_classification_tr_5.1.4_3.4_1698791546470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_interpress_turkish_news_classification","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_interpress_turkish_news_classification","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.bert.news.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_interpress_turkish_news_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/serdarakyol/interpress-turkish-news-classification +- https://github.com/serdarakyol \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_italian_iptc_it.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_italian_iptc_it.md new file mode 100644 index 000000000000..f1d6b18fad91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_italian_iptc_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian bert_classifier_italian_iptc BertForSequenceClassification from M47Labs +author: John Snow Labs +name: bert_classifier_italian_iptc +date: 2023-10-31 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_italian_iptc` is a Italian model originally trained by M47Labs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_italian_iptc_it_5.1.4_3.4_1698793003345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_italian_iptc_it_5.1.4_3.4_1698793003345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_italian_iptc","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_italian_iptc","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_italian_iptc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|423.8 MB| + +## References + +https://huggingface.co/M47Labs/it_iptc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_jd_resume_model_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_jd_resume_model_en.md new file mode 100644 index 000000000000..0c8ce8c4e751 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_jd_resume_model_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from nikunjbjj) +author: John Snow Labs +name: bert_classifier_jd_resume_model +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `jd-resume-model` is a English model originally trained by `nikunjbjj`. + +## Predicted Entities + +`POS`, `NEG`, `NEU` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_jd_resume_model_en_5.1.4_3.4_1698793289034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_jd_resume_model_en_5.1.4_3.4_1698793289034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_jd_resume_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_jd_resume_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_nikunjbjj").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_jd_resume_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nikunjbjj/jd-resume-model +- https://github.com/finiteautomata/pysentimiento/ +- https://github.com/dccuchile/beto \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_joniponi_finetuned_semitic_languages_eval_english_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_joniponi_finetuned_semitic_languages_eval_english_en.md new file mode 100644 index 000000000000..a1fc54fbe456 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_joniponi_finetuned_semitic_languages_eval_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classifier_joniponi_finetuned_semitic_languages_eval_english BertForSequenceClassification from joniponi +author: John Snow Labs +name: bert_classifier_joniponi_finetuned_semitic_languages_eval_english +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_joniponi_finetuned_semitic_languages_eval_english` is a English model originally trained by joniponi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_joniponi_finetuned_semitic_languages_eval_english_en_5.1.4_3.4_1698791724261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_joniponi_finetuned_semitic_languages_eval_english_en_5.1.4_3.4_1698791724261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_joniponi_finetuned_semitic_languages_eval_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_joniponi_finetuned_semitic_languages_eval_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_joniponi_finetuned_semitic_languages_eval_english| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.6 MB| + +## References + +https://huggingface.co/joniponi/bert-finetuned-sem_eval-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_kor_3i4k_base_cased_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_kor_3i4k_base_cased_en.md new file mode 100644 index 000000000000..07a8e88a9a9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_kor_3i4k_base_cased_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from seongju) +author: John Snow Labs +name: bert_classifier_kor_3i4k_base_cased +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kor-3i4k-bert-base-cased` is a English model originally trained by `seongju`. + +## Predicted Entities + +`fragment`, `command`, `rhetorical question`, `intonation-depedent utterance`, `question`, `statement`, `rhetorical command` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_kor_3i4k_base_cased_en_5.1.4_3.4_1698793666327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_kor_3i4k_base_cased_en_5.1.4_3.4_1698793666327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_kor_3i4k_base_cased","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_kor_3i4k_base_cased","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.cased_base.by_seongju").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_kor_3i4k_base_cased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/seongju/kor-3i4k-bert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_kor_unsmile_ko.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_kor_unsmile_ko.md new file mode 100644 index 000000000000..9a387bd0e0a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_kor_unsmile_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean BertForSequenceClassification Cased model (from smilegate-ai) +author: John Snow Labs +name: bert_classifier_kor_unsmile +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, ko, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kor_unsmile` is a Korean model originally trained by `smilegate-ai`. + +## Predicted Entities + +`종교`, `clean`, `여성/가족`, `악플/욕설`, `인종/국적`, `기타 혐오`, `성소수자`, `지역`, `연령`, `남성` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_kor_unsmile_ko_5.1.4_3.4_1698793957420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_kor_unsmile_ko_5.1.4_3.4_1698793957420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_kor_unsmile","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["나는 Spark NLP를 좋아합니다"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_kor_unsmile","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("나는 Spark NLP를 좋아합니다").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_kor_unsmile| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|408.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/smilegate-ai/kor_unsmile \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_krm_sa3_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_krm_sa3_en.md new file mode 100644 index 000000000000..4badd1438c1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_krm_sa3_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from erica) +author: John Snow Labs +name: bert_classifier_krm_sa3 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `krm_sa3` is a English model originally trained by `erica`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_krm_sa3_en_5.1.4_3.4_1698785879297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_krm_sa3_en_5.1.4_3.4_1698785879297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_krm_sa3","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_krm_sa3","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_erica").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_krm_sa3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|380.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/erica/krm_sa3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_large_cased_whole_word_masking_sst2_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_large_cased_whole_word_masking_sst2_en.md new file mode 100644 index 000000000000..bc99226a602e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_large_cased_whole_word_masking_sst2_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from philschmid) +author: John Snow Labs +name: bert_classifier_large_cased_whole_word_masking_sst2 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-whole-word-masking-sst2` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_large_cased_whole_word_masking_sst2_en_5.1.4_3.4_1698794431673.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_large_cased_whole_word_masking_sst2_en_5.1.4_3.4_1698794431673.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_large_cased_whole_word_masking_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_large_cased_whole_word_masking_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.cased_large_whole_word_masking").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_large_cased_whole_word_masking_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/bert-large-cased-whole-word-masking-sst2 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_larskjeldgaard_senda_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_larskjeldgaard_senda_da.md new file mode 100644 index 000000000000..d8f4e3c66d2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_larskjeldgaard_senda_da.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Danish BertForSequenceClassification Cased model (from larskjeldgaard) +author: John Snow Labs +name: bert_classifier_larskjeldgaard_senda +date: 2023-10-31 +tags: [da, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `senda` is a Danish model originally trained by `larskjeldgaard`. + +## Predicted Entities + +`positiv`, `neutral`, `negativ` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_larskjeldgaard_senda_da_5.1.4_3.4_1698786136740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_larskjeldgaard_senda_da_5.1.4_3.4_1698786136740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_larskjeldgaard_senda","da") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_larskjeldgaard_senda","da") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("da.classify.bert.by_larskjeldgaard").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_larskjeldgaard_senda| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/larskjeldgaard/senda +- https://github.com/alexandrainst \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_lex_textclassification_turkish_uncased_tr.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_lex_textclassification_turkish_uncased_tr.md new file mode 100644 index 000000000000..61033c25d8bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_lex_textclassification_turkish_uncased_tr.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Uncased model (from sfurkan) +author: John Snow Labs +name: bert_classifier_lex_textclassification_turkish_uncased +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, tr, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `LexBERT-textclassification-turkish-uncased` is a Turkish model originally trained by `sfurkan`. + +## Predicted Entities + +`Genelge`, `Tüzük`, `Kanun Hükmünde Kararname`, `Yönetmelik`, `Özelge`, `Cumhurbaşkanlığı Kararnamesi`, `Kanun`, `Komisyon Raporu`, `Tebliğ`, `Resmi Gazete` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_lex_textclassification_turkish_uncased_tr_5.1.4_3.4_1698786403966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_lex_textclassification_turkish_uncased_tr_5.1.4_3.4_1698786403966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_lex_textclassification_turkish_uncased","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Spark NLP'yi seviyorum"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_lex_textclassification_turkish_uncased","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Spark NLP'yi seviyorum").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.bert.uncased").predict("""Spark NLP'yi seviyorum""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_lex_textclassification_turkish_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.7 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sfurkan/LexBERT-textclassification-turkish-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_lupinlevorace_tiny_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_lupinlevorace_tiny_sst2_distilled_en.md new file mode 100644 index 000000000000..714fb018a4c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_lupinlevorace_tiny_sst2_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from lupinlevorace) +author: John Snow Labs +name: bert_classifier_lupinlevorace_tiny_sst2_distilled +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-distilled` is a English model originally trained by `lupinlevorace`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_lupinlevorace_tiny_sst2_distilled_en_5.1.4_3.4_1698786533397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_lupinlevorace_tiny_sst2_distilled_en_5.1.4_3.4_1698786533397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_lupinlevorace_tiny_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_lupinlevorace_tiny_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.distilled_tiny.by_lupinlevorace").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_lupinlevorace_tiny_sst2_distilled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lupinlevorace/tiny-bert-sst2-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_manglish_offensive_language_identification_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_manglish_offensive_language_identification_en.md new file mode 100644 index 000000000000..c5b324d785aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_manglish_offensive_language_identification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from seanbenhur) +author: John Snow Labs +name: bert_classifier_manglish_offensive_language_identification +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `manglish-offensive-language-identification` is a English model originally trained by `seanbenhur`. + +## Predicted Entities + +`OFFENSIVE`, `NOT-OFFENSIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_manglish_offensive_language_identification_en_5.1.4_3.4_1698794765847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_manglish_offensive_language_identification_en_5.1.4_3.4_1698794765847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_manglish_offensive_language_identification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_manglish_offensive_language_identification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.lang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_manglish_offensive_language_identification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/seanbenhur/manglish-offensive-language-identification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_finetuned_emotion_en.md new file mode 100644 index 000000000000..1599cb416bb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_finetuned_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Cased model (from lewtun) +author: John Snow Labs +name: bert_classifier_minilm_finetuned_emotion +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `minilm-finetuned-emotion` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`anger`, `sadness`, `fear`, `joy`, `love`, `surprise` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_finetuned_emotion_en_5.1.4_3.4_1698792370784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_finetuned_emotion_en_5.1.4_3.4_1698792370784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minilm_finetuned_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minilm_finetuned_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.emotion.mini_lm_mini_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_minilm_finetuned_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|119.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/minilm-finetuned-emotion +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_finetuned_emotion_nm_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_finetuned_emotion_nm_en.md new file mode 100644 index 000000000000..ab4c4ebba4ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_finetuned_emotion_nm_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Cased model (from nickmuchi) +author: John Snow Labs +name: bert_classifier_minilm_finetuned_emotion_nm +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `minilm-finetuned-emotion_nm` is a English model originally trained by `nickmuchi`. + +## Predicted Entities + +`sadness`, `joy`, `love`, `anger`, `surprise`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_finetuned_emotion_nm_en_5.1.4_3.4_1698794923721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_finetuned_emotion_nm_en_5.1.4_3.4_1698794923721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minilm_finetuned_emotion_nm","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minilm_finetuned_emotion_nm","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.emotion.mini_lm_mini_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_minilm_finetuned_emotion_nm| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|119.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nickmuchi/minilm-finetuned-emotion_nm +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_l12_h384_uncased_mrpc_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_l12_h384_uncased_mrpc_en.md new file mode 100644 index 000000000000..702e6898b74c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_l12_h384_uncased_mrpc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Uncased model (from Intel) +author: John Snow Labs +name: bert_classifier_minilm_l12_h384_uncased_mrpc +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L12-H384-uncased-mrpc` is a English model originally trained by `Intel`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_l12_h384_uncased_mrpc_en_5.1.4_3.4_1698786679523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_l12_h384_uncased_mrpc_en_5.1.4_3.4_1698786679523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_minilm_l12_h384_uncased_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_minilm_l12_h384_uncased_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.uncased_mini_lm_mini").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_minilm_l12_h384_uncased_mrpc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|116.8 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Intel/MiniLM-L12-H384-uncased-mrpc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_l12_h384_uncased_sst2_all_train_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_l12_h384_uncased_sst2_all_train_en.md new file mode 100644 index 000000000000..090487a9b56c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minilm_l12_h384_uncased_sst2_all_train_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Uncased model (from SetFit) +author: John Snow Labs +name: bert_classifier_minilm_l12_h384_uncased_sst2_all_train +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L12-H384-uncased__sst2__all-train` is a English model originally trained by `SetFit`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_l12_h384_uncased_sst2_all_train_en_5.1.4_3.4_1698795118330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_minilm_l12_h384_uncased_sst2_all_train_en_5.1.4_3.4_1698795118330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minilm_l12_h384_uncased_sst2_all_train","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minilm_l12_h384_uncased_sst2_all_train","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_mini_lm_mini").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_minilm_l12_h384_uncased_sst2_all_train| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|118.0 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/SetFit/MiniLM-L12-H384-uncased__sst2__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minlm_finetuned_emotionnew1_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minlm_finetuned_emotionnew1_en.md new file mode 100644 index 000000000000..95cfe904c787 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minlm_finetuned_emotionnew1_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Dinithi) +author: John Snow Labs +name: bert_classifier_minlm_finetuned_emotionnew1 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `minlm-finetuned-emotionnew1` is a English model originally trained by `Dinithi`. + +## Predicted Entities + +`sadness`, `joy`, `love`, `anger`, `surprise`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_minlm_finetuned_emotionnew1_en_5.1.4_3.4_1698792541734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_minlm_finetuned_emotionnew1_en_5.1.4_3.4_1698792541734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minlm_finetuned_emotionnew1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minlm_finetuned_emotionnew1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.emotion.bert.minilm_v1.by_dinithi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_minlm_finetuned_emotionnew1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|119.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Dinithi/minlm-finetuned-emotionnew1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minlm_finetuned_emotionnew2_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minlm_finetuned_emotionnew2_en.md new file mode 100644 index 000000000000..8984342739d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_minlm_finetuned_emotionnew2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Dinithi) +author: John Snow Labs +name: bert_classifier_minlm_finetuned_emotionnew2 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `minlm-finetuned-emotionnew2` is a English model originally trained by `Dinithi`. + +## Predicted Entities + +`sadness`, `joy`, `love`, `anger`, `surprise`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_minlm_finetuned_emotionnew2_en_5.1.4_3.4_1698792693752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_minlm_finetuned_emotionnew2_en_5.1.4_3.4_1698792693752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minlm_finetuned_emotionnew2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_minlm_finetuned_emotionnew2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.emotion.bert.minilm_v2.by_dinithi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_minlm_finetuned_emotionnew2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|119.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Dinithi/minlm-finetuned-emotionnew2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_model1_test_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_model1_test_en.md new file mode 100644 index 000000000000..283473407e62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_model1_test_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from ptro) +author: John Snow Labs +name: bert_classifier_model1_test +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `model1_test` is a English model originally trained by `ptro`. + +## Predicted Entities + +`offensive`, `not offensive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_model1_test_en_5.1.4_3.4_1698795433682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_model1_test_en_5.1.4_3.4_1698795433682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_model1_test","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_model1_test","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_ptro").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_model1_test| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ptro/model1_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multi2convai_quality_english_mbert_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multi2convai_quality_english_mbert_en.md new file mode 100644 index 000000000000..6fbe76ddc09e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multi2convai_quality_english_mbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classifier_multi2convai_quality_english_mbert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_classifier_multi2convai_quality_english_mbert +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_multi2convai_quality_english_mbert` is a English model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_multi2convai_quality_english_mbert_en_5.1.4_3.4_1698792939427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_multi2convai_quality_english_mbert_en_5.1.4_3.4_1698792939427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_multi2convai_quality_english_mbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_multi2convai_quality_english_mbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_multi2convai_quality_english_mbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/inovex/multi2convai-quality-en-mbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multi_label_classification_of_pubmed_articles_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multi_label_classification_of_pubmed_articles_en.md new file mode 100644 index 000000000000..a18bf7ddc02d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multi_label_classification_of_pubmed_articles_en.md @@ -0,0 +1,113 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from owaiskha9654) +author: John Snow Labs +name: bert_classifier_multi_label_classification_of_pubmed_articles +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Multi-Label-Classification-of-PubMed-Articles` is a English model originally trained by `owaiskha9654`. + +## Predicted Entities + +`Phenomena and Processes [G]`, `Diseases [C]`, `Health Care [N]`, `Chemicals and Drugs [D]`, `Psychiatry and Psychology [F]`, `Anatomy [A]`, `Information Science [L]`, `Geographicals [Z]`, `Organisms [B]`, `Disciplines and Occupations [H]`, `Named Groups [M]` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_multi_label_classification_of_pubmed_articles_en_5.1.4_3.4_1698793224415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_multi_label_classification_of_pubmed_articles_en_5.1.4_3.4_1698793224415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_multi_label_classification_of_pubmed_articles","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_multi_label_classification_of_pubmed_articles","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.pubmed.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_multi_label_classification_of_pubmed_articles| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/owaiskha9654/Multi-Label-Classification-of-PubMed-Articles +- https://www.kaggle.com/datasets/owaiskhan9654/pubmed-multilabel-text-classification +- https://www.kaggle.com/code/owaiskhan9654/multi-label-classification-of-pubmed-articles +- https://www.kaggle.com/datasets/owaiskhan9654/pubmed-multilabel-text-classification +- https://arxiv.org/abs/1706.03762 +- https://arxiv.org/abs/1810.04805 +- https://github.com/google-research/bert +- https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html#torch.nn.BCEWithLogitsLoss \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multilabel_inpatient_comments_14labels_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multilabel_inpatient_comments_14labels_en.md new file mode 100644 index 000000000000..c6118a85fa28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_multilabel_inpatient_comments_14labels_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from joniponi) +author: John Snow Labs +name: bert_classifier_multilabel_inpatient_comments_14labels +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilabel_inpatient_comments_14labels` is a English model originally trained by `joniponi`. + +## Predicted Entities + +`food`, `financial`, `communication`, `medical`, `doctor`, `nurse`, `condition`, `rude`, `clean`, `bathroom`, `treatment`, `administration`, `environmental`, `extra_nice` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_multilabel_inpatient_comments_14labels_en_5.1.4_3.4_1698792060518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_multilabel_inpatient_comments_14labels_en_5.1.4_3.4_1698792060518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_multilabel_inpatient_comments_14labels","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_multilabel_inpatient_comments_14labels","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_joniponi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_multilabel_inpatient_comments_14labels| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/joniponi/multilabel_inpatient_comments_14labels \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_nateraw_base_uncased_imdb_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_nateraw_base_uncased_imdb_en.md new file mode 100644 index 000000000000..eb1e2b2b01d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_nateraw_base_uncased_imdb_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from nateraw) +author: John Snow Labs +name: bert_classifier_nateraw_base_uncased_imdb +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-imdb` is a English model originally trained by `nateraw`. + +## Predicted Entities + +`NEGATIVE`, `POSITIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_nateraw_base_uncased_imdb_en_5.1.4_3.4_1698786961814.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_nateraw_base_uncased_imdb_en_5.1.4_3.4_1698786961814.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_nateraw_base_uncased_imdb","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_nateraw_base_uncased_imdb","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.imdb.uncased_base.by_nateraw").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_nateraw_base_uncased_imdb| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nateraw/bert-base-uncased-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_ni_model_8_19_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_ni_model_8_19_en.md new file mode 100644 index 000000000000..805fbe46e1ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_ni_model_8_19_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from aujer) +author: John Snow Labs +name: bert_classifier_ni_model_8_19 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ni_model_8_19` is a English model originally trained by `aujer`. + +## Predicted Entities + +`MISCLASSIFICATION`, `ROLE_FIT`, `VISA`, `COMPENSATION`, `TIMING`, `OTHER`, `REMOTE_POLICY`, `MAKING_REFERRAL`, `COMPANY_FIT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_ni_model_8_19_en_5.1.4_3.4_1698793754195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_ni_model_8_19_en_5.1.4_3.4_1698793754195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_ni_model_8_19","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_ni_model_8_19","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_aujer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_ni_model_8_19| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aujer/ni_model_8_19 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_non_cl_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_non_cl_en.md new file mode 100644 index 000000000000..9e9c58f5e9ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_non_cl_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from yuan1729) +author: John Snow Labs +name: bert_classifier_non_cl +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Non_CL` is a English model originally trained by `yuan1729`. + +## Predicted Entities + +`個人資料保護法`, `竊盜犯贓物犯保安處分條例`, `洗錢防制法`, `商業會計法`, `性騷擾防治法`, `臺灣地區與大陸地區人民關係條例`, `廢棄物清理法`, `著作權法`, `藥事法`, `證券交易法`, `稅捐稽徵法`, `道路交通管理處罰條例`, `電信法`, `中華民國九十六年罪犯減刑條例`, `商標法`, `道路交通安全規則`, `道路交通標誌標線號誌設置規則`, `律師法`, `貪污治罪條例`, `家庭暴力防治法`, `公設辯護人條例`, `通訊保障及監察法`, `轉讓毒品加重其刑之數量標準`, `毒品危害防制條例`, `中華民國憲法`, `就業服務法`, `公司法`, `陸海空軍刑法`, `兒童及少年福利與權益保障法`, `戶籍法`, `兒童及少年性剝削防制條例`, `森林法`, `妨害兵役治罪條例`, `管制藥品管理條例`, `組織犯罪防制條例`, `公職人員選舉罷免法`, `懲治走私條例`, `職業安全衛生法`, `性侵害犯罪防治法`, `水土保持法`, `槍砲彈藥刀械管制條例`, `入出國及移民法`, `罰金罰鍰提高標準條例`, `民法`, `電子遊戲場業管理條例`, `銀行法`, `軍事審判法`, `區域計畫法`, `政府採購法` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_non_cl_en_5.1.4_3.4_1698794011354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_non_cl_en_5.1.4_3.4_1698794011354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_non_cl","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_non_cl","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_non_cl| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/yuan1729/Non_CL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_non_contextualized_hate_speech_es.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_non_contextualized_hate_speech_es.md new file mode 100644 index 000000000000..4c9908efd0f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_non_contextualized_hate_speech_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Cased model (from finiteautomata) +author: John Snow Labs +name: bert_classifier_non_contextualized_hate_speech +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, es, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-non-contextualized-hate-speech-es` is a Spanish model originally trained by `finiteautomata`. + +## Predicted Entities + +`Hateful`, `Not hateful` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_non_contextualized_hate_speech_es_5.1.4_3.4_1698794290718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_non_contextualized_hate_speech_es_5.1.4_3.4_1698794290718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_non_contextualized_hate_speech","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Amo Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_non_contextualized_hate_speech","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Amo Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.hate_non_contextualized.bert.by_finiteautomata").predict("""Amo Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_non_contextualized_hate_speech| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/finiteautomata/bert-non-contextualized-hate-speech-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_ogbv_gder_hindi_mlkorra_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_ogbv_gder_hindi_mlkorra_en.md new file mode 100644 index 000000000000..5492b3d0122a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_ogbv_gder_hindi_mlkorra_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classifier_ogbv_gder_hindi_mlkorra BertForSequenceClassification from mlkorra +author: John Snow Labs +name: bert_classifier_ogbv_gder_hindi_mlkorra +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_ogbv_gder_hindi_mlkorra` is a English model originally trained by mlkorra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_ogbv_gder_hindi_mlkorra_en_5.1.4_3.4_1698787203957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_ogbv_gder_hindi_mlkorra_en_5.1.4_3.4_1698787203957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_ogbv_gder_hindi_mlkorra","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_ogbv_gder_hindi_mlkorra","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_ogbv_gder_hindi_mlkorra| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/mlkorra/OGBV-gender-bert-hi-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_pathology_meningioma_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_pathology_meningioma_en.md new file mode 100644 index 000000000000..250966b4d7a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_pathology_meningioma_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Santarabantoosoo) +author: John Snow Labs +name: bert_classifier_pathology_meningioma +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `PathologyBERT-meningioma` is a English model originally trained by `Santarabantoosoo`. + +## Predicted Entities + +`No recurrence`, `Recurrence` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_pathology_meningioma_en_5.1.4_3.4_1698792264764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_pathology_meningioma_en_5.1.4_3.4_1698792264764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_pathology_meningioma","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_pathology_meningioma","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_santarabantoosoo").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_pathology_meningioma| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|359.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Santarabantoosoo/PathologyBERT-meningioma \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_philschmid_tiny_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_philschmid_tiny_sst2_distilled_en.md new file mode 100644 index 000000000000..a15727b0cf6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_philschmid_tiny_sst2_distilled_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from philschmid) +author: John Snow Labs +name: bert_classifier_philschmid_tiny_sst2_distilled +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-distilled` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_philschmid_tiny_sst2_distilled_en_5.1.4_3.4_1698795523184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_philschmid_tiny_sst2_distilled_en_5.1.4_3.4_1698795523184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_philschmid_tiny_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_philschmid_tiny_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.distilled_tiny.by_philschmid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_philschmid_tiny_sst2_distilled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/tiny-bert-sst2-distilled +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_pro_cell_expert_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_pro_cell_expert_en.md new file mode 100644 index 000000000000..f511b18d27a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_pro_cell_expert_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Mim) +author: John Snow Labs +name: bert_classifier_pro_cell_expert +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `pro-cell-expert` is a English model originally trained by `Mim`. + +## Predicted Entities + +`accept`, `reject` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_pro_cell_expert_en_5.1.4_3.4_1698787585761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_pro_cell_expert_en_5.1.4_3.4_1698787585761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_pro_cell_expert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_pro_cell_expert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_mim").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_pro_cell_expert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Mim/pro-cell-expert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_qs_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_qs_en.md new file mode 100644 index 000000000000..dc01763a6140 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_qs_en.md @@ -0,0 +1,104 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from asvs) +author: John Snow Labs +name: bert_classifier_qs +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qs-classifier` is a English model originally trained by `asvs`. + +## Predicted Entities + +`Statement`, `Question` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_qs_en_5.1.4_3.4_1698792431524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_qs_en_5.1.4_3.4_1698792431524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_qs","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_qs","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_asvs").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_qs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +References + +- https://huggingface.co/asvs/qs-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_regardv3_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_regardv3_en.md new file mode 100644 index 000000000000..5a5fc07965df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_regardv3_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: bert_classifier_regardv3 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `regardv3` is a English model originally trained by `sasha`. + +## Predicted Entities + +`positive`, `other`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_regardv3_en_5.1.4_3.4_1698794521812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_regardv3_en_5.1.4_3.4_1698794521812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_regardv3","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_regardv3","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_regardv3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/regardv3 +- https://github.com/ewsheng/controllable-nlg-biases \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_response_quality_tiny_ru.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_response_quality_tiny_ru.md new file mode 100644 index 000000000000..1822c5a9f5ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_response_quality_tiny_ru.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Russian BertForSequenceClassification Tiny Cased model (from tinkoff-ai) +author: John Snow Labs +name: bert_classifier_response_quality_tiny +date: 2023-10-31 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `response-quality-classifier-tiny` is a Russian model originally trained by `tinkoff-ai`. + +## Predicted Entities + +`relevance`, `specificity` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_response_quality_tiny_ru_5.1.4_3.4_1698794690965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_response_quality_tiny_ru_5.1.4_3.4_1698794690965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_response_quality_tiny","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_response_quality_tiny","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.bert.tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_response_quality_tiny| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tinkoff-ai/response-quality-classifier-tiny +- https://github.com/egoriyaa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_response_toxicity_base_ru.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_response_toxicity_base_ru.md new file mode 100644 index 000000000000..cfcc688612d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_response_toxicity_base_ru.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Russian BertForSequenceClassification Base Cased model (from tinkoff-ai) +author: John Snow Labs +name: bert_classifier_response_toxicity_base +date: 2023-10-31 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `response-toxicity-classifier-base` is a Russian model originally trained by `tinkoff-ai`. + +## Predicted Entities + +`ok`, `toxic`, `risks`, `severe_toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_response_toxicity_base_ru_5.1.4_3.4_1698794987636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_response_toxicity_base_ru_5.1.4_3.4_1698794987636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_response_toxicity_base","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_response_toxicity_base","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.bert.base.by_tinkoff_ai").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_response_toxicity_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|611.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tinkoff-ai/response-toxicity-classifier-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_roberta_base_finetuned_jd_binary_chinese_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_roberta_base_finetuned_jd_binary_chinese_zh.md new file mode 100644 index 000000000000..fd80c2a069f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_roberta_base_finetuned_jd_binary_chinese_zh.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Base Cased model (from uer) +author: John Snow Labs +name: bert_classifier_roberta_base_finetuned_jd_binary_chinese +date: 2023-10-31 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-finetuned-jd-binary-chinese` is a Chinese model originally trained by `uer`. + +## Predicted Entities + +`positive (stars 4 and 5)` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_jd_binary_chinese_zh_5.1.4_3.4_1698787852119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_jd_binary_chinese_zh_5.1.4_3.4_1698787852119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_jd_binary_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_jd_binary_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.base_finetuned_jd_binary_chinese.by_uer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_roberta_base_finetuned_jd_binary_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|382.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/uer/roberta-base-finetuned-jd-binary-chinese +- https://arxiv.org/abs/1909.05658 +- https://github.com/dbiir/UER-py/wiki/Modelzoo +- https://github.com/zhangxiangxiao/glyph +- https://arxiv.org/abs/1708.02657 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rotten_tomatoes_finetuned_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rotten_tomatoes_finetuned_en.md new file mode 100644 index 000000000000..4f2685f6b7e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rotten_tomatoes_finetuned_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from jboomc) +author: John Snow Labs +name: bert_classifier_rotten_tomatoes_finetuned +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rotten_tomatoes_finetuned` is a English model originally trained by `jboomc`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rotten_tomatoes_finetuned_en_5.1.4_3.4_1698795682298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rotten_tomatoes_finetuned_en_5.1.4_3.4_1698795682298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_rotten_tomatoes_finetuned","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_rotten_tomatoes_finetuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.finetuned.by_jboomc").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rotten_tomatoes_finetuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|119.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jboomc/rotten_tomatoes_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_base_cased_sentiment_nepal_bhasa_ru.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_base_cased_sentiment_nepal_bhasa_ru.md new file mode 100644 index 000000000000..29dcfcb9b7d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_base_cased_sentiment_nepal_bhasa_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian bert_classifier_rubert_base_cased_sentiment_nepal_bhasa BertForSequenceClassification from Tatyana +author: John Snow Labs +name: bert_classifier_rubert_base_cased_sentiment_nepal_bhasa +date: 2023-10-31 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classifier_rubert_base_cased_sentiment_nepal_bhasa` is a Russian model originally trained by Tatyana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_base_cased_sentiment_nepal_bhasa_ru_5.1.4_3.4_1698788096595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_base_cased_sentiment_nepal_bhasa_ru_5.1.4_3.4_1698788096595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_base_cased_sentiment_nepal_bhasa","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_base_cased_sentiment_nepal_bhasa","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rubert_base_cased_sentiment_nepal_bhasa| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/Tatyana/rubert-base-cased-sentiment-new \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_base_corruption_detector_ru.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_base_corruption_detector_ru.md new file mode 100644 index 000000000000..2e963d1fd5fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_base_corruption_detector_ru.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Russian BertForSequenceClassification Base Cased model (from SkolkovoInstitute) +author: John Snow Labs +name: bert_classifier_rubert_base_corruption_detector +date: 2023-10-31 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-base-corruption-detector` is a Russian model originally trained by `SkolkovoInstitute`. + +## Predicted Entities + +`unnatural`, `natural` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_base_corruption_detector_ru_5.1.4_3.4_1698795321565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_base_corruption_detector_ru_5.1.4_3.4_1698795321565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_base_corruption_detector","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_base_corruption_detector","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.bert.base.by_skolkovoinstitute").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rubert_base_corruption_detector| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/SkolkovoInstitute/rubert-base-corruption-detector +- https://www.kaggle.com/alexandersemiletov/toxic-russian-comments +- https://www.kaggle.com/blackmoon/russian-language-toxic-comments \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_tiny2_russian_emotion_detection_ru.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_tiny2_russian_emotion_detection_ru.md new file mode 100644 index 000000000000..daeab167582e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rubert_tiny2_russian_emotion_detection_ru.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Russian BertForSequenceClassification Tiny Cased model (from Aniemore) +author: John Snow Labs +name: bert_classifier_rubert_tiny2_russian_emotion_detection +date: 2023-10-31 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-tiny2-russian-emotion-detection` is a Russian model originally trained by `Aniemore`. + +## Predicted Entities + +`disgust`, `sadness`, `fear`, `enthusiasm`, `anger`, `neutral`, `happiness` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny2_russian_emotion_detection_ru_5.1.4_3.4_1698792621891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rubert_tiny2_russian_emotion_detection_ru_5.1.4_3.4_1698792621891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny2_russian_emotion_detection","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rubert_tiny2_russian_emotion_detection","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.bert.tiny.by_aniemore").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rubert_tiny2_russian_emotion_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Aniemore/rubert-tiny2-russian-emotion-detection +- https://github.com/aniemore/Aniemore +- https://paperswithcode.com/sota?task=Multilabel+Text+Classification&dataset=CEDR+M7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rumor_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rumor_en.md new file mode 100644 index 000000000000..9ee756c76899 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_rumor_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from tristantristantristan) +author: John Snow Labs +name: bert_classifier_rumor +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rumor` is a English model originally trained by `tristantristantristan`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_rumor_en_5.1.4_3.4_1698788616085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_rumor_en_5.1.4_3.4_1698788616085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rumor","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_rumor","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_tristantristantristan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_rumor| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tristantristantristan/rumor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_savasy_turkish_text_classification_tr.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_savasy_turkish_text_classification_tr.md new file mode 100644 index 000000000000..1f1d1b7441ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_savasy_turkish_text_classification_tr.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from savasy) +author: John Snow Labs +name: bert_classifier_savasy_turkish_text_classification +date: 2023-10-31 +tags: [tr, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-turkish-text-classification` is a Turkish model originally trained by `savasy`. + +## Predicted Entities + +`economy`, `world`, `culture`, `sport`, `politics`, `technology`, `health` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_savasy_turkish_text_classification_tr_5.1.4_3.4_1698796217401.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_savasy_turkish_text_classification_tr_5.1.4_3.4_1698796217401.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_savasy_turkish_text_classification","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_savasy_turkish_text_classification","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.bert.by_savasy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_savasy_turkish_text_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/savasy/bert-turkish-text-classification +- https://github.com/stefan-it/turkish-bert +- https://www.kaggle.com/savasy/ttc4900 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sci_uncased_topics_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sci_uncased_topics_en.md new file mode 100644 index 000000000000..31f206f14328 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sci_uncased_topics_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Uncased model (from oliverqq) +author: John Snow Labs +name: bert_classifier_sci_uncased_topics +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scibert-uncased-topics` is a English model originally trained by `oliverqq`. + +## Predicted Entities + +`Engineering`, `Mathematics`, `Medicine`, `Psychology`, `Sociology`, `Computer science`, `Artificial intelligence`, `Economics` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sci_uncased_topics_en_5.1.4_3.4_1698788923680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sci_uncased_topics_en_5.1.4_3.4_1698788923680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sci_uncased_topics","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sci_uncased_topics","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sci_uncased_topics| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/oliverqq/scibert-uncased-topics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sent_chineses_zh.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sent_chineses_zh.md new file mode 100644 index 000000000000..81c7e5b6a1e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sent_chineses_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from lgodwangl) +author: John Snow Labs +name: bert_classifier_sent_chineses +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, zh, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sent_chineses` is a Chinese model originally trained by `lgodwangl`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sent_chineses_zh_5.1.4_3.4_1698795710044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sent_chineses_zh_5.1.4_3.4_1698795710044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sent_chineses","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sent_chineses","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.by_lgodwangl").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sent_chineses| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lgodwangl/sent_chineses \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sentiment_tweets_tr.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sentiment_tweets_tr.md new file mode 100644 index 000000000000..271315e3d841 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sentiment_tweets_tr.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from kullackaan) +author: John Snow Labs +name: bert_classifier_sentiment_tweets +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, tr, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sentiment-tweets` is a Turkish model originally trained by `kullackaan`. + +## Predicted Entities + +`Notr`, `Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sentiment_tweets_tr_5.1.4_3.4_1698793287344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sentiment_tweets_tr_5.1.4_3.4_1698793287344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sentiment_tweets","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Spark NLP'yi seviyorum"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sentiment_tweets","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Spark NLP'yi seviyorum").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.bert.tweet_sentiment.").predict("""Spark NLP'yi seviyorum""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sentiment_tweets| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|691.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/kullackaan/sentiment-tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_small_finetuned_glue_rte_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_small_finetuned_glue_rte_en.md new file mode 100644 index 000000000000..7b849cc99cae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_small_finetuned_glue_rte_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Small Cased model (from muhtasham) +author: John Snow Labs +name: bert_classifier_small_finetuned_glue_rte +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-finetuned-glue-rte` is a English model originally trained by `muhtasham`. + +## Predicted Entities + +`entailment`, `not_entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_small_finetuned_glue_rte_en_5.1.4_3.4_1698793463379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_small_finetuned_glue_rte_en_5.1.4_3.4_1698793463379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_small_finetuned_glue_rte","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_small_finetuned_glue_rte","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.small_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_small_finetuned_glue_rte| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|107.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/muhtasham/bert-small-finetuned-glue-rte +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sundanese_base_emotion_su.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sundanese_base_emotion_su.md new file mode 100644 index 000000000000..0c31a0b2b583 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_sundanese_base_emotion_su.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Sundanese BertForSequenceClassification Base Cased model (from w11wo) +author: John Snow Labs +name: bert_classifier_sundanese_base_emotion +date: 2023-10-31 +tags: [su, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: su +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sundanese-bert-base-emotion-classifier` is a Sundanese model originally trained by `w11wo`. + +## Predicted Entities + +`fear`, `anger`, `joy`, `sadness` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sundanese_base_emotion_su_5.1.4_3.4_1698789425032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sundanese_base_emotion_su_5.1.4_3.4_1698789425032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_sundanese_base_emotion","su") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_sundanese_base_emotion","su") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("su.classify.bert.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sundanese_base_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|su| +|Size:|377.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/w11wo/sundanese-bert-base-emotion-classifier +- https://arxiv.org/abs/1810.04805 +- https://github.com/virgantara/sundanese-twitter-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tanglish_offensive_language_identification_xx.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tanglish_offensive_language_identification_xx.md new file mode 100644 index 000000000000..5d2dc0faee5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tanglish_offensive_language_identification_xx.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Multilingual BertForSequenceClassification Cased model (from seanbenhur) +author: John Snow Labs +name: bert_classifier_tanglish_offensive_language_identification +date: 2023-10-31 +tags: [en, ta, open_source, bert, sequence_classification, classification, xx, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tanglish-offensive-language-identification` is a Multilingual model originally trained by `seanbenhur`. + +## Predicted Entities + +`NOT-OFFENSIVE`, `OFFENSIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tanglish_offensive_language_identification_xx_5.1.4_3.4_1698789867761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tanglish_offensive_language_identification_xx_5.1.4_3.4_1698789867761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_tanglish_offensive_language_identification","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_tanglish_offensive_language_identification","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.bert.lang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tanglish_offensive_language_identification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|889.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/seanbenhur/tanglish-offensive-language-identification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_test_dynamic_pipeline_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_test_dynamic_pipeline_en.md new file mode 100644 index 000000000000..d53316710110 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_test_dynamic_pipeline_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sgugger) +author: John Snow Labs +name: bert_classifier_test_dynamic_pipeline +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `test-dynamic-pipeline` is a English model originally trained by `sgugger`. + +## Predicted Entities + +`equivalent`, `not equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_test_dynamic_pipeline_en_5.1.4_3.4_1698790148676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_test_dynamic_pipeline_en_5.1.4_3.4_1698790148676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_test_dynamic_pipeline","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_test_dynamic_pipeline","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_sgugger").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_test_dynamic_pipeline| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sgugger/test-dynamic-pipeline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_test_model_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_test_model_en.md new file mode 100644 index 000000000000..010b438ef544 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_test_model_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from pwz98) +author: John Snow Labs +name: bert_classifier_test_model +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `test_model` is a English model originally trained by `pwz98`. + +## Predicted Entities + +`软件工程`, `计算机架构`, `算法与数据结构`, `数据库`, `数学知识`, `编程语言与编译器`, `操作系统`, `计算机网络`, `人工智能` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_test_model_en_5.1.4_3.4_1698796206309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_test_model_en_5.1.4_3.4_1698796206309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_test_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_test_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_test_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|144.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/pwz98/test_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_aug_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_aug_sst2_distilled_en.md new file mode 100644 index 000000000000..5a6d1724149e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_aug_sst2_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from moshew) +author: John Snow Labs +name: bert_classifier_tiny_aug_sst2_distilled +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-aug-sst2-distilled` is a English model originally trained by `moshew`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_aug_sst2_distilled_en_5.1.4_3.4_1698790298283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_aug_sst2_distilled_en_5.1.4_3.4_1698790298283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_aug_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_aug_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.distilled_tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_aug_sst2_distilled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/moshew/tiny-bert-aug-sst2-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_best_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_best_en.md new file mode 100644 index 000000000000..7288f66e0c6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_best_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from nbhimte) +author: John Snow Labs +name: bert_classifier_tiny_best +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-best` is a English model originally trained by `nbhimte`. + +## Predicted Entities + +`entailment`, `neutral`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_best_en_5.1.4_3.4_1698793682318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_best_en_5.1.4_3.4_1698793682318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_best","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_best","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_best| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nbhimte/tiny-bert-best \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_sst2_distilled_model_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_sst2_distilled_model_en.md new file mode 100644 index 000000000000..001694ac4e6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_tiny_sst2_distilled_model_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from gokuls) +author: John Snow Labs +name: bert_classifier_tiny_sst2_distilled_model +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-sst2-distilled-model` is a English model originally trained by `gokuls`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_distilled_model_en_5.1.4_3.4_1698793799310.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_sst2_distilled_model_en_5.1.4_3.4_1698793799310.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_distilled_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_sst2_distilled_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.ssts2.distilled_tiny.by_gokuls").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_sst2_distilled_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gokuls/tiny-bert-sst2-distilled-model +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_topic_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_topic_en.md new file mode 100644 index 000000000000..4be7b4cff959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_topic_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from foundkim) +author: John Snow Labs +name: bert_classifier_topic +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `topic_classifier` is a English model originally trained by `foundkim`. + +## Predicted Entities + +`symptômes`, `chiffres`, `divers`, `opinions`, `mesures` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_topic_en_5.1.4_3.4_1698794035986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_topic_en_5.1.4_3.4_1698794035986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_topic","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_topic","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_foundkim").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_topic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/foundkim/topic_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_toxicity_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_toxicity_en.md new file mode 100644 index 000000000000..541b32eed9f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_toxicity_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_classifier_toxicity +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `toxicity-classifier` is a English model originally trained by `mohsenfayyaz`. + +## Predicted Entities + +`Non-Toxic`, `Toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_toxicity_en_5.1.4_3.4_1698796510237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_toxicity_en_5.1.4_3.4_1698796510237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_toxicity","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_toxicity","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_mohsenfayyaz").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_toxicity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/toxicity-classifier +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02_nl.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02_nl.md new file mode 100644 index 000000000000..f2a5e961473b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02_nl.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Cased model (from Jeska) +author: John Snow Labs +name: bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02 +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, nl, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `VaccinChatSentenceClassifierDutch_fromBERTje2_DAdialog02` is a Dutch model originally trained by `Jeska`. + +## Predicted Entities + +`faq_ask_taxi`, `faq_ask_twijfel_ivm_vaccinatie`, `faq_ask_naaldangst`, `faq_ask_positieve_test_na_vaccin`, `faq_ask_experimenteel`, `faq_ask_risicopatient`, `faq_ask_geen_uitnodiging`, `faq_ask_beschermingspercentage`, `faq_ask_vaccin_doorgeven`, `faq_ask_curevac`, `faq_ask_waarom`, `nlu_fallback`, `faq_ask_bijwerking_moderna`, `faq_ask_risicopatient_kanker`, `faq_ask_verschillen`, `faq_ask_keuze`, `faq_ask_huisarts`, `faq_ask_wie_doet_inenting`, `chitchat_ask_hi`, `faq_ask_algemeen_info`, `faq_ask_tijd_tot_tweede_dosis`, `faq_ask_twijfel_ontwikkeling`, `faq_ask_eerst_weigeren`, `faq_ask_hoe_weet_overheid`, `faq_ask_wanneer_iedereen_gevaccineerd`, `faq_ask_jong_en_gezond`, `faq_ask_mondmasker`, `faq_ask_privacy`, `faq_ask_derde_prik`, `faq_ask_moderna`, `faq_ask_vaccine_covid_gehad`, `faq_ask_betrouwbaar`, `faq_ask_hersenziekte`, `faq_ask_waarom_niet_verplicht`, `faq_ask_bijwerking_pfizer`, `faq_ask_buitenlander`, `chitchat_ask_bye`, `faq_ask_wie_ben_ik`, `faq_ask_quarantaine`, `faq_ask_wie_nu`, `faq_ask_beschermen`, `faq_ask_mantelzorger`, `faq_ask_testen`, `faq_ask_borstvoeding`, `faq_ask_afspraak_afzeggen`, `faq_ask_twijfel_effectiviteit`, `faq_ask_betalen_voor_vaccin`, `faq_ask_welk_vaccin_krijg_ik`, `faq_ask_vaccinatiecentrum`, `faq_ask_logistiek_veilig`, `faq_ask_aantal_gevaccineerd`, `faq_ask_tweede_dosis_vervroegen`, `faq_ask_corona_vermijden`, `faq_ask_info_vaccins`, `faq_ask_risicopatient_immuunziekte`, `faq_ask_in_vaccin`, `test`, `faq_ask_geen_risicopatient`, `faq_ask_twijfel_inhoud`, `faq_ask_keuze_vaccinatiecentrum`, `faq_ask_nadelen`, `faq_ask_astrazeneca_prik_2`, `faq_ask_twijfel_vrijheid`, `faq_ask_bijwerking_AZ`, `faq_ask_contra_ind`, `faq_ask_gestockeerd`, `faq_ask_wanneer_algemene_bevolking`, `faq_ask_wat_is_vaccin`, `faq_ask_waarom_twijfel`, `faq_ask_veelgestelde_vragen`, `faq_ask_gezondheidstoestand_gekend`, `faq_ask_risicopatient_diabetes`, `faq_ask_vrijwilliger`, `faq_ask_wat_is_corona`, `faq_ask_iedereen`, `chitchat_ask_hi_fr`, `faq_ask_nuchter`, `faq_ask_wat_na_vaccinatie`, `faq_ask_alternatieve_medicatie`, `faq_ask_bijwerking_algemeen`, `faq_ask_begeleiding`, `faq_ask_duur_vaccinatie`, `faq_ask_janssen`, `faq_ask_hoeveel_dosissen`, `faq_ask_hartspierontsteking`, `faq_ask_bijwerking_lange_termijn`, `faq_ask_dna`, `faq_ask_gif_in_vaccin`, `faq_ask_planning_eerstelijnszorg`, `faq_ask_reproductiegetal`, `chitchat_ask_thanks`, `faq_ask_problemen_uitnodiging`, `faq_ask_covid_door_vaccin`, `faq_ask_combi`, `faq_ask_tweede_dosis_afspraak`, `faq_ask_kosjer_halal`, `get_started`, `faq_ask_vrijwillig_Janssen`, `faq_ask_groepsimmuniteit`, `faq_ask_smaakverlies`, `faq_ask_astrazeneca_bloedklonters`, `faq_ask_complottheorie_Bill_Gates`, `faq_ask_ontwikkeling`, `faq_ask_vaccin_immuunsysteem`, `faq_ask_magnetisch`, `faq_ask_mrna_vs_andere_vaccins`, `faq_ask_test_voor_vaccin`, `faq_ask_betrouwbare_bronnen`, `faq_ask_astrazeneca`, `faq_ask_man_vrouw_verschillen`, `faq_ask_twijfel_bijwerkingen`, `faq_ask_eerste_prik_buitenland`, `faq_ask_sneller_aan_de_beurt`, `faq_ask_complottheorie_5G`, `faq_ask_leveringen`, `faq_ask_essentieel_beroep`, `faq_ask_geen_antwoord`, `faq_ask_twijfel_vaccins_zelf`, `faq_ask_waarom_twee_prikken`, `faq_ask_andere_vaccins`, `faq_ask_beschermingsduur`, `faq_ask_complottheorie`, `faq_ask_uit_flacon`, `faq_ask_qvax_probleem`, `faq_ask_waar_en_wanneer`, `faq_ask_onvruchtbaar`, `faq_ask_janssen_een_dosis`, `chitchat_ask_hoe_gaat_het`, `faq_ask_probleem_registratie`, `faq_ask_kinderen`, `faq_ask_trage_start`, `faq_ask_timing_andere_vaccins`, `faq_ask_uitnodiging_na_vaccinatie`, `faq_ask_snel_ontwikkeld`, `faq_ask_vakantie`, `faq_ask_foetus`, `faq_ask_risicopatient_luchtwegaandoening`, `faq_ask_bijwerking_JJ`, `faq_ask_risicopatient_hartvaat`, `faq_ask_afspraak_gemist`, `faq_ask_meer_bijwerkingen_tweede_dosis`, `faq_ask_zwanger`, `faq_ask_pijnstiller`, `faq_ask_verplicht`, `faq_ask_autisme_na_vaccinatie`, `faq_ask_chronisch_ziek`, `faq_ask_wilsonbekwaam`, `faq_ask_vaccin_variant`, `faq_ask_auto-immuun`, `faq_ask_besmetten_na_vaccin`, `faq_ask_huisdieren`, `faq_ask_prioritaire_gropen`, `faq_ask_maximaal_een_dosis`, `faq_ask_goedkeuring`, `faq_ask_wie_is_risicopatient`, `faq_ask_pfizer`, `faq_ask_bijsluiter`, `faq_ask_corona_is_griep`, `faq_ask_welke_vaccin`, `faq_ask_vaccine_covid_gehad_effect`, `faq_ask_waarom_ouderen_eerst`, `faq_ask_vegan`, `faq_ask_bloed_geven`, `faq_ask_oplopen_vaccinatie`, `faq_ask_minder_mobiel`, `faq_ask_hoe_dodelijk`, `chitchat_ask_hi_en`, `faq_ask_logistiek`, `faq_ask_attest`, `chitchat_ask_hi_de`, `faq_ask_astrazeneca_bij_ouderen`, `faq_ask_planning_ouderen`, `faq_ask_motiveren`, `faq_ask_uitnodiging_afspraak_kwijt`, `chitchat_ask_name`, `faq_ask_phishing`, `faq_ask_twijfel_praktisch`, `faq_ask_wat_is_rna`, `faq_ask_aantal_gevaccineerd_wereldwijd`, `faq_ask_allergisch_na_vaccinatie`, `faq_ask_twijfel_noodzaak` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02_nl_5.1.4_3.4_1698790575435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02_nl_5.1.4_3.4_1698790575435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.frombertje2_dadialog02.by_jeska").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialog02| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jeska/VaccinChatSentenceClassifierDutch_fromBERTje2_DAdialog02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly_nl.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly_nl.md new file mode 100644 index 000000000000..894e85171def --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly_nl.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Cased model (from Jeska) +author: John Snow Labs +name: bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, nl, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `VaccinChatSentenceClassifierDutch_fromBERTje2_DAdialogQonly` is a Dutch model originally trained by `Jeska`. + +## Predicted Entities + +`faq_ask_taxi`, `faq_ask_twijfel_ivm_vaccinatie`, `faq_ask_naaldangst`, `faq_ask_positieve_test_na_vaccin`, `faq_ask_experimenteel`, `faq_ask_risicopatient`, `faq_ask_geen_uitnodiging`, `faq_ask_beschermingspercentage`, `faq_ask_vaccin_doorgeven`, `faq_ask_curevac`, `faq_ask_waarom`, `nlu_fallback`, `faq_ask_bijwerking_moderna`, `faq_ask_risicopatient_kanker`, `faq_ask_verschillen`, `faq_ask_keuze`, `faq_ask_huisarts`, `faq_ask_wie_doet_inenting`, `chitchat_ask_hi`, `faq_ask_algemeen_info`, `faq_ask_tijd_tot_tweede_dosis`, `faq_ask_twijfel_ontwikkeling`, `faq_ask_eerst_weigeren`, `faq_ask_hoe_weet_overheid`, `faq_ask_wanneer_iedereen_gevaccineerd`, `faq_ask_jong_en_gezond`, `faq_ask_mondmasker`, `faq_ask_privacy`, `faq_ask_derde_prik`, `faq_ask_moderna`, `faq_ask_vaccine_covid_gehad`, `faq_ask_betrouwbaar`, `faq_ask_hersenziekte`, `faq_ask_waarom_niet_verplicht`, `faq_ask_bijwerking_pfizer`, `faq_ask_buitenlander`, `chitchat_ask_bye`, `faq_ask_wie_ben_ik`, `faq_ask_quarantaine`, `faq_ask_wie_nu`, `faq_ask_beschermen`, `faq_ask_mantelzorger`, `faq_ask_testen`, `faq_ask_borstvoeding`, `faq_ask_afspraak_afzeggen`, `faq_ask_twijfel_effectiviteit`, `faq_ask_betalen_voor_vaccin`, `faq_ask_welk_vaccin_krijg_ik`, `faq_ask_vaccinatiecentrum`, `faq_ask_logistiek_veilig`, `faq_ask_aantal_gevaccineerd`, `faq_ask_tweede_dosis_vervroegen`, `faq_ask_corona_vermijden`, `faq_ask_info_vaccins`, `faq_ask_risicopatient_immuunziekte`, `faq_ask_in_vaccin`, `test`, `faq_ask_geen_risicopatient`, `faq_ask_twijfel_inhoud`, `faq_ask_keuze_vaccinatiecentrum`, `faq_ask_nadelen`, `faq_ask_astrazeneca_prik_2`, `faq_ask_twijfel_vrijheid`, `faq_ask_bijwerking_AZ`, `faq_ask_contra_ind`, `faq_ask_gestockeerd`, `faq_ask_wanneer_algemene_bevolking`, `faq_ask_wat_is_vaccin`, `faq_ask_waarom_twijfel`, `faq_ask_veelgestelde_vragen`, `faq_ask_gezondheidstoestand_gekend`, `faq_ask_risicopatient_diabetes`, `faq_ask_vrijwilliger`, `faq_ask_wat_is_corona`, `faq_ask_iedereen`, `chitchat_ask_hi_fr`, `faq_ask_nuchter`, `faq_ask_wat_na_vaccinatie`, `faq_ask_alternatieve_medicatie`, `faq_ask_bijwerking_algemeen`, `faq_ask_begeleiding`, `faq_ask_duur_vaccinatie`, `faq_ask_janssen`, `faq_ask_hoeveel_dosissen`, `faq_ask_hartspierontsteking`, `faq_ask_bijwerking_lange_termijn`, `faq_ask_dna`, `faq_ask_gif_in_vaccin`, `faq_ask_planning_eerstelijnszorg`, `faq_ask_reproductiegetal`, `chitchat_ask_thanks`, `faq_ask_problemen_uitnodiging`, `faq_ask_covid_door_vaccin`, `faq_ask_combi`, `faq_ask_tweede_dosis_afspraak`, `faq_ask_kosjer_halal`, `get_started`, `faq_ask_vrijwillig_Janssen`, `faq_ask_groepsimmuniteit`, `faq_ask_smaakverlies`, `faq_ask_astrazeneca_bloedklonters`, `faq_ask_complottheorie_Bill_Gates`, `faq_ask_ontwikkeling`, `faq_ask_vaccin_immuunsysteem`, `faq_ask_magnetisch`, `faq_ask_mrna_vs_andere_vaccins`, `faq_ask_test_voor_vaccin`, `faq_ask_betrouwbare_bronnen`, `faq_ask_astrazeneca`, `faq_ask_man_vrouw_verschillen`, `faq_ask_twijfel_bijwerkingen`, `faq_ask_eerste_prik_buitenland`, `faq_ask_sneller_aan_de_beurt`, `faq_ask_complottheorie_5G`, `faq_ask_leveringen`, `faq_ask_essentieel_beroep`, `faq_ask_geen_antwoord`, `faq_ask_twijfel_vaccins_zelf`, `faq_ask_waarom_twee_prikken`, `faq_ask_andere_vaccins`, `faq_ask_beschermingsduur`, `faq_ask_complottheorie`, `faq_ask_uit_flacon`, `faq_ask_qvax_probleem`, `faq_ask_waar_en_wanneer`, `faq_ask_onvruchtbaar`, `faq_ask_janssen_een_dosis`, `chitchat_ask_hoe_gaat_het`, `faq_ask_probleem_registratie`, `faq_ask_kinderen`, `faq_ask_trage_start`, `faq_ask_timing_andere_vaccins`, `faq_ask_uitnodiging_na_vaccinatie`, `faq_ask_snel_ontwikkeld`, `faq_ask_vakantie`, `faq_ask_foetus`, `faq_ask_risicopatient_luchtwegaandoening`, `faq_ask_bijwerking_JJ`, `faq_ask_risicopatient_hartvaat`, `faq_ask_afspraak_gemist`, `faq_ask_meer_bijwerkingen_tweede_dosis`, `faq_ask_zwanger`, `faq_ask_pijnstiller`, `faq_ask_verplicht`, `faq_ask_autisme_na_vaccinatie`, `faq_ask_chronisch_ziek`, `faq_ask_wilsonbekwaam`, `faq_ask_vaccin_variant`, `faq_ask_auto-immuun`, `faq_ask_besmetten_na_vaccin`, `faq_ask_huisdieren`, `faq_ask_prioritaire_gropen`, `faq_ask_maximaal_een_dosis`, `faq_ask_goedkeuring`, `faq_ask_wie_is_risicopatient`, `faq_ask_pfizer`, `faq_ask_bijsluiter`, `faq_ask_corona_is_griep`, `faq_ask_welke_vaccin`, `faq_ask_vaccine_covid_gehad_effect`, `faq_ask_waarom_ouderen_eerst`, `faq_ask_vegan`, `faq_ask_bloed_geven`, `faq_ask_oplopen_vaccinatie`, `faq_ask_minder_mobiel`, `faq_ask_hoe_dodelijk`, `chitchat_ask_hi_en`, `faq_ask_logistiek`, `faq_ask_attest`, `chitchat_ask_hi_de`, `faq_ask_astrazeneca_bij_ouderen`, `faq_ask_planning_ouderen`, `faq_ask_motiveren`, `faq_ask_uitnodiging_afspraak_kwijt`, `chitchat_ask_name`, `faq_ask_phishing`, `faq_ask_twijfel_praktisch`, `faq_ask_wat_is_rna`, `faq_ask_aantal_gevaccineerd_wereldwijd`, `faq_ask_allergisch_na_vaccinatie`, `faq_ask_twijfel_noodzaak` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly_nl_5.1.4_3.4_1698796783546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly_nl_5.1.4_3.4_1698796783546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.frombertje2_dadialogqonly.by_jeska").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jeska/VaccinChatSentenceClassifierDutch_fromBERTje2_DAdialogQonly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vijaygoriya_test_trainer_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vijaygoriya_test_trainer_en.md new file mode 100644 index 000000000000..9f268f899bd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_vijaygoriya_test_trainer_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from vijaygoriya) +author: John Snow Labs +name: bert_classifier_vijaygoriya_test_trainer +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `test_trainer` is a English model originally trained by `vijaygoriya`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_vijaygoriya_test_trainer_en_5.1.4_3.4_1698790927775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_vijaygoriya_test_trainer_en_5.1.4_3.4_1698790927775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vijaygoriya_test_trainer","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vijaygoriya_test_trainer","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_vijaygoriya").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_vijaygoriya_test_trainer| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/vijaygoriya/test_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_xtremedistil_l6_h384_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_xtremedistil_l6_h384_emotion_en.md new file mode 100644 index 000000000000..8d27740939c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_classifier_xtremedistil_l6_h384_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bergum) +author: John Snow Labs +name: bert_classifier_xtremedistil_l6_h384_emotion +date: 2023-10-31 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h384-emotion` is a English model originally trained by `bergum`. + +## Predicted Entities + +`anger`, `sadness`, `fear`, `joy`, `love`, `surprise` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_xtremedistil_l6_h384_emotion_en_5.1.4_3.4_1698794162521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_xtremedistil_l6_h384_emotion_en_5.1.4_3.4_1698794162521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_l6_h384_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_l6_h384_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.emotion.xtremedistiled.by_bergum").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_xtremedistil_l6_h384_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bergum/xtremedistil-l6-h384-emotion +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_english_emotion_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_english_emotion_analysis_en.md new file mode 100644 index 000000000000..7fc2f42e55a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_english_emotion_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_english_emotion_analysis BertForSequenceClassification from mariogiordano +author: John Snow Labs +name: bert_english_emotion_analysis +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_english_emotion_analysis` is a English model originally trained by mariogiordano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_english_emotion_analysis_en_5.1.4_3.4_1698782860532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_english_emotion_analysis_en_5.1.4_3.4_1698782860532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_english_emotion_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_english_emotion_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_english_emotion_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/mariogiordano/Bert-english-emotion-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_large_uncased_finetuned_stsb_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_large_uncased_finetuned_stsb_en.md new file mode 100644 index 000000000000..ff38c61b2ccd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_large_uncased_finetuned_stsb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_finetuned_stsb BertForSequenceClassification from SarielSinLuo +author: John Snow Labs +name: bert_large_uncased_finetuned_stsb +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_finetuned_stsb` is a English model originally trained by SarielSinLuo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_stsb_en_5.1.4_3.4_1698785398041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_finetuned_stsb_en_5.1.4_3.4_1698785398041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_finetuned_stsb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_finetuned_stsb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_finetuned_stsb| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/SarielSinLuo/bert-large-uncased-finetuned-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_seq_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_seq_en.md new file mode 100644 index 000000000000..42a1b6097499 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_seq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_seq BertForSequenceClassification from shengqin +author: John Snow Labs +name: bert_seq +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_seq` is a English model originally trained by shengqin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_seq_en_5.1.4_3.4_1698782701750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_seq_en_5.1.4_3.4_1698782701750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_seq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_seq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_seq| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/shengqin/bert-seq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_analytical_da.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_analytical_da.md new file mode 100644 index 000000000000..1668739c02d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_analytical_da.md @@ -0,0 +1,104 @@ +--- +layout: model +title: Danish BertForSequenceClassification Cased model (from pin) +author: John Snow Labs +name: bert_sequence_classifier_analytical +date: 2023-10-31 +tags: [da, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `analytical` is a Danish model originally trained by `pin`. + +## Predicted Entities + +`objektivt`, `subjektivt` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_analytical_da_5.1.4_3.4_1698791263558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_analytical_da_5.1.4_3.4_1698791263558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_analytical","da") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_analytical","da") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_analytical| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|da| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/pin/analytical +- https://github.com/alexandrainst +- https://github.com/ebanalyse/senda +- https://github.com/alexandrainst/danlp/blob/master/docs/docs/datasets.md#twitter-sentiment +- https://github.com/ebanalyse/senda \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717_en.md new file mode 100644 index 000000000000..f7bf4a414426 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from adrianmoses) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-auto-nlp-lyrics-classification-19333717` is a English model originally trained by `adrianmoses`. + +## Predicted Entities + +`Heavy Metal`, `Pop`, `Dance`, `Indie`, `Hip Hop`, `Rock` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717_en_5.1.4_3.4_1698791899714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717_en_5.1.4_3.4_1698791899714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_auto_nlp_lyrics_classification_19333717| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/adrianmoses/autonlp-auto-nlp-lyrics-classification-19333717 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_college_classification_164469_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_college_classification_164469_en.md new file mode 100644 index 000000000000..cb4bd53e89cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_college_classification_164469_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Shuvam) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_college_classification_164469 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-college_classification-164469` is a English model originally trained by `Shuvam`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_college_classification_164469_en_5.1.4_3.4_1698794470358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_college_classification_164469_en_5.1.4_3.4_1698794470358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_college_classification_164469","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_college_classification_164469","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_college_classification_164469| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Shuvam/autonlp-college_classification-164469 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_creator_classifications_4021083_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_creator_classifications_4021083_en.md new file mode 100644 index 000000000000..6be622746c4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_creator_classifications_4021083_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from idrimadrid) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_creator_classifications_4021083 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-creator_classifications-4021083` is a English model originally trained by `idrimadrid`. + +## Predicted Entities + +`NBC - Heroes`, `Mattel`, `Icon Comics`, `South Park`, `Dreamworks`, `Shueisha`, `Star Trek`, `Disney`, `Microsoft`, `Image Comics`, `SyFy`, `Hanna-Barbera`, `Lego`, `George Lucas`, `Capcom`, `Ubisoft`, `HarperCollins`, `Namco`, `Hasbro`, `IDW Publishing`, `Cartoon Network`, `George R. R. Martin`, `Universal Studios`, `Sony Pictures`, `Marvel Comics`, `Sega`, `Wildstorm`, `Nintendo`, `DC Comics`, `Clive Barker`, `Matt Groening`, `ABC Studios`, `J. R. R. Tolkien`, `Stephen King`, `Ian Fleming`, `Konami`, `J. K. Rowling`, `Mortal Kombat`, `Dark Horse Comics`, `Blizzard Entertainment`, `Team Epic TV` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_creator_classifications_4021083_en_5.1.4_3.4_1698794748203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_creator_classifications_4021083_en_5.1.4_3.4_1698794748203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_creator_classifications_4021083","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_creator_classifications_4021083","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_creator_classifications_4021083| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/idrimadrid/autonlp-creator_classifications-4021083 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_emotion_14722565_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_emotion_14722565_en.md new file mode 100644 index 000000000000..2cbbee1962cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_emotion_14722565_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from serenay) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_emotion_14722565 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-Emotion-14722565` is a English model originally trained by `serenay`. + +## Predicted Entities + +`joy`, `optimism`, `sadness`, `anger` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_emotion_14722565_en_5.1.4_3.4_1698792177657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_emotion_14722565_en_5.1.4_3.4_1698792177657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_emotion_14722565","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_emotion_14722565","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_emotion_14722565| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/serenay/autonlp-Emotion-14722565 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_imdb_classification_596216804_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_imdb_classification_596216804_en.md new file mode 100644 index 000000000000..c91a06bf47b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_imdb_classification_596216804_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sarahlmk) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_imdb_classification_596216804 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb-classification-596216804` is a English model originally trained by `sarahlmk`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_imdb_classification_596216804_en_5.1.4_3.4_1698792701918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_imdb_classification_596216804_en_5.1.4_3.4_1698792701918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_imdb_classification_596216804","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_imdb_classification_596216804","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_imdb_classification_596216804| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sarahlmk/autonlp-imdb-classification-596216804 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_imdb_eval_71421_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_imdb_eval_71421_en.md new file mode 100644 index 000000000000..3ee7e3383fad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_imdb_eval_71421_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from abhishek) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_imdb_eval_71421 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb_eval-71421` is a English model originally trained by `abhishek`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_imdb_eval_71421_en_5.1.4_3.4_1698792990062.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_imdb_eval_71421_en_5.1.4_3.4_1698792990062.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_imdb_eval_71421","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_imdb_eval_71421","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_imdb_eval_71421| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/abhishek/autonlp-imdb_eval-71421 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_kpmg_nlp_18833547_ar.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_kpmg_nlp_18833547_ar.md new file mode 100644 index 000000000000..42ab4b7c59ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_kpmg_nlp_18833547_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from adelgasmi) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_kpmg_nlp_18833547 +date: 2023-10-31 +tags: [ar, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-kpmg_nlp-18833547` is a Arabic model originally trained by `adelgasmi`. + +## Predicted Entities + +`0`, `4`, `2`, `3`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_kpmg_nlp_18833547_ar_5.1.4_3.4_1698793343628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_kpmg_nlp_18833547_ar_5.1.4_3.4_1698793343628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_kpmg_nlp_18833547","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_kpmg_nlp_18833547","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_kpmg_nlp_18833547| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|506.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/adelgasmi/autonlp-kpmg_nlp-18833547 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927_en.md new file mode 100644 index 000000000000..73a5f2d644ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from akilesh96) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-mrcooper_text_classification-529614927` is a English model originally trained by `akilesh96`. + +## Predicted Entities + +`Heavy Emotion`, `Love`, `Animals`, `Compliment`, `Joke`, `Self`, `Education`, `Religion`, `Politics`, `Health`, `Science` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927_en_5.1.4_3.4_1698795122852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927_en_5.1.4_3.4_1698795122852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_mrcooper_text_classification_529614927| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/akilesh96/autonlp-mrcooper_text_classification-529614927 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_song_lyrics_18753417_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_song_lyrics_18753417_en.md new file mode 100644 index 000000000000..7e94abcac5bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_song_lyrics_18753417_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from juliensimon) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_song_lyrics_18753417 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-song-lyrics-18753417` is a English model originally trained by `juliensimon`. + +## Predicted Entities + +`Heavy Metal`, `Pop`, `Dance`, `Indie`, `Hip Hop`, `Rock` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_song_lyrics_18753417_en_5.1.4_3.4_1698795400270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_song_lyrics_18753417_en_5.1.4_3.4_1698795400270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_song_lyrics_18753417","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_song_lyrics_18753417","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_song_lyrics_18753417| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/juliensimon/autonlp-song-lyrics-18753417 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_spanish_songs_202661_es.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_spanish_songs_202661_es.md new file mode 100644 index 000000000000..ce286ae88de4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autonlp_spanish_songs_202661_es.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Cased model (from hectorcotelo) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_spanish_songs_202661 +date: 2023-10-31 +tags: [es, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-spanish_songs-202661` is a Spanish model originally trained by `hectorcotelo`. + +## Predicted Entities + +`average`, `good`, `hit`, `bad`, `worst` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_spanish_songs_202661_es_5.1.4_3.4_1698793649824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_spanish_songs_202661_es_5.1.4_3.4_1698793649824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_spanish_songs_202661","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_spanish_songs_202661","es") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_spanish_songs_202661| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|411.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/hectorcotelo/autonlp-spanish_songs-202661 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autotrain_formality_1026434913_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autotrain_formality_1026434913_en.md new file mode 100644 index 000000000000..b846ee13b4d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_autotrain_formality_1026434913_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from 404E) +author: John Snow Labs +name: bert_sequence_classifier_autotrain_formality_1026434913 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-formality-1026434913` is a English model originally trained by `404E`. + +## Predicted Entities + +`target` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_formality_1026434913_en_5.1.4_3.4_1698795703107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_formality_1026434913_en_5.1.4_3.4_1698795703107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_formality_1026434913","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_formality_1026434913","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autotrain_formality_1026434913| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/404E/autotrain-formality-1026434913 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_banking77_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_banking77_en.md new file mode 100644 index 000000000000..c6e3e6a0b35a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_banking77_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from philschmid) +author: John Snow Labs +name: bert_sequence_classifier_banking77 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BERT-Banking77` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`get_disposable_virtual_card`, `declined_card_payment`, `fiat_currency_support`, `apple_pay_or_google_pay`, `atm_support`, `failed_transfer`, `Refund_not_showing_up`, `wrong_amount_of_cash_received`, `getting_virtual_card`, `verify_my_identity`, `top_up_by_cash_or_cheque`, `top_up_by_bank_transfer_charge`, `balance_not_updated_after_cheque_or_cash_deposit`, `visa_or_mastercard`, `cash_withdrawal_charge`, `pending_top_up`, `country_support`, `contactless_not_working`, `transfer_not_received_by_recipient`, `card_arrival`, `top_up_failed`, `balance_not_updated_after_bank_transfer`, `topping_up_by_card`, `card_acceptance`, `order_physical_card`, `pending_card_payment`, `exchange_charge`, `extra_charge_on_statement`, `verify_top_up`, `card_swallowed`, `card_delivery_estimate`, `top_up_by_card_charge`, `exchange_rate`, `activate_my_card`, `card_payment_wrong_exchange_rate`, `passcode_forgotten`, `supported_cards_and_currencies`, `why_verify_identity`, `verify_source_of_funds`, `card_payment_fee_charged`, `change_pin`, `top_up_reverted`, `virtual_card_not_working`, `declined_cash_withdrawal`, `reverted_card_payment?`, `transfer_fee_charged`, `card_payment_not_recognised`, `card_not_working`, `beneficiary_not_allowed`, `exchange_via_app`, `automatic_top_up`, `lost_or_stolen_card`, `card_about_to_expire`, `pin_blocked`, `card_linking`, `direct_debit_payment_not_recognised`, `compromised_card`, `request_refund`, `wrong_exchange_rate_for_cash_withdrawal`, `transfer_into_account`, `declined_transfer`, `cash_withdrawal_not_recognised`, `get_physical_card`, `edit_personal_details`, `unable_to_verify_identity`, `terminate_account`, `transfer_timing`, `top_up_limits`, `pending_cash_withdrawal`, `disposable_card_limits`, `getting_spare_card`, `lost_or_stolen_phone`, `pending_transfer`, `receiving_money`, `cancel_transfer`, `age_limit`, `transaction_charged_twice` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_banking77_en_5.1.4_3.4_1698794067301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_banking77_en_5.1.4_3.4_1698794067301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_banking77","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_banking77","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_banking77| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/philschmid/BERT-Banking77 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=BANKING77 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_cased_finetuned_mnli_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_cased_finetuned_mnli_en.md new file mode 100644 index 000000000000..9749ae300454 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_cased_finetuned_mnli_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_base_cased_finetuned_mnli +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-mnli` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`neutral`, `contradiction`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_mnli_en_5.1.4_3.4_1698796023024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_mnli_en_5.1.4_3.4_1698796023024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_mnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_mnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_cased_finetuned_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_cased_finetuned_qqp_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_cased_finetuned_qqp_en.md new file mode 100644 index 000000000000..24d3a420437b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_cased_finetuned_qqp_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_base_cased_finetuned_qqp +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-qqp` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`duplicate`, `not_duplicate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_qqp_en_5.1.4_3.4_1698794368464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_qqp_en_5.1.4_3.4_1698794368464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_qqp","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_qqp","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_cased_finetuned_qqp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+QQP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_italian_cased_sentiment_it.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_italian_cased_sentiment_it.md new file mode 100644 index 000000000000..4d55d30f9cea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_italian_cased_sentiment_it.md @@ -0,0 +1,103 @@ +--- +layout: model +title: Italian BertForSequenceClassification Base Cased model (from neuraly) +author: John Snow Labs +name: bert_sequence_classifier_base_italian_cased_sentiment +date: 2023-10-31 +tags: [it, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-italian-cased-sentiment` is a Italian model originally trained by `neuraly`. + +## Predicted Entities + +`positive`, `negative`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_italian_cased_sentiment_it_5.1.4_3.4_1698794660664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_italian_cased_sentiment_it_5.1.4_3.4_1698794660664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_italian_cased_sentiment","it") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_italian_cased_sentiment","it") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_italian_cased_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|it| +|Size:|414.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/neuraly/bert-base-italian-cased-sentiment +- http://www.di.unito.it/~tutreeb/sentipolc-evalita16/data.html +- https://neuraly.ai +- https://neuraly.ai \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_uncased_mrpc_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_uncased_mrpc_en.md new file mode 100644 index 000000000000..aa29044ad670 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_base_uncased_mrpc_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from Intel) +author: John Snow Labs +name: bert_sequence_classifier_base_uncased_mrpc +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-mrpc` is a English model originally trained by `Intel`. + +## Predicted Entities + +`equivalent`, `not_equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_uncased_mrpc_en_5.1.4_3.4_1698794922462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_uncased_mrpc_en_5.1.4_3.4_1698794922462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_uncased_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_uncased_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_uncased_mrpc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Intel/bert-base-uncased-mrpc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_beto_emotion_analysis_es.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_beto_emotion_analysis_es.md new file mode 100644 index 000000000000..8a92b22d3b57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_beto_emotion_analysis_es.md @@ -0,0 +1,104 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Cased model (from finiteautomata) +author: John Snow Labs +name: bert_sequence_classifier_beto_emotion_analysis +date: 2023-10-31 +tags: [es, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beto-emotion-analysis` is a Spanish model originally trained by `finiteautomata`. + +## Predicted Entities + +`anger`, `joy`, `surprise`, `disgust`, `sadness`, `others`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_beto_emotion_analysis_es_5.1.4_3.4_1698795194176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_beto_emotion_analysis_es_5.1.4_3.4_1698795194176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_beto_emotion_analysis","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_beto_emotion_analysis","es") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_beto_emotion_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|411.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/finiteautomata/beto-emotion-analysis +- https://github.com/finiteautomata/pysentimiento/ +- https://github.com/dccuchile/beto +- http://tass.sepln.org/tass_data/download.php +- https://arxiv.org/abs/2106.09462 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_fin_tone_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_fin_tone_en.md new file mode 100644 index 000000000000..4c6e9add286a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_fin_tone_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from yiyanghkust) +author: John Snow Labs +name: bert_sequence_classifier_fin_tone +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finbert-tone` is a English model originally trained by `yiyanghkust`. + +## Predicted Entities + +`Negative`, `Neutral`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_fin_tone_en_5.1.4_3.4_1698796557463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_fin_tone_en_5.1.4_3.4_1698796557463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_fin_tone","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_fin_tone","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_fin_tone| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/yiyanghkust/finbert-tone +- https://github.com/yya518/FinBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_industry_classification_api_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_industry_classification_api_en.md new file mode 100644 index 000000000000..1837ba163e28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_industry_classification_api_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sampathkethineedi) +author: John Snow Labs +name: bert_sequence_classifier_industry_classification_api +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `industry-classification-api` is a English model originally trained by `sampathkethineedi`. + +## Predicted Entities + +`Electronic Equipment & Instruments`, `Commodity Chemicals`, `Leisure Products`, `Health Care Services`, `Regional Banks`, `Life Sciences Tools & Services`, `Industrial Machinery`, `Gold`, `Investment Banking & Brokerage`, `Restaurants`, `Packaged Foods & Meats`, `IT Consulting & Other Services`, `Oil & Gas Equipment & Services`, `Health Care Supplies`, `Aerospace & Defense`, `Human Resource & Employment Services`, `Application Software`, `Property & Casualty Insurance`, `Movies & Entertainment`, `Oil & Gas Storage & Transportation`, `Apparel Retail`, `Electrical Components & Equipment`, `Consumer Finance`, `Construction Machinery & Heavy Trucks`, `Advertising`, `Casinos & Gaming`, `Construction & Engineering`, `Systems Software`, `Auto Parts & Equipment`, `Data Processing & Outsourced Services`, `Specialty Stores`, `Research & Consulting Services`, `Oil & Gas Exploration & Production`, `Pharmaceuticals`, `Interactive Media & Services`, `Homebuilding`, `Building Products`, `Personal Products`, `Electric Utilities`, `Communications Equipment`, `Trading Companies & Distributors`, `Health Care Equipment`, `Semiconductors`, `Internet & Direct Marketing Retail`, `Environmental & Facilities Services`, `Thrifts & Mortgage Finance`, `Diversified Metals & Mining`, `Oil & Gas Refining & Marketing`, `Steel`, `Diversified Support Services`, `Technology Distributors`, `Health Care Facilities`, `Health Care Technology`, `Biotechnology`, `Integrated Telecommunication Services`, `Real Estate Operating Companies`, `Internet Services & Infrastructure`, `Asset Management & Custody Banks`, `Specialty Chemicals` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_industry_classification_api_en_5.1.4_3.4_1698795567583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_industry_classification_api_en_5.1.4_3.4_1698795567583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_industry_classification_api","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_industry_classification_api","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_industry_classification_api| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sampathkethineedi/industry-classification-api \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_lro_v1.0.0_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_lro_v1.0.0_en.md new file mode 100644 index 000000000000..a29227474f52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_lro_v1.0.0_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from PhucLe) +author: John Snow Labs +name: bert_sequence_classifier_lro_v1.0.0 +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `LRO_v1.0.0` is a English model originally trained by `PhucLe`. + +## Predicted Entities + +`resident`, `lead`, `other` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_lro_v1.0.0_en_5.1.4_3.4_1698796117139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_lro_v1.0.0_en_5.1.4_3.4_1698796117139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_lro_v1.0.0","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_lro_v1.0.0","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_lro_v1.0.0| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/PhucLe/LRO_v1.0.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_minilm_l6_mnli_binary_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_minilm_l6_mnli_binary_en.md new file mode 100644 index 000000000000..3ccbdd488810 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_minilm_l6_mnli_binary_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Cased model (from MoritzLaurer) +author: John Snow Labs +name: bert_sequence_classifier_minilm_l6_mnli_binary +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L6-mnli-binary` is a English model originally trained by `MoritzLaurer`. + +## Predicted Entities + +`not_entailment`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_minilm_l6_mnli_binary_en_5.1.4_3.4_1698796298317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_minilm_l6_mnli_binary_en_5.1.4_3.4_1698796298317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_minilm_l6_mnli_binary","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_minilm_l6_mnli_binary","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_minilm_l6_mnli_binary| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|84.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/MoritzLaurer/MiniLM-L6-mnli-binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c_en.md new file mode 100644 index 000000000000..ed1a2ffb943e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Cased model (from MoritzLaurer) +author: John Snow Labs +name: bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c +date: 2023-10-31 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L6-mnli-fever-docnli-ling-2c` is a English model originally trained by `MoritzLaurer`. + +## Predicted Entities + +`not_entailment`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c_en_5.1.4_3.4_1698796449266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c_en_5.1.4_3.4_1698796449266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_minilm_l6_mnli_fever_docnli_ling_2c| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|84.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/MoritzLaurer/MiniLM-L6-mnli-fever-docnli-ling-2c +- https://github.com/easonnie/combine-FEVER-NSMN/blob/master/other_resources/nli_fever.md +- https://arxiv.org/abs/2104.07179 +- https://arxiv.org/pdf/2106.09449.pdf +- https://github.com/facebookresearch/anli +- https://github.com/easonnie/combine-FEVER-NSMN/blob/master/other_resources/nli_fever.md +- https://arxiv.org/abs/2104.07179 +- https://arxiv.org/pdf/2106.09449.pdf +- https://github.com/facebookresearch/anli +- https://www.linkedin.com/in/moritz-laurer/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_multi2convai_logistics_croatian_bert_hr.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_multi2convai_logistics_croatian_bert_hr.md new file mode 100644 index 000000000000..35273fca697c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_multi2convai_logistics_croatian_bert_hr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Croatian bert_sequence_classifier_multi2convai_logistics_croatian_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_logistics_croatian_bert +date: 2023-10-31 +tags: [bert, hr, open_source, sequence_classification, onnx] +task: Text Classification +language: hr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_logistics_croatian_bert` is a Croatian model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_croatian_bert_hr_5.1.4_3.4_1698796691193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_croatian_bert_hr_5.1.4_3.4_1698796691193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_croatian_bert","hr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_croatian_bert","hr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_logistics_croatian_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|hr| +|Size:|667.3 MB| + +## References + +https://huggingface.co/inovex/multi2convai-logistics-hr-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_multi2convai_logistics_polish_bert_pl.md b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_multi2convai_logistics_polish_bert_pl.md new file mode 100644 index 000000000000..857764cb6e3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_sequence_classifier_multi2convai_logistics_polish_bert_pl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Polish bert_sequence_classifier_multi2convai_logistics_polish_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_logistics_polish_bert +date: 2023-10-31 +tags: [bert, pl, open_source, sequence_classification, onnx] +task: Text Classification +language: pl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_logistics_polish_bert` is a Polish model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_polish_bert_pl_5.1.4_3.4_1698796778482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_polish_bert_pl_5.1.4_3.4_1698796778482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_polish_bert","pl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_polish_bert","pl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_logistics_polish_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pl| +|Size:|495.8 MB| + +## References + +https://huggingface.co/inovex/multi2convai-logistics-pl-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_12_h_768_a_12_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_12_h_768_a_12_emotion_en.md new file mode 100644 index 000000000000..0b7e5e4b0003 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_12_h_768_a_12_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_uncased_l_12_h_768_a_12_emotion BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_uncased_l_12_h_768_a_12_emotion +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_l_12_h_768_a_12_emotion` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_l_12_h_768_a_12_emotion_en_5.1.4_3.4_1698781677094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_l_12_h_768_a_12_emotion_en_5.1.4_3.4_1698781677094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_12_h_768_a_12_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_12_h_768_a_12_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_l_12_h_768_a_12_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/gokuls/bert_uncased_L-12_H-768_A-12_emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_4_h_128_a_2_emotion_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_4_h_128_a_2_emotion_en.md new file mode 100644 index 000000000000..990c75da7bff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_4_h_128_a_2_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_uncased_l_4_h_128_a_2_emotion BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_uncased_l_4_h_128_a_2_emotion +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_l_4_h_128_a_2_emotion` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_l_4_h_128_a_2_emotion_en_5.1.4_3.4_1698786205383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_l_4_h_128_a_2_emotion_en_5.1.4_3.4_1698786205383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_4_h_128_a_2_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_4_h_128_a_2_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_l_4_h_128_a_2_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|18.2 MB| + +## References + +https://huggingface.co/gokuls/bert_uncased_L-4_H-128_A-2_emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_8_h_512_a_8_massive_en.md b/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_8_h_512_a_8_massive_en.md new file mode 100644 index 000000000000..f770a2cb0e34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-bert_uncased_l_8_h_512_a_8_massive_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_uncased_l_8_h_512_a_8_massive BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_uncased_l_8_h_512_a_8_massive +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_l_8_h_512_a_8_massive` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_l_8_h_512_a_8_massive_en_5.1.4_3.4_1698787335771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_l_8_h_512_a_8_massive_en_5.1.4_3.4_1698787335771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_8_h_512_a_8_massive","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_uncased_l_8_h_512_a_8_massive","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_l_8_h_512_a_8_massive| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|155.3 MB| + +## References + +https://huggingface.co/gokuls/bert_uncased_L-8_H-512_A-8_massive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-best_model_sst_2_32_100_en.md b/docs/_posts/ahmedlone127/2023-10-31-best_model_sst_2_32_100_en.md new file mode 100644 index 000000000000..f4a418e43e6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-best_model_sst_2_32_100_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English best_model_sst_2_32_100 BertForSequenceClassification from simonycl +author: John Snow Labs +name: best_model_sst_2_32_100 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`best_model_sst_2_32_100` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/best_model_sst_2_32_100_en_5.1.4_3.4_1698781676402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/best_model_sst_2_32_100_en_5.1.4_3.4_1698781676402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("best_model_sst_2_32_100","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("best_model_sst_2_32_100","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|best_model_sst_2_32_100| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/simonycl/best_model-sst-2-32-100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_clas_model_arabic_en.md b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_clas_model_arabic_en.md new file mode 100644 index 000000000000..1c5a5590376b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_clas_model_arabic_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_clas_model_arabic BertForSequenceClassification from Axel-0087 +author: John Snow Labs +name: burmese_awesome_clas_model_arabic +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_clas_model_arabic` is a English model originally trained by Axel-0087. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_clas_model_arabic_en_5.1.4_3.4_1698788531704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_clas_model_arabic_en_5.1.4_3.4_1698788531704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_clas_model_arabic","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_clas_model_arabic","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_clas_model_arabic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Axel-0087/my_awesome_clas_model_ar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_clas_model_en.md b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_clas_model_en.md new file mode 100644 index 000000000000..717a15255cbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_clas_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_clas_model BertForSequenceClassification from Axel-0087 +author: John Snow Labs +name: burmese_awesome_clas_model +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_clas_model` is a English model originally trained by Axel-0087. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_clas_model_en_5.1.4_3.4_1698783681748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_clas_model_en_5.1.4_3.4_1698783681748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_clas_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_clas_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_clas_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Axel-0087/my_awesome_clas_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_model_ryonsd_en.md b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_model_ryonsd_en.md new file mode 100644 index 000000000000..a91cc86b2de4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_model_ryonsd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_ryonsd BertForSequenceClassification from ryonsd +author: John Snow Labs +name: burmese_awesome_model_ryonsd +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_ryonsd` is a English model originally trained by ryonsd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_ryonsd_en_5.1.4_3.4_1698786070614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_ryonsd_en_5.1.4_3.4_1698786070614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_model_ryonsd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_model_ryonsd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_ryonsd| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ryonsd/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_model_wskhanh_en.md b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_model_wskhanh_en.md new file mode 100644 index 000000000000..ed2d13436077 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-burmese_awesome_model_wskhanh_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_wskhanh BertForSequenceClassification from wskhanh +author: John Snow Labs +name: burmese_awesome_model_wskhanh +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_wskhanh` is a English model originally trained by wskhanh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_wskhanh_en_5.1.4_3.4_1698783396538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_wskhanh_en_5.1.4_3.4_1698783396538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_model_wskhanh","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("burmese_awesome_model_wskhanh","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_wskhanh| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/wskhanh/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en.md b/docs/_posts/ahmedlone127/2023-10-31-chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en.md new file mode 100644 index 000000000000..2a29a886ff17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en_5.1.4_3.4_1698786379840.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en_5.1.4_3.4_1698786379840.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_macbert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/chinese-macbert-base-wallstreetcn-morning-news-market-overview-SSEC-f1-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10_en.md b/docs/_posts/ahmedlone127/2023-10-31-chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10_en.md new file mode 100644 index 000000000000..f6ae09f4a31d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10_en_5.1.4_3.4_1698784429437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10_en_5.1.4_3.4_1698784429437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_pert_base_wallstreetcn_morning_news_market_overview_sse50_v10| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.4 MB| + +## References + +https://huggingface.co/hw2942/chinese-pert-base-wallstreetcn-morning-news-market-overview-SSE50-v10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en.md b/docs/_posts/ahmedlone127/2023-10-31-chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en.md new file mode 100644 index 000000000000..a524aea910f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en_5.1.4_3.4_1698784862444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3_en_5.1.4_3.4_1698784862444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_pert_base_wallstreetcn_morning_news_market_overview_ssec_f1_3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.4 MB| + +## References + +https://huggingface.co/hw2942/chinese-pert-base-wallstreetcn-morning-news-market-overview-SSEC-f1-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-chinese_roberta_wwm_ext_ssec_en.md b/docs/_posts/ahmedlone127/2023-10-31-chinese_roberta_wwm_ext_ssec_en.md new file mode 100644 index 000000000000..054d3057d9dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-chinese_roberta_wwm_ext_ssec_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinese_roberta_wwm_ext_ssec BertForSequenceClassification from hw2942 +author: John Snow Labs +name: chinese_roberta_wwm_ext_ssec +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_wwm_ext_ssec` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_ssec_en_5.1.4_3.4_1698781854521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_ssec_en_5.1.4_3.4_1698781854521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_ssec","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_ssec","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_wwm_ext_ssec| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.2 MB| + +## References + +https://huggingface.co/hw2942/chinese-roberta-wwm-ext-SSEC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1_en.md b/docs/_posts/ahmedlone127/2023-10-31-chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1_en.md new file mode 100644 index 000000000000..c9bc4abc2f79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1` is a English model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1_en_5.1.4_3.4_1698781849733.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1_en_5.1.4_3.4_1698781849733.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_roberta_wwm_ext_wallstreetcn_morning_news_market_overview_sse50_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.2 MB| + +## References + +https://huggingface.co/hw2942/chinese-roberta-wwm-ext-wallstreetcn-morning-news-market-overview-SSE50-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-classifiereutoplevellongertrainaugmented_en.md b/docs/_posts/ahmedlone127/2023-10-31-classifiereutoplevellongertrainaugmented_en.md new file mode 100644 index 000000000000..6b529695699e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-classifiereutoplevellongertrainaugmented_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English classifiereutoplevellongertrainaugmented BertForSequenceClassification from gianma +author: John Snow Labs +name: classifiereutoplevellongertrainaugmented +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`classifiereutoplevellongertrainaugmented` is a English model originally trained by gianma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/classifiereutoplevellongertrainaugmented_en_5.1.4_3.4_1698781198495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/classifiereutoplevellongertrainaugmented_en_5.1.4_3.4_1698781198495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("classifiereutoplevellongertrainaugmented","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("classifiereutoplevellongertrainaugmented","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|classifiereutoplevellongertrainaugmented| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|416.4 MB| + +## References + +https://huggingface.co/gianma/classifierEUtopLevelLongerTrainAugmented \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-deepset_covid_bert_litcovid_v1_0_en.md b/docs/_posts/ahmedlone127/2023-10-31-deepset_covid_bert_litcovid_v1_0_en.md new file mode 100644 index 000000000000..13c84b9f2e0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-deepset_covid_bert_litcovid_v1_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English deepset_covid_bert_litcovid_v1_0 BertForSequenceClassification from sofia-todeschini +author: John Snow Labs +name: deepset_covid_bert_litcovid_v1_0 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deepset_covid_bert_litcovid_v1_0` is a English model originally trained by sofia-todeschini. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deepset_covid_bert_litcovid_v1_0_en_5.1.4_3.4_1698784107452.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deepset_covid_bert_litcovid_v1_0_en_5.1.4_3.4_1698784107452.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("deepset_covid_bert_litcovid_v1_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("deepset_covid_bert_litcovid_v1_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deepset_covid_bert_litcovid_v1_0| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sofia-todeschini/deepset-Covid-bert-LitCovid-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-dummy_model_sdinger_en.md b/docs/_posts/ahmedlone127/2023-10-31-dummy_model_sdinger_en.md new file mode 100644 index 000000000000..96a524a608af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-dummy_model_sdinger_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dummy_model_sdinger BertForSequenceClassification from sdinger +author: John Snow Labs +name: dummy_model_sdinger +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_model_sdinger` is a English model originally trained by sdinger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_model_sdinger_en_5.1.4_3.4_1698787502495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_model_sdinger_en_5.1.4_3.4_1698787502495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dummy_model_sdinger","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dummy_model_sdinger","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dummy_model_sdinger| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sdinger/dummy-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-eganet_mripoti_swahili_bert_sw.md b/docs/_posts/ahmedlone127/2023-10-31-eganet_mripoti_swahili_bert_sw.md new file mode 100644 index 000000000000..f2a4e3420e2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-eganet_mripoti_swahili_bert_sw.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Swahili (macrolanguage) eganet_mripoti_swahili_bert BertForSequenceClassification from shadyAI +author: John Snow Labs +name: eganet_mripoti_swahili_bert +date: 2023-10-31 +tags: [bert, sw, open_source, sequence_classification, onnx] +task: Text Classification +language: sw +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`eganet_mripoti_swahili_bert` is a Swahili (macrolanguage) model originally trained by shadyAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/eganet_mripoti_swahili_bert_sw_5.1.4_3.4_1698784093528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/eganet_mripoti_swahili_bert_sw_5.1.4_3.4_1698784093528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("eganet_mripoti_swahili_bert","sw")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("eganet_mripoti_swahili_bert","sw") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|eganet_mripoti_swahili_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sw| +|Size:|666.4 MB| + +## References + +https://huggingface.co/shadyAI/eganet-mripoti-swahili-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-finetuning_sentiment_spanish_synthetic_train_orig_val_en.md b/docs/_posts/ahmedlone127/2023-10-31-finetuning_sentiment_spanish_synthetic_train_orig_val_en.md new file mode 100644 index 000000000000..255c59252c58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-finetuning_sentiment_spanish_synthetic_train_orig_val_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_spanish_synthetic_train_orig_val BertForSequenceClassification from jclynn +author: John Snow Labs +name: finetuning_sentiment_spanish_synthetic_train_orig_val +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_spanish_synthetic_train_orig_val` is a English model originally trained by jclynn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_spanish_synthetic_train_orig_val_en_5.1.4_3.4_1698781198530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_spanish_synthetic_train_orig_val_en_5.1.4_3.4_1698781198530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_sentiment_spanish_synthetic_train_orig_val","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_sentiment_spanish_synthetic_train_orig_val","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_spanish_synthetic_train_orig_val| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/jclynn/finetuning-sentiment-es-synthetic-train-orig-val \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-google_256_finetuned_financial_headline_en.md b/docs/_posts/ahmedlone127/2023-10-31-google_256_finetuned_financial_headline_en.md new file mode 100644 index 000000000000..d3338b32aba1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-google_256_finetuned_financial_headline_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English google_256_finetuned_financial_headline BertForSequenceClassification from odunola +author: John Snow Labs +name: google_256_finetuned_financial_headline +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`google_256_finetuned_financial_headline` is a English model originally trained by odunola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/google_256_finetuned_financial_headline_en_5.1.4_3.4_1698783796450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/google_256_finetuned_financial_headline_en_5.1.4_3.4_1698783796450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("google_256_finetuned_financial_headline","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("google_256_finetuned_financial_headline","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|google_256_finetuned_financial_headline| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|65.7 MB| + +## References + +https://huggingface.co/odunola/google-256-finetuned-financial-headline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz_en.md b/docs/_posts/ahmedlone127/2023-10-31-legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz_en.md new file mode 100644 index 000000000000..4448cbb2bbb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz BertForSequenceClassification from wiorz +author: John Snow Labs +name: legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz` is a English model originally trained by wiorz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz_en_5.1.4_3.4_1698781198477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz_en_5.1.4_3.4_1698781198477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_bert_samoan_gen1_large_defined_summarized_chuvash_1_wiorz| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/wiorz/legal_bert_sm_gen1_large_defined_summarized_cv_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-minilm_finetuned_emotion_jerry3238_en.md b/docs/_posts/ahmedlone127/2023-10-31-minilm_finetuned_emotion_jerry3238_en.md new file mode 100644 index 000000000000..801eb4d1e419 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-minilm_finetuned_emotion_jerry3238_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English minilm_finetuned_emotion_jerry3238 BertForSequenceClassification from jerry3238 +author: John Snow Labs +name: minilm_finetuned_emotion_jerry3238 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_finetuned_emotion_jerry3238` is a English model originally trained by jerry3238. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_finetuned_emotion_jerry3238_en_5.1.4_3.4_1698783015832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_finetuned_emotion_jerry3238_en_5.1.4_3.4_1698783015832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_emotion_jerry3238","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_emotion_jerry3238","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_finetuned_emotion_jerry3238| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|118.5 MB| + +## References + +https://huggingface.co/jerry3238/minilm-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-model_dir_en.md b/docs/_posts/ahmedlone127/2023-10-31-model_dir_en.md new file mode 100644 index 000000000000..9f299d3a78ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-model_dir_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English model_dir BertForSequenceClassification from rmhirota +author: John Snow Labs +name: model_dir +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_dir` is a English model originally trained by rmhirota. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_dir_en_5.1.4_3.4_1698781446726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_dir_en_5.1.4_3.4_1698781446726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("model_dir","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("model_dir","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_dir| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/rmhirota/model_dir \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-modelo_boicot_reviews_negativas_en.md b/docs/_posts/ahmedlone127/2023-10-31-modelo_boicot_reviews_negativas_en.md new file mode 100644 index 000000000000..8252a169da73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-modelo_boicot_reviews_negativas_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English modelo_boicot_reviews_negativas BertForSequenceClassification from VictorGil75 +author: John Snow Labs +name: modelo_boicot_reviews_negativas +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelo_boicot_reviews_negativas` is a English model originally trained by VictorGil75. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelo_boicot_reviews_negativas_en_5.1.4_3.4_1698782426921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelo_boicot_reviews_negativas_en_5.1.4_3.4_1698782426921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("modelo_boicot_reviews_negativas","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("modelo_boicot_reviews_negativas","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelo_boicot_reviews_negativas| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.9 MB| + +## References + +https://huggingface.co/VictorGil75/Modelo_Boicot_reviews_negativas \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-modelo_clasificacion_taller_notaller_en.md b/docs/_posts/ahmedlone127/2023-10-31-modelo_clasificacion_taller_notaller_en.md new file mode 100644 index 000000000000..de429cdb97b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-modelo_clasificacion_taller_notaller_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English modelo_clasificacion_taller_notaller BertForSequenceClassification from VictorGil75 +author: John Snow Labs +name: modelo_clasificacion_taller_notaller +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelo_clasificacion_taller_notaller` is a English model originally trained by VictorGil75. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelo_clasificacion_taller_notaller_en_5.1.4_3.4_1698782775013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelo_clasificacion_taller_notaller_en_5.1.4_3.4_1698782775013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("modelo_clasificacion_taller_notaller","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("modelo_clasificacion_taller_notaller","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelo_clasificacion_taller_notaller| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/VictorGil75/Modelo_Clasificacion_Taller_NoTaller \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-nextquarter_status_v1_0_2_en.md b/docs/_posts/ahmedlone127/2023-10-31-nextquarter_status_v1_0_2_en.md new file mode 100644 index 000000000000..7b2fce1bb983 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-nextquarter_status_v1_0_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nextquarter_status_v1_0_2 BertForSequenceClassification from AhmedTaha012 +author: John Snow Labs +name: nextquarter_status_v1_0_2 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nextquarter_status_v1_0_2` is a English model originally trained by AhmedTaha012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nextquarter_status_v1_0_2_en_5.1.4_3.4_1698782062502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nextquarter_status_v1_0_2_en_5.1.4_3.4_1698782062502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nextquarter_status_v1_0_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nextquarter_status_v1_0_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nextquarter_status_v1_0_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/AhmedTaha012/nextQuarter-status-V1.0.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-output_jaminone_en.md b/docs/_posts/ahmedlone127/2023-10-31-output_jaminone_en.md new file mode 100644 index 000000000000..0eb7673b7c13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-output_jaminone_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English output_jaminone BertForSequenceClassification from JaminOne +author: John Snow Labs +name: output_jaminone +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`output_jaminone` is a English model originally trained by JaminOne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/output_jaminone_en_5.1.4_3.4_1698782063448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/output_jaminone_en_5.1.4_3.4_1698782063448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("output_jaminone","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("output_jaminone","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|output_jaminone| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/JaminOne/output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-phrasebank_sentiment_analysis_elrarun_en.md b/docs/_posts/ahmedlone127/2023-10-31-phrasebank_sentiment_analysis_elrarun_en.md new file mode 100644 index 000000000000..09aadd8bd6c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-phrasebank_sentiment_analysis_elrarun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_elrarun BertForSequenceClassification from elrarun +author: John Snow Labs +name: phrasebank_sentiment_analysis_elrarun +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_elrarun` is a English model originally trained by elrarun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_elrarun_en_5.1.4_3.4_1698783189728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_elrarun_en_5.1.4_3.4_1698783189728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_elrarun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_elrarun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_elrarun| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/elrarun/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-phrasebank_sentiment_analysis_giantist_en.md b/docs/_posts/ahmedlone127/2023-10-31-phrasebank_sentiment_analysis_giantist_en.md new file mode 100644 index 000000000000..8cd1597c26ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-phrasebank_sentiment_analysis_giantist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English phrasebank_sentiment_analysis_giantist BertForSequenceClassification from giantist +author: John Snow Labs +name: phrasebank_sentiment_analysis_giantist +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phrasebank_sentiment_analysis_giantist` is a English model originally trained by giantist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_giantist_en_5.1.4_3.4_1698781451522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phrasebank_sentiment_analysis_giantist_en_5.1.4_3.4_1698781451522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_giantist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("phrasebank_sentiment_analysis_giantist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phrasebank_sentiment_analysis_giantist| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/giantist/phrasebank-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-pubmed_1e_en.md b/docs/_posts/ahmedlone127/2023-10-31-pubmed_1e_en.md new file mode 100644 index 000000000000..ac1b60b00423 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-pubmed_1e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pubmed_1e BertForSequenceClassification from Shana4 +author: John Snow Labs +name: pubmed_1e +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmed_1e` is a English model originally trained by Shana4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmed_1e_en_5.1.4_3.4_1698785068680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmed_1e_en_5.1.4_3.4_1698785068680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pubmed_1e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pubmed_1e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmed_1e| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/Shana4/PubMed_1E \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-pubmedbert_large_litcovid_v1_3_1_en.md b/docs/_posts/ahmedlone127/2023-10-31-pubmedbert_large_litcovid_v1_3_1_en.md new file mode 100644 index 000000000000..0c33d4ff7d02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-pubmedbert_large_litcovid_v1_3_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pubmedbert_large_litcovid_v1_3_1 BertForSequenceClassification from sofia-todeschini +author: John Snow Labs +name: pubmedbert_large_litcovid_v1_3_1 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmedbert_large_litcovid_v1_3_1` is a English model originally trained by sofia-todeschini. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmedbert_large_litcovid_v1_3_1_en_5.1.4_3.4_1698782434723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmedbert_large_litcovid_v1_3_1_en_5.1.4_3.4_1698782434723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pubmedbert_large_litcovid_v1_3_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pubmedbert_large_litcovid_v1_3_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmedbert_large_litcovid_v1_3_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/sofia-todeschini/PubMedBERT-Large-LitCovid-v1.3.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-rubert_rusentitweet_sturgunbaev_en.md b/docs/_posts/ahmedlone127/2023-10-31-rubert_rusentitweet_sturgunbaev_en.md new file mode 100644 index 000000000000..1858dbcdd560 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-rubert_rusentitweet_sturgunbaev_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_rusentitweet_sturgunbaev BertForSequenceClassification from sturgunbaev +author: John Snow Labs +name: rubert_rusentitweet_sturgunbaev +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_rusentitweet_sturgunbaev` is a English model originally trained by sturgunbaev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_rusentitweet_sturgunbaev_en_5.1.4_3.4_1698783040543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_rusentitweet_sturgunbaev_en_5.1.4_3.4_1698783040543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_rusentitweet_sturgunbaev","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_rusentitweet_sturgunbaev","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_rusentitweet_sturgunbaev| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|666.5 MB| + +## References + +https://huggingface.co/sturgunbaev/rubert-rusentitweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-sentiment_analysis_karlopintaric_en.md b/docs/_posts/ahmedlone127/2023-10-31-sentiment_analysis_karlopintaric_en.md new file mode 100644 index 000000000000..8121a1e6b690 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-sentiment_analysis_karlopintaric_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_karlopintaric BertForSequenceClassification from karlopintaric +author: John Snow Labs +name: sentiment_analysis_karlopintaric +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_karlopintaric` is a English model originally trained by karlopintaric. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_karlopintaric_en_5.1.4_3.4_1698784659236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_karlopintaric_en_5.1.4_3.4_1698784659236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_karlopintaric","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_karlopintaric","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_karlopintaric| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/karlopintaric/sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-sts_klue_bert_base_ep5_ckpt_en.md b/docs/_posts/ahmedlone127/2023-10-31-sts_klue_bert_base_ep5_ckpt_en.md new file mode 100644 index 000000000000..ede1ab0c72a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-sts_klue_bert_base_ep5_ckpt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sts_klue_bert_base_ep5_ckpt BertForSequenceClassification from ys7yoo +author: John Snow Labs +name: sts_klue_bert_base_ep5_ckpt +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sts_klue_bert_base_ep5_ckpt` is a English model originally trained by ys7yoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sts_klue_bert_base_ep5_ckpt_en_5.1.4_3.4_1698785805240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sts_klue_bert_base_ep5_ckpt_en_5.1.4_3.4_1698785805240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sts_klue_bert_base_ep5_ckpt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sts_klue_bert_base_ep5_ckpt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sts_klue_bert_base_ep5_ckpt| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/ys7yoo/sts_klue_bert_base_ep5_ckpt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-swahili_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-10-31-swahili_sentiment_analysis_en.md new file mode 100644 index 000000000000..c859c5399a6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-swahili_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English swahili_sentiment_analysis BertForSequenceClassification from nairaxo +author: John Snow Labs +name: swahili_sentiment_analysis +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swahili_sentiment_analysis` is a English model originally trained by nairaxo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swahili_sentiment_analysis_en_5.1.4_3.4_1698786976813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swahili_sentiment_analysis_en_5.1.4_3.4_1698786976813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("swahili_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("swahili_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swahili_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|755.1 MB| + +## References + +https://huggingface.co/nairaxo/swahili-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-test_upload_en.md b/docs/_posts/ahmedlone127/2023-10-31-test_upload_en.md new file mode 100644 index 000000000000..bceb0fbe9966 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-test_upload_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English test_upload BertForSequenceClassification from augustinLib +author: John Snow Labs +name: test_upload +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_upload` is a English model originally trained by augustinLib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_upload_en_5.1.4_3.4_1698781641758.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_upload_en_5.1.4_3.4_1698781641758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("test_upload","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("test_upload","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_upload| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/augustinLib/test_upload \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-10-31-text_tonga_tonga_islands_subfunction_v2_en.md b/docs/_posts/ahmedlone127/2023-10-31-text_tonga_tonga_islands_subfunction_v2_en.md new file mode 100644 index 000000000000..4491c41cb7f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-10-31-text_tonga_tonga_islands_subfunction_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_tonga_tonga_islands_subfunction_v2 BertForSequenceClassification from Sandrro +author: John Snow Labs +name: text_tonga_tonga_islands_subfunction_v2 +date: 2023-10-31 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_tonga_tonga_islands_subfunction_v2` is a English model originally trained by Sandrro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_subfunction_v2_en_5.1.4_3.4_1698783718345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_subfunction_v2_en_5.1.4_3.4_1698783718345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("text_tonga_tonga_islands_subfunction_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("text_tonga_tonga_islands_subfunction_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_tonga_tonga_islands_subfunction_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.9 MB| + +## References + +https://huggingface.co/Sandrro/text_to_subfunction_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-16_combo_webscrap_1709_v1_en.md b/docs/_posts/ahmedlone127/2023-11-01-16_combo_webscrap_1709_v1_en.md new file mode 100644 index 000000000000..bb558b70d2f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-16_combo_webscrap_1709_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 16_combo_webscrap_1709_v1 BertForSequenceClassification from dsmsb +author: John Snow Labs +name: 16_combo_webscrap_1709_v1 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`16_combo_webscrap_1709_v1` is a English model originally trained by dsmsb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/16_combo_webscrap_1709_v1_en_5.1.4_3.4_1698870052077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/16_combo_webscrap_1709_v1_en_5.1.4_3.4_1698870052077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("16_combo_webscrap_1709_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("16_combo_webscrap_1709_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|16_combo_webscrap_1709_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/dsmsb/16_combo_webscrap_1709_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-absa_aspectsentiment_hotels_en.md b/docs/_posts/ahmedlone127/2023-11-01-absa_aspectsentiment_hotels_en.md new file mode 100644 index 000000000000..aa6a54a530a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-absa_aspectsentiment_hotels_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English absa_aspectsentiment_hotels BertForSequenceClassification from MutazYoune +author: John Snow Labs +name: absa_aspectsentiment_hotels +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`absa_aspectsentiment_hotels` is a English model originally trained by MutazYoune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/absa_aspectsentiment_hotels_en_5.1.4_3.4_1698871310830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/absa_aspectsentiment_hotels_en_5.1.4_3.4_1698871310830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("absa_aspectsentiment_hotels","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("absa_aspectsentiment_hotels","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|absa_aspectsentiment_hotels| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.6 MB| + +## References + +https://huggingface.co/MutazYoune/Absa_AspectSentiment_hotels \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-abusive_tagalog_profanity_detection_en.md b/docs/_posts/ahmedlone127/2023-11-01-abusive_tagalog_profanity_detection_en.md new file mode 100644 index 000000000000..eac0b40e01b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-abusive_tagalog_profanity_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English abusive_tagalog_profanity_detection BertForSequenceClassification from Dabid +author: John Snow Labs +name: abusive_tagalog_profanity_detection +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`abusive_tagalog_profanity_detection` is a English model originally trained by Dabid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/abusive_tagalog_profanity_detection_en_5.1.4_3.4_1698828589109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/abusive_tagalog_profanity_detection_en_5.1.4_3.4_1698828589109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("abusive_tagalog_profanity_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("abusive_tagalog_profanity_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|abusive_tagalog_profanity_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/Dabid/abusive-tagalog-profanity-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-adultcontentclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-adultcontentclassifier_en.md new file mode 100644 index 000000000000..3eba7ee51e66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-adultcontentclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English adultcontentclassifier BertForSequenceClassification from ziadA123 +author: John Snow Labs +name: adultcontentclassifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adultcontentclassifier` is a English model originally trained by ziadA123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adultcontentclassifier_en_5.1.4_3.4_1698868207729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adultcontentclassifier_en_5.1.4_3.4_1698868207729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("adultcontentclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("adultcontentclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adultcontentclassifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|608.8 MB| + +## References + +https://huggingface.co/ziadA123/adultcontentclassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-ag_news1_en.md b/docs/_posts/ahmedlone127/2023-11-01-ag_news1_en.md new file mode 100644 index 000000000000..5727be361471 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-ag_news1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ag_news1 BertForSequenceClassification from Lumos +author: John Snow Labs +name: ag_news1 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ag_news1` is a English model originally trained by Lumos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ag_news1_en_5.1.4_3.4_1698864743178.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ag_news1_en_5.1.4_3.4_1698864743178.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("ag_news1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ag_news1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ag_news1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Lumos/ag_news1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-agribert_clfmodel_en.md b/docs/_posts/ahmedlone127/2023-11-01-agribert_clfmodel_en.md new file mode 100644 index 000000000000..7c3592c86d19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-agribert_clfmodel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English agribert_clfmodel BertForSequenceClassification from divyanshu94 +author: John Snow Labs +name: agribert_clfmodel +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`agribert_clfmodel` is a English model originally trained by divyanshu94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/agribert_clfmodel_en_5.1.4_3.4_1698818952783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/agribert_clfmodel_en_5.1.4_3.4_1698818952783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("agribert_clfmodel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("agribert_clfmodel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|agribert_clfmodel| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/divyanshu94/agriBERT_clfModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-ai_oriya_human_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-ai_oriya_human_text_classification_en.md new file mode 100644 index 000000000000..69a42ebe83c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-ai_oriya_human_text_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ai_oriya_human_text_classification BertForSequenceClassification from priyabrat +author: John Snow Labs +name: ai_oriya_human_text_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai_oriya_human_text_classification` is a English model originally trained by priyabrat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai_oriya_human_text_classification_en_5.1.4_3.4_1698810202853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai_oriya_human_text_classification_en_5.1.4_3.4_1698810202853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("ai_oriya_human_text_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ai_oriya_human_text_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai_oriya_human_text_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/priyabrat/AI.or.Human.text.classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-aimesoft_test_text_classification_4_en.md b/docs/_posts/ahmedlone127/2023-11-01-aimesoft_test_text_classification_4_en.md new file mode 100644 index 000000000000..c4d6af09f4c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-aimesoft_test_text_classification_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English aimesoft_test_text_classification_4 BertForSequenceClassification from bunbohue +author: John Snow Labs +name: aimesoft_test_text_classification_4 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aimesoft_test_text_classification_4` is a English model originally trained by bunbohue. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aimesoft_test_text_classification_4_en_5.1.4_3.4_1698815743643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aimesoft_test_text_classification_4_en_5.1.4_3.4_1698815743643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("aimesoft_test_text_classification_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("aimesoft_test_text_classification_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aimesoft_test_text_classification_4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/bunbohue/Aimesoft-test_Text-Classification-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-ais_ru.md b/docs/_posts/ahmedlone127/2023-11-01-ais_ru.md new file mode 100644 index 000000000000..8311119bbbbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-ais_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian ais BertForSequenceClassification from Neira +author: John Snow Labs +name: ais +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ais` is a Russian model originally trained by Neira. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ais_ru_5.1.4_3.4_1698814835820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ais_ru_5.1.4_3.4_1698814835820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("ais","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ais","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ais| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.5 MB| + +## References + +https://huggingface.co/Neira/ais \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-albert_sentiment_tw.md b/docs/_posts/ahmedlone127/2023-11-01-albert_sentiment_tw.md new file mode 100644 index 000000000000..72ab3796e2a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-albert_sentiment_tw.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Twi albert_sentiment BertForSequenceClassification from clhuang +author: John Snow Labs +name: albert_sentiment +date: 2023-11-01 +tags: [bert, tw, open_source, sequence_classification, onnx] +task: Text Classification +language: tw +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albert_sentiment` is a Twi model originally trained by clhuang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_sentiment_tw_5.1.4_3.4_1698813351136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_sentiment_tw_5.1.4_3.4_1698813351136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("albert_sentiment","tw")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("albert_sentiment","tw") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albert_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tw| +|Size:|43.4 MB| + +## References + +https://huggingface.co/clhuang/albert-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-allenai_scibert_scivocab_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-01-allenai_scibert_scivocab_uncased_en.md new file mode 100644 index 000000000000..9b1c9d805d79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-allenai_scibert_scivocab_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English allenai_scibert_scivocab_uncased BertForSequenceClassification from Mahmoud8 +author: John Snow Labs +name: allenai_scibert_scivocab_uncased +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`allenai_scibert_scivocab_uncased` is a English model originally trained by Mahmoud8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/allenai_scibert_scivocab_uncased_en_5.1.4_3.4_1698806541940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/allenai_scibert_scivocab_uncased_en_5.1.4_3.4_1698806541940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("allenai_scibert_scivocab_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("allenai_scibert_scivocab_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|allenai_scibert_scivocab_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/Mahmoud8/allenai-scibert_scivocab_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-aspect_level_certainty_en.md b/docs/_posts/ahmedlone127/2023-11-01-aspect_level_certainty_en.md new file mode 100644 index 000000000000..5c870f6646bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-aspect_level_certainty_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English aspect_level_certainty BertForSequenceClassification from pedropei +author: John Snow Labs +name: aspect_level_certainty +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aspect_level_certainty` is a English model originally trained by pedropei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aspect_level_certainty_en_5.1.4_3.4_1698806738445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aspect_level_certainty_en_5.1.4_3.4_1698806738445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("aspect_level_certainty","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("aspect_level_certainty","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aspect_level_certainty| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.1 MB| + +## References + +https://huggingface.co/pedropei/aspect-level-certainty \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-asr_question_detection_en.md b/docs/_posts/ahmedlone127/2023-11-01-asr_question_detection_en.md new file mode 100644 index 000000000000..8f7256b83d41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-asr_question_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English asr_question_detection BertForSequenceClassification from mrsinghania +author: John Snow Labs +name: asr_question_detection +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`asr_question_detection` is a English model originally trained by mrsinghania. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/asr_question_detection_en_5.1.4_3.4_1698800718681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/asr_question_detection_en_5.1.4_3.4_1698800718681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("asr_question_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("asr_question_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|asr_question_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/mrsinghania/asr-question-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-autonlp_swahili_sentiment_615517563_en.md b/docs/_posts/ahmedlone127/2023-11-01-autonlp_swahili_sentiment_615517563_en.md new file mode 100644 index 000000000000..ebee0584f73e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-autonlp_swahili_sentiment_615517563_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autonlp_swahili_sentiment_615517563 BertForSequenceClassification from abhishek +author: John Snow Labs +name: autonlp_swahili_sentiment_615517563 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_swahili_sentiment_615517563` is a English model originally trained by abhishek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_swahili_sentiment_615517563_en_5.1.4_3.4_1698835093463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_swahili_sentiment_615517563_en_5.1.4_3.4_1698835093463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("autonlp_swahili_sentiment_615517563","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autonlp_swahili_sentiment_615517563","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_swahili_sentiment_615517563| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.2 MB| + +## References + +https://huggingface.co/abhishek/autonlp-swahili-sentiment-615517563 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-autotrain_crenderping1_0_89162143881_en.md b/docs/_posts/ahmedlone127/2023-11-01-autotrain_crenderping1_0_89162143881_en.md new file mode 100644 index 000000000000..a4034bd4985b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-autotrain_crenderping1_0_89162143881_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_crenderping1_0_89162143881 BertForSequenceClassification from matejmicek +author: John Snow Labs +name: autotrain_crenderping1_0_89162143881 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_crenderping1_0_89162143881` is a English model originally trained by matejmicek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_crenderping1_0_89162143881_en_5.1.4_3.4_1698872488292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_crenderping1_0_89162143881_en_5.1.4_3.4_1698872488292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_crenderping1_0_89162143881","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_crenderping1_0_89162143881","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_crenderping1_0_89162143881| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/matejmicek/autotrain-crenderping1.0-89162143881 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-autotrain_intent_classification_5categories_90278144252_ko.md b/docs/_posts/ahmedlone127/2023-11-01-autotrain_intent_classification_5categories_90278144252_ko.md new file mode 100644 index 000000000000..fec33bf969d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-autotrain_intent_classification_5categories_90278144252_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean autotrain_intent_classification_5categories_90278144252 BertForSequenceClassification from yeye776 +author: John Snow Labs +name: autotrain_intent_classification_5categories_90278144252 +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_intent_classification_5categories_90278144252` is a Korean model originally trained by yeye776. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_intent_classification_5categories_90278144252_ko_5.1.4_3.4_1698843838615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_intent_classification_5categories_90278144252_ko_5.1.4_3.4_1698843838615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_intent_classification_5categories_90278144252","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_intent_classification_5categories_90278144252","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_intent_classification_5categories_90278144252| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|443.4 MB| + +## References + +https://huggingface.co/yeye776/autotrain-intent-classification-5categories-90278144252 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-autotrain_text_sentiment_indonlu_smse_2885384370_id.md b/docs/_posts/ahmedlone127/2023-11-01-autotrain_text_sentiment_indonlu_smse_2885384370_id.md new file mode 100644 index 000000000000..86fe5e5ff1e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-autotrain_text_sentiment_indonlu_smse_2885384370_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian autotrain_text_sentiment_indonlu_smse_2885384370 BertForSequenceClassification from mkhairil +author: John Snow Labs +name: autotrain_text_sentiment_indonlu_smse_2885384370 +date: 2023-11-01 +tags: [bert, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_text_sentiment_indonlu_smse_2885384370` is a Indonesian model originally trained by mkhairil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_text_sentiment_indonlu_smse_2885384370_id_5.1.4_3.4_1698808806562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_text_sentiment_indonlu_smse_2885384370_id_5.1.4_3.4_1698808806562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_text_sentiment_indonlu_smse_2885384370","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("autotrain_text_sentiment_indonlu_smse_2885384370","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_text_sentiment_indonlu_smse_2885384370| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|667.3 MB| + +## References + +https://huggingface.co/mkhairil/autotrain-text-sentiment-indonlu-smse-2885384370 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-awesome_japanese_nlp_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-awesome_japanese_nlp_classification_model_en.md new file mode 100644 index 000000000000..5d7422cd4889 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-awesome_japanese_nlp_classification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English awesome_japanese_nlp_classification_model BertForSequenceClassification from taishi-i +author: John Snow Labs +name: awesome_japanese_nlp_classification_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`awesome_japanese_nlp_classification_model` is a English model originally trained by taishi-i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/awesome_japanese_nlp_classification_model_en_5.1.4_3.4_1698814804044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/awesome_japanese_nlp_classification_model_en_5.1.4_3.4_1698814804044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("awesome_japanese_nlp_classification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("awesome_japanese_nlp_classification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|awesome_japanese_nlp_classification_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/taishi-i/awesome-japanese-nlp-classification-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-base_model_stevendee5_en.md b/docs/_posts/ahmedlone127/2023-11-01-base_model_stevendee5_en.md new file mode 100644 index 000000000000..8eddd4fb55c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-base_model_stevendee5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English base_model_stevendee5 BertForSequenceClassification from stevendee5 +author: John Snow Labs +name: base_model_stevendee5 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_model_stevendee5` is a English model originally trained by stevendee5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_model_stevendee5_en_5.1.4_3.4_1698822467670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_model_stevendee5_en_5.1.4_3.4_1698822467670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("base_model_stevendee5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("base_model_stevendee5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_model_stevendee5| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/stevendee5/base-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bbb_prediction_classification_iupac_en.md b/docs/_posts/ahmedlone127/2023-11-01-bbb_prediction_classification_iupac_en.md new file mode 100644 index 000000000000..4465f7ccd02a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bbb_prediction_classification_iupac_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bbb_prediction_classification_iupac BertForSequenceClassification from Parsa +author: John Snow Labs +name: bbb_prediction_classification_iupac +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bbb_prediction_classification_iupac` is a English model originally trained by Parsa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bbb_prediction_classification_iupac_en_5.1.4_3.4_1698842615392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bbb_prediction_classification_iupac_en_5.1.4_3.4_1698842615392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bbb_prediction_classification_iupac","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bbb_prediction_classification_iupac","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bbb_prediction_classification_iupac| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|326.4 MB| + +## References + +https://huggingface.co/Parsa/BBB_prediction_classification_IUPAC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_agnews_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_agnews_en.md new file mode 100644 index 000000000000..0914a72dc18a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_agnews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_agnews BertForSequenceClassification from tzhao3 +author: John Snow Labs +name: bert_agnews +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_agnews` is a English model originally trained by tzhao3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_agnews_en_5.1.4_3.4_1698810360136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_agnews_en_5.1.4_3.4_1698810360136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_agnews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_agnews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_agnews| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/tzhao3/Bert-AGnews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_amazon_product_classification_epoch_0_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_amazon_product_classification_epoch_0_en.md new file mode 100644 index 000000000000..e5c9c402f78b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_amazon_product_classification_epoch_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_amazon_product_classification_epoch_0 BertForSequenceClassification from nthieu +author: John Snow Labs +name: bert_amazon_product_classification_epoch_0 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_amazon_product_classification_epoch_0` is a English model originally trained by nthieu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_amazon_product_classification_epoch_0_en_5.1.4_3.4_1698862822162.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_amazon_product_classification_epoch_0_en_5.1.4_3.4_1698862822162.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_amazon_product_classification_epoch_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_amazon_product_classification_epoch_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_amazon_product_classification_epoch_0| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/nthieu/bert-amazon-product-classification-epoch-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_bank_model_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_bank_model_tr.md new file mode 100644 index 000000000000..3c05fa40558e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_bank_model_tr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Turkish bert_bank_model BertForSequenceClassification from elifftosunn +author: John Snow Labs +name: bert_bank_model +date: 2023-11-01 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_bank_model` is a Turkish model originally trained by elifftosunn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_bank_model_tr_5.1.4_3.4_1698862531850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_bank_model_tr_5.1.4_3.4_1698862531850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_bank_model","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_bank_model","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_bank_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|691.6 MB| + +## References + +https://huggingface.co/elifftosunn/Bert-Bank-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_ag_news_lucasresck_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_ag_news_lucasresck_en.md new file mode 100644 index 000000000000..e9996d2fc051 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_ag_news_lucasresck_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_ag_news_lucasresck BertForSequenceClassification from lucasresck +author: John Snow Labs +name: bert_base_cased_ag_news_lucasresck +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ag_news_lucasresck` is a English model originally trained by lucasresck. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ag_news_lucasresck_en_5.1.4_3.4_1698801601428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ag_news_lucasresck_en_5.1.4_3.4_1698801601428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ag_news_lucasresck","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ag_news_lucasresck","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ag_news_lucasresck| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/lucasresck/bert-base-cased-ag-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_ag_news_odunola_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_ag_news_odunola_en.md new file mode 100644 index 000000000000..d866b9a8d212 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_ag_news_odunola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_ag_news_odunola BertForSequenceClassification from odunola +author: John Snow Labs +name: bert_base_cased_ag_news_odunola +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ag_news_odunola` is a English model originally trained by odunola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ag_news_odunola_en_5.1.4_3.4_1698839508494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ag_news_odunola_en_5.1.4_3.4_1698839508494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ag_news_odunola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_ag_news_odunola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ag_news_odunola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/odunola/bert-base-cased-ag-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_fever_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_fever_en.md new file mode 100644 index 000000000000..c426aa7644d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_fever_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_fever BertForSequenceClassification from sagnikrayc +author: John Snow Labs +name: bert_base_cased_fever +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_fever` is a English model originally trained by sagnikrayc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_fever_en_5.1.4_3.4_1698810975118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_fever_en_5.1.4_3.4_1698810975118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_fever","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_fever","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_fever| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/sagnikrayc/bert-base-cased-fever \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_finetuned_finbert_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_finetuned_finbert_en.md new file mode 100644 index 000000000000..a9341401a06e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_finetuned_finbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_finetuned_finbert BertForSequenceClassification from ipuneetrathore +author: John Snow Labs +name: bert_base_cased_finetuned_finbert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_finbert` is a English model originally trained by ipuneetrathore. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_finbert_en_5.1.4_3.4_1698807749279.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_finbert_en_5.1.4_3.4_1698807749279.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_finbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_finetuned_finbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_finbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/ipuneetrathore/bert-base-cased-finetuned-finBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_korean_sentiment_ko.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_korean_sentiment_ko.md new file mode 100644 index 000000000000..dfa60273178a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_korean_sentiment_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean bert_base_cased_korean_sentiment BertForSequenceClassification from WhitePeak +author: John Snow Labs +name: bert_base_cased_korean_sentiment +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_korean_sentiment` is a Korean model originally trained by WhitePeak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_korean_sentiment_ko_5.1.4_3.4_1698804265297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_korean_sentiment_ko_5.1.4_3.4_1698804265297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_korean_sentiment","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_korean_sentiment","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_korean_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|667.3 MB| + +## References + +https://huggingface.co/WhitePeak/bert-base-cased-Korean-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_qa_evaluator_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_qa_evaluator_en.md new file mode 100644 index 000000000000..64063a92257b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_cased_qa_evaluator_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_cased_qa_evaluator BertForSequenceClassification from iarfmoose +author: John Snow Labs +name: bert_base_cased_qa_evaluator +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_qa_evaluator` is a English model originally trained by iarfmoose. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_qa_evaluator_en_5.1.4_3.4_1698803516608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_qa_evaluator_en_5.1.4_3.4_1698803516608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_qa_evaluator","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_cased_qa_evaluator","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_qa_evaluator| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/iarfmoose/bert-base-cased-qa-evaluator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_chinese_finetuning_financial_news_sentiment_v2_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_chinese_finetuning_financial_news_sentiment_v2_zh.md new file mode 100644 index 000000000000..589f73fa7e91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_chinese_finetuning_financial_news_sentiment_v2_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese bert_base_chinese_finetuning_financial_news_sentiment_v2 BertForSequenceClassification from hw2942 +author: John Snow Labs +name: bert_base_chinese_finetuning_financial_news_sentiment_v2 +date: 2023-11-01 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuning_financial_news_sentiment_v2` is a Chinese model originally trained by hw2942. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuning_financial_news_sentiment_v2_zh_5.1.4_3.4_1698809002868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuning_financial_news_sentiment_v2_zh_5.1.4_3.4_1698809002868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuning_financial_news_sentiment_v2","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_chinese_finetuning_financial_news_sentiment_v2","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuning_financial_news_sentiment_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/hw2942/bert-base-chinese-finetuning-financial-news-sentiment-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_claimbuster_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_claimbuster_en.md new file mode 100644 index 000000000000..90ac59c59357 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_claimbuster_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_claimbuster BertForSequenceClassification from Nithiwat +author: John Snow Labs +name: bert_base_claimbuster +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_claimbuster` is a English model originally trained by Nithiwat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_claimbuster_en_5.1.4_3.4_1698820050536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_claimbuster_en_5.1.4_3.4_1698820050536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_claimbuster","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_claimbuster","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_claimbuster| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Nithiwat/bert-base_claimbuster \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_codemixed_uncased_sentiment_hi.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_codemixed_uncased_sentiment_hi.md new file mode 100644 index 000000000000..c60991866dbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_codemixed_uncased_sentiment_hi.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Hindi bert_base_codemixed_uncased_sentiment BertForSequenceClassification from rohanrajpal +author: John Snow Labs +name: bert_base_codemixed_uncased_sentiment +date: 2023-11-01 +tags: [bert, hi, open_source, sequence_classification, onnx] +task: Text Classification +language: hi +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_codemixed_uncased_sentiment` is a Hindi model originally trained by rohanrajpal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_codemixed_uncased_sentiment_hi_5.1.4_3.4_1698813250317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_codemixed_uncased_sentiment_hi_5.1.4_3.4_1698813250317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_codemixed_uncased_sentiment","hi")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_codemixed_uncased_sentiment","hi") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_codemixed_uncased_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|hi| +|Size:|667.3 MB| + +## References + +https://huggingface.co/rohanrajpal/bert-base-codemixed-uncased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_german_cased_hatespeech_germeval18_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_german_cased_hatespeech_germeval18_en.md new file mode 100644 index 000000000000..0148a4ea453a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_german_cased_hatespeech_germeval18_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_german_cased_hatespeech_germeval18 BertForSequenceClassification from GeorgHCundK +author: John Snow Labs +name: bert_base_german_cased_hatespeech_germeval18 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_hatespeech_germeval18` is a English model originally trained by GeorgHCundK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_hatespeech_germeval18_en_5.1.4_3.4_1698841522224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_hatespeech_germeval18_en_5.1.4_3.4_1698841522224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_german_cased_hatespeech_germeval18","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_german_cased_hatespeech_germeval18","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_hatespeech_germeval18| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/GeorgHCundK/bert-base-german-cased-hatespeech-GermEval18 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_intent_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_intent_en.md new file mode 100644 index 000000000000..d0ace87eeb15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_intent BertForSequenceClassification from JeswinMS4 +author: John Snow Labs +name: bert_base_intent +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_intent` is a English model originally trained by JeswinMS4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_intent_en_5.1.4_3.4_1698815056244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_intent_en_5.1.4_3.4_1698815056244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_intent| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JeswinMS4/bert-base-intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_finetuned_emotion_toshifumi_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_finetuned_emotion_toshifumi_xx.md new file mode 100644 index 000000000000..8a05a4fabdae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_finetuned_emotion_toshifumi_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_emotion_toshifumi BertForSequenceClassification from Toshifumi +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_emotion_toshifumi +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_emotion_toshifumi` is a Multilingual model originally trained by Toshifumi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_emotion_toshifumi_xx_5.1.4_3.4_1698861894596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_emotion_toshifumi_xx_5.1.4_3.4_1698861894596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_emotion_toshifumi","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_finetuned_emotion_toshifumi","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_emotion_toshifumi| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Toshifumi/bert-base-multilingual-cased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_language_detection_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_language_detection_xx.md new file mode 100644 index 000000000000..e5992b64868e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_language_detection_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_language_detection BertForSequenceClassification from jb2k +author: John Snow Labs +name: bert_base_multilingual_cased_language_detection +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_language_detection` is a Multilingual model originally trained by jb2k. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_language_detection_xx_5.1.4_3.4_1698811819686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_language_detection_xx_5.1.4_3.4_1698811819686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_language_detection","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_language_detection","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_language_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.4 MB| + +## References + +https://huggingface.co/jb2k/bert-base-multilingual-cased-language-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_nsmc_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_nsmc_xx.md new file mode 100644 index 000000000000..30c22db1564e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_cased_nsmc_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_nsmc BertForSequenceClassification from sackoh +author: John Snow Labs +name: bert_base_multilingual_cased_nsmc +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_nsmc` is a Multilingual model originally trained by sackoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_nsmc_xx_5.1.4_3.4_1698805940370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_nsmc_xx_5.1.4_3.4_1698805940370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_nsmc","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_cased_nsmc","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_nsmc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/sackoh/bert-base-multilingual-cased-nsmc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_codemixed_cased_sentiment_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_codemixed_cased_sentiment_xx.md new file mode 100644 index 000000000000..b8c81c1c6cca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_codemixed_cased_sentiment_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_codemixed_cased_sentiment BertForSequenceClassification from rohanrajpal +author: John Snow Labs +name: bert_base_multilingual_codemixed_cased_sentiment +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_codemixed_cased_sentiment` is a Multilingual model originally trained by rohanrajpal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_codemixed_cased_sentiment_xx_5.1.4_3.4_1698813293869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_codemixed_cased_sentiment_xx_5.1.4_3.4_1698813293869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_codemixed_cased_sentiment","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_codemixed_cased_sentiment","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_codemixed_cased_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/rohanrajpal/bert-base-multilingual-codemixed-cased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_uncased_sentiment_3labels_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_uncased_sentiment_3labels_xx.md new file mode 100644 index 000000000000..dd34c94a241e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_uncased_sentiment_3labels_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_3labels BertForSequenceClassification from gilesitorr +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_3labels +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_3labels` is a Multilingual model originally trained by gilesitorr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_3labels_xx_5.1.4_3.4_1698813884609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_3labels_xx_5.1.4_3.4_1698813884609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_3labels","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_3labels","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_3labels| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/gilesitorr/bert-base-multilingual-uncased-sentiment-3labels \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_uncased_sentiment_b1tk40s_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_uncased_sentiment_b1tk40s_xx.md new file mode 100644 index 000000000000..35312f488b98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_multilingual_uncased_sentiment_b1tk40s_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_sentiment_b1tk40s BertForSequenceClassification from b1tk40s +author: John Snow Labs +name: bert_base_multilingual_uncased_sentiment_b1tk40s +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_sentiment_b1tk40s` is a Multilingual model originally trained by b1tk40s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_b1tk40s_xx_5.1.4_3.4_1698829427264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_sentiment_b1tk40s_xx_5.1.4_3.4_1698829427264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_b1tk40s","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_multilingual_uncased_sentiment_b1tk40s","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_sentiment_b1tk40s| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/b1tk40s/bert-base-multilingual-uncased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_personality_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_personality_en.md new file mode 100644 index 000000000000..57ccd6f08078 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_personality_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_personality BertForSequenceClassification from Minej +author: John Snow Labs +name: bert_base_personality +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_personality` is a English model originally trained by Minej. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_personality_en_5.1.4_3.4_1698814698287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_personality_en_5.1.4_3.4_1698814698287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_personality","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_personality","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_personality| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Minej/bert-base-personality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_polish_cyberbullying_pl.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_polish_cyberbullying_pl.md new file mode 100644 index 000000000000..84a336bcdaba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_polish_cyberbullying_pl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Polish bert_base_polish_cyberbullying BertForSequenceClassification from ptaszynski +author: John Snow Labs +name: bert_base_polish_cyberbullying +date: 2023-11-01 +tags: [bert, pl, open_source, sequence_classification, onnx] +task: Text Classification +language: pl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_polish_cyberbullying` is a Polish model originally trained by ptaszynski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_polish_cyberbullying_pl_5.1.4_3.4_1698817046777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_polish_cyberbullying_pl_5.1.4_3.4_1698817046777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_polish_cyberbullying","pl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_polish_cyberbullying","pl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_polish_cyberbullying| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pl| +|Size:|495.8 MB| + +## References + +https://huggingface.co/ptaszynski/bert-base-polish-cyberbullying \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_portuguese_cased_assin2_similarity_pt.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_portuguese_cased_assin2_similarity_pt.md new file mode 100644 index 000000000000..47bf92cf5af8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_portuguese_cased_assin2_similarity_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese bert_base_portuguese_cased_assin2_similarity BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_base_portuguese_cased_assin2_similarity +date: 2023-11-01 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_assin2_similarity` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin2_similarity_pt_5.1.4_3.4_1698812798408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_assin2_similarity_pt_5.1.4_3.4_1698812798408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_assin2_similarity","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_portuguese_cased_assin2_similarity","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_assin2_similarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ruanchaves/bert-base-portuguese-cased-assin2-similarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_sentiment_analysis_french_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_sentiment_analysis_french_en.md new file mode 100644 index 000000000000..9480f0c81a8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_sentiment_analysis_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_sentiment_analysis_french BertForSequenceClassification from ulrichING +author: John Snow Labs +name: bert_base_sentiment_analysis_french +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_sentiment_analysis_french` is a English model originally trained by ulrichING. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_sentiment_analysis_french_en_5.1.4_3.4_1698829099931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_sentiment_analysis_french_en_5.1.4_3.4_1698829099931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sentiment_analysis_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_sentiment_analysis_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_sentiment_analysis_french| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.5 MB| + +## References + +https://huggingface.co/ulrichING/bert-base-sentiment-analysis-french \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_turkish_cased_offensive_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_turkish_cased_offensive_tr.md new file mode 100644 index 000000000000..114fd56fc600 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_turkish_cased_offensive_tr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Turkish bert_base_turkish_cased_offensive BertForSequenceClassification from Overfit-GM +author: John Snow Labs +name: bert_base_turkish_cased_offensive +date: 2023-11-01 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_cased_offensive` is a Turkish model originally trained by Overfit-GM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_cased_offensive_tr_5.1.4_3.4_1698871166738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_cased_offensive_tr_5.1.4_3.4_1698871166738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_cased_offensive","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_cased_offensive","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_cased_offensive| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.5 MB| + +## References + +https://huggingface.co/Overfit-GM/bert-base-turkish-cased-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_turkish_job_advertisement_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_turkish_job_advertisement_tr.md new file mode 100644 index 000000000000..02335f9308a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_turkish_job_advertisement_tr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Turkish bert_base_turkish_job_advertisement BertForSequenceClassification from nanelimon +author: John Snow Labs +name: bert_base_turkish_job_advertisement +date: 2023-11-01 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_turkish_job_advertisement` is a Turkish model originally trained by nanelimon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_turkish_job_advertisement_tr_5.1.4_3.4_1698830333963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_turkish_job_advertisement_tr_5.1.4_3.4_1698830333963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_job_advertisement","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_turkish_job_advertisement","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_turkish_job_advertisement| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|691.6 MB| + +## References + +https://huggingface.co/nanelimon/bert-base-turkish-job-advertisement \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_ag_news_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_ag_news_textattack_en.md new file mode 100644 index 000000000000..3bae0ce953c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_ag_news_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_ag_news_textattack BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_ag_news_textattack +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ag_news_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ag_news_textattack_en_5.1.4_3.4_1698803827207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ag_news_textattack_en_5.1.4_3.4_1698803827207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ag_news_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ag_news_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ag_news_textattack| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-ag-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_amazon_massive_intent_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_amazon_massive_intent_en.md new file mode 100644 index 000000000000..a7f3a19003f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_amazon_massive_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_amazon_massive_intent BertForSequenceClassification from cartesinus +author: John Snow Labs +name: bert_base_uncased_amazon_massive_intent +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_amazon_massive_intent` is a English model originally trained by cartesinus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_amazon_massive_intent_en_5.1.4_3.4_1698815335204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_amazon_massive_intent_en_5.1.4_3.4_1698815335204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_amazon_massive_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_amazon_massive_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_amazon_massive_intent| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/cartesinus/bert-base-uncased-amazon-massive-intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_en.md new file mode 100644 index 000000000000..c5b4d2b582ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_cola BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_cola +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_cola` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_en_5.1.4_3.4_1698819940237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_en_5.1.4_3.4_1698819940237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-CoLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_modeltc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_modeltc_en.md new file mode 100644 index 000000000000..7b9ad10de9fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_modeltc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_cola_modeltc BertForSequenceClassification from ModelTC +author: John Snow Labs +name: bert_base_uncased_cola_modeltc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_cola_modeltc` is a English model originally trained by ModelTC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_modeltc_en_5.1.4_3.4_1698810549116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_modeltc_en_5.1.4_3.4_1698810549116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola_modeltc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola_modeltc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_cola_modeltc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ModelTC/bert-base-uncased-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_yoshitomo_matsubara_en.md new file mode 100644 index 000000000000..585b2a343f59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_cola_yoshitomo_matsubara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_cola_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_cola_yoshitomo_matsubara +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_cola_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_yoshitomo_matsubara_en_5.1.4_3.4_1698872378368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_cola_yoshitomo_matsubara_en_5.1.4_3.4_1698872378368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola_yoshitomo_matsubara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_cola_yoshitomo_matsubara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_cola_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_ear_misogyny_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_ear_misogyny_en.md new file mode 100644 index 000000000000..97e396b09266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_ear_misogyny_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_ear_misogyny BertForSequenceClassification from MilaNLProc +author: John Snow Labs +name: bert_base_uncased_ear_misogyny +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ear_misogyny` is a English model originally trained by MilaNLProc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ear_misogyny_en_5.1.4_3.4_1698833227309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ear_misogyny_en_5.1.4_3.4_1698833227309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ear_misogyny","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_ear_misogyny","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ear_misogyny| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/MilaNLProc/bert-base-uncased-ear-misogyny \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_boolq_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_boolq_en.md new file mode 100644 index 000000000000..a74aef3c8fd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_boolq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_boolq BertForSequenceClassification from lewtun +author: John Snow Labs +name: bert_base_uncased_finetuned_boolq +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_boolq` is a English model originally trained by lewtun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_boolq_en_5.1.4_3.4_1698862097697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_boolq_en_5.1.4_3.4_1698862097697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_boolq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_boolq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_boolq| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/lewtun/bert-base-uncased-finetuned-boolq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_emotion_vasanth_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_emotion_vasanth_en.md new file mode 100644 index 000000000000..cd56ee48c19e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_emotion_vasanth_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_emotion_vasanth BertForSequenceClassification from Vasanth +author: John Snow Labs +name: bert_base_uncased_finetuned_emotion_vasanth +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_emotion_vasanth` is a English model originally trained by Vasanth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_emotion_vasanth_en_5.1.4_3.4_1698812328407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_emotion_vasanth_en_5.1.4_3.4_1698812328407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_emotion_vasanth","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_emotion_vasanth","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_emotion_vasanth| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Vasanth/bert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_sentiments_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_sentiments_en.md new file mode 100644 index 000000000000..1d88bbdba0a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_finetuned_sentiments_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_sentiments BertForSequenceClassification from RinInori +author: John Snow Labs +name: bert_base_uncased_finetuned_sentiments +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_sentiments` is a English model originally trained by RinInori. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sentiments_en_5.1.4_3.4_1698809733113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_sentiments_en_5.1.4_3.4_1698809733113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sentiments","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_finetuned_sentiments","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_sentiments| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/RinInori/bert-base-uncased_finetuned_sentiments \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_header_textsim_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_header_textsim_en.md new file mode 100644 index 000000000000..f2773c9ac481 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_header_textsim_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_header_textsim BertForSequenceClassification from kaanakdeniz +author: John Snow Labs +name: bert_base_uncased_header_textsim +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_header_textsim` is a English model originally trained by kaanakdeniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_header_textsim_en_5.1.4_3.4_1698861281679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_header_textsim_en_5.1.4_3.4_1698861281679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_header_textsim","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_header_textsim","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_header_textsim| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kaanakdeniz/bert_base_uncased_header_textsim \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_imdb_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_imdb_textattack_en.md new file mode 100644 index 000000000000..1eb6dc82d10b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_imdb_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_imdb_textattack BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_imdb_textattack +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_imdb_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_imdb_textattack_en_5.1.4_3.4_1698809533724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_imdb_textattack_en_5.1.4_3.4_1698809533724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_imdb_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_imdb_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_imdb_textattack| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_en.md new file mode 100644 index 000000000000..374bcc931708 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mnli BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_mnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_en_5.1.4_3.4_1698811194109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_en_5.1.4_3.4_1698811194109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-MNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_ishan_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_ishan_en.md new file mode 100644 index 000000000000..f1bbe21647f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_ishan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mnli_ishan BertForSequenceClassification from ishan +author: John Snow Labs +name: bert_base_uncased_mnli_ishan +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli_ishan` is a English model originally trained by ishan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_ishan_en_5.1.4_3.4_1698819074187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_ishan_en_5.1.4_3.4_1698819074187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_ishan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_ishan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli_ishan| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ishan/bert-base-uncased-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_modeltc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_modeltc_en.md new file mode 100644 index 000000000000..14d844c9dbb8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_modeltc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mnli_modeltc BertForSequenceClassification from ModelTC +author: John Snow Labs +name: bert_base_uncased_mnli_modeltc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli_modeltc` is a English model originally trained by ModelTC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_modeltc_en_5.1.4_3.4_1698811199337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_modeltc_en_5.1.4_3.4_1698811199337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_modeltc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_modeltc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli_modeltc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ModelTC/bert-base-uncased-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_yoshitomo_matsubara_en.md new file mode 100644 index 000000000000..e09fd46b8a7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mnli_yoshitomo_matsubara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mnli_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_mnli_yoshitomo_matsubara +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mnli_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_yoshitomo_matsubara_en_5.1.4_3.4_1698803081169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mnli_yoshitomo_matsubara_en_5.1.4_3.4_1698803081169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_yoshitomo_matsubara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mnli_yoshitomo_matsubara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mnli_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_modeltc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_modeltc_en.md new file mode 100644 index 000000000000..96ba19536127 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_modeltc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_modeltc BertForSequenceClassification from ModelTC +author: John Snow Labs +name: bert_base_uncased_mrpc_modeltc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_modeltc` is a English model originally trained by ModelTC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_modeltc_en_5.1.4_3.4_1698828860504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_modeltc_en_5.1.4_3.4_1698828860504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_modeltc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_modeltc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_modeltc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ModelTC/bert-base-uncased-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_textattack_en.md new file mode 100644 index 000000000000..0834f1cbf705 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_textattack BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_mrpc_textattack +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_textattack_en_5.1.4_3.4_1698811019872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_textattack_en_5.1.4_3.4_1698811019872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_textattack| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_yoshitomo_matsubara_en.md new file mode 100644 index 000000000000..5db2833361a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_mrpc_yoshitomo_matsubara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_mrpc_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_mrpc_yoshitomo_matsubara +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_mrpc_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_yoshitomo_matsubara_en_5.1.4_3.4_1698812598411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_mrpc_yoshitomo_matsubara_en_5.1.4_3.4_1698812598411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_yoshitomo_matsubara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_mrpc_yoshitomo_matsubara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_mrpc_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_qnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_qnli_en.md new file mode 100644 index 000000000000..9f9163cdc67d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_qnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_qnli BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_qnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qnli` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qnli_en_5.1.4_3.4_1698813250680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qnli_en_5.1.4_3.4_1698813250680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-QNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_qqp_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_qqp_textattack_en.md new file mode 100644 index 000000000000..1c5bb5c8e8d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_qqp_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_qqp_textattack BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_qqp_textattack +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_qqp_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_textattack_en_5.1.4_3.4_1698803610871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_qqp_textattack_en_5.1.4_3.4_1698803610871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_qqp_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_qqp_textattack| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-QQP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rotten_tomatoes_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rotten_tomatoes_en.md new file mode 100644 index 000000000000..04e961de1eec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rotten_tomatoes_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English bert_base_uncased_rotten_tomatoes BertEmbeddings from textattack +author: John Snow Labs +name: bert_base_uncased_rotten_tomatoes +date: 2023-11-01 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_rotten_tomatoes` is a English model originally trained by textattack. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rotten_tomatoes_en_5.1.4_3.4_1698802282734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rotten_tomatoes_en_5.1.4_3.4_1698802282734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("bert_base_uncased_rotten_tomatoes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("bert_base_uncased_rotten_tomatoes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_rotten_tomatoes| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +References + +https://huggingface.co/textattack/bert-base-uncased-rotten_tomatoes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_en.md new file mode 100644 index 000000000000..e28ea39ac133 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_rte BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_rte +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_rte` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_en_5.1.4_3.4_1698804584079.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_en_5.1.4_3.4_1698804584079.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_rte","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_rte","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_rte| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-RTE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_howey_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_howey_en.md new file mode 100644 index 000000000000..237554f88072 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_howey_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_rte_howey BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_rte_howey +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_rte_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_howey_en_5.1.4_3.4_1698811190189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_howey_en_5.1.4_3.4_1698811190189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_rte_howey","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_rte_howey","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_rte_howey| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_yoshitomo_matsubara_en.md new file mode 100644 index 000000000000..c7b7dbe02de7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_rte_yoshitomo_matsubara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_rte_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_rte_yoshitomo_matsubara +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_rte_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_yoshitomo_matsubara_en_5.1.4_3.4_1698836489621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_rte_yoshitomo_matsubara_en_5.1.4_3.4_1698836489621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_rte_yoshitomo_matsubara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_rte_yoshitomo_matsubara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_rte_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_snli_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_snli_textattack_en.md new file mode 100644 index 000000000000..fd6a882c379b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_snli_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_snli_textattack BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_snli_textattack +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_snli_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_snli_textattack_en_5.1.4_3.4_1698826540593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_snli_textattack_en_5.1.4_3.4_1698826540593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_snli_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_snli_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_snli_textattack| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-snli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_howey_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_howey_en.md new file mode 100644 index 000000000000..092c7682e787 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_howey_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst2_howey BertForSequenceClassification from howey +author: John Snow Labs +name: bert_base_uncased_sst2_howey +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_howey_en_5.1.4_3.4_1698808234409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_howey_en_5.1.4_3.4_1698808234409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_howey","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_howey","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_howey| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/howey/bert-base-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_modeltc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_modeltc_en.md new file mode 100644 index 000000000000..2c6453a4fec0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_modeltc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst2_modeltc BertForSequenceClassification from ModelTC +author: John Snow Labs +name: bert_base_uncased_sst2_modeltc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_modeltc` is a English model originally trained by ModelTC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_modeltc_en_5.1.4_3.4_1698861856914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_modeltc_en_5.1.4_3.4_1698861856914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_modeltc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_modeltc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_modeltc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ModelTC/bert-base-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_yoshitomo_matsubara_en.md new file mode 100644 index 000000000000..92f6d469f533 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst2_yoshitomo_matsubara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst2_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_sst2_yoshitomo_matsubara +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst2_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_yoshitomo_matsubara_en_5.1.4_3.4_1698808968150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst2_yoshitomo_matsubara_en_5.1.4_3.4_1698808968150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_yoshitomo_matsubara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst2_yoshitomo_matsubara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst2_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst_2_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst_2_en.md new file mode 100644 index 000000000000..e94b567ca592 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sst_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sst_2 BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_sst_2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sst_2` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst_2_en_5.1.4_3.4_1698808229890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sst_2_en_5.1.4_3.4_1698808229890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sst_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sst_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-SST-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sts_b_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sts_b_en.md new file mode 100644 index 000000000000..fd1d29be81ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_sts_b_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_sts_b BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_sts_b +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_sts_b` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sts_b_en_5.1.4_3.4_1698809295505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_sts_b_en_5.1.4_3.4_1698809295505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sts_b","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_sts_b","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_sts_b| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-STS-B \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_stsb_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_stsb_yoshitomo_matsubara_en.md new file mode 100644 index 000000000000..e1c0af019a2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_stsb_yoshitomo_matsubara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_stsb_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_base_uncased_stsb_yoshitomo_matsubara +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_stsb_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_stsb_yoshitomo_matsubara_en_5.1.4_3.4_1698820050544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_stsb_yoshitomo_matsubara_en_5.1.4_3.4_1698820050544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_stsb_yoshitomo_matsubara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_stsb_yoshitomo_matsubara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_stsb_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-base-uncased-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_wnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_wnli_en.md new file mode 100644 index 000000000000..c920f3856779 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_wnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_wnli BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_wnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_wnli` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_en_5.1.4_3.4_1698808053861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_wnli_en_5.1.4_3.4_1698808053861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_wnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_wnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_wnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-WNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_yahoo_answers_topics_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_yahoo_answers_topics_en.md new file mode 100644 index 000000000000..b33b28076543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_yahoo_answers_topics_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_yahoo_answers_topics BertForSequenceClassification from fabriceyhc +author: John Snow Labs +name: bert_base_uncased_yahoo_answers_topics +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_yahoo_answers_topics` is a English model originally trained by fabriceyhc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yahoo_answers_topics_en_5.1.4_3.4_1698808766199.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yahoo_answers_topics_en_5.1.4_3.4_1698808766199.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_yahoo_answers_topics","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_yahoo_answers_topics","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_yahoo_answers_topics| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/fabriceyhc/bert-base-uncased-yahoo_answers_topics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_yelp_polarity_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_yelp_polarity_textattack_en.md new file mode 100644 index 000000000000..565c7d450213 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_base_uncased_yelp_polarity_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_base_uncased_yelp_polarity_textattack BertForSequenceClassification from textattack +author: John Snow Labs +name: bert_base_uncased_yelp_polarity_textattack +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_yelp_polarity_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yelp_polarity_textattack_en_5.1.4_3.4_1698809720709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_yelp_polarity_textattack_en_5.1.4_3.4_1698809720709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_yelp_polarity_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_base_uncased_yelp_polarity_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_yelp_polarity_textattack| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/textattack/bert-base-uncased-yelp-polarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_book_zipper_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_book_zipper_en.md new file mode 100644 index 000000000000..7ab59c2ecaf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_book_zipper_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_book_zipper BertForSequenceClassification from abragin +author: John Snow Labs +name: bert_book_zipper +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_book_zipper` is a English model originally trained by abragin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_book_zipper_en_5.1.4_3.4_1698833281207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_book_zipper_en_5.1.4_3.4_1698833281207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_book_zipper","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_book_zipper","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_book_zipper| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/abragin/bert_book_zipper \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_110m_nli_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_110m_nli_zh.md new file mode 100644 index 000000000000..99e11ea0c74f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_110m_nli_zh.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from IDEA-CCNL) +author: John Snow Labs +name: bert_classifier_erlangshen_roberta_110m_nli +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Roberta-110M-NLI` is a Chinese model originally trained by `IDEA-CCNL`. + +## Predicted Entities + +`ENTAILMENT`, `NEUTRAL`, `CONTRADICTION` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_110m_nli_zh_5.1.4_3.4_1698797017458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_110m_nli_zh_5.1.4_3.4_1698797017458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_110m_nli","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_110m_nli","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.lang_110m").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_erlangshen_roberta_110m_nli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-110M-NLI +- https://github.com/IDEA-CCNL/Fengshenbang-LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_110m_sentiment_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_110m_sentiment_zh.md new file mode 100644 index 000000000000..4fb82a40eee3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_110m_sentiment_zh.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from IDEA-CCNL) +author: John Snow Labs +name: bert_classifier_erlangshen_roberta_110m_sentiment +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Roberta-110M-Sentiment` is a Chinese model originally trained by `IDEA-CCNL`. + +## Predicted Entities + +`Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_110m_sentiment_zh_5.1.4_3.4_1698797242694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_110m_sentiment_zh_5.1.4_3.4_1698797242694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_110m_sentiment","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_110m_sentiment","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.sentiment.lang_110m").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_erlangshen_roberta_110m_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment +- https://github.com/IDEA-CCNL/Fengshenbang-LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_330m_sentiment_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_330m_sentiment_zh.md new file mode 100644 index 000000000000..35847a2cb388 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_erlangshen_roberta_330m_sentiment_zh.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from IDEA-CCNL) +author: John Snow Labs +name: bert_classifier_erlangshen_roberta_330m_sentiment +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Roberta-330M-Sentiment` is a Chinese model originally trained by `IDEA-CCNL`. + +## Predicted Entities + +`Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_330m_sentiment_zh_5.1.4_3.4_1698797796164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_erlangshen_roberta_330m_sentiment_zh_5.1.4_3.4_1698797796164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_330m_sentiment","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_erlangshen_roberta_330m_sentiment","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.sentiment.lang_330m").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_erlangshen_roberta_330m_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-330M-Sentiment +- https://github.com/IDEA-CCNL/Fengshenbang-LM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_extra_bio_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_extra_bio_en.md new file mode 100644 index 000000000000..690a4b4f168a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_extra_bio_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from k-partha) +author: John Snow Labs +name: bert_classifier_extra_bio +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `extrabert_bio` is a English model originally trained by `k-partha`. + +## Predicted Entities + +`Introvert`, `Extravert` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_extra_bio_en_5.1.4_3.4_1698798071817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_extra_bio_en_5.1.4_3.4_1698798071817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_extra_bio","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_extra_bio","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.extra_bio.bert.by_k_partha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_extra_bio| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/k-partha/extrabert_bio +- https://arxiv.org/abs/2109.06402 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_finetuned_location_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_finetuned_location_en.md new file mode 100644 index 000000000000..eeecb9262983 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_finetuned_location_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Abderrahim2) +author: John Snow Labs +name: bert_classifier_finetuned_location +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-Location` is a English model originally trained by `Abderrahim2`. + +## Predicted Entities + +`United Kingdom`, `United States`, `Australia` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_location_en_5.1.4_3.4_1698798379757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_finetuned_location_en_5.1.4_3.4_1698798379757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_location","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_finetuned_location","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.finetuned.by_abderrahim2").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_finetuned_location| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Abderrahim2/bert-finetuned-Location \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_gbert_base_finetuned_cefr_de.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_gbert_base_finetuned_cefr_de.md new file mode 100644 index 000000000000..867562721ee4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_gbert_base_finetuned_cefr_de.md @@ -0,0 +1,106 @@ +--- +layout: model +title: German BertForSequenceClassification Base Cased model (from BramVanroy) +author: John Snow Labs +name: bert_classifier_gbert_base_finetuned_cefr +date: 2023-11-01 +tags: [de, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `gbert-base-finetuned-cefr` is a German model originally trained by `BramVanroy`. + +## Predicted Entities + +`A2`, `B2`, `B1`, `C1`, `A1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_gbert_base_finetuned_cefr_de_5.1.4_3.4_1698798666320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_gbert_base_finetuned_cefr_de_5.1.4_3.4_1698798666320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_gbert_base_finetuned_cefr","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_gbert_base_finetuned_cefr","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.bert.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_gbert_base_finetuned_cefr| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BramVanroy/gbert-base-finetuned-cefr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_italian_news_classification_headlines_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_italian_news_classification_headlines_xx.md new file mode 100644 index 000000000000..9e7186a205f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_italian_news_classification_headlines_xx.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Multilingual BertForSequenceClassification Cased model (from M47Labs) +author: John Snow Labs +name: bert_classifier_italian_news_classification_headlines +date: 2023-11-01 +tags: [distilbert, sequence_classification, open_source, it, en, xx, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `italian_news_classification_headlines` is a Multilingual model originally trained by `M47Labs`. + +## Predicted Entities + +`science and technology`, `health`, `society`, `weather`, `enviroment`, `sport`, `lifestyle and leisure`, `labour` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_italian_news_classification_headlines_xx_5.1.4_3.4_1698798959820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_italian_news_classification_headlines_xx_5.1.4_3.4_1698798959820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_italian_news_classification_headlines","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_italian_news_classification_headlines","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.bert.news.by_m47labs").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_italian_news_classification_headlines| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|411.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/M47Labs/italian_news_classification_headlines \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_letr_sol_profanity_filter_ko.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_letr_sol_profanity_filter_ko.md new file mode 100644 index 000000000000..b86724fa8fef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_letr_sol_profanity_filter_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean BertForSequenceClassification Cased model (from dobbytk) +author: John Snow Labs +name: bert_classifier_letr_sol_profanity_filter +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, ko, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `letr-sol-profanity-filter` is a Korean model originally trained by `dobbytk`. + +## Predicted Entities + +`offensive`, `none`, `hate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_letr_sol_profanity_filter_ko_5.1.4_3.4_1698799293938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_letr_sol_profanity_filter_ko_5.1.4_3.4_1698799293938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_letr_sol_profanity_filter","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["나는 Spark NLP를 좋아합니다"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_letr_sol_profanity_filter","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("나는 Spark NLP를 좋아합니다").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_letr_sol_profanity_filter| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|408.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/dobbytk/letr-sol-profanity-filter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_mental_health_trainer_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_mental_health_trainer_en.md new file mode 100644 index 000000000000..9d6abc11255b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_mental_health_trainer_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from edmundhui) +author: John Snow Labs +name: bert_classifier_mental_health_trainer +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mental_health_trainer` is a English model originally trained by `edmundhui`. + +## Predicted Entities + +`aspergers`, `depression`, `ptsd`, `ADHD`, `OCD` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_mental_health_trainer_en_5.1.4_3.4_1698799555994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_mental_health_trainer_en_5.1.4_3.4_1698799555994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_mental_health_trainer","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_mental_health_trainer","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_edmundhui").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_mental_health_trainer| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/edmundhui/mental_health_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_mini_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_mini_sst2_distilled_en.md new file mode 100644 index 000000000000..26d9475b462b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_mini_sst2_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Cased model (from moshew) +author: John Snow Labs +name: bert_classifier_mini_sst2_distilled +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-mini-sst2-distilled` is a English model originally trained by `moshew`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_mini_sst2_distilled_en_5.1.4_3.4_1698799673425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_mini_sst2_distilled_en_5.1.4_3.4_1698799673425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_mini_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_mini_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.distilled_mini").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_mini_sst2_distilled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/moshew/bert-mini-sst2-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_navid_test_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_navid_test_en.md new file mode 100644 index 000000000000..2506b720fe6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_navid_test_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from navsad) +author: John Snow Labs +name: bert_classifier_navid_test +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `navid_test_bert` is a English model originally trained by `navsad`. + +## Predicted Entities + +`unacceptable`, `acceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_navid_test_en_5.1.4_3.4_1698799922847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_navid_test_en_5.1.4_3.4_1698799922847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_navid_test","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_navid_test","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.by_navsad").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_navid_test| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/navsad/navid_test_bert +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_obgv_gder_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_obgv_gder_xx.md new file mode 100644 index 000000000000..8c35055b7fb9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_obgv_gder_xx.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Multilingual BertForSequenceClassification Cased model (from mlkorra) +author: John Snow Labs +name: bert_classifier_obgv_gder +date: 2023-11-01 +tags: [distilbert, sequence_classification, open_source, hi, en, xx, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `obgv-gender-bert-hi-en` is a Multilingual model originally trained by `mlkorra`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_obgv_gder_xx_5.1.4_3.4_1698800258732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_obgv_gder_xx_5.1.4_3.4_1698800258732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_obgv_gder","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_obgv_gder","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.bert.by_mlkorra").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_obgv_gder| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mlkorra/obgv-gender-bert-hi-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_pubmed_pubmed200krct_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_pubmed_pubmed200krct_en.md new file mode 100644 index 000000000000..4c7c6ac4f3df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_pubmed_pubmed200krct_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from pritamdeka) +author: John Snow Labs +name: bert_classifier_pubmed_pubmed200krct +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `PubMedBert-PubMed200kRCT` is a English model originally trained by `pritamdeka`. + +## Predicted Entities + +`METHODS`, `BACKGROUND`, `RESULTS`, `OBJECTIVE`, `CONCLUSIONS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_pubmed_pubmed200krct_en_5.1.4_3.4_1698800509521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_pubmed_pubmed200krct_en_5.1.4_3.4_1698800509521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_pubmed_pubmed200krct","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_pubmed_pubmed200krct","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.pubmed.by_pritamdeka").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_pubmed_pubmed200krct| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/pritamdeka/PubMedBert-PubMed200kRCT +- https://github.com/Franck-Dernoncourt/pubmed-rct/tree/master/PubMed_200k_RCT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_question_detection_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_question_detection_en.md new file mode 100644 index 000000000000..255e19ff6a13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_question_detection_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from huaen) +author: John Snow Labs +name: bert_classifier_question_detection +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `question_detection` is a English model originally trained by `huaen`. + +## Predicted Entities + +`question`, `non_question` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_question_detection_en_5.1.4_3.4_1698800762344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_question_detection_en_5.1.4_3.4_1698800762344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_question_detection","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_question_detection","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_huaen").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_question_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/huaen/question_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_republic_nl.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_republic_nl.md new file mode 100644 index 000000000000..6775ef2b2d0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_republic_nl.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Cased model (from clips) +author: John Snow Labs +name: bert_classifier_republic +date: 2023-11-01 +tags: [nl, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `republic` is a Dutch model originally trained by `clips`. + +## Predicted Entities + +`neu`, `pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_republic_nl_5.1.4_3.4_1698801109096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_republic_nl_5.1.4_3.4_1698801109096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_republic","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_republic","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_republic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/clips/republic +- https://www.uantwerpen.be/en/staff/jan-boon/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_chinanews_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_chinanews_chinese_zh.md new file mode 100644 index 000000000000..433d5dc74738 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_chinanews_chinese_zh.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Base Cased model (from uer) +author: John Snow Labs +name: bert_classifier_roberta_base_finetuned_chinanews_chinese +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-finetuned-chinanews-chinese` is a Chinese model originally trained by `uer`. + +## Predicted Entities + +`entertainment`, `mainland China politics`, `financial news`, `sports`, `Hong Kong - Macau politics`, `culture`, `International news` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_chinanews_chinese_zh_5.1.4_3.4_1698801375541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_chinanews_chinese_zh_5.1.4_3.4_1698801375541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_chinanews_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_chinanews_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.news.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_roberta_base_finetuned_chinanews_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/uer/roberta-base-finetuned-chinanews-chinese +- https://arxiv.org/abs/1909.05658 +- https://github.com/dbiir/UER-py/wiki/Modelzoo +- https://github.com/zhangxiangxiao/glyph +- https://arxiv.org/abs/1708.02657 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_dianping_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_dianping_chinese_zh.md new file mode 100644 index 000000000000..3ed361cdc3b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_dianping_chinese_zh.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Base Cased model (from uer) +author: John Snow Labs +name: bert_classifier_roberta_base_finetuned_dianping_chinese +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-finetuned-dianping-chinese` is a Chinese model originally trained by `uer`. + +## Predicted Entities + +`positive (stars 4 and 5)` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_dianping_chinese_zh_5.1.4_3.4_1698801596846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_dianping_chinese_zh_5.1.4_3.4_1698801596846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_dianping_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_dianping_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.base_finetuned_dianping_chinese.by_uer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_roberta_base_finetuned_dianping_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/uer/roberta-base-finetuned-dianping-chinese +- https://arxiv.org/abs/1909.05658 +- https://github.com/dbiir/UER-py/wiki/Modelzoo +- https://github.com/zhangxiangxiao/glyph +- https://arxiv.org/abs/1708.02657 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_ifeng_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_ifeng_chinese_zh.md new file mode 100644 index 000000000000..500d742db223 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_roberta_base_finetuned_ifeng_chinese_zh.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Base Cased model (from uer) +author: John Snow Labs +name: bert_classifier_roberta_base_finetuned_ifeng_chinese +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-finetuned-ifeng-chinese` is a Chinese model originally trained by `uer`. + +## Predicted Entities + +`mainland China politics`, `military news`, `Taiwan - Hong Kong - Macau politics`, `society news`, `International news` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_ifeng_chinese_zh_5.1.4_3.4_1698801904266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_roberta_base_finetuned_ifeng_chinese_zh_5.1.4_3.4_1698801904266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_ifeng_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_roberta_base_finetuned_ifeng_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.classify.bert.base_finetuned_ifeng_chinese.by_uer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_roberta_base_finetuned_ifeng_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/uer/roberta-base-finetuned-ifeng-chinese +- https://arxiv.org/abs/1909.05658 +- https://github.com/dbiir/UER-py/wiki/Modelzoo +- https://github.com/zhangxiangxiao/glyph +- https://arxiv.org/abs/1708.02657 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_russian_toxicity_ru.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_russian_toxicity_ru.md new file mode 100644 index 000000000000..df78711d16f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_russian_toxicity_ru.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Russian BertForSequenceClassification Cased model (from SkolkovoInstitute) +author: John Snow Labs +name: bert_classifier_russian_toxicity +date: 2023-11-01 +tags: [ru, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `russian_toxicity_classifier` is a Russian model originally trained by `SkolkovoInstitute`. + +## Predicted Entities + +`toxic`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_russian_toxicity_ru_5.1.4_3.4_1698802257253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_russian_toxicity_ru_5.1.4_3.4_1698802257253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_russian_toxicity","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_russian_toxicity","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.bert.by_skolkovoinstitute").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_russian_toxicity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/SkolkovoInstitute/russian_toxicity_classifier +- https://www.kaggle.com/blackmoon/russian-language-toxic-comments/metadata +- https://www.kaggle.com/alexandersemiletov/toxic-russian-comments +- http://creativecommons.org/licenses/by-nc-sa/4.0/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sec_finetuned_finance_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sec_finetuned_finance_classification_en.md new file mode 100644 index 000000000000..e35de4a6084e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sec_finetuned_finance_classification_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from nickmuchi) +author: John Snow Labs +name: bert_classifier_sec_finetuned_finance_classification +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sec-bert-finetuned-finance-classification` is a English model originally trained by `nickmuchi`. + +## Predicted Entities + +`bearish`, `neutral`, `bullish` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sec_finetuned_finance_classification_en_5.1.4_3.4_1698802680649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sec_finetuned_finance_classification_en_5.1.4_3.4_1698802680649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sec_finetuned_finance_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sec_finetuned_finance_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.finetuned.by_nickmuchi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sec_finetuned_finance_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nickmuchi/sec-bert-finetuned-finance-classification +- https://www.kaggle.com/percyzheng/sentiment-classification-selflabel-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_semantic_relations_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_semantic_relations_en.md new file mode 100644 index 000000000000..ce4ede8bbad9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_semantic_relations_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from leonweber) +author: John Snow Labs +name: bert_classifier_semantic_relations +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `semantic_relations` is a English model originally trained by `leonweber`. + +## Predicted Entities + +`PREVENT`, `SIDE_EFF`, `TREAT_FOR_DIS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_semantic_relations_en_5.1.4_3.4_1698802964453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_semantic_relations_en_5.1.4_3.4_1698802964453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_semantic_relations","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_semantic_relations","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_leonweber").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_semantic_relations| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/leonweber/semantic_relations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sentence_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sentence_en.md new file mode 100644 index 000000000000..c1dc0e2b099d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sentence_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from juancavallotti) +author: John Snow Labs +name: bert_classifier_sentence +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_sentence_classifier` is a English model originally trained by `juancavallotti`. + +## Predicted Entities + +`HOME & LIVING`, `ARTS & CULTURE`, `ENVIRONMENT`, `MEDIA`, `STYLE & BEAUTY`, `FOOD & DRINK`, `GREEN`, `TRAVEL`, `BUSINESS`, `POLITICS`, `SCIENCE`, `WORLD NEWS`, `WELLNESS`, `TECH`, `COMEDY`, `SPORTS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sentence_en_5.1.4_3.4_1698796871939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sentence_en_5.1.4_3.4_1698796871939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sentence","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sentence","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_juancavallotti").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sentence| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/juancavallotti/bert_sentence_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sentiment_analysis_en.md new file mode 100644 index 000000000000..aded28ff220b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sentiment_analysis_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from tomato) +author: John Snow Labs +name: bert_classifier_sentiment_analysis +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sentiment_analysis` is a English model originally trained by `tomato`. + +## Predicted Entities + +`3 stars`, `4 stars`, `2 stars`, `5 stars`, `1 star` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sentiment_analysis_en_5.1.4_3.4_1698803299571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sentiment_analysis_en_5.1.4_3.4_1698803299571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sentiment.by_tomato").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tomato/sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sgugger_fine_tuned_cola_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sgugger_fine_tuned_cola_en.md new file mode 100644 index 000000000000..ba89c0a37ac3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sgugger_fine_tuned_cola_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sgugger) +author: John Snow Labs +name: bert_classifier_sgugger_fine_tuned_cola +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-fine-tuned-cola` is a English model originally trained by `sgugger`. + +## Predicted Entities + +`unacceptable`, `acceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sgugger_fine_tuned_cola_en_5.1.4_3.4_1698803588125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sgugger_fine_tuned_cola_en_5.1.4_3.4_1698803588125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sgugger_fine_tuned_cola","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sgugger_fine_tuned_cola","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue_cola1.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sgugger_fine_tuned_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sgugger/bert-fine-tuned-cola +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sgugger_finetuned_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sgugger_finetuned_mrpc_en.md new file mode 100644 index 000000000000..5072509ab9fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_sgugger_finetuned_mrpc_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from sgugger) +author: John Snow Labs +name: bert_classifier_sgugger_finetuned_mrpc +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetuned-bert-mrpc` is a English model originally trained by `sgugger`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_sgugger_finetuned_mrpc_en_5.1.4_3.4_1698803854427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_sgugger_finetuned_mrpc_en_5.1.4_3.4_1698803854427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sgugger_finetuned_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_sgugger_finetuned_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.finetuned.by_sgugger").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_sgugger_finetuned_mrpc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sgugger/finetuned-bert-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_testing3_multilavel_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_testing3_multilavel_en.md new file mode 100644 index 000000000000..cd4dd1f6a655 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_testing3_multilavel_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from shogumbo) +author: John Snow Labs +name: bert_classifier_testing3_multilavel +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `testing3-multilavel-classifier` is a English model originally trained by `shogumbo`. + +## Predicted Entities + +`Amusing`, `Suspenseful`, `Emotional`, `Dark`, `Thrilling` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_testing3_multilavel_en_5.1.4_3.4_1698804122814.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_testing3_multilavel_en_5.1.4_3.4_1698804122814.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_testing3_multilavel","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_testing3_multilavel","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.v1by_shogumbo").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_testing3_multilavel| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shogumbo/testing3-multilavel-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_testing4_multilabel_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_testing4_multilabel_en.md new file mode 100644 index 000000000000..1587c30c8c8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_testing4_multilabel_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from shogumbo) +author: John Snow Labs +name: bert_classifier_testing4_multilabel +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `testing4-multilabel-classifier` is a English model originally trained by `shogumbo`. + +## Predicted Entities + +`Amusing`, `Suspenseful`, `Emotional`, `Dark`, `Thrilling` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_testing4_multilabel_en_5.1.4_3.4_1698797443300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_testing4_multilabel_en_5.1.4_3.4_1698797443300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_testing4_multilabel","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_testing4_multilabel","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.v2by_shogumbo").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_testing4_multilabel| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shogumbo/testing4-multilabel-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_tiny_mnli_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_tiny_mnli_distilled_en.md new file mode 100644 index 000000000000..454f01813c8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_tiny_mnli_distilled_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from nbhimte) +author: John Snow Labs +name: bert_classifier_tiny_mnli_distilled +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-mnli-distilled` is a English model originally trained by `nbhimte`. + +## Predicted Entities + +`contradiction`, `entailment`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_mnli_distilled_en_5.1.4_3.4_1698804272057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_tiny_mnli_distilled_en_5.1.4_3.4_1698804272057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_mnli_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_tiny_mnli_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.glue.distilled_tiny.by_nbhimte").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_tiny_mnli_distilled| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nbhimte/tiny-bert-mnli-distilled +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_titlewave_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_titlewave_base_uncased_en.md new file mode 100644 index 000000000000..1b53f09d49ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_titlewave_base_uncased_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from tennessejoyce) +author: John Snow Labs +name: bert_classifier_titlewave_base_uncased +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `titlewave-bert-base-uncased` is a English model originally trained by `tennessejoyce`. + +## Predicted Entities + +`Unanswered`, `Answered` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_titlewave_base_uncased_en_5.1.4_3.4_1698804528437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_titlewave_base_uncased_en_5.1.4_3.4_1698804528437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_titlewave_base_uncased","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_titlewave_base_uncased","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.uncased_base.by_tennessejoyce").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_titlewave_base_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tennessejoyce/titlewave-bert-base-uncased +- https://github.com/tennessejoyce/TitleWave +- https://archive.org/details/stackexchange \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_topic_v5_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_topic_v5_en.md new file mode 100644 index 000000000000..a017a8424583 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_topic_v5_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from slowturtle) +author: John Snow Labs +name: bert_classifier_topic_v5 +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `topic_v5` is a English model originally trained by `slowturtle`. + +## Predicted Entities + +`bottlenecks`, `responsability`, `srcs`, `justify`, `agree`, `procedures`, `world`, `experience`, `planning`, `jobrole`, `efficiency`, `ergonomics`, `suggestion`, `motivation_commitment`, `shame`, `impact`, `career`, `timesheet`, `transport`, `autonomy`, `collaboration`, `personal`, `burnout_stress`, `integration`, `employee`, `relationship`, `mental_health`, `learning`, `proud`, `delivery`, `communication`, `bureaucracy`, `support`, `ethics`, `turnover`, `changes`, `lifebalance`, `rwxp`, `fear`, `workload`, `environment`, `improvement`, `pandemics`, `allgood`, `diversity`, `questioning_criticism`, `clients`, `salary`, `safety`, `performance`, `health`, `clarity`, `growth`, `behaviour`, `product`, `recognition`, `challenges`, `skills`, `facilities`, `respect`, `routine`, `benefits`, `leadership`, `training`, `culture_values`, `feedback`, `disagree`, `attrition` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_topic_v5_en_5.1.4_3.4_1698804800868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_topic_v5_en_5.1.4_3.4_1698804800868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_topic_v5","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_topic_v5","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_slowturtle").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_topic_v5| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/slowturtle/topic_v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_toxic_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_toxic_en.md new file mode 100644 index 000000000000..bd9766807dde --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_toxic_en.md @@ -0,0 +1,120 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from unitary) +author: John Snow Labs +name: bert_classifier_toxic +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `toxic-bert` is a English model originally trained by `unitary`. + +## Predicted Entities + +`obscene`, `insult`, `severe_toxic`, `identity_hate`, `threat`, `toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_toxic_en_5.1.4_3.4_1698797776713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_toxic_en_5.1.4_3.4_1698797776713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_toxic","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_toxic","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.by_unitary").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_toxic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/unitary/toxic-bert +- https://github.com/unitaryai/detoxify/issues/15 +- https://github.com/unitaryai/detoxify +- https://laurahanu.github.io/ +- https://www.unitary.ai/ +- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification +- https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification +- https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf +- https://arxiv.org/pdf/1703.04009.pdf%201.pdf +- https://arxiv.org/pdf/1905.12516.pdf +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data +- https://www.kaggle.com/miklgr500/jigsaw-train-multilingual-coments-google-api +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/overview/evaluation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_turkish_product_comment_sentiment_classification_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_turkish_product_comment_sentiment_classification_tr.md new file mode 100644 index 000000000000..826dbdc5a031 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_turkish_product_comment_sentiment_classification_tr.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from gurkan08) +author: John Snow Labs +name: bert_classifier_turkish_product_comment_sentiment_classification +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, tr, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `turkish-product-comment-sentiment-classification` is a Turkish model originally trained by `gurkan08`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_turkish_product_comment_sentiment_classification_tr_5.1.4_3.4_1698805112222.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_turkish_product_comment_sentiment_classification_tr_5.1.4_3.4_1698805112222.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_turkish_product_comment_sentiment_classification","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Spark NLP'yi seviyorum"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_turkish_product_comment_sentiment_classification","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Spark NLP'yi seviyorum").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.bert.sentiment.by_gurkan08").predict("""Spark NLP'yi seviyorum""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_turkish_product_comment_sentiment_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gurkan08/turkish-product-comment-sentiment-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_turkish_sentiment_analysis_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_turkish_sentiment_analysis_tr.md new file mode 100644 index 000000000000..beb8961945ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_turkish_sentiment_analysis_tr.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from emre) +author: John Snow Labs +name: bert_classifier_turkish_sentiment_analysis +date: 2023-11-01 +tags: [tr, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `turkish-sentiment-analysis` is a Turkish model originally trained by `emre`. + +## Predicted Entities + +`Positive`, `Notr`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_turkish_sentiment_analysis_tr_5.1.4_3.4_1698805518770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_turkish_sentiment_analysis_tr_5.1.4_3.4_1698805518770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_turkish_sentiment_analysis","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_turkish_sentiment_analysis","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.bert.sentiment.by_emre").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_turkish_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|691.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/emre/turkish-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09_nl.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09_nl.md new file mode 100644 index 000000000000..6bae6baa9cd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09_nl.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Cased model (from Jeska) +author: John Snow Labs +name: bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09 +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, nl, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `VaccinChatSentenceClassifierDutch_fromBERTje2_DAdialogQonly09` is a Dutch model originally trained by `Jeska`. + +## Predicted Entities + +`faq_ask_astrazeneca`, `faq_ask_kinderen`, `faq_ask_risicopatient_kanker`, `faq_ask_begeleiding`, `faq_ask_vakantie`, `faq_ask_eerste_prik_buitenland`, `faq_ask_mantelzorger`, `faq_ask_waarom_ouderen_eerst`, `faq_ask_corona_vermijden`, `faq_ask_janssen_een_dosis`, `faq_ask_risicopatient_luchtwegaandoening`, `faq_ask_wie_nu`, `faq_ask_probleem_registratie`, `faq_ask_hersenziekte`, `faq_ask_twijfel_effectiviteit`, `faq_ask_auto-immuun`, `faq_ask_buitenlander`, `faq_ask_prioritaire_gropen`, `faq_ask_astrazeneca_bij_ouderen`, `faq_ask_tweede_dosis_afspraak`, `faq_ask_iedereen`, `faq_ask_vaccine_covid_gehad`, `faq_ask_algemeen_info`, `faq_ask_astrazeneca_prik_2`, `faq_ask_man_vrouw_verschillen`, `faq_ask_twijfel_ontwikkeling`, `faq_ask_janssen`, `faq_ask_keuze_vaccinatiecentrum`, `faq_ask_logistiek_veilig`, `faq_ask_wat_is_vaccin`, `chitchat_ask_name`, `faq_ask_zwanger`, `faq_ask_wat_na_vaccinatie`, `faq_ask_duur_vaccinatie`, `nlu_fallback`, `faq_ask_risicopatient_immuunziekte`, `faq_ask_vrijwilliger`, `faq_ask_huisdieren`, `faq_ask_hoeveel_dosissen`, `faq_ask_waarom_twee_prikken`, `faq_ask_groepsimmuniteit`, `faq_ask_risicopatient`, `faq_ask_bijwerking_JJ`, `faq_ask_onvruchtbaar`, `faq_ask_keuze`, `faq_ask_alternatieve_medicatie`, `faq_ask_kosjer_halal`, `faq_ask_snel_ontwikkeld`, `faq_ask_chronisch_ziek`, `faq_ask_tweede_dosis_vervroegen`, `faq_ask_wie_is_risicopatient`, `faq_ask_complottheorie_5G`, `faq_ask_gezondheidstoestand_gekend`, `chitchat_ask_hi_fr`, `faq_ask_besmetten_na_vaccin`, `faq_ask_logistiek`, `faq_ask_autisme_na_vaccinatie`, `faq_ask_bijsluiter`, `faq_ask_corona_is_griep`, `faq_ask_aantal_gevaccineerd`, `faq_ask_betalen_voor_vaccin`, `faq_ask_vaccinatiecentrum`, `faq_ask_wilsonbekwaam`, `test`, `faq_ask_huisarts`, `faq_ask_moderna`, `faq_ask_bijwerking_algemeen`, `faq_ask_covid_door_vaccin`, `faq_ask_hoe_weet_overheid`, `faq_ask_uitnodiging_na_vaccinatie`, `faq_ask_verschillen`, `faq_ask_vrijwillig_Janssen`, `chitchat_ask_hi_de`, `faq_ask_betrouwbaar`, `faq_ask_wanneer_algemene_bevolking`, `faq_ask_andere_vaccins`, `faq_ask_geen_uitnodiging`, `faq_ask_allergisch_na_vaccinatie`, `chitchat_ask_bye`, `faq_ask_qvax_probleem`, `faq_ask_vaccin_doorgeven`, `faq_ask_vaccin_immuunsysteem`, `faq_ask_welke_vaccin`, `faq_ask_hoe_dodelijk`, `faq_ask_geen_antwoord`, `faq_ask_jong_en_gezond`, `faq_ask_twijfel_ivm_vaccinatie`, `faq_ask_eerst_weigeren`, `faq_ask_reproductiegetal`, `faq_ask_waarom_niet_verplicht`, `faq_ask_test_voor_vaccin`, `faq_ask_taxi`, `faq_ask_waarom_twijfel`, `faq_ask_wie_ben_ik`, `chitchat_ask_hi`, `faq_ask_uit_flacon`, `faq_ask_risicopatient_diabetes`, `faq_ask_privacy`, `faq_ask_wanneer_iedereen_gevaccineerd`, `faq_ask_tijd_tot_tweede_dosis`, `faq_ask_borstvoeding`, `get_started`, `faq_ask_contra_ind`, `faq_ask_trage_start`, `faq_ask_aantal_gevaccineerd_wereldwijd`, `faq_ask_twijfel_praktisch`, `faq_ask_waarom`, `faq_ask_bijwerking_lange_termijn`, `faq_ask_naaldangst`, `faq_ask_ontwikkeling`, `faq_ask_wat_is_rna`, `faq_ask_mondmasker`, `faq_ask_twijfel_bijwerkingen`, `faq_ask_twijfel_vaccins_zelf`, `faq_ask_uitnodiging_afspraak_kwijt`, `faq_ask_hartspierontsteking`, `faq_ask_astrazeneca_bloedklonters`, `faq_ask_pfizer`, `faq_ask_twijfel_noodzaak`, `faq_ask_wie_doet_inenting`, `faq_ask_curevac`, `faq_ask_welk_vaccin_krijg_ik`, `faq_ask_bloed_geven`, `faq_ask_dna`, `faq_ask_planning_ouderen`, `faq_ask_pijnstiller`, `faq_ask_mrna_vs_andere_vaccins`, `faq_ask_nadelen`, `faq_ask_beschermen`, `faq_ask_gestockeerd`, `faq_ask_gif_in_vaccin`, `faq_ask_twijfel_inhoud`, `faq_ask_vaccine_covid_gehad_effect`, `faq_ask_veelgestelde_vragen`, `faq_ask_info_vaccins`, `faq_ask_vaccin_variant`, `faq_ask_vegan`, `faq_ask_bijwerking_pfizer`, `faq_ask_planning_eerstelijnszorg`, `faq_ask_timing_andere_vaccins`, `chitchat_ask_hoe_gaat_het`, `faq_ask_twijfel_vrijheid`, `faq_ask_risicopatient_hartvaat`, `faq_ask_beschermingsduur`, `faq_ask_betrouwbare_bronnen`, `faq_ask_smaakverlies`, `faq_ask_waar_en_wanneer`, `faq_ask_derde_prik`, `faq_ask_afspraak_gemist`, `faq_ask_motiveren`, `faq_ask_positieve_test_na_vaccin`, `faq_ask_beschermingspercentage`, `faq_ask_magnetisch`, `faq_ask_problemen_uitnodiging`, `faq_ask_experimenteel`, `chitchat_ask_hi_en`, `faq_ask_bijwerking_moderna`, `faq_ask_meer_bijwerkingen_tweede_dosis`, `faq_ask_testen`, `faq_ask_nuchter`, `faq_ask_quarantaine`, `faq_ask_essentieel_beroep`, `faq_ask_bijwerking_AZ`, `faq_ask_sneller_aan_de_beurt`, `faq_ask_complottheorie_Bill_Gates`, `faq_ask_complottheorie`, `faq_ask_afspraak_afzeggen`, `faq_ask_minder_mobiel`, `faq_ask_combi`, `faq_ask_maximaal_een_dosis`, `faq_ask_phishing`, `faq_ask_goedkeuring`, `faq_ask_verplicht`, `faq_ask_geen_risicopatient`, `faq_ask_attest`, `chitchat_ask_thanks`, `faq_ask_leveringen`, `faq_ask_oplopen_vaccinatie`, `faq_ask_wat_is_corona`, `faq_ask_foetus`, `faq_ask_in_vaccin` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09_nl_5.1.4_3.4_1698797055685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09_nl_5.1.4_3.4_1698797055685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.frombertje2_dadialogqonly09.by_jeska").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_vaccinchatsentenceclassifierdutch_frombertje2_dadialogqonly09| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jeska/VaccinChatSentenceClassifierDutch_fromBERTje2_DAdialogQonly09 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial_nl.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial_nl.md new file mode 100644 index 000000000000..e2ad3acac288 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial_nl.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Dutch BertForSequenceClassification Cased model (from Jeska) +author: John Snow Labs +name: bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, nl, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `VaccinChatSentenceClassifierDutch_fromBERTjeDIAL` is a Dutch model originally trained by `Jeska`. + +## Predicted Entities + +`faq_ask_taxi`, `faq_ask_twijfel_ivm_vaccinatie`, `faq_ask_naaldangst`, `faq_ask_positieve_test_na_vaccin`, `faq_ask_experimenteel`, `faq_ask_risicopatient`, `faq_ask_geen_uitnodiging`, `faq_ask_beschermingspercentage`, `faq_ask_vaccin_doorgeven`, `faq_ask_curevac`, `faq_ask_waarom`, `nlu_fallback`, `faq_ask_bijwerking_moderna`, `faq_ask_risicopatient_kanker`, `faq_ask_verschillen`, `faq_ask_keuze`, `faq_ask_huisarts`, `faq_ask_wie_doet_inenting`, `chitchat_ask_hi`, `faq_ask_algemeen_info`, `faq_ask_tijd_tot_tweede_dosis`, `faq_ask_twijfel_ontwikkeling`, `faq_ask_eerst_weigeren`, `faq_ask_hoe_weet_overheid`, `faq_ask_wanneer_iedereen_gevaccineerd`, `faq_ask_jong_en_gezond`, `faq_ask_mondmasker`, `faq_ask_privacy`, `faq_ask_derde_prik`, `faq_ask_moderna`, `faq_ask_vaccine_covid_gehad`, `faq_ask_betrouwbaar`, `faq_ask_hersenziekte`, `faq_ask_waarom_niet_verplicht`, `faq_ask_bijwerking_pfizer`, `faq_ask_buitenlander`, `chitchat_ask_bye`, `faq_ask_wie_ben_ik`, `faq_ask_quarantaine`, `faq_ask_wie_nu`, `faq_ask_beschermen`, `faq_ask_mantelzorger`, `faq_ask_testen`, `faq_ask_borstvoeding`, `faq_ask_afspraak_afzeggen`, `faq_ask_twijfel_effectiviteit`, `faq_ask_betalen_voor_vaccin`, `faq_ask_welk_vaccin_krijg_ik`, `faq_ask_vaccinatiecentrum`, `faq_ask_logistiek_veilig`, `faq_ask_aantal_gevaccineerd`, `faq_ask_tweede_dosis_vervroegen`, `faq_ask_corona_vermijden`, `faq_ask_info_vaccins`, `faq_ask_risicopatient_immuunziekte`, `faq_ask_in_vaccin`, `test`, `faq_ask_geen_risicopatient`, `faq_ask_twijfel_inhoud`, `faq_ask_keuze_vaccinatiecentrum`, `faq_ask_nadelen`, `faq_ask_astrazeneca_prik_2`, `faq_ask_twijfel_vrijheid`, `faq_ask_bijwerking_AZ`, `faq_ask_contra_ind`, `faq_ask_gestockeerd`, `faq_ask_wanneer_algemene_bevolking`, `faq_ask_wat_is_vaccin`, `faq_ask_waarom_twijfel`, `faq_ask_veelgestelde_vragen`, `faq_ask_gezondheidstoestand_gekend`, `faq_ask_risicopatient_diabetes`, `faq_ask_vrijwilliger`, `faq_ask_wat_is_corona`, `faq_ask_iedereen`, `chitchat_ask_hi_fr`, `faq_ask_nuchter`, `faq_ask_wat_na_vaccinatie`, `faq_ask_alternatieve_medicatie`, `faq_ask_bijwerking_algemeen`, `faq_ask_begeleiding`, `faq_ask_duur_vaccinatie`, `faq_ask_janssen`, `faq_ask_hoeveel_dosissen`, `faq_ask_hartspierontsteking`, `faq_ask_bijwerking_lange_termijn`, `faq_ask_dna`, `faq_ask_gif_in_vaccin`, `faq_ask_planning_eerstelijnszorg`, `faq_ask_reproductiegetal`, `chitchat_ask_thanks`, `faq_ask_problemen_uitnodiging`, `faq_ask_covid_door_vaccin`, `faq_ask_combi`, `faq_ask_tweede_dosis_afspraak`, `faq_ask_kosjer_halal`, `get_started`, `faq_ask_vrijwillig_Janssen`, `faq_ask_groepsimmuniteit`, `faq_ask_smaakverlies`, `faq_ask_astrazeneca_bloedklonters`, `faq_ask_complottheorie_Bill_Gates`, `faq_ask_ontwikkeling`, `faq_ask_vaccin_immuunsysteem`, `faq_ask_magnetisch`, `faq_ask_mrna_vs_andere_vaccins`, `faq_ask_test_voor_vaccin`, `faq_ask_betrouwbare_bronnen`, `faq_ask_astrazeneca`, `faq_ask_man_vrouw_verschillen`, `faq_ask_twijfel_bijwerkingen`, `faq_ask_eerste_prik_buitenland`, `faq_ask_sneller_aan_de_beurt`, `faq_ask_complottheorie_5G`, `faq_ask_leveringen`, `faq_ask_essentieel_beroep`, `faq_ask_geen_antwoord`, `faq_ask_twijfel_vaccins_zelf`, `faq_ask_waarom_twee_prikken`, `faq_ask_andere_vaccins`, `faq_ask_beschermingsduur`, `faq_ask_complottheorie`, `faq_ask_uit_flacon`, `faq_ask_qvax_probleem`, `faq_ask_waar_en_wanneer`, `faq_ask_onvruchtbaar`, `faq_ask_janssen_een_dosis`, `chitchat_ask_hoe_gaat_het`, `faq_ask_probleem_registratie`, `faq_ask_kinderen`, `faq_ask_trage_start`, `faq_ask_timing_andere_vaccins`, `faq_ask_uitnodiging_na_vaccinatie`, `faq_ask_snel_ontwikkeld`, `faq_ask_vakantie`, `faq_ask_foetus`, `faq_ask_risicopatient_luchtwegaandoening`, `faq_ask_bijwerking_JJ`, `faq_ask_risicopatient_hartvaat`, `faq_ask_afspraak_gemist`, `faq_ask_meer_bijwerkingen_tweede_dosis`, `faq_ask_zwanger`, `faq_ask_pijnstiller`, `faq_ask_verplicht`, `faq_ask_autisme_na_vaccinatie`, `faq_ask_chronisch_ziek`, `faq_ask_wilsonbekwaam`, `faq_ask_vaccin_variant`, `faq_ask_auto-immuun`, `faq_ask_besmetten_na_vaccin`, `faq_ask_huisdieren`, `faq_ask_prioritaire_gropen`, `faq_ask_maximaal_een_dosis`, `faq_ask_goedkeuring`, `faq_ask_wie_is_risicopatient`, `faq_ask_pfizer`, `faq_ask_bijsluiter`, `faq_ask_corona_is_griep`, `faq_ask_welke_vaccin`, `faq_ask_vaccine_covid_gehad_effect`, `faq_ask_waarom_ouderen_eerst`, `faq_ask_vegan`, `faq_ask_bloed_geven`, `faq_ask_oplopen_vaccinatie`, `faq_ask_minder_mobiel`, `faq_ask_hoe_dodelijk`, `chitchat_ask_hi_en`, `faq_ask_logistiek`, `faq_ask_attest`, `chitchat_ask_hi_de`, `faq_ask_astrazeneca_bij_ouderen`, `faq_ask_planning_ouderen`, `faq_ask_motiveren`, `faq_ask_uitnodiging_afspraak_kwijt`, `chitchat_ask_name`, `faq_ask_phishing`, `faq_ask_twijfel_praktisch`, `faq_ask_wat_is_rna`, `faq_ask_aantal_gevaccineerd_wereldwijd`, `faq_ask_allergisch_na_vaccinatie`, `faq_ask_twijfel_noodzaak` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial_nl_5.1.4_3.4_1698797300348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial_nl_5.1.4_3.4_1698797300348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.bert.frombertjedial.by_jeska").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_vaccinchatsentenceclassifierdutch_frombertjedial| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jeska/VaccinChatSentenceClassifierDutch_fromBERTjeDIAL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_xtremedistil_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_xtremedistil_emotion_en.md new file mode 100644 index 000000000000..18d5d3ad0854 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_xtremedistil_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bergum) +author: John Snow Labs +name: bert_classifier_xtremedistil_emotion +date: 2023-11-01 +tags: [bert, sequence_classification, classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-emotion` is a English model originally trained by `bergum`. + +## Predicted Entities + +`anger`, `sadness`, `fear`, `joy`, `love`, `surprise` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_xtremedistil_emotion_en_5.1.4_3.4_1698797452848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_xtremedistil_emotion_en_5.1.4_3.4_1698797452848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.emotion.xtremedistiled").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_xtremedistil_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|47.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bergum/xtremedistil-emotion +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_xtremedistil_l12_h384_uncased_pub_section_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_xtremedistil_l12_h384_uncased_pub_section_en.md new file mode 100644 index 000000000000..656cf26f07fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_classifier_xtremedistil_l12_h384_uncased_pub_section_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Uncased model (from ml4pubmed) +author: John Snow Labs +name: bert_classifier_xtremedistil_l12_h384_uncased_pub_section +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l12-h384-uncased_pub_section` is a English model originally trained by `ml4pubmed`. + +## Predicted Entities + +`CONCLUSIONS`, `METHODS`, `OBJECTIVE`, `RESULTS`, `BACKGROUND` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classifier_xtremedistil_l12_h384_uncased_pub_section_en_5.1.4_3.4_1698805755471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classifier_xtremedistil_l12_h384_uncased_pub_section_en_5.1.4_3.4_1698805755471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_l12_h384_uncased_pub_section","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("bert_classifier_xtremedistil_l12_h384_uncased_pub_section","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.xtremedistiled_uncased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classifier_xtremedistil_l12_h384_uncased_pub_section| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|91.6 MB| +|Case sensitive:|false| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ml4pubmed/xtremedistil-l12-h384-uncased_pub_section \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_eec_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_eec_emotion_en.md new file mode 100644 index 000000000000..2cf1cdf4bda4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_eec_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_eec_emotion BertForSequenceClassification from Cameron +author: John Snow Labs +name: bert_eec_emotion +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_eec_emotion` is a English model originally trained by Cameron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_eec_emotion_en_5.1.4_3.4_1698861083584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_eec_emotion_en_5.1.4_3.4_1698861083584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_eec_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_eec_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_eec_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Cameron/BERT-eec-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_biden_ke_mlm_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_biden_ke_mlm_en.md new file mode 100644 index 000000000000..03cdf8a8dc59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_biden_ke_mlm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_election2020_twitter_stance_biden_ke_mlm BertForSequenceClassification from kornosk +author: John Snow Labs +name: bert_election2020_twitter_stance_biden_ke_mlm +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_election2020_twitter_stance_biden_ke_mlm` is a English model originally trained by kornosk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_election2020_twitter_stance_biden_ke_mlm_en_5.1.4_3.4_1698839222494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_election2020_twitter_stance_biden_ke_mlm_en_5.1.4_3.4_1698839222494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_election2020_twitter_stance_biden_ke_mlm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_election2020_twitter_stance_biden_ke_mlm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_election2020_twitter_stance_biden_ke_mlm| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/kornosk/bert-election2020-twitter-stance-biden-KE-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_trump_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_trump_en.md new file mode 100644 index 000000000000..2350dd73c156 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_trump_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_election2020_twitter_stance_trump BertForSequenceClassification from kornosk +author: John Snow Labs +name: bert_election2020_twitter_stance_trump +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_election2020_twitter_stance_trump` is a English model originally trained by kornosk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_election2020_twitter_stance_trump_en_5.1.4_3.4_1698862770589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_election2020_twitter_stance_trump_en_5.1.4_3.4_1698862770589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_election2020_twitter_stance_trump","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_election2020_twitter_stance_trump","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_election2020_twitter_stance_trump| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/kornosk/bert-election2020-twitter-stance-trump \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_trump_ke_mlm_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_trump_ke_mlm_en.md new file mode 100644 index 000000000000..b85069444d67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_election2020_twitter_stance_trump_ke_mlm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_election2020_twitter_stance_trump_ke_mlm BertForSequenceClassification from kornosk +author: John Snow Labs +name: bert_election2020_twitter_stance_trump_ke_mlm +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_election2020_twitter_stance_trump_ke_mlm` is a English model originally trained by kornosk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_election2020_twitter_stance_trump_ke_mlm_en_5.1.4_3.4_1698812770635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_election2020_twitter_stance_trump_ke_mlm_en_5.1.4_3.4_1698812770635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_election2020_twitter_stance_trump_ke_mlm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_election2020_twitter_stance_trump_ke_mlm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_election2020_twitter_stance_trump_ke_mlm| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/kornosk/bert-election2020-twitter-stance-trump-KE-MLM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_emails_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_emails_en.md new file mode 100644 index 000000000000..7373d20c2be9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_emails_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emails BertForSequenceClassification from kudoshinichi +author: John Snow Labs +name: bert_emails +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emails` is a English model originally trained by kudoshinichi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emails_en_5.1.4_3.4_1698810378259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emails_en_5.1.4_3.4_1698810378259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_emails","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_emails","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emails| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.3 MB| + +## References + +https://huggingface.co/kudoshinichi/bert_emails \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_embedding_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_embedding_en.md new file mode 100644 index 000000000000..000b037fdda9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_embedding_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_embedding BertForSequenceClassification from guialfaro +author: John Snow Labs +name: bert_embedding +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_embedding` is a English model originally trained by guialfaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_embedding_en_5.1.4_3.4_1698826689586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_embedding_en_5.1.4_3.4_1698826689586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_embedding","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_embedding","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_embedding| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|387.0 MB| + +## References + +https://huggingface.co/guialfaro/bert-embedding \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_emotion_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_emotion_analysis_en.md new file mode 100644 index 000000000000..21a05fa75f91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_emotion_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_analysis BertForSequenceClassification from mariogiordano +author: John Snow Labs +name: bert_emotion_analysis +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_analysis` is a English model originally trained by mariogiordano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_analysis_en_5.1.4_3.4_1698810422703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_analysis_en_5.1.4_3.4_1698810422703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_emotion_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_emotion_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/mariogiordano/Bert-emotion-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_fine_tuned_hatexplain_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_fine_tuned_hatexplain_en.md new file mode 100644 index 000000000000..07bc594f796c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_fine_tuned_hatexplain_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_fine_tuned_hatexplain BertForSequenceClassification from pasan-SK +author: John Snow Labs +name: bert_fine_tuned_hatexplain +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_fine_tuned_hatexplain` is a English model originally trained by pasan-SK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_hatexplain_en_5.1.4_3.4_1698810505435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_fine_tuned_hatexplain_en_5.1.4_3.4_1698810505435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_hatexplain","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_fine_tuned_hatexplain","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_fine_tuned_hatexplain| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/pasan-SK/bert-fine-tuned-hatexplain \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_finetuned_gender_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_finetuned_gender_classification_en.md new file mode 100644 index 000000000000..81a8bfc550a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_finetuned_gender_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuned_gender_classification BertForSequenceClassification from Abderrahim2 +author: John Snow Labs +name: bert_finetuned_gender_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_gender_classification` is a English model originally trained by Abderrahim2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_gender_classification_en_5.1.4_3.4_1698862743607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_gender_classification_en_5.1.4_3.4_1698862743607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_gender_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_gender_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_gender_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Abderrahim2/bert-finetuned-gender_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_finetuned_imdb_katrin_kc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_finetuned_imdb_katrin_kc_en.md new file mode 100644 index 000000000000..0a9d351cafaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_finetuned_imdb_katrin_kc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_finetuned_imdb_katrin_kc BertForSequenceClassification from katrin-kc +author: John Snow Labs +name: bert_finetuned_imdb_katrin_kc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_imdb_katrin_kc` is a English model originally trained by katrin-kc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_imdb_katrin_kc_en_5.1.4_3.4_1698861113680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_imdb_katrin_kc_en_5.1.4_3.4_1698861113680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_imdb_katrin_kc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_finetuned_imdb_katrin_kc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_imdb_katrin_kc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/katrin-kc/bert-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_french_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_french_en.md new file mode 100644 index 000000000000..b3ba36999817 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_french_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_french BertForSequenceClassification from baayematar +author: John Snow Labs +name: bert_french +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_french` is a English model originally trained by baayematar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_french_en_5.1.4_3.4_1698870052101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_french_en_5.1.4_3.4_1698870052101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_french","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_french","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_french| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.8 MB| + +## References + +https://huggingface.co/baayematar/bert-french \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_ft_qqp_6ep_42_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_ft_qqp_6ep_42_en.md new file mode 100644 index 000000000000..85eff3a0071c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_ft_qqp_6ep_42_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_ft_qqp_6ep_42 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: bert_ft_qqp_6ep_42 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ft_qqp_6ep_42` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ft_qqp_6ep_42_en_5.1.4_3.4_1698862984355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ft_qqp_6ep_42_en_5.1.4_3.4_1698862984355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_ft_qqp_6ep_42","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_ft_qqp_6ep_42","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ft_qqp_6ep_42| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/bert_ft_qqp_6ep-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_hatexplain_tum_nlp_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_hatexplain_tum_nlp_en.md new file mode 100644 index 000000000000..b6fe9b4f2140 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_hatexplain_tum_nlp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_hatexplain_tum_nlp BertForSequenceClassification from tum-nlp +author: John Snow Labs +name: bert_hatexplain_tum_nlp +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_hatexplain_tum_nlp` is a English model originally trained by tum-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_hatexplain_tum_nlp_en_5.1.4_3.4_1698835846932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_hatexplain_tum_nlp_en_5.1.4_3.4_1698835846932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_hatexplain_tum_nlp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_hatexplain_tum_nlp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_hatexplain_tum_nlp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/tum-nlp/bert-hateXplain \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_italian_uncased_iptc_headlines_it.md b/docs/_posts/ahmedlone127/2023-11-01-bert_italian_uncased_iptc_headlines_it.md new file mode 100644 index 000000000000..08f3c558d69a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_italian_uncased_iptc_headlines_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian bert_italian_uncased_iptc_headlines BertForSequenceClassification from nlpodyssey +author: John Snow Labs +name: bert_italian_uncased_iptc_headlines +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_italian_uncased_iptc_headlines` is a Italian model originally trained by nlpodyssey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_italian_uncased_iptc_headlines_it_5.1.4_3.4_1698814021963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_italian_uncased_iptc_headlines_it_5.1.4_3.4_1698814021963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_italian_uncased_iptc_headlines","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_italian_uncased_iptc_headlines","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_italian_uncased_iptc_headlines| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|412.0 MB| + +## References + +https://huggingface.co/nlpodyssey/bert-italian-uncased-iptc-headlines \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_jigsaw_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_jigsaw_en.md new file mode 100644 index 000000000000..3d397c8a89ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_jigsaw_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_jigsaw BertForSequenceClassification from Cameron +author: John Snow Labs +name: bert_jigsaw +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_jigsaw` is a English model originally trained by Cameron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_jigsaw_en_5.1.4_3.4_1698861256639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_jigsaw_en_5.1.4_3.4_1698861256639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_jigsaw","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_jigsaw","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_jigsaw| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Cameron/BERT-Jigsaw \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_jigsaw_identityhate_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_jigsaw_identityhate_en.md new file mode 100644 index 000000000000..a2221a581dbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_jigsaw_identityhate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_jigsaw_identityhate BertForSequenceClassification from Cameron +author: John Snow Labs +name: bert_jigsaw_identityhate +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_jigsaw_identityhate` is a English model originally trained by Cameron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_jigsaw_identityhate_en_5.1.4_3.4_1698830302627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_jigsaw_identityhate_en_5.1.4_3.4_1698830302627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_jigsaw_identityhate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_jigsaw_identityhate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_jigsaw_identityhate| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Cameron/BERT-jigsaw-identityhate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_legal_sentence_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_legal_sentence_classification_en.md new file mode 100644 index 000000000000..414ad1ac8057 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_legal_sentence_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_legal_sentence_classification BertForSequenceClassification from samkas125 +author: John Snow Labs +name: bert_large_legal_sentence_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_legal_sentence_classification` is a English model originally trained by samkas125. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_legal_sentence_classification_en_5.1.4_3.4_1698812412977.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_legal_sentence_classification_en_5.1.4_3.4_1698812412977.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_legal_sentence_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_legal_sentence_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_legal_sentence_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/samkas125/bert-large-legal-sentence-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_portuguese_cased_assin2_similarity_pt.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_portuguese_cased_assin2_similarity_pt.md new file mode 100644 index 000000000000..0d5451907060 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_portuguese_cased_assin2_similarity_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese bert_large_portuguese_cased_assin2_similarity BertForSequenceClassification from ruanchaves +author: John Snow Labs +name: bert_large_portuguese_cased_assin2_similarity +date: 2023-11-01 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_portuguese_cased_assin2_similarity` is a Portuguese model originally trained by ruanchaves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_cased_assin2_similarity_pt_5.1.4_3.4_1698810793512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_cased_assin2_similarity_pt_5.1.4_3.4_1698810793512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_portuguese_cased_assin2_similarity","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_portuguese_cased_assin2_similarity","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_portuguese_cased_assin2_similarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ruanchaves/bert-large-portuguese-cased-assin2-similarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_cola_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_cola_en.md new file mode 100644 index 000000000000..58055ada6766 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_cola BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_large_uncased_cola +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_cola` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_cola_en_5.1.4_3.4_1698860993115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_cola_en_5.1.4_3.4_1698860993115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-large-uncased-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_mnli_yoshitomo_matsubara_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_mnli_yoshitomo_matsubara_en.md new file mode 100644 index 000000000000..a86e9c16ef9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_mnli_yoshitomo_matsubara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_mnli_yoshitomo_matsubara BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_large_uncased_mnli_yoshitomo_matsubara +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_mnli_yoshitomo_matsubara` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_mnli_yoshitomo_matsubara_en_5.1.4_3.4_1698813845296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_mnli_yoshitomo_matsubara_en_5.1.4_3.4_1698813845296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_mnli_yoshitomo_matsubara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_mnli_yoshitomo_matsubara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_mnli_yoshitomo_matsubara| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-large-uncased-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_qnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_qnli_en.md new file mode 100644 index 000000000000..0df42c19e4fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_qnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_qnli BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_large_uncased_qnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_qnli` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_qnli_en_5.1.4_3.4_1698808580293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_qnli_en_5.1.4_3.4_1698808580293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_qnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_qnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_qnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-large-uncased-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_qqp_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_qqp_en.md new file mode 100644 index 000000000000..0fa4febf33f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_qqp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_qqp BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_large_uncased_qqp +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_qqp` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_qqp_en_5.1.4_3.4_1698807486905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_qqp_en_5.1.4_3.4_1698807486905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_qqp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_qqp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_qqp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-large-uncased-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_rte_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_rte_en.md new file mode 100644 index 000000000000..89ffaf5c9bca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_rte_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_rte BertForSequenceClassification from yoshitomo-matsubara +author: John Snow Labs +name: bert_large_uncased_rte +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_rte` is a English model originally trained by yoshitomo-matsubara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_rte_en_5.1.4_3.4_1698805308603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_rte_en_5.1.4_3.4_1698805308603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_rte","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_rte","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_rte| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/yoshitomo-matsubara/bert-large-uncased-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_wwm_finetuned_boolq_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_wwm_finetuned_boolq_en.md new file mode 100644 index 000000000000..82209a6c545b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_large_uncased_wwm_finetuned_boolq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_large_uncased_wwm_finetuned_boolq BertForSequenceClassification from lewtun +author: John Snow Labs +name: bert_large_uncased_wwm_finetuned_boolq +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_uncased_wwm_finetuned_boolq` is a English model originally trained by lewtun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_uncased_wwm_finetuned_boolq_en_5.1.4_3.4_1698861451713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_uncased_wwm_finetuned_boolq_en_5.1.4_3.4_1698861451713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_wwm_finetuned_boolq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_large_uncased_wwm_finetuned_boolq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_uncased_wwm_finetuned_boolq| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/lewtun/bert-large-uncased-wwm-finetuned-boolq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_m_agnews_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_m_agnews_en.md new file mode 100644 index 000000000000..7052a7be23ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_m_agnews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_m_agnews BertForSequenceClassification from tzhao3 +author: John Snow Labs +name: bert_m_agnews +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_m_agnews` is a English model originally trained by tzhao3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_m_agnews_en_5.1.4_3.4_1698806366734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_m_agnews_en_5.1.4_3.4_1698806366734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_m_agnews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_m_agnews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_m_agnews| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|155.2 MB| + +## References + +https://huggingface.co/tzhao3/Bert-M-AGnews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_m_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_m_sst2_en.md new file mode 100644 index 000000000000..cafd758f3ca6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_m_sst2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_m_sst2 BertForSequenceClassification from tzhao3 +author: John Snow Labs +name: bert_m_sst2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_m_sst2` is a English model originally trained by tzhao3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_m_sst2_en_5.1.4_3.4_1698804274259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_m_sst2_en_5.1.4_3.4_1698804274259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_m_sst2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_m_sst2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_m_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|154.7 MB| + +## References + +https://huggingface.co/tzhao3/Bert-M-SST2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_medium_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_medium_mnli_en.md new file mode 100644 index 000000000000..29c66ea94c85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_medium_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_medium_mnli BertForSequenceClassification from prajjwal1 +author: John Snow Labs +name: bert_medium_mnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_medium_mnli` is a English model originally trained by prajjwal1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_medium_mnli_en_5.1.4_3.4_1698804574624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_medium_mnli_en_5.1.4_3.4_1698804574624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_medium_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_medium_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_medium_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|155.2 MB| + +## References + +https://huggingface.co/prajjwal1/bert-medium-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_mini_finetune_question_detection_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_mini_finetune_question_detection_en.md new file mode 100644 index 000000000000..b9e821609e76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_mini_finetune_question_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_mini_finetune_question_detection BertForSequenceClassification from shahrukhx01 +author: John Snow Labs +name: bert_mini_finetune_question_detection +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mini_finetune_question_detection` is a English model originally trained by shahrukhx01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mini_finetune_question_detection_en_5.1.4_3.4_1698803310550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mini_finetune_question_detection_en_5.1.4_3.4_1698803310550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mini_finetune_question_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mini_finetune_question_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mini_finetune_question_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/shahrukhx01/bert-mini-finetune-question-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_mnli_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_mnli_classifier_en.md new file mode 100644 index 000000000000..117c859c2d3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_mnli_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_mnli_classifier BertForSequenceClassification from gayanin +author: John Snow Labs +name: bert_mnli_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mnli_classifier` is a English model originally trained by gayanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mnli_classifier_en_5.1.4_3.4_1698842072786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mnli_classifier_en_5.1.4_3.4_1698842072786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_mnli_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_mnli_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mnli_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/gayanin/bert-mnli-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_multilingual_uncased_geo_countries_headlines_xx.md b/docs/_posts/ahmedlone127/2023-11-01-bert_multilingual_uncased_geo_countries_headlines_xx.md new file mode 100644 index 000000000000..fca5a21523f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_multilingual_uncased_geo_countries_headlines_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bert_multilingual_uncased_geo_countries_headlines BertForSequenceClassification from nlpodyssey +author: John Snow Labs +name: bert_multilingual_uncased_geo_countries_headlines +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingual_uncased_geo_countries_headlines` is a Multilingual model originally trained by nlpodyssey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingual_uncased_geo_countries_headlines_xx_5.1.4_3.4_1698802496760.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingual_uncased_geo_countries_headlines_xx_5.1.4_3.4_1698802496760.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_multilingual_uncased_geo_countries_headlines","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_multilingual_uncased_geo_countries_headlines","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingual_uncased_geo_countries_headlines| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|628.4 MB| + +## References + +https://huggingface.co/nlpodyssey/bert-multilingual-uncased-geo-countries-headlines \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_multitask_query_classifiers_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_multitask_query_classifiers_en.md new file mode 100644 index 000000000000..cc1c148f9789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_multitask_query_classifiers_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_multitask_query_classifiers BertForSequenceClassification from shahrukhx01 +author: John Snow Labs +name: bert_multitask_query_classifiers +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multitask_query_classifiers` is a English model originally trained by shahrukhx01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multitask_query_classifiers_en_5.1.4_3.4_1698803184402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multitask_query_classifiers_en_5.1.4_3.4_1698803184402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_multitask_query_classifiers","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_multitask_query_classifiers","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multitask_query_classifiers| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/shahrukhx01/bert-multitask-query-classifiers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_portuguese_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_portuguese_emotion_en.md new file mode 100644 index 000000000000..a5c2fe870afc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_portuguese_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_portuguese_emotion BertForSequenceClassification from pysentimiento +author: John Snow Labs +name: bert_portuguese_emotion +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_portuguese_emotion` is a English model originally trained by pysentimiento. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_portuguese_emotion_en_5.1.4_3.4_1698814235638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_portuguese_emotion_en_5.1.4_3.4_1698814235638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_portuguese_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_portuguese_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_portuguese_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/pysentimiento/bert-pt-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_resume_job_recommender_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_resume_job_recommender_en.md new file mode 100644 index 000000000000..68e6f18f6666 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_resume_job_recommender_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_resume_job_recommender BertForSequenceClassification from liberatoratif +author: John Snow Labs +name: bert_resume_job_recommender +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_resume_job_recommender` is a English model originally trained by liberatoratif. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_resume_job_recommender_en_5.1.4_3.4_1698816075387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_resume_job_recommender_en_5.1.4_3.4_1698816075387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_resume_job_recommender","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_resume_job_recommender","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_resume_job_recommender| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/liberatoratif/BERT-resume-job-recommender \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_roman_urdu_ur.md b/docs/_posts/ahmedlone127/2023-11-01-bert_roman_urdu_ur.md new file mode 100644 index 000000000000..5308a90514a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_roman_urdu_ur.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Urdu bert_roman_urdu BertForSequenceClassification from YYAH +author: John Snow Labs +name: bert_roman_urdu +date: 2023-11-01 +tags: [bert, ur, open_source, sequence_classification, onnx] +task: Text Classification +language: ur +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_roman_urdu` is a Urdu model originally trained by YYAH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_roman_urdu_ur_5.1.4_3.4_1698831849403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_roman_urdu_ur_5.1.4_3.4_1698831849403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_roman_urdu","ur")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_roman_urdu","ur") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_roman_urdu| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ur| +|Size:|627.7 MB| + +## References + +https://huggingface.co/YYAH/Bert_Roman_Urdu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_seq_training_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_seq_training_model_en.md new file mode 100644 index 000000000000..682e7d284b15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_seq_training_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_seq_training_model BertForSequenceClassification from Brecon +author: John Snow Labs +name: bert_seq_training_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_seq_training_model` is a English model originally trained by Brecon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_seq_training_model_en_5.1.4_3.4_1698814060001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_seq_training_model_en_5.1.4_3.4_1698814060001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_seq_training_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_seq_training_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_seq_training_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Brecon/bert_seq_training_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_au_topics_452311620_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_au_topics_452311620_en.md new file mode 100644 index 000000000000..e12a2282465a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_au_topics_452311620_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Smone55) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_au_topics_452311620 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-au_topics-452311620` is a English model originally trained by `Smone55`. + +## Predicted Entities + +`113`, `112`, `24`, `98`, `54`, `62`, `114`, `69`, `68`, `15`, `47`, `45`, `107`, `34`, `14`, `37`, `96`, `9`, `81`, `51`, `83`, `79`, `111`, `27`, `50`, `4`, `95`, `101`, `61`, `56`, `64`, `104`, `10`, `78`, `41`, `55`, `103`, `87`, `124`, `120`, `80`, `25`, `53`, `22`, `90`, `1`, `5`, `29`, `20`, `97`, `86`, `32`, `16`, `85`, `94`, `105`, `91`, `93`, `88`, `48`, `102`, `13`, `35`, `40`, `121`, `49`, `23`, `63`, `72`, `39`, `2`, `109`, `122`, `125`, `12`, `21`, `66`, `11`, `67`, `30`, `0`, `43`, `74`, `58`, `73`, `75`, `108`, `38`, `116`, `6`, `33`, `123`, `100`, `65`, `77`, `19`, `106`, `117`, `44`, `8`, `46`, `92`, `57`, `115`, `118`, `70`, `31`, `17`, `7`, `60`, `82`, `110`, `26`, `28`, `71`, `59`, `42`, `119`, `99`, `18`, `3`, `-1`, `84`, `36`, `76`, `89`, `52` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_au_topics_452311620_en_5.1.4_3.4_1698798115513.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_au_topics_452311620_en_5.1.4_3.4_1698798115513.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_au_topics_452311620","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_au_topics_452311620","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_au_topics_452311620| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Smone55/autonlp-au_topics-452311620 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_cai_out_of_scope_649919116_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_cai_out_of_scope_649919116_en.md new file mode 100644 index 000000000000..ff0806a3d09e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_cai_out_of_scope_649919116_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from msamogh) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_cai_out_of_scope_649919116 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-cai-out-of-scope-649919116` is a English model originally trained by `msamogh`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_cai_out_of_scope_649919116_en_5.1.4_3.4_1698798749091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_cai_out_of_scope_649919116_en_5.1.4_3.4_1698798749091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_cai_out_of_scope_649919116","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_cai_out_of_scope_649919116","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_cai_out_of_scope_649919116| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/msamogh/autonlp-cai-out-of-scope-649919116 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_esperanto_590516680_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_esperanto_590516680_en.md new file mode 100644 index 000000000000..d219bddca15b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_esperanto_590516680_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_sequence_classifier_autonlp_esperanto_590516680 BertForSequenceClassification from panashe +author: John Snow Labs +name: bert_sequence_classifier_autonlp_esperanto_590516680 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_autonlp_esperanto_590516680` is a English model originally trained by panashe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_esperanto_590516680_en_5.1.4_3.4_1698798965790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_esperanto_590516680_en_5.1.4_3.4_1698798965790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_esperanto_590516680","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_esperanto_590516680","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_esperanto_590516680| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/panashe/autonlp-eo-590516680 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_gibb_detect_515314387_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_gibb_detect_515314387_en.md new file mode 100644 index 000000000000..e70f06e69359 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_gibb_detect_515314387_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from MadhurJindalWorkMail) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_gibb_detect_515314387 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-Gibb-Detect-515314387` is a English model originally trained by `MadhurJindalWorkMail`. + +## Predicted Entities + +`clean`, `word salad`, `noise`, `mild gibberish` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_gibb_detect_515314387_en_5.1.4_3.4_1698806311760.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_gibb_detect_515314387_en_5.1.4_3.4_1698806311760.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_gibb_detect_515314387","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_gibb_detect_515314387","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_gibb_detect_515314387| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/MadhurJindalWorkMail/autonlp-Gibb-Detect-515314387 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388_en.md new file mode 100644 index 000000000000..5ad300c3f373 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from yosemite) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb-sentiment-analysis-english-470512388` is a English model originally trained by `yosemite`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388_en_5.1.4_3.4_1698798398387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388_en_5.1.4_3.4_1698798398387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_imdb_sentiment_analysis_english_470512388| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/yosemite/autonlp-imdb-sentiment-analysis-english-470512388 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_nlpisfun_251844_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_nlpisfun_251844_en.md new file mode 100644 index 000000000000..9015e0c6015d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_nlpisfun_251844_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from m3tafl0ps) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_nlpisfun_251844 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-NLPIsFun-251844` is a English model originally trained by `m3tafl0ps`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_nlpisfun_251844_en_5.1.4_3.4_1698799645398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_nlpisfun_251844_en_5.1.4_3.4_1698799645398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_nlpisfun_251844","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_nlpisfun_251844","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_nlpisfun_251844| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/m3tafl0ps/autonlp-NLPIsFun-251844 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_predict_roi_1_29797722_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_predict_roi_1_29797722_en.md new file mode 100644 index 000000000000..952e9bf674af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_predict_roi_1_29797722_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from ds198799) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_predict_roi_1_29797722 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-predict_ROI_1-29797722` is a English model originally trained by `ds198799`. + +## Predicted Entities + +`1.0`, `3.0`, `2.0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_predict_roi_1_29797722_en_5.1.4_3.4_1698799954023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_predict_roi_1_29797722_en_5.1.4_3.4_1698799954023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_predict_roi_1_29797722","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_predict_roi_1_29797722","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_predict_roi_1_29797722| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ds198799/autonlp-predict_ROI_1-29797722 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_spinner_check_16492731_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_spinner_check_16492731_en.md new file mode 100644 index 000000000000..1c7bb415ef69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_spinner_check_16492731_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from imzachjohnson) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_spinner_check_16492731 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-spinner-check-16492731` is a English model originally trained by `imzachjohnson`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_spinner_check_16492731_en_5.1.4_3.4_1698798710929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_spinner_check_16492731_en_5.1.4_3.4_1698798710929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_spinner_check_16492731","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_spinner_check_16492731","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_spinner_check_16492731| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/imzachjohnson/autonlp-spinner-check-16492731 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_test_459011902_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_test_459011902_zh.md new file mode 100644 index 000000000000..780b4122faaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_test_459011902_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from ysslang) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_test_459011902 +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-test-459011902` is a Chinese model originally trained by `ysslang`. + +## Predicted Entities + +`6`, `0`, `7`, `2`, `8`, `4`, `9`, `1`, `3`, `5` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_test_459011902_zh_5.1.4_3.4_1698798985293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_test_459011902_zh_5.1.4_3.4_1698798985293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_test_459011902","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_test_459011902","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_test_459011902| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|383.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ysslang/autonlp-test-459011902 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_test_530014983_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_test_530014983_en.md new file mode 100644 index 000000000000..7c1cd6588430 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_test_530014983_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Ajay191191) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_test_530014983 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-Test-530014983` is a English model originally trained by `Ajay191191`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_test_530014983_en_5.1.4_3.4_1698800245280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_test_530014983_en_5.1.4_3.4_1698800245280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_test_530014983","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_test_530014983","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_test_530014983| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Ajay191191/autonlp-Test-530014983 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323_en.md new file mode 100644 index 000000000000..91877864f08c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from DrishtiSharma) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-Text-Classification-Catalonia-Independence-AutoNLP-633018323` is a English model originally trained by `DrishtiSharma`. + +## Predicted Entities + +`AGAINST`, `NEUTRAL`, `FAVOR` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323_en_5.1.4_3.4_1698800517393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323_en_5.1.4_3.4_1698800517393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_text_classification_catalonia_independence_autonlp_633018323| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/DrishtiSharma/autonlp-Text-Classification-Catalonia-Independence-AutoNLP-633018323 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_traffic_nlp_451311592_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_traffic_nlp_451311592_en.md new file mode 100644 index 000000000000..7a204804b7d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_traffic_nlp_451311592_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from zwang199) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_traffic_nlp_451311592 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-traffic-nlp-451311592` is a English model originally trained by `zwang199`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_traffic_nlp_451311592_en_5.1.4_3.4_1698799319008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_traffic_nlp_451311592_en_5.1.4_3.4_1698799319008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_traffic_nlp_451311592","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_traffic_nlp_451311592","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_traffic_nlp_451311592| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/zwang199/autonlp-traffic-nlp-451311592 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_triage_35248482_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_triage_35248482_en.md new file mode 100644 index 000000000000..3d43986b392a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_triage_35248482_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Aimendo) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_triage_35248482 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-triage-35248482` is a English model originally trained by `Aimendo`. + +## Predicted Entities + +`away`, `new_booking`, `refund`, `approval`, `doc_request`, `acknowledgement`, `inquirey`, `modification`, `cancellation`, `ads` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_triage_35248482_en_5.1.4_3.4_1698806612481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_triage_35248482_en_5.1.4_3.4_1698806612481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_triage_35248482","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_triage_35248482","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_triage_35248482| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Aimendo/autonlp-triage-35248482 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_txc_17923129_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_txc_17923129_en.md new file mode 100644 index 000000000000..6a26b8e3f845 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autonlp_txc_17923129_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from emekaboris) +author: John Snow Labs +name: bert_sequence_classifier_autonlp_txc_17923129 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-txc-17923129` is a English model originally trained by `emekaboris`. + +## Predicted Entities + +`1.0`, `13.0`, `9.0`, `22.0`, `4.0`, `12.0`, `17.0`, `7.0`, `15.0`, `8.0`, `21.0`, `5.0`, `10.0`, `19.0`, `14.0`, `3.0`, `6.0`, `16.0`, `18.0`, `24.0`, `23.0`, `2.0`, `20.0`, `11.0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_txc_17923129_en_5.1.4_3.4_1698801109156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autonlp_txc_17923129_en_5.1.4_3.4_1698801109156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_txc_17923129","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autonlp_txc_17923129","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autonlp_txc_17923129| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/emekaboris/autonlp-txc-17923129 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227_ar.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227_ar.md new file mode 100644 index 000000000000..647c6e4ad0f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from zenkri) +author: John Snow Labs +name: bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227 +date: 2023-11-01 +tags: [ar, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-Arabic_Poetry_by_Subject-920730227` is a Arabic model originally trained by `zenkri`. + +## Predicted Entities + +`اعتذار`, `مدح`, `صبر`, `سياسية`, `وطنيه`, `رثاء`, `عامه`, `ابتهال`, `قصيره`, `رومنسيه`, `حزينه`, `عتاب`, `رحمة`, `الاناشيد`, `المعلقات`, `فراق`, `هجاء`, `نصيحة`, `جود`, `حكمة`, `شوق`, `دينية`, `عدل`, `غزل`, `ذم` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227_ar_5.1.4_3.4_1698799618038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227_ar_5.1.4_3.4_1698799618038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730227| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|414.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/zenkri/autotrain-Arabic_Poetry_by_Subject-920730227 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230_ar.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230_ar.md new file mode 100644 index 000000000000..d18d2d06df4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from zenkri) +author: John Snow Labs +name: bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230 +date: 2023-11-01 +tags: [ar, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-Arabic_Poetry_by_Subject-920730230` is a Arabic model originally trained by `zenkri`. + +## Predicted Entities + +`اعتذار`, `مدح`, `صبر`, `سياسية`, `وطنيه`, `رثاء`, `عامه`, `ابتهال`, `قصيره`, `رومنسيه`, `حزينه`, `عتاب`, `رحمة`, `الاناشيد`, `المعلقات`, `فراق`, `هجاء`, `نصيحة`, `جود`, `حكمة`, `شوق`, `دينية`, `عدل`, `غزل`, `ذم` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230_ar_5.1.4_3.4_1698801427269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230_ar_5.1.4_3.4_1698801427269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autotrain_arabic_poetry_by_subject_920730230| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|466.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/zenkri/autotrain-Arabic_Poetry_by_Subject-920730230 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_mut_all_text_680820343_es.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_mut_all_text_680820343_es.md new file mode 100644 index 000000000000..3d0dee52ca88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_mut_all_text_680820343_es.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Cased model (from gabitoo1234) +author: John Snow Labs +name: bert_sequence_classifier_autotrain_mut_all_text_680820343 +date: 2023-11-01 +tags: [es, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-mut_all_text-680820343` is a Spanish model originally trained by `gabitoo1234`. + +## Predicted Entities + +`523.0`, `232.0`, `192.0`, `526.0`, `262.0`, `422.0`, `330.0`, `131.0`, `539.0`, `424.0`, `342.0`, `234.2`, `513.0`, `423.0`, `234.3`, `380.0`, `240.0`, `159.0`, `521.0`, `325.0`, `234.1`, `429.0`, `234.4`, `236.0`, `212.0`, `142.0`, `449.0`, `234.0`, `370.0`, `519.0`, `512.0`, `252.0`, `690.0`, `222.0`, `529.0`, `151.0`, `313.0`, `239.0`, `361.0`, `511.0`, `410.0`, `149.0`, `390.0`, `321.0`, `193.0`, `199.0`, `611.0`, `231.0`, `314.0`, `319.0`, `490.0`, `362.0`, `191.0`, `129.0`, `235.0`, `350.0`, `251.0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_mut_all_text_680820343_es_5.1.4_3.4_1698807124594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_mut_all_text_680820343_es_5.1.4_3.4_1698807124594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_mut_all_text_680820343","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_mut_all_text_680820343","es") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autotrain_mut_all_text_680820343| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gabitoo1234/autotrain-mut_all_text-680820343 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_trec_fine_739422530_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_trec_fine_739422530_en.md new file mode 100644 index 000000000000..3a60c2a47ece --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_autotrain_trec_fine_739422530_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from ndavid) +author: John Snow Labs +name: bert_sequence_classifier_autotrain_trec_fine_739422530 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-trec-fine-bert-739422530` is a English model originally trained by `ndavid`. + +## Predicted Entities + +`veh`, `exp`, `techmeth`, `religion`, `currency`, `reason`, `event`, `letter`, `country`, `manner`, `city`, `other`, `abb`, `plant`, `title`, `period`, `temp`, `lang`, `weight`, `mount`, `state`, `desc`, `code`, `money`, `cremat`, `gr`, `volsize`, `dist`, `dismed`, `instru`, `sport`, `count`, `food`, `perc`, `product`, `termeq`, `ord`, `word`, `def`, `color`, `speed`, `date`, `substance`, `symbol`, `ind`, `body`, `animal` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_trec_fine_739422530_en_5.1.4_3.4_1698801687311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_autotrain_trec_fine_739422530_en_5.1.4_3.4_1698801687311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_trec_fine_739422530","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_autotrain_trec_fine_739422530","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_autotrain_trec_fine_739422530| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ndavid/autotrain-trec-fine-bert-739422530 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_cola_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_cola_en.md new file mode 100644 index 000000000000..d4b8cad073eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_cola_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_base_cased_finetuned_cola +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-cola` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`acceptable`, `unacceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_cola_en_5.1.4_3.4_1698799958744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_cola_en_5.1.4_3.4_1698799958744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_cola","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_cola","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_cased_finetuned_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+COLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_mrpc_en.md new file mode 100644 index 000000000000..1ddffcd2d55d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_mrpc_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_base_cased_finetuned_mrpc +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-mrpc` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`equivalent`, `not_equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_mrpc_en_5.1.4_3.4_1698801966451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_mrpc_en_5.1.4_3.4_1698801966451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_cased_finetuned_mrpc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_qnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_qnli_en.md new file mode 100644 index 000000000000..c66fd4ddb531 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_qnli_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_base_cased_finetuned_qnli +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-qnli` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`entailment`, `not_entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_qnli_en_5.1.4_3.4_1698802267376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_qnli_en_5.1.4_3.4_1698802267376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_qnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_qnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_cased_finetuned_qnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+QNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_rte_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_rte_en.md new file mode 100644 index 000000000000..44c6f58b45b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_rte_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_base_cased_finetuned_rte +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-rte` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`entailment`, `not_entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_rte_en_5.1.4_3.4_1698800249263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_rte_en_5.1.4_3.4_1698800249263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_rte","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_rte","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_cased_finetuned_rte| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-rte +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+RTE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_sst2_en.md new file mode 100644 index 000000000000..f51ff44fd327 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_cased_finetuned_sst2_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForSequenceClassification Base Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_base_cased_finetuned_sst2 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-sst2` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_sst2_en_5.1.4_3.4_1698800513438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_cased_finetuned_sst2_en_5.1.4_3.4_1698800513438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_cased_finetuned_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_cased_finetuned_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2 +- https://arxiv.org/abs/2105.03824 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+SST2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx_de.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx_de.md new file mode 100644 index 000000000000..f6bb1ed2c953 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx_de.md @@ -0,0 +1,106 @@ +--- +layout: model +title: German BertForSequenceClassification Base Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx +date: 2023-11-01 +tags: [de, open_source, bert, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-german-dbmdz-cased-finetuned-pawsx-de` is a German model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx_de_5.1.4_3.4_1698807442272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx_de_5.1.4_3.4_1698807442272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.bert.pawsx_xtreme.cased_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_german_dbmdz_cased_finetuned_pawsx| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/bert-base-german-dbmdz-cased-finetuned-pawsx-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_multilingual_cased_nsmc_ko.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_multilingual_cased_nsmc_ko.md new file mode 100644 index 000000000000..6e672e792f94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_multilingual_cased_nsmc_ko.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Korean BertForSequenceClassification Base Cased model (from sangrimlee) +author: John Snow Labs +name: bert_sequence_classifier_base_multilingual_cased_nsmc +date: 2023-11-01 +tags: [ko, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-nsmc` is a Korean model originally trained by `sangrimlee`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_multilingual_cased_nsmc_ko_5.1.4_3.4_1698802656109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_multilingual_cased_nsmc_ko_5.1.4_3.4_1698802656109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_multilingual_cased_nsmc","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_multilingual_cased_nsmc","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_multilingual_cased_nsmc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ko| +|Size:|667.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sangrimlee/bert-base-multilingual-cased-nsmc +- https://github.com/e9t/nsmc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_spanish_wwm_cased_xnli_es.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_spanish_wwm_cased_xnli_es.md new file mode 100644 index 000000000000..662d0ddd8f08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_spanish_wwm_cased_xnli_es.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Base Cased model (from Recognai) +author: John Snow Labs +name: bert_sequence_classifier_base_spanish_wwm_cased_xnli +date: 2023-11-01 +tags: [es, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-cased-xnli` is a Spanish model originally trained by `Recognai`. + +## Predicted Entities + +`neutral`, `contradiction`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_spanish_wwm_cased_xnli_es_5.1.4_3.4_1698802954642.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_spanish_wwm_cased_xnli_es_5.1.4_3.4_1698802954642.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_spanish_wwm_cased_xnli","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_spanish_wwm_cased_xnli","es") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_spanish_wwm_cased_xnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|411.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_turkish_sentiment_cased_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_turkish_sentiment_cased_tr.md new file mode 100644 index 000000000000..f6948e3e3aaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_turkish_sentiment_cased_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Base Cased model (from savasy) +author: John Snow Labs +name: bert_sequence_classifier_base_turkish_sentiment_cased +date: 2023-11-01 +tags: [tr, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-sentiment-cased` is a Turkish model originally trained by `savasy`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_turkish_sentiment_cased_tr_5.1.4_3.4_1698807741859.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_turkish_sentiment_cased_tr_5.1.4_3.4_1698807741859.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_turkish_sentiment_cased","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_turkish_sentiment_cased","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_turkish_sentiment_cased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|414.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/savasy/bert-base-turkish-sentiment-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_uncased_ag_news_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_uncased_ag_news_en.md new file mode 100644 index 000000000000..32482628dede --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_uncased_ag_news_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from nateraw) +author: John Snow Labs +name: bert_sequence_classifier_base_uncased_ag_news +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-ag-news` is a English model originally trained by `nateraw`. + +## Predicted Entities + +`Sports`, `Business`, `Sci/Tech`, `World` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_uncased_ag_news_en_5.1.4_3.4_1698800947044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_uncased_ag_news_en_5.1.4_3.4_1698800947044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_uncased_ag_news","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_uncased_ag_news","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_uncased_ag_news| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/nateraw/bert-base-uncased-ag-news +- https://github.com/nateraw/hf-text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_uncased_emotion_en.md new file mode 100644 index 000000000000..004f062b9643 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_base_uncased_emotion_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Base Uncased model (from nateraw) +author: John Snow Labs +name: bert_sequence_classifier_base_uncased_emotion +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-emotion` is a English model originally trained by `nateraw`. + +## Predicted Entities + +`surprise`, `anger`, `joy`, `sadness`, `fear`, `love` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_uncased_emotion_en_5.1.4_3.4_1698801222088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_base_uncased_emotion_en_5.1.4_3.4_1698801222088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_uncased_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_base_uncased_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_base_uncased_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/nateraw/bert-base-uncased-emotion +- https://github.com/nateraw \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_bounti_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_bounti_tr.md new file mode 100644 index 000000000000..fdb1e3c7f749 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_bounti_tr.md @@ -0,0 +1,102 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from akoksal) +author: John Snow Labs +name: bert_sequence_classifier_bounti +date: 2023-11-01 +tags: [tr, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bounti` is a Turkish model originally trained by `akoksal`. + +## Predicted Entities + +`positive`, `neutral`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_bounti_tr_5.1.4_3.4_1698803340886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_bounti_tr_5.1.4_3.4_1698803340886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_bounti","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_bounti","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_bounti| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|691.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/akoksal/bounti +- https://ieeexplore.ieee.org/document/9477814 +- https://github.com/boun-tabi/BounTi-Turkish-Sentiment-Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_comments_text_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_comments_text_classification_model_en.md new file mode 100644 index 000000000000..8e1f5790de0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_comments_text_classification_model_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from EricPeter) +author: John Snow Labs +name: bert_sequence_classifier_comments_text_classification_model +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `comments-text-classification-model` is a English model originally trained by `EricPeter`. + +## Predicted Entities + +`Good`, `Neutral`, `Very Poor`, `Poor`, `Excellent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_comments_text_classification_model_en_5.1.4_3.4_1698803925499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_comments_text_classification_model_en_5.1.4_3.4_1698803925499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_comments_text_classification_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_comments_text_classification_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_comments_text_classification_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/EricPeter/comments-text-classification-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_danish_emotion_classification_da.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_danish_emotion_classification_da.md new file mode 100644 index 000000000000..e06a288487c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_danish_emotion_classification_da.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Danish BertForSequenceClassification Cased model (from NikolajMunch) +author: John Snow Labs +name: bert_sequence_classifier_danish_emotion_classification +date: 2023-11-01 +tags: [da, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `danish-emotion-classification` is a Danish model originally trained by `NikolajMunch`. + +## Predicted Entities + +`Glæde`, `Afsky`, `Vrede`, `Frygt`, `Tristhed`, `Overraskelse` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_danish_emotion_classification_da_5.1.4_3.4_1698804224003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_danish_emotion_classification_da_5.1.4_3.4_1698804224003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_danish_emotion_classification","da") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_danish_emotion_classification","da") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_danish_emotion_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|da| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/NikolajMunch/danish-emotion-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_electricidad_base_finetuned_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_electricidad_base_finetuned_sst2_en.md new file mode 100644 index 000000000000..31637bb5908b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_electricidad_base_finetuned_sst2_en.md @@ -0,0 +1,104 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Base Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_electricidad_base_finetuned_sst2 +date: 2023-11-01 +tags: [bert, es, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electricidad-base-finetuned-sst2-es` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + +`NEG`, `NEU`, `POS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_electricidad_base_finetuned_sst2_en_5.1.4_3.4_1698808012843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_electricidad_base_finetuned_sst2_en_5.1.4_3.4_1698808012843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_electricidad_base_finetuned_sst2","es") .setInputCols(["document", "token"]) .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_electricidad_base_finetuned_sst2","es") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.base_finetuned").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_electricidad_base_finetuned_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +https://huggingface.co/mrm8488/electricidad-base-finetuned-sst2-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_erlangshen_roberta_110m_similarity_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_erlangshen_roberta_110m_similarity_zh.md new file mode 100644 index 000000000000..6caa4be31b20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_erlangshen_roberta_110m_similarity_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from IDEA-CCNL) +author: John Snow Labs +name: bert_sequence_classifier_erlangshen_roberta_110m_similarity +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Roberta-110M-Similarity` is a Chinese model originally trained by `IDEA-CCNL`. + +## Predicted Entities + +`not similar`, `similar` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_erlangshen_roberta_110m_similarity_zh_5.1.4_3.4_1698808280759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_erlangshen_roberta_110m_similarity_zh_5.1.4_3.4_1698808280759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_erlangshen_roberta_110m_similarity","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_erlangshen_roberta_110m_similarity","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_erlangshen_roberta_110m_similarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|383.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-110M-Similarity +- https://github.com/IDEA-CCNL/Fengshenbang-LM +- https://fengshenbang-doc.readthedocs.io/ +- https://arxiv.org/abs/2209.02970 +- https://arxiv.org/abs/2209.02970 +- https://github.com/IDEA-CCNL/Fengshenbang-LM/ +- https://github.com/IDEA-CCNL/Fengshenbang-LM/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_erlangshen_roberta_330m_nli_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_erlangshen_roberta_330m_nli_zh.md new file mode 100644 index 000000000000..9cb162cedf9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_erlangshen_roberta_330m_nli_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from IDEA-CCNL) +author: John Snow Labs +name: bert_sequence_classifier_erlangshen_roberta_330m_nli +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Erlangshen-Roberta-330M-NLI` is a Chinese model originally trained by `IDEA-CCNL`. + +## Predicted Entities + +`ENTAILMENT`, `NEUTRAL`, `CONTRADICTION` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_erlangshen_roberta_330m_nli_zh_5.1.4_3.4_1698808886818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_erlangshen_roberta_330m_nli_zh_5.1.4_3.4_1698808886818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_erlangshen_roberta_330m_nli","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_erlangshen_roberta_330m_nli","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_erlangshen_roberta_330m_nli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-330M-NLI +- https://github.com/IDEA-CCNL/Fengshenbang-LM +- https://fengshenbang-doc.readthedocs.io/ +- https://arxiv.org/abs/2209.02970 +- https://arxiv.org/abs/2209.02970 +- https://github.com/IDEA-CCNL/Fengshenbang-LM/ +- https://github.com/IDEA-CCNL/Fengshenbang-LM/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_kaggle_comp_test_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_kaggle_comp_test_en.md new file mode 100644 index 000000000000..c2fdf9a30e11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_kaggle_comp_test_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Crasher222) +author: John Snow Labs +name: bert_sequence_classifier_kaggle_comp_test +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kaggle-comp-test` is a English model originally trained by `Crasher222`. + +## Predicted Entities + +`3`, `0`, `4`, `2`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_kaggle_comp_test_en_5.1.4_3.4_1698801846570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_kaggle_comp_test_en_5.1.4_3.4_1698801846570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_kaggle_comp_test","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_kaggle_comp_test","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_kaggle_comp_test| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Crasher222/kaggle-comp-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_cola_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_cola_en.md new file mode 100644 index 000000000000..ae2319b991cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_cola_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_large_cased_finetuned_cola +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-finetuned-cola` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`acceptable`, `unacceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_cola_en_5.1.4_3.4_1698802439061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_cola_en_5.1.4_3.4_1698802439061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_cola","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_cola","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_large_cased_finetuned_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-large-cased-finetuned-cola +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+COLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_mrpc_en.md new file mode 100644 index 000000000000..09627b513d22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_mrpc_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_large_cased_finetuned_mrpc +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-finetuned-mrpc` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`equivalent`, `not_equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_mrpc_en_5.1.4_3.4_1698809468427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_mrpc_en_5.1.4_3.4_1698809468427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_large_cased_finetuned_mrpc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-large-cased-finetuned-mrpc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_rte_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_rte_en.md new file mode 100644 index 000000000000..b5462f702548 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_rte_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_large_cased_finetuned_rte +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-finetuned-rte` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`entailment`, `not_entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_rte_en_5.1.4_3.4_1698803000285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_rte_en_5.1.4_3.4_1698803000285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_rte","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_rte","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_large_cased_finetuned_rte| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-large-cased-finetuned-rte +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+RTE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_wnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_wnli_en.md new file mode 100644 index 000000000000..f8e47ba67d46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_large_cased_finetuned_wnli_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Large Cased model (from gchhablani) +author: John Snow Labs +name: bert_sequence_classifier_large_cased_finetuned_wnli +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-finetuned-wnli` is a English model originally trained by `gchhablani`. + +## Predicted Entities + +`not_entailment`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_wnli_en_5.1.4_3.4_1698810047635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_large_cased_finetuned_wnli_en_5.1.4_3.4_1698810047635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_wnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_large_cased_finetuned_wnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_large_cased_finetuned_wnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gchhablani/bert-large-cased-finetuned-wnli +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+WNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_mini_finetuned_age_news_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_mini_finetuned_age_news_classification_en.md new file mode 100644 index 000000000000..d9d49a89a0b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_mini_finetuned_age_news_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_mini_finetuned_age_news_classification +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-mini-finetuned-age_news-classification` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_mini_finetuned_age_news_classification_en_5.1.4_3.4_1698810216676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_mini_finetuned_age_news_classification_en_5.1.4_3.4_1698810216676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_mini_finetuned_age_news_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = BertForSequenceClassification.pretrained("bert_sequence_classifier_mini_finetuned_age_news_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.news.mini_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_mini_finetuned_age_news_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/bert-mini-finetuned-age_news-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_minilm_l6_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_minilm_l6_mnli_en.md new file mode 100644 index 000000000000..1b6cfe01bb7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_minilm_l6_mnli_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Mini Cased model (from MoritzLaurer) +author: John Snow Labs +name: bert_sequence_classifier_minilm_l6_mnli +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L6-mnli` is a English model originally trained by `MoritzLaurer`. + +## Predicted Entities + +`neutral`, `contradiction`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_minilm_l6_mnli_en_5.1.4_3.4_1698804396860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_minilm_l6_mnli_en_5.1.4_3.4_1698804396860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_minilm_l6_mnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_minilm_l6_mnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_minilm_l6_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|84.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/MoritzLaurer/MiniLM-L6-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_english_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_english_bert_en.md new file mode 100644 index 000000000000..0b35cef97c2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_english_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_sequence_classifier_multi2convai_corona_english_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_corona_english_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_corona_english_bert` is a English model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_corona_english_bert_en_5.1.4_3.4_1698810382265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_corona_english_bert_en_5.1.4_3.4_1698810382265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_corona_english_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_corona_english_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_corona_english_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/inovex/multi2convai-corona-en-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_german_bert_de.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_german_bert_de.md new file mode 100644 index 000000000000..116bd1309199 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_german_bert_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German bert_sequence_classifier_multi2convai_corona_german_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_corona_german_bert +date: 2023-11-01 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_corona_german_bert` is a German model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_corona_german_bert_de_5.1.4_3.4_1698803196024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_corona_german_bert_de_5.1.4_3.4_1698803196024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_corona_german_bert","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_corona_german_bert","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_corona_german_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.2 MB| + +## References + +https://huggingface.co/inovex/multi2convai-corona-de-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_italian_bert_it.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_italian_bert_it.md new file mode 100644 index 000000000000..873a83d0349d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_corona_italian_bert_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian bert_sequence_classifier_multi2convai_corona_italian_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_corona_italian_bert +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_corona_italian_bert` is a Italian model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_corona_italian_bert_it_5.1.4_3.4_1698803359952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_corona_italian_bert_it_5.1.4_3.4_1698803359952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_corona_italian_bert","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_corona_italian_bert","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_corona_italian_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|411.9 MB| + +## References + +https://huggingface.co/inovex/multi2convai-corona-it-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_english_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_english_bert_en.md new file mode 100644 index 000000000000..c86fd37ddf44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_english_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_sequence_classifier_multi2convai_logistics_english_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_logistics_english_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_logistics_english_bert` is a English model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_english_bert_en_5.1.4_3.4_1698810551760.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_english_bert_en_5.1.4_3.4_1698810551760.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_english_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_english_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_logistics_english_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/inovex/multi2convai-logistics-en-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_german_bert_de.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_german_bert_de.md new file mode 100644 index 000000000000..7aac061b4648 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_german_bert_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German bert_sequence_classifier_multi2convai_logistics_german_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_logistics_german_bert +date: 2023-11-01 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_logistics_german_bert` is a German model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_german_bert_de_5.1.4_3.4_1698804585957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_german_bert_de_5.1.4_3.4_1698804585957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_german_bert","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_german_bert","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_logistics_german_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.2 MB| + +## References + +https://huggingface.co/inovex/multi2convai-logistics-de-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_turkish_bert_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_turkish_bert_tr.md new file mode 100644 index 000000000000..37f0406e6bf0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_logistics_turkish_bert_tr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Turkish bert_sequence_classifier_multi2convai_logistics_turkish_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_logistics_turkish_bert +date: 2023-11-01 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_logistics_turkish_bert` is a Turkish model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_turkish_bert_tr_5.1.4_3.4_1698796893774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_logistics_turkish_bert_tr_5.1.4_3.4_1698796893774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_turkish_bert","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_logistics_turkish_bert","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_logistics_turkish_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.6 MB| + +## References + +https://huggingface.co/inovex/multi2convai-logistics-tr-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_french_bert_fr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_french_bert_fr.md new file mode 100644 index 000000000000..89d6d0e67453 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_french_bert_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French bert_sequence_classifier_multi2convai_quality_french_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_quality_french_bert +date: 2023-11-01 +tags: [bert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_quality_french_bert` is a French model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_french_bert_fr_5.1.4_3.4_1698796962104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_french_bert_fr_5.1.4_3.4_1698796962104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_french_bert","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_french_bert","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_quality_french_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|414.6 MB| + +## References + +https://huggingface.co/inovex/multi2convai-quality-fr-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_french_mbert_fr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_french_mbert_fr.md new file mode 100644 index 000000000000..c317dbc3654a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_french_mbert_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French bert_sequence_classifier_multi2convai_quality_french_mbert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_quality_french_mbert +date: 2023-11-01 +tags: [bert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_quality_french_mbert` is a French model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_french_mbert_fr_5.1.4_3.4_1698810793471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_french_mbert_fr_5.1.4_3.4_1698810793471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_french_mbert","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_french_mbert","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_quality_french_mbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|667.3 MB| + +## References + +https://huggingface.co/inovex/multi2convai-quality-fr-mbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_german_bert_de.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_german_bert_de.md new file mode 100644 index 000000000000..73dbab5991b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_german_bert_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German bert_sequence_classifier_multi2convai_quality_german_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_quality_german_bert +date: 2023-11-01 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_quality_german_bert` is a German model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_german_bert_de_5.1.4_3.4_1698804795775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_german_bert_de_5.1.4_3.4_1698804795775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_german_bert","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_german_bert","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_quality_german_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.2 MB| + +## References + +https://huggingface.co/inovex/multi2convai-quality-de-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_italian_bert_it.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_italian_bert_it.md new file mode 100644 index 000000000000..51317e891c19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_italian_bert_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian bert_sequence_classifier_multi2convai_quality_italian_bert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_quality_italian_bert +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_quality_italian_bert` is a Italian model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_italian_bert_it_5.1.4_3.4_1698797154893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_italian_bert_it_5.1.4_3.4_1698797154893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_italian_bert","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_italian_bert","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_quality_italian_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|411.9 MB| + +## References + +https://huggingface.co/inovex/multi2convai-quality-it-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_italian_mbert_it.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_italian_mbert_it.md new file mode 100644 index 000000000000..144b6f20ab02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multi2convai_quality_italian_mbert_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian bert_sequence_classifier_multi2convai_quality_italian_mbert BertForSequenceClassification from inovex +author: John Snow Labs +name: bert_sequence_classifier_multi2convai_quality_italian_mbert +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_multi2convai_quality_italian_mbert` is a Italian model originally trained by inovex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_italian_mbert_it_5.1.4_3.4_1698803601667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multi2convai_quality_italian_mbert_it_5.1.4_3.4_1698803601667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_italian_mbert","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multi2convai_quality_italian_mbert","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multi2convai_quality_italian_mbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|667.3 MB| + +## References + +https://huggingface.co/inovex/multi2convai-quality-it-mbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multiclass_textclassification_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multiclass_textclassification_en.md new file mode 100644 index 000000000000..58edc6a80047 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_multiclass_textclassification_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from palakagl) +author: John Snow Labs +name: bert_sequence_classifier_multiclass_textclassification +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_MultiClass_TextClassification` is a English model originally trained by `palakagl`. + +## Predicted Entities + +`alarm_remove`, `transport_query`, `email_addcontact`, `general_praise`, `general_dontcare`, `takeaway_query`, `email_query`, `transport_traffic`, `iot_wemo_off`, `weather_query`, `iot_hue_lightchange`, `calendar_query`, `iot_wemo_on`, `email_sendemail`, `general_negate`, `qa_currency`, `general_joke`, `alarm_query`, `alarm_set`, `general_repeat`, `datetime_convert`, `transport_taxi`, `lists_query`, `general_quirky`, `recommendation_movies`, `calendar_remove`, `qa_factoid`, `iot_hue_lighton`, `iot_hue_lightup`, `audio_volume_up`, `social_query`, `general_explain`, `general_confirm`, `news_query`, `qa_definition`, `iot_coffee`, `play_audiobook`, `qa_maths`, `lists_createoradd`, `play_podcasts`, `music_query`, `recommendation_locations`, `play_music`, `calendar_set`, `email_querycontact`, `general_affirm`, `recommendation_events`, `play_radio`, `audio_volume_down`, `social_post`, `general_commandstop`, `iot_hue_lightdim`, `transport_ticket`, `cooking_recipe`, `iot_hue_lightoff`, `audio_volume_mute`, `lists_remove`, `music_settings`, `iot_cleaning`, `takeaway_order`, `music_likeness`, `qa_stock`, `datetime_query`, `play_game` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multiclass_textclassification_en_5.1.4_3.4_1698805116890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_multiclass_textclassification_en_5.1.4_3.4_1698805116890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multiclass_textclassification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_multiclass_textclassification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_multiclass_textclassification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/palakagl/bert_MultiClass_TextClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_off_detection_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_off_detection_turkish_tr.md new file mode 100644 index 000000000000..206d82e662b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_off_detection_turkish_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from hemekci) +author: John Snow Labs +name: bert_sequence_classifier_off_detection_turkish +date: 2023-11-01 +tags: [tr, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `off_detection_turkish` is a Turkish model originally trained by `hemekci`. + +## Predicted Entities + +`not offensive`, `offensive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_off_detection_turkish_tr_5.1.4_3.4_1698797767017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_off_detection_turkish_tr_5.1.4_3.4_1698797767017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_off_detection_turkish","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_off_detection_turkish","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_off_detection_turkish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|691.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/hemekci/off_detection_turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_poem_qafiyah_detection_ar.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_poem_qafiyah_detection_ar.md new file mode 100644 index 000000000000..c5f4cfc07c48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_poem_qafiyah_detection_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic BertForSequenceClassification Cased model (from Yah216) +author: John Snow Labs +name: bert_sequence_classifier_poem_qafiyah_detection +date: 2023-11-01 +tags: [ar, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Poem_Qafiyah_Detection` is a Arabic model originally trained by `Yah216`. + +## Predicted Entities + +`ؤ`, `ح`, `م`, `ل`, `ه`, `ز`, `د`, `ء`, `غ`, `ي`, `ص`, `ف`, `ذ`, `خ`, `ث`, `ج`, `ن`, `هـ`, `ط`, `س`, `طن`, `ى`, `ب`, `ت`, `لا`, `ش`, `ر`, `ا`, `ع`, `ض`, `ك`, `و`, `هن`, `ق`, `ظ` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_poem_qafiyah_detection_ar_5.1.4_3.4_1698805471681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_poem_qafiyah_detection_ar_5.1.4_3.4_1698805471681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_poem_qafiyah_detection","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_poem_qafiyah_detection","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_poem_qafiyah_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|466.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Yah216/Poem_Qafiyah_Detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_pred_genre_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_pred_genre_zh.md new file mode 100644 index 000000000000..f248cac9e80e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_pred_genre_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from Herais) +author: John Snow Labs +name: bert_sequence_classifier_pred_genre +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `pred_genre` is a Chinese model originally trained by `Herais`. + +## Predicted Entities + +`科幻`, `其它`, `武打`, `农村`, `传奇`, `都市`, `神话`, `军旅`, `宫廷`, `传记`, `青少`, `涉案`, `革命` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_pred_genre_zh_5.1.4_3.4_1698798092183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_pred_genre_zh_5.1.4_3.4_1698798092183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_pred_genre","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_pred_genre","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_pred_genre| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|383.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Herais/pred_genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_pred_timeperiod_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_pred_timeperiod_zh.md new file mode 100644 index 000000000000..ad39e910480a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_pred_timeperiod_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Cased model (from Herais) +author: John Snow Labs +name: bert_sequence_classifier_pred_timeperiod +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `pred_timeperiod` is a Chinese model originally trained by `Herais`. + +## Predicted Entities + +`当代`, `近代`, `古代`, `重大`, `现代` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_pred_timeperiod_zh_5.1.4_3.4_1698805754358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_pred_timeperiod_zh_5.1.4_3.4_1698805754358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_pred_timeperiod","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_pred_timeperiod","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_pred_timeperiod| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|383.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Herais/pred_timeperiod \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_priv_consent_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_priv_consent_en.md new file mode 100644 index 000000000000..d684f32050af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_priv_consent_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from Adi2K) +author: John Snow Labs +name: bert_sequence_classifier_priv_consent +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Priv-Consent` is a English model originally trained by `Adi2K`. + +## Predicted Entities + +`NOT`, `CON` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_priv_consent_en_5.1.4_3.4_1698803896285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_priv_consent_en_5.1.4_3.4_1698803896285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_priv_consent","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_priv_consent","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_priv_consent| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Adi2K/Priv-Consent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_response_quality_base_ru.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_response_quality_base_ru.md new file mode 100644 index 000000000000..4113e7ea244b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_response_quality_base_ru.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Russian BertForSequenceClassification Base Cased model (from tinkoff-ai) +author: John Snow Labs +name: bert_sequence_classifier_response_quality_base +date: 2023-11-01 +tags: [ru, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `response-quality-classifier-base` is a Russian model originally trained by `tinkoff-ai`. + +## Predicted Entities + +`specificity`, `relevance` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_response_quality_base_ru_5.1.4_3.4_1698806162339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_response_quality_base_ru_5.1.4_3.4_1698806162339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_response_quality_base","ru") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_response_quality_base","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_response_quality_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ru| +|Size:|666.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tinkoff-ai/response-quality-classifier-base +- https://github.com/egoriyaa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese_zh.md new file mode 100644 index 000000000000..df98f5309dd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForSequenceClassification Base Cased model (from uer) +author: John Snow Labs +name: bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese +date: 2023-11-01 +tags: [zh, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-finetuned-jd-full-chinese` is a Chinese model originally trained by `uer`. + +## Predicted Entities + +`star 4`, `star 5`, `star 1`, `star 2`, `star 3` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese_zh_5.1.4_3.4_1698797201948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese_zh_5.1.4_3.4_1698797201948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_roberta_base_finetuned_jd_full_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|383.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/uer/roberta-base-finetuned-jd-full-chinese +- https://arxiv.org/abs/1909.05658 +- https://github.com/dbiir/UER-py/wiki/Modelzoo +- https://github.com/zhangxiangxiao/glyph +- https://arxiv.org/abs/1708.02657 +- https://github.com/dbiir/UER-py/ +- https://cloud.tencent.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_robertabase_ana4_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_robertabase_ana4_en.md new file mode 100644 index 000000000000..ebd8fe76a8d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_robertabase_ana4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from vinaydngowda) +author: John Snow Labs +name: bert_sequence_classifier_robertabase_ana4 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Robertabase_Ana4` is a English model originally trained by `vinaydngowda`. + +## Predicted Entities + +`Credit card or prepaid card`, `Mortgage`, `Student loan`, `Checking or savings account`, `Debt collection`, `Vehicle loan or lease` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_robertabase_ana4_en_5.1.4_3.4_1698798712928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_robertabase_ana4_en_5.1.4_3.4_1698798712928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_robertabase_ana4","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_robertabase_ana4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_robertabase_ana4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/vinaydngowda/Robertabase_Ana4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_russian_base_cased_nli_threeway_ru.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_russian_base_cased_nli_threeway_ru.md new file mode 100644 index 000000000000..1fbbda8fb99c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_russian_base_cased_nli_threeway_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian bert_sequence_classifier_russian_base_cased_nli_threeway BertForSequenceClassification from cointegrated +author: John Snow Labs +name: bert_sequence_classifier_russian_base_cased_nli_threeway +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_russian_base_cased_nli_threeway` is a Russian model originally trained by cointegrated. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_nli_threeway_ru_5.1.4_3.4_1698797443866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_nli_threeway_ru_5.1.4_3.4_1698797443866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_nli_threeway","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_nli_threeway","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_russian_base_cased_nli_threeway| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| + +## References + +https://huggingface.co/cointegrated/rubert-base-cased-nli-threeway \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_russian_base_cased_nli_twoway_ru.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_russian_base_cased_nli_twoway_ru.md new file mode 100644 index 000000000000..9e4129f83420 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_russian_base_cased_nli_twoway_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian bert_sequence_classifier_russian_base_cased_nli_twoway BertForSequenceClassification from cointegrated +author: John Snow Labs +name: bert_sequence_classifier_russian_base_cased_nli_twoway +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sequence_classifier_russian_base_cased_nli_twoway` is a Russian model originally trained by cointegrated. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_nli_twoway_ru_5.1.4_3.4_1698806399420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_russian_base_cased_nli_twoway_ru_5.1.4_3.4_1698806399420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_nli_twoway","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_russian_base_cased_nli_twoway","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_russian_base_cased_nli_twoway| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| + +## References + +https://huggingface.co/cointegrated/rubert-base-cased-nli-twoway \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_scientific_challenges_and_directions_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_scientific_challenges_and_directions_en.md new file mode 100644 index 000000000000..f22217b63588 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_scientific_challenges_and_directions_en.md @@ -0,0 +1,104 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from DanL) +author: John Snow Labs +name: bert_sequence_classifier_scientific_challenges_and_directions +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scientific-challenges-and-directions` is a English model originally trained by `DanL`. + +## Predicted Entities + +`Direction`, `Challenge` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_scientific_challenges_and_directions_en_5.1.4_3.4_1698799323207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_scientific_challenges_and_directions_en_5.1.4_3.4_1698799323207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_scientific_challenges_and_directions","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_scientific_challenges_and_directions","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_scientific_challenges_and_directions| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/DanL/scientific-challenges-and-directions +- https://arxiv.org/abs/2108.13751 +- https://challenges.apps.allenai.org/ +- https://arxiv.org/abs/2108.13751 +- https://github.com/Dan-La/scientific-challenges-and-directions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_senda_da.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_senda_da.md new file mode 100644 index 000000000000..451ab01b5d1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_senda_da.md @@ -0,0 +1,104 @@ +--- +layout: model +title: Danish BertForSequenceClassification Cased model (from pin) +author: John Snow Labs +name: bert_sequence_classifier_senda +date: 2023-11-01 +tags: [da, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `senda` is a Danish model originally trained by `pin`. + +## Predicted Entities + +`negativ`, `positiv`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_senda_da_5.1.4_3.4_1698797792686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_senda_da_5.1.4_3.4_1698797792686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_senda","da") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_senda","da") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_senda| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|da| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/pin/senda +- https://github.com/alexandrainst +- https://github.com/ebanalyse/senda +- https://github.com/alexandrainst/danlp/blob/master/docs/docs/datasets.md#twitter-sentiment +- https://github.com/ebanalyse/senda \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_sent_sci_irrelevance_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_sent_sci_irrelevance_en.md new file mode 100644 index 000000000000..d93e1d4118df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_sent_sci_irrelevance_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from world-wide) +author: John Snow Labs +name: bert_sequence_classifier_sent_sci_irrelevance +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sent-sci-irrelevance` is a English model originally trained by `world-wide`. + +## Predicted Entities + +`True`, `False` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_sent_sci_irrelevance_en_5.1.4_3.4_1698804708639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_sent_sci_irrelevance_en_5.1.4_3.4_1698804708639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_sent_sci_irrelevance","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_sent_sci_irrelevance","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_sent_sci_irrelevance| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/world-wide/sent-sci-irrelevance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli_es.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli_es.md new file mode 100644 index 000000000000..e8f25a1c778c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Tiny Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli +date: 2023-11-01 +tags: [es, open_source, bert, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanish-TinyBERT-betito-finetuned-mnli` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli_es_5.1.4_3.4_1698804845511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli_es_5.1.4_3.4_1698804845511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = BertForSequenceClassification.pretrained("bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.bert.tiny_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_spanish_tinybert_betito_finetuned_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|54.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/spanish-TinyBERT-betito-finetuned-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli_es.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli_es.md new file mode 100644 index 000000000000..417ee206bedf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish BertForSequenceClassification Tiny Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli +date: 2023-11-01 +tags: [es, open_source, bert, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanish-TinyBERT-betito-finetuned-xnli-es` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli_es_5.1.4_3.4_1698799486383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli_es_5.1.4_3.4_1698799486383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = BertForSequenceClassification.pretrained("bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.bert.xnli.tiny_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_spanish_tinybert_betito_finetuned_xnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|54.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/spanish-TinyBERT-betito-finetuned-xnli-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_test_hub_pr_1_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_test_hub_pr_1_en.md new file mode 100644 index 000000000000..8b0357faffee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_test_hub_pr_1_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from lewtun) +author: John Snow Labs +name: bert_sequence_classifier_test_hub_pr_1 +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `test-hub-pr-1` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`neg`, `pos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_test_hub_pr_1_en_5.1.4_3.4_1698806961672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_test_hub_pr_1_en_5.1.4_3.4_1698806961672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_test_hub_pr_1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_test_hub_pr_1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_test_hub_pr_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/lewtun/test-hub-pr-1 +- https://paperswithcode.com/sota?task=Multi-class+Classification&dataset=Emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_textclassification_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_textclassification_en.md new file mode 100644 index 000000000000..2b3ba97f6872 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_textclassification_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from palakagl) +author: John Snow Labs +name: bert_sequence_classifier_textclassification +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_TextClassification` is a English model originally trained by `palakagl`. + +## Predicted Entities + +`alarm_remove`, `transport_query`, `email_addcontact`, `general_praise`, `general_dontcare`, `takeaway_query`, `email_query`, `transport_traffic`, `iot_wemo_off`, `weather_query`, `iot_hue_lightchange`, `calendar_query`, `iot_wemo_on`, `email_sendemail`, `general_negate`, `qa_currency`, `general_joke`, `alarm_query`, `alarm_set`, `general_repeat`, `datetime_convert`, `transport_taxi`, `lists_query`, `general_quirky`, `recommendation_movies`, `calendar_remove`, `qa_factoid`, `iot_hue_lighton`, `iot_hue_lightup`, `audio_volume_up`, `social_query`, `general_explain`, `general_confirm`, `news_query`, `qa_definition`, `iot_coffee`, `play_audiobook`, `qa_maths`, `lists_createoradd`, `play_podcasts`, `music_query`, `recommendation_locations`, `play_music`, `calendar_set`, `email_querycontact`, `general_affirm`, `recommendation_events`, `play_radio`, `audio_volume_down`, `social_post`, `general_commandstop`, `iot_hue_lightdim`, `transport_ticket`, `cooking_recipe`, `iot_hue_lightoff`, `audio_volume_mute`, `lists_remove`, `music_settings`, `iot_cleaning`, `takeaway_order`, `music_likeness`, `qa_stock`, `datetime_query`, `play_game` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_textclassification_en_5.1.4_3.4_1698798145186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_textclassification_en_5.1.4_3.4_1698798145186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_textclassification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_textclassification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_textclassification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/palakagl/bert_TextClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_fake_news_detection_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_fake_news_detection_en.md new file mode 100644 index 000000000000..8246aaa877b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_fake_news_detection_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_tiny_finetuned_fake_news_detection +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-fake-news-detection` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tiny_finetuned_fake_news_detection_en_5.1.4_3.4_1698807093307.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tiny_finetuned_fake_news_detection_en_5.1.4_3.4_1698807093307.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_tiny_finetuned_fake_news_detection","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = BertForSequenceClassification.pretrained("bert_sequence_classifier_tiny_finetuned_fake_news_detection","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.news.tiny_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_tiny_finetuned_fake_news_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-finetuned-fake-news-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_sms_spam_detection_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_sms_spam_detection_en.md new file mode 100644 index 000000000000..b27b101406a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_sms_spam_detection_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_tiny_finetuned_sms_spam_detection +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-sms-spam-detection` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tiny_finetuned_sms_spam_detection_en_5.1.4_3.4_1698807193444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tiny_finetuned_sms_spam_detection_en_5.1.4_3.4_1698807193444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_tiny_finetuned_sms_spam_detection","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = BertForSequenceClassification.pretrained("bert_sequence_classifier_tiny_finetuned_sms_spam_detection","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.sms_spam.tiny_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_tiny_finetuned_sms_spam_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-finetuned-sms-spam-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics_en.md new file mode 100644 index 000000000000..b7ab337aa5e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English BertForSequenceClassification Tiny Cased model (from mrm8488) +author: John Snow Labs +name: bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-yahoo_answers_topics` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics_en_5.1.4_3.4_1698811480357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics_en_5.1.4_3.4_1698811480357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = BertForSequenceClassification.pretrained("bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bert.tiny_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_tiny_finetuned_yahoo_answers_topics| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-finetuned-yahoo_answers_topics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_toxicity_ru.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_toxicity_ru.md new file mode 100644 index 000000000000..726fff1b338b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_toxicity_ru.md @@ -0,0 +1,119 @@ +--- +layout: model +title: Toxic content classifier for Russian +author: John Snow Labs +name: bert_sequence_classifier_toxicity +date: 2023-11-01 +tags: [sentiment, bert, sequence, russian, ru, open_source, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +This model was imported from `Hugging Face` and it's been fine-tuned for the Russian language, leveraging `Bert` embeddings and `BertForSequenceClassification` for text classification purposes. + +## Predicted Entities + +`neutral`, `toxic` + +{:.btn-box} +[Live Demo](https://demo.johnsnowlabs.com/public/CLASSIFICATION_RU_TOXIC/){:.button.button-orange} +[Open in Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/CLASSIFICATION_RU_TOXIC.ipynb){:.button.button-orange.button-orange-trans.co.button-icon} +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_toxicity_ru_5.1.4_3.4_1698807535211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_toxicity_ru_5.1.4_3.4_1698807535211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ +.setInputCol('text') \ +.setOutputCol('document') + +tokenizer = Tokenizer() \ +.setInputCols(['document']) \ +.setOutputCol('token') + +sequenceClassifier = BertForSequenceClassification \ +.pretrained('bert_sequence_classifier_toxicity', 'ru') \ +.setInputCols(['token', 'document']) \ +.setOutputCol('class') + +pipeline = Pipeline(stages=[document_assembler, tokenizer, sequenceClassifier]) + +example = spark.createDataFrame([["Ненавижу тебя, идиот."]]).toDF("text") +result = pipeline.fit(example).transform(example) +``` +```scala +val document_assembler = DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val tokenizer = Tokenizer() +.setInputCols("document") +.setOutputCol("token") + +val tokenClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_toxicity", "ru") +.setInputCols("document", "token") +.setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier)) + +val example = Seq.empty["Ненавижу тебя, идиот."].toDS.toDF("text") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ru.classify.toxic").predict("""Ненавижу тебя, идиот.""") +``` +
+ +## Results + +```bash + +['toxic'] +``` + +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_toxicity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## Benchmarking + +```bash + +label precision recall f1-score support +neutral 0.98 0.99 0.98 21384 +toxic 0.94 0.92 0.93 4886 +accuracy - - 0.97 26270 +macro-avg 0.96 0.96 0.96 26270 +weighted-avg 0.97 0.97 0.97 26270 +``` \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_turkish_text_classification_tr.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_turkish_text_classification_tr.md new file mode 100644 index 000000000000..f00fca24f593 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_turkish_text_classification_tr.md @@ -0,0 +1,102 @@ +--- +layout: model +title: Turkish BertForSequenceClassification Cased model (from gurkan08) +author: John Snow Labs +name: bert_sequence_classifier_turkish_text_classification +date: 2023-11-01 +tags: [tr, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-turkish-text-classification` is a Turkish model originally trained by `gurkan08`. + +## Predicted Entities + +`kultur_sanat`, `bilim_teknoloji`, `ekonomi`, `spor`, `egitim`, `saglik` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_turkish_text_classification_tr_5.1.4_3.4_1698807810972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_turkish_text_classification_tr_5.1.4_3.4_1698807810972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_turkish_text_classification","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_turkish_text_classification","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_turkish_text_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|414.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gurkan08/bert-turkish-text-classification +- https://www.trthaber.com/ +- https://github.com/gurkan08/datasets/tree/master/trt_11_category \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tweet_eval_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tweet_eval_emotion_en.md new file mode 100644 index 000000000000..f6f73dd2a010 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_tweet_eval_emotion_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from philschmid) +author: John Snow Labs +name: bert_sequence_classifier_tweet_eval_emotion +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BERT-tweet-eval-emotion` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`1`, `0`, `3`, `2` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tweet_eval_emotion_en_5.1.4_3.4_1698812019239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_tweet_eval_emotion_en_5.1.4_3.4_1698812019239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_tweet_eval_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_tweet_eval_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_tweet_eval_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/philschmid/BERT-tweet-eval-emotion +- https://paperswithcode.com/sota?task=Sentiment+Analysis&dataset=tweeteval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_twitter_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_twitter_sentiment_en.md new file mode 100644 index 000000000000..7c5c8f9cab19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_twitter_sentiment_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForSequenceClassification Cased model (from bgoel4132) +author: John Snow Labs +name: bert_sequence_classifier_twitter_sentiment +date: 2023-11-01 +tags: [en, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twitter-sentiment` is a English model originally trained by `bgoel4132`. + +## Predicted Entities + +`flood`, `tornado`, `medical`, `fire`, `cyclone`, `hurricane`, `pollution`, `earthquake`, `volcano`, `typhoon`, `explosion` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_twitter_sentiment_en_5.1.4_3.4_1698799713257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_twitter_sentiment_en_5.1.4_3.4_1698799713257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_twitter_sentiment","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_twitter_sentiment","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_twitter_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bgoel4132/twitter-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_uzbek_news_category_uz.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_uzbek_news_category_uz.md new file mode 100644 index 000000000000..0425832a638e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sequence_classifier_uzbek_news_category_uz.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Uzbek BertForSequenceClassification Cased model (from coppercitylabs) +author: John Snow Labs +name: bert_sequence_classifier_uzbek_news_category +date: 2023-11-01 +tags: [uz, open_source, bert, sequence_classification, ner, onnx] +task: Named Entity Recognition +language: uz +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `uzbek-news-category-classifier` is a Uzbek model originally trained by `coppercitylabs`. + +## Predicted Entities + +`сиёсат`, `дунё`, `спорт`, `иқтисодиёт`, `фан ва техника`, `шоу-бизнес`, `реклама`, `саломатлик`, `маданият`, `жиноят`, `жамият` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_uzbek_news_category_uz_5.1.4_3.4_1698799971106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sequence_classifier_uzbek_news_category_uz_5.1.4_3.4_1698799971106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_uzbek_news_category","uz") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sequence_classifier_uzbek_news_category","uz") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sequence_classifier_uzbek_news_category| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|uz| +|Size:|409.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/coppercitylabs/uzbek-news-category-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_sst2_en.md new file mode 100644 index 000000000000..935e8e59f907 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_sst2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_sst2 BertForSequenceClassification from tzhao3 +author: John Snow Labs +name: bert_sst2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sst2` is a English model originally trained by tzhao3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sst2_en_5.1.4_3.4_1698815550292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sst2_en_5.1.4_3.4_1698815550292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_sst2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/tzhao3/Bert-SST2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_emotion_intent_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_emotion_intent_en.md new file mode 100644 index 000000000000..32816b6144fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_emotion_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_tiny_emotion_intent BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_tiny_emotion_intent +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_emotion_intent` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_emotion_intent_en_5.1.4_3.4_1698814737401.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_emotion_intent_en_5.1.4_3.4_1698814737401.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_emotion_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_emotion_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_emotion_intent| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/gokuls/BERT-tiny-emotion-intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_finetuned_enron_spam_detection_mrm8488_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_finetuned_enron_spam_detection_mrm8488_en.md new file mode 100644 index 000000000000..1ae0dbd17e48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_finetuned_enron_spam_detection_mrm8488_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_tiny_finetuned_enron_spam_detection_mrm8488 BertForSequenceClassification from mrm8488 +author: John Snow Labs +name: bert_tiny_finetuned_enron_spam_detection_mrm8488 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_enron_spam_detection_mrm8488` is a English model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_enron_spam_detection_mrm8488_en_5.1.4_3.4_1698815131721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_enron_spam_detection_mrm8488_en_5.1.4_3.4_1698815131721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_finetuned_enron_spam_detection_mrm8488","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_finetuned_enron_spam_detection_mrm8488","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_enron_spam_detection_mrm8488| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/mrm8488/bert-tiny-finetuned-enron-spam-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_ita_lemma_classification_it.md b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_ita_lemma_classification_it.md new file mode 100644 index 000000000000..d5aa26d8b1a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_ita_lemma_classification_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian bert_tiny_ita_lemma_classification BertForSequenceClassification from mascIT +author: John Snow Labs +name: bert_tiny_ita_lemma_classification +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_ita_lemma_classification` is a Italian model originally trained by mascIT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_ita_lemma_classification_it_5.1.4_3.4_1698846121649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_ita_lemma_classification_it_5.1.4_3.4_1698846121649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_ita_lemma_classification","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_ita_lemma_classification","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_ita_lemma_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|11.6 MB| + +## References + +https://huggingface.co/mascIT/bert-tiny-ita-lemma-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_mnli_en.md new file mode 100644 index 000000000000..443b36ba422b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_tiny_mnli BertForSequenceClassification from prajjwal1 +author: John Snow Labs +name: bert_tiny_mnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_mnli` is a English model originally trained by prajjwal1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_mnli_en_5.1.4_3.4_1698816448242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_mnli_en_5.1.4_3.4_1698816448242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/prajjwal1/bert-tiny-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_sst2_en.md new file mode 100644 index 000000000000..1c50c108afcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_tiny_sst2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_tiny_sst2 BertForSequenceClassification from gokuls +author: John Snow Labs +name: bert_tiny_sst2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_sst2` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_en_5.1.4_3.4_1698810110260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_sst2_en_5.1.4_3.4_1698810110260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_sst2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_tiny_sst2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/gokuls/BERT-tiny-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_xnli_german_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_xnli_german_classifier_en.md new file mode 100644 index 000000000000..697fec00aff8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_xnli_german_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_xnli_german_classifier BertForSequenceClassification from gayanin +author: John Snow Labs +name: bert_xnli_german_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_xnli_german_classifier` is a English model originally trained by gayanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_xnli_german_classifier_en_5.1.4_3.4_1698805963337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_xnli_german_classifier_en_5.1.4_3.4_1698805963337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_xnli_german_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_xnli_german_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_xnli_german_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/gayanin/bert-xnli-de-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bert_xnli_spanish_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-bert_xnli_spanish_classifier_en.md new file mode 100644 index 000000000000..0bc2ef33653b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bert_xnli_spanish_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_xnli_spanish_classifier BertForSequenceClassification from gayanin +author: John Snow Labs +name: bert_xnli_spanish_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_xnli_spanish_classifier` is a English model originally trained by gayanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_xnli_spanish_classifier_en_5.1.4_3.4_1698813716284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_xnli_spanish_classifier_en_5.1.4_3.4_1698813716284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bert_xnli_spanish_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bert_xnli_spanish_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_xnli_spanish_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/gayanin/bert-xnli-es-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15_pt.md b/docs/_posts/ahmedlone127/2023-11-01-bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15_pt.md new file mode 100644 index 000000000000..b1cb90ad205c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15 BertForSequenceClassification from Luciano +author: John Snow Labs +name: bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15 +date: 2023-11-01 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15` is a Portuguese model originally trained by Luciano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15_pt_5.1.4_3.4_1698807990685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15_pt_5.1.4_3.4_1698807990685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertimbau_base_finetuned_brazilian_court_decisions_bt16_ep15| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/Luciano/bertimbau-base-finetuned-brazilian_court_decisions_bt16_ep15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions_pt.md b/docs/_posts/ahmedlone127/2023-11-01-bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions_pt.md new file mode 100644 index 000000000000..3d418ecf6ef0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions BertForSequenceClassification from Luciano +author: John Snow Labs +name: bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions +date: 2023-11-01 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions` is a Portuguese model originally trained by Luciano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions_pt_5.1.4_3.4_1698815208286.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions_pt_5.1.4_3.4_1698815208286.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertimbau_base_finetuned_lener_breton_finetuned_brazilian_court_decisions| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.1 MB| + +## References + +https://huggingface.co/Luciano/bertimbau-base-finetuned-lener-br-finetuned-brazilian_court_decisions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-beto_ciuo08cl_4d_en.md b/docs/_posts/ahmedlone127/2023-11-01-beto_ciuo08cl_4d_en.md new file mode 100644 index 000000000000..729ae0595f98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-beto_ciuo08cl_4d_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English beto_ciuo08cl_4d BertForSequenceClassification from WIC-Uchile +author: John Snow Labs +name: beto_ciuo08cl_4d +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_ciuo08cl_4d` is a English model originally trained by WIC-Uchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_ciuo08cl_4d_en_5.1.4_3.4_1698805753787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_ciuo08cl_4d_en_5.1.4_3.4_1698805753787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("beto_ciuo08cl_4d","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("beto_ciuo08cl_4d","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_ciuo08cl_4d| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/WIC-Uchile/BETO_CIUO08CL_4D \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-beto_contextualized_hate_speech_es.md b/docs/_posts/ahmedlone127/2023-11-01-beto_contextualized_hate_speech_es.md new file mode 100644 index 000000000000..cc6df39b0756 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-beto_contextualized_hate_speech_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish beto_contextualized_hate_speech BertForSequenceClassification from piuba-bigdata +author: John Snow Labs +name: beto_contextualized_hate_speech +date: 2023-11-01 +tags: [bert, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_contextualized_hate_speech` is a Castilian, Spanish model originally trained by piuba-bigdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_contextualized_hate_speech_es_5.1.4_3.4_1698815373932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_contextualized_hate_speech_es_5.1.4_3.4_1698815373932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("beto_contextualized_hate_speech","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("beto_contextualized_hate_speech","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_contextualized_hate_speech| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.4 MB| + +## References + +https://huggingface.co/piuba-bigdata/beto-contextualized-hate-speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-beto_sentiment_analysis_spanish_es.md b/docs/_posts/ahmedlone127/2023-11-01-beto_sentiment_analysis_spanish_es.md new file mode 100644 index 000000000000..5c4db8478cdb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-beto_sentiment_analysis_spanish_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish beto_sentiment_analysis_spanish BertForSequenceClassification from edumunozsala +author: John Snow Labs +name: beto_sentiment_analysis_spanish +date: 2023-11-01 +tags: [bert, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_sentiment_analysis_spanish` is a Castilian, Spanish model originally trained by edumunozsala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_sentiment_analysis_spanish_es_5.1.4_3.4_1698861672530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_sentiment_analysis_spanish_es_5.1.4_3.4_1698861672530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("beto_sentiment_analysis_spanish","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("beto_sentiment_analysis_spanish","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_sentiment_analysis_spanish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|411.7 MB| + +## References + +https://huggingface.co/edumunozsala/beto_sentiment_analysis_es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bibert_subjectivity_en.md b/docs/_posts/ahmedlone127/2023-11-01-bibert_subjectivity_en.md new file mode 100644 index 000000000000..a9f8d472da54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bibert_subjectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bibert_subjectivity BertForSequenceClassification from HCKLab +author: John Snow Labs +name: bibert_subjectivity +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bibert_subjectivity` is a English model originally trained by HCKLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bibert_subjectivity_en_5.1.4_3.4_1698862578236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bibert_subjectivity_en_5.1.4_3.4_1698862578236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bibert_subjectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bibert_subjectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bibert_subjectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/HCKLab/BiBert-Subjectivity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-binary_question_classifier_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-binary_question_classifier_bert_en.md new file mode 100644 index 000000000000..246143f3bdf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-binary_question_classifier_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English binary_question_classifier_bert BertForSequenceClassification from ndavid +author: John Snow Labs +name: binary_question_classifier_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_question_classifier_bert` is a English model originally trained by ndavid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_question_classifier_bert_en_5.1.4_3.4_1698860889018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_question_classifier_bert_en_5.1.4_3.4_1698860889018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("binary_question_classifier_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("binary_question_classifier_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_question_classifier_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/ndavid/binary-question-classifier-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bio_clinicalbert_finetuned_medicalcondition_en.md b/docs/_posts/ahmedlone127/2023-11-01-bio_clinicalbert_finetuned_medicalcondition_en.md new file mode 100644 index 000000000000..2f345ed83c4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bio_clinicalbert_finetuned_medicalcondition_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bio_clinicalbert_finetuned_medicalcondition BertForSequenceClassification from sid321axn +author: John Snow Labs +name: bio_clinicalbert_finetuned_medicalcondition +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bio_clinicalbert_finetuned_medicalcondition` is a English model originally trained by sid321axn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_finetuned_medicalcondition_en_5.1.4_3.4_1698814499367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_finetuned_medicalcondition_en_5.1.4_3.4_1698814499367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bio_clinicalbert_finetuned_medicalcondition","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bio_clinicalbert_finetuned_medicalcondition","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bio_clinicalbert_finetuned_medicalcondition| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/sid321axn/Bio_ClinicalBERT-finetuned-medicalcondition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bio_clinicalbert_zero_shot_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-bio_clinicalbert_zero_shot_sentiment_model_en.md new file mode 100644 index 000000000000..a4d199907915 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bio_clinicalbert_zero_shot_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bio_clinicalbert_zero_shot_sentiment_model BertForSequenceClassification from okho0653 +author: John Snow Labs +name: bio_clinicalbert_zero_shot_sentiment_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bio_clinicalbert_zero_shot_sentiment_model` is a English model originally trained by okho0653. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_zero_shot_sentiment_model_en_5.1.4_3.4_1698813506029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_zero_shot_sentiment_model_en_5.1.4_3.4_1698813506029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bio_clinicalbert_zero_shot_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bio_clinicalbert_zero_shot_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bio_clinicalbert_zero_shot_sentiment_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/okho0653/Bio_ClinicalBERT-zero-shot-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-biobert_icd10_l3_en.md b/docs/_posts/ahmedlone127/2023-11-01-biobert_icd10_l3_en.md new file mode 100644 index 000000000000..7c92610a49d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-biobert_icd10_l3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biobert_icd10_l3 BertForSequenceClassification from rjac +author: John Snow Labs +name: biobert_icd10_l3 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_icd10_l3` is a English model originally trained by rjac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_icd10_l3_en_5.1.4_3.4_1698804914293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_icd10_l3_en_5.1.4_3.4_1698804914293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biobert_icd10_l3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobert_icd10_l3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_icd10_l3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.1 MB| + +## References + +https://huggingface.co/rjac/biobert-ICD10-L3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-biobert_icd10_l3_mimic_en.md b/docs/_posts/ahmedlone127/2023-11-01-biobert_icd10_l3_mimic_en.md new file mode 100644 index 000000000000..3e550c639b3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-biobert_icd10_l3_mimic_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biobert_icd10_l3_mimic BertForSequenceClassification from rjac +author: John Snow Labs +name: biobert_icd10_l3_mimic +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_icd10_l3_mimic` is a English model originally trained by rjac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_icd10_l3_mimic_en_5.1.4_3.4_1698818952755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_icd10_l3_mimic_en_5.1.4_3.4_1698818952755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biobert_icd10_l3_mimic","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobert_icd10_l3_mimic","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_icd10_l3_mimic| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.8 MB| + +## References + +https://huggingface.co/rjac/biobert-ICD10-L3-mimic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-biobertrelationgenesdiseases_en.md b/docs/_posts/ahmedlone127/2023-11-01-biobertrelationgenesdiseases_en.md new file mode 100644 index 000000000000..0d640a6e8437 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-biobertrelationgenesdiseases_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biobertrelationgenesdiseases BertForSequenceClassification from JacopoBandoni +author: John Snow Labs +name: biobertrelationgenesdiseases +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobertrelationgenesdiseases` is a English model originally trained by JacopoBandoni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobertrelationgenesdiseases_en_5.1.4_3.4_1698820036374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobertrelationgenesdiseases_en_5.1.4_3.4_1698820036374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biobertrelationgenesdiseases","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biobertrelationgenesdiseases","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobertrelationgenesdiseases| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.3 MB| + +## References + +https://huggingface.co/JacopoBandoni/BioBertRelationGenesDiseases \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bioformer_8l_qnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-bioformer_8l_qnli_en.md new file mode 100644 index 000000000000..b1a7d03675b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bioformer_8l_qnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bioformer_8l_qnli BertForSequenceClassification from bioformers +author: John Snow Labs +name: bioformer_8l_qnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bioformer_8l_qnli` is a English model originally trained by bioformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bioformer_8l_qnli_en_5.1.4_3.4_1698815009192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bioformer_8l_qnli_en_5.1.4_3.4_1698815009192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bioformer_8l_qnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bioformer_8l_qnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bioformer_8l_qnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|159.5 MB| + +## References + +https://huggingface.co/bioformers/bioformer-8L-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-biolinkbert_large_mnli_snli_en.md b/docs/_posts/ahmedlone127/2023-11-01-biolinkbert_large_mnli_snli_en.md new file mode 100644 index 000000000000..b9422fd066a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-biolinkbert_large_mnli_snli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biolinkbert_large_mnli_snli BertForSequenceClassification from cnut1648 +author: John Snow Labs +name: biolinkbert_large_mnli_snli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biolinkbert_large_mnli_snli` is a English model originally trained by cnut1648. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biolinkbert_large_mnli_snli_en_5.1.4_3.4_1698812180297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biolinkbert_large_mnli_snli_en_5.1.4_3.4_1698812180297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biolinkbert_large_mnli_snli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biolinkbert_large_mnli_snli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biolinkbert_large_mnli_snli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/cnut1648/biolinkbert-large-mnli-snli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-01-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli_en.md new file mode 100644 index 000000000000..b67fb9e2531a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli BertForSequenceClassification from lighteternal +author: John Snow Labs +name: biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli` is a English model originally trained by lighteternal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli_en_5.1.4_3.4_1698811724194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli_en_5.1.4_3.4_1698811724194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_mnli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/lighteternal/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-finetuned-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification_en.md b/docs/_posts/ahmedlone127/2023-11-01-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification_en.md new file mode 100644 index 000000000000..351e4ad4a667 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification BertForSequenceClassification from Kekelilii +author: John Snow Labs +name: biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification` is a English model originally trained by Kekelilii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification_en_5.1.4_3.4_1698836722487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification_en_5.1.4_3.4_1698836722487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_textclassification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/Kekelilii/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_finetuned_TextClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bleurt_base_512_en.md b/docs/_posts/ahmedlone127/2023-11-01-bleurt_base_512_en.md new file mode 100644 index 000000000000..456554a77c72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bleurt_base_512_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bleurt_base_512 BertForSequenceClassification from Elron +author: John Snow Labs +name: bleurt_base_512 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bleurt_base_512` is a English model originally trained by Elron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bleurt_base_512_en_5.1.4_3.4_1698816807874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bleurt_base_512_en_5.1.4_3.4_1698816807874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_base_512","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_base_512","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bleurt_base_512| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/Elron/bleurt-base-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bleurt_large_128_en.md b/docs/_posts/ahmedlone127/2023-11-01-bleurt_large_128_en.md new file mode 100644 index 000000000000..584b275d15de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bleurt_large_128_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bleurt_large_128 BertForSequenceClassification from Elron +author: John Snow Labs +name: bleurt_large_128 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bleurt_large_128` is a English model originally trained by Elron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bleurt_large_128_en_5.1.4_3.4_1698829427298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bleurt_large_128_en_5.1.4_3.4_1698829427298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_large_128","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_large_128","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bleurt_large_128| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Elron/bleurt-large-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bleurt_tiny_128_en.md b/docs/_posts/ahmedlone127/2023-11-01-bleurt_tiny_128_en.md new file mode 100644 index 000000000000..6f07fcc007a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bleurt_tiny_128_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bleurt_tiny_128 BertForSequenceClassification from Elron +author: John Snow Labs +name: bleurt_tiny_128 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bleurt_tiny_128` is a English model originally trained by Elron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bleurt_tiny_128_en_5.1.4_3.4_1698806050055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bleurt_tiny_128_en_5.1.4_3.4_1698806050055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_tiny_128","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_tiny_128","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bleurt_tiny_128| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/Elron/bleurt-tiny-128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-bleurt_tiny_512_en.md b/docs/_posts/ahmedlone127/2023-11-01-bleurt_tiny_512_en.md new file mode 100644 index 000000000000..56cb573a1a26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-bleurt_tiny_512_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bleurt_tiny_512 BertForSequenceClassification from Elron +author: John Snow Labs +name: bleurt_tiny_512 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bleurt_tiny_512` is a English model originally trained by Elron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bleurt_tiny_512_en_5.1.4_3.4_1698809112093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bleurt_tiny_512_en_5.1.4_3.4_1698809112093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_tiny_512","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("bleurt_tiny_512","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bleurt_tiny_512| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/Elron/bleurt-tiny-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-category_categorization_en.md b/docs/_posts/ahmedlone127/2023-11-01-category_categorization_en.md new file mode 100644 index 000000000000..9c7b9da1398f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-category_categorization_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English category_categorization BertForSequenceClassification from cihan-lyons +author: John Snow Labs +name: category_categorization +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`category_categorization` is a English model originally trained by cihan-lyons. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/category_categorization_en_5.1.4_3.4_1698846572678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/category_categorization_en_5.1.4_3.4_1698846572678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("category_categorization","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("category_categorization","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|category_categorization| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/cihan-lyons/category-categorization \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-category_of_article_bert_nepal_bhasa_en.md b/docs/_posts/ahmedlone127/2023-11-01-category_of_article_bert_nepal_bhasa_en.md new file mode 100644 index 000000000000..f9bb65c72c96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-category_of_article_bert_nepal_bhasa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English category_of_article_bert_nepal_bhasa BertForSequenceClassification from priyabrat +author: John Snow Labs +name: category_of_article_bert_nepal_bhasa +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`category_of_article_bert_nepal_bhasa` is a English model originally trained by priyabrat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/category_of_article_bert_nepal_bhasa_en_5.1.4_3.4_1698840984170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/category_of_article_bert_nepal_bhasa_en_5.1.4_3.4_1698840984170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("category_of_article_bert_nepal_bhasa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("category_of_article_bert_nepal_bhasa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|category_of_article_bert_nepal_bhasa| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/priyabrat/Category_of_article_bert_new \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-chatgpt_detector_roberta_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-01-chatgpt_detector_roberta_chinese_zh.md new file mode 100644 index 000000000000..aacd1d8fa1f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-chatgpt_detector_roberta_chinese_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese chatgpt_detector_roberta_chinese BertForSequenceClassification from Hello-SimpleAI +author: John Snow Labs +name: chatgpt_detector_roberta_chinese +date: 2023-11-01 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt_detector_roberta_chinese` is a Chinese model originally trained by Hello-SimpleAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_detector_roberta_chinese_zh_5.1.4_3.4_1698802892989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_detector_roberta_chinese_zh_5.1.4_3.4_1698802892989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chatgpt_detector_roberta_chinese","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chatgpt_detector_roberta_chinese","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt_detector_roberta_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.2 MB| + +## References + +https://huggingface.co/Hello-SimpleAI/chatgpt-detector-roberta-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-chatgpt_qa_detector_roberta_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-01-chatgpt_qa_detector_roberta_chinese_zh.md new file mode 100644 index 000000000000..4547b5f08a66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-chatgpt_qa_detector_roberta_chinese_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese chatgpt_qa_detector_roberta_chinese BertForSequenceClassification from Hello-SimpleAI +author: John Snow Labs +name: chatgpt_qa_detector_roberta_chinese +date: 2023-11-01 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt_qa_detector_roberta_chinese` is a Chinese model originally trained by Hello-SimpleAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_qa_detector_roberta_chinese_zh_5.1.4_3.4_1698814048085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_qa_detector_roberta_chinese_zh_5.1.4_3.4_1698814048085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chatgpt_qa_detector_roberta_chinese","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chatgpt_qa_detector_roberta_chinese","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt_qa_detector_roberta_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.2 MB| + +## References + +https://huggingface.co/Hello-SimpleAI/chatgpt-qa-detector-roberta-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-chemical_bert_uncased_pharmaceutical_chemical_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-chemical_bert_uncased_pharmaceutical_chemical_classifier_en.md new file mode 100644 index 000000000000..6ed8e2fe65c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-chemical_bert_uncased_pharmaceutical_chemical_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chemical_bert_uncased_pharmaceutical_chemical_classifier BertForSequenceClassification from recobo +author: John Snow Labs +name: chemical_bert_uncased_pharmaceutical_chemical_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chemical_bert_uncased_pharmaceutical_chemical_classifier` is a English model originally trained by recobo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_pharmaceutical_chemical_classifier_en_5.1.4_3.4_1698829715831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chemical_bert_uncased_pharmaceutical_chemical_classifier_en_5.1.4_3.4_1698829715831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chemical_bert_uncased_pharmaceutical_chemical_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chemical_bert_uncased_pharmaceutical_chemical_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chemical_bert_uncased_pharmaceutical_chemical_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.3 MB| + +## References + +https://huggingface.co/recobo/chemical-bert-uncased-pharmaceutical-chemical-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-chinese_macbert_base_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-chinese_macbert_base_text_classification_en.md new file mode 100644 index 000000000000..269663749044 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-chinese_macbert_base_text_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinese_macbert_base_text_classification BertForSequenceClassification from CeroShrijver +author: John Snow Labs +name: chinese_macbert_base_text_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_macbert_base_text_classification` is a English model originally trained by CeroShrijver. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_macbert_base_text_classification_en_5.1.4_3.4_1698827688712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_macbert_base_text_classification_en_5.1.4_3.4_1698827688712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinese_macbert_base_text_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinese_macbert_base_text_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_macbert_base_text_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.3 MB| + +## References + +https://huggingface.co/CeroShrijver/chinese-macbert-base-text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-chinesesequenceclassification_en.md b/docs/_posts/ahmedlone127/2023-11-01-chinesesequenceclassification_en.md new file mode 100644 index 000000000000..ae83dcbbfffb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-chinesesequenceclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chinesesequenceclassification BertForSequenceClassification from LeoFeng +author: John Snow Labs +name: chinesesequenceclassification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinesesequenceclassification` is a English model originally trained by LeoFeng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinesesequenceclassification_en_5.1.4_3.4_1698867629281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinesesequenceclassification_en_5.1.4_3.4_1698867629281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("chinesesequenceclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("chinesesequenceclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinesesequenceclassification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.1 MB| + +## References + +https://huggingface.co/LeoFeng/ChineseSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-clf_bert_e_en.md b/docs/_posts/ahmedlone127/2023-11-01-clf_bert_e_en.md new file mode 100644 index 000000000000..e8e186d5aa94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-clf_bert_e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English clf_bert_e BertForSequenceClassification from nllg +author: John Snow Labs +name: clf_bert_e +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clf_bert_e` is a English model originally trained by nllg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clf_bert_e_en_5.1.4_3.4_1698827140305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clf_bert_e_en_5.1.4_3.4_1698827140305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("clf_bert_e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("clf_bert_e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clf_bert_e| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.1 MB| + +## References + +https://huggingface.co/nllg/clf-bert-e \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-codeswitch_spaeng_sentiment_analysis_lince_es.md b/docs/_posts/ahmedlone127/2023-11-01-codeswitch_spaeng_sentiment_analysis_lince_es.md new file mode 100644 index 000000000000..675a52afd491 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-codeswitch_spaeng_sentiment_analysis_lince_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish codeswitch_spaeng_sentiment_analysis_lince BertForSequenceClassification from sagorsarker +author: John Snow Labs +name: codeswitch_spaeng_sentiment_analysis_lince +date: 2023-11-01 +tags: [bert, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`codeswitch_spaeng_sentiment_analysis_lince` is a Castilian, Spanish model originally trained by sagorsarker. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/codeswitch_spaeng_sentiment_analysis_lince_es_5.1.4_3.4_1698861796448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/codeswitch_spaeng_sentiment_analysis_lince_es_5.1.4_3.4_1698861796448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("codeswitch_spaeng_sentiment_analysis_lince","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("codeswitch_spaeng_sentiment_analysis_lince","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|codeswitch_spaeng_sentiment_analysis_lince| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|667.3 MB| + +## References + +https://huggingface.co/sagorsarker/codeswitch-spaeng-sentiment-analysis-lince \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-comment_detection_prop_16_en.md b/docs/_posts/ahmedlone127/2023-11-01-comment_detection_prop_16_en.md new file mode 100644 index 000000000000..7c4e1cc40761 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-comment_detection_prop_16_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English comment_detection_prop_16 BertForSequenceClassification from ultra-coder54732 +author: John Snow Labs +name: comment_detection_prop_16 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`comment_detection_prop_16` is a English model originally trained by ultra-coder54732. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/comment_detection_prop_16_en_5.1.4_3.4_1698862307459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/comment_detection_prop_16_en_5.1.4_3.4_1698862307459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("comment_detection_prop_16","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("comment_detection_prop_16","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|comment_detection_prop_16| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/ultra-coder54732/comment-detection-prop-16 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-community_sentiment_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-community_sentiment_bert_en.md new file mode 100644 index 000000000000..9a2f995029d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-community_sentiment_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English community_sentiment_bert BertForSequenceClassification from KernAI +author: John Snow Labs +name: community_sentiment_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`community_sentiment_bert` is a English model originally trained by KernAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/community_sentiment_bert_en_5.1.4_3.4_1698805506687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/community_sentiment_bert_en_5.1.4_3.4_1698805506687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("community_sentiment_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("community_sentiment_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|community_sentiment_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/KernAI/community-sentiment-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-convxai_diversity_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-convxai_diversity_model_en.md new file mode 100644 index 000000000000..e544f9421af1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-convxai_diversity_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English convxai_diversity_model BertForSequenceClassification from huashen218 +author: John Snow Labs +name: convxai_diversity_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`convxai_diversity_model` is a English model originally trained by huashen218. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/convxai_diversity_model_en_5.1.4_3.4_1698833043976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/convxai_diversity_model_en_5.1.4_3.4_1698833043976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("convxai_diversity_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("convxai_diversity_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|convxai_diversity_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/huashen218/convxai-diversity-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-core_clinical_diagnosis_prediction_en.md b/docs/_posts/ahmedlone127/2023-11-01-core_clinical_diagnosis_prediction_en.md new file mode 100644 index 000000000000..2aa26dbee5db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-core_clinical_diagnosis_prediction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English core_clinical_diagnosis_prediction BertForSequenceClassification from DATEXIS +author: John Snow Labs +name: core_clinical_diagnosis_prediction +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`core_clinical_diagnosis_prediction` is a English model originally trained by DATEXIS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/core_clinical_diagnosis_prediction_en_5.1.4_3.4_1698804805747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/core_clinical_diagnosis_prediction_en_5.1.4_3.4_1698804805747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("core_clinical_diagnosis_prediction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("core_clinical_diagnosis_prediction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|core_clinical_diagnosis_prediction| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|432.1 MB| + +## References + +https://huggingface.co/DATEXIS/CORe-clinical-diagnosis-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_bert_base_zh.md b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_bert_base_zh.md new file mode 100644 index 000000000000..851db6925357 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_bert_base_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese cross_encoder_bert_base BertForSequenceClassification from tuhailong +author: John Snow Labs +name: cross_encoder_bert_base +date: 2023-11-01 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_bert_base` is a Chinese model originally trained by tuhailong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_bert_base_zh_5.1.4_3.4_1698807125325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_bert_base_zh_5.1.4_3.4_1698807125325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_bert_base","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_bert_base","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_bert_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/tuhailong/cross-encoder-bert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_italian_bert_stsb_it.md b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_italian_bert_stsb_it.md new file mode 100644 index 000000000000..a3c2ff744a78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_italian_bert_stsb_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian cross_encoder_italian_bert_stsb BertForSequenceClassification from nickprock +author: John Snow Labs +name: cross_encoder_italian_bert_stsb +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_italian_bert_stsb` is a Italian model originally trained by nickprock. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_italian_bert_stsb_it_5.1.4_3.4_1698815415692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_italian_bert_stsb_it_5.1.4_3.4_1698815415692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_italian_bert_stsb","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_italian_bert_stsb","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_italian_bert_stsb| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|411.9 MB| + +## References + +https://huggingface.co/nickprock/cross-encoder-italian-bert-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_llamaindex_demo_v2_en.md b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_llamaindex_demo_v2_en.md new file mode 100644 index 000000000000..5d1680088c39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_llamaindex_demo_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cross_encoder_llamaindex_demo_v2 BertForSequenceClassification from bpHigh +author: John Snow Labs +name: cross_encoder_llamaindex_demo_v2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_llamaindex_demo_v2` is a English model originally trained by bpHigh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_llamaindex_demo_v2_en_5.1.4_3.4_1698833310651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_llamaindex_demo_v2_en_5.1.4_3.4_1698833310651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_llamaindex_demo_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_llamaindex_demo_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_llamaindex_demo_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.3 MB| + +## References + +https://huggingface.co/bpHigh/Cross-Encoder-LLamaIndex-Demo-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1_en.md b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1_en.md new file mode 100644 index 000000000000..ff892f99b578 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1 BertForSequenceClassification from ivan-savchuk +author: John Snow Labs +name: cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1` is a English model originally trained by ivan-savchuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1_en_5.1.4_3.4_1698810316912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1_en_5.1.4_3.4_1698810316912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_malay_marco_minilm_l_12_v2_tuned_mediqa_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.2 MB| + +## References + +https://huggingface.co/ivan-savchuk/cross-encoder-ms-marco-MiniLM-L-12-v2-tuned_mediqa-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_portuguese_sentence_similarity_en.md b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_portuguese_sentence_similarity_en.md new file mode 100644 index 000000000000..f1fb32e06ddf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-cross_encoder_portuguese_sentence_similarity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cross_encoder_portuguese_sentence_similarity BertForSequenceClassification from anatel +author: John Snow Labs +name: cross_encoder_portuguese_sentence_similarity +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_portuguese_sentence_similarity` is a English model originally trained by anatel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_portuguese_sentence_similarity_en_5.1.4_3.4_1698808916778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_portuguese_sentence_similarity_en_5.1.4_3.4_1698808916778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_portuguese_sentence_similarity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cross_encoder_portuguese_sentence_similarity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_portuguese_sentence_similarity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/anatel/cross-encoder-pt-sentence-similarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-cryptobert_en.md b/docs/_posts/ahmedlone127/2023-11-01-cryptobert_en.md new file mode 100644 index 000000000000..a978dbecd72f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-cryptobert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cryptobert BertForSequenceClassification from kk08 +author: John Snow Labs +name: cryptobert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cryptobert` is a English model originally trained by kk08. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cryptobert_en_5.1.4_3.4_1698813721975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cryptobert_en_5.1.4_3.4_1698813721975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("cryptobert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("cryptobert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cryptobert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/kk08/CryptoBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-danish_binary_emotion_classification_base_da.md b/docs/_posts/ahmedlone127/2023-11-01-danish_binary_emotion_classification_base_da.md new file mode 100644 index 000000000000..c2ba3cc6e97c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-danish_binary_emotion_classification_base_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish danish_binary_emotion_classification_base BertForSequenceClassification from alexandrainst +author: John Snow Labs +name: danish_binary_emotion_classification_base +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_binary_emotion_classification_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_binary_emotion_classification_base_da_5.1.4_3.4_1698812521342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_binary_emotion_classification_base_da_5.1.4_3.4_1698812521342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("danish_binary_emotion_classification_base","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("danish_binary_emotion_classification_base","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_binary_emotion_classification_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| + +## References + +https://huggingface.co/alexandrainst/da-binary-emotion-classification-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-danish_discourse_coherence_base_da.md b/docs/_posts/ahmedlone127/2023-11-01-danish_discourse_coherence_base_da.md new file mode 100644 index 000000000000..738be12be03f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-danish_discourse_coherence_base_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish danish_discourse_coherence_base BertForSequenceClassification from alexandrainst +author: John Snow Labs +name: danish_discourse_coherence_base +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_discourse_coherence_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_discourse_coherence_base_da_5.1.4_3.4_1698812599768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_discourse_coherence_base_da_5.1.4_3.4_1698812599768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("danish_discourse_coherence_base","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("danish_discourse_coherence_base","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_discourse_coherence_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|668.4 MB| + +## References + +https://huggingface.co/alexandrainst/da-discourse-coherence-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-danish_emotion_classification_base_da.md b/docs/_posts/ahmedlone127/2023-11-01-danish_emotion_classification_base_da.md new file mode 100644 index 000000000000..ec527cd4b2b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-danish_emotion_classification_base_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish danish_emotion_classification_base BertForSequenceClassification from alexandrainst +author: John Snow Labs +name: danish_emotion_classification_base +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_emotion_classification_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_emotion_classification_base_da_5.1.4_3.4_1698805332628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_emotion_classification_base_da_5.1.4_3.4_1698805332628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("danish_emotion_classification_base","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("danish_emotion_classification_base","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_emotion_classification_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| + +## References + +https://huggingface.co/alexandrainst/da-emotion-classification-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-danish_hatespeech_classification_base_da.md b/docs/_posts/ahmedlone127/2023-11-01-danish_hatespeech_classification_base_da.md new file mode 100644 index 000000000000..b316b565ea6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-danish_hatespeech_classification_base_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish danish_hatespeech_classification_base BertForSequenceClassification from alexandrainst +author: John Snow Labs +name: danish_hatespeech_classification_base +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_hatespeech_classification_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_hatespeech_classification_base_da_5.1.4_3.4_1698814449782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_hatespeech_classification_base_da_5.1.4_3.4_1698814449782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("danish_hatespeech_classification_base","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("danish_hatespeech_classification_base","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_hatespeech_classification_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| + +## References + +https://huggingface.co/alexandrainst/da-hatespeech-classification-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-danish_hatespeech_detection_base_da.md b/docs/_posts/ahmedlone127/2023-11-01-danish_hatespeech_detection_base_da.md new file mode 100644 index 000000000000..6083f22266f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-danish_hatespeech_detection_base_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish danish_hatespeech_detection_base BertForSequenceClassification from alexandrainst +author: John Snow Labs +name: danish_hatespeech_detection_base +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_hatespeech_detection_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_hatespeech_detection_base_da_5.1.4_3.4_1698804998192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_hatespeech_detection_base_da_5.1.4_3.4_1698804998192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("danish_hatespeech_detection_base","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("danish_hatespeech_detection_base","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_hatespeech_detection_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| + +## References + +https://huggingface.co/alexandrainst/da-hatespeech-detection-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-danish_sentiment_base_da.md b/docs/_posts/ahmedlone127/2023-11-01-danish_sentiment_base_da.md new file mode 100644 index 000000000000..3c98c5322785 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-danish_sentiment_base_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish danish_sentiment_base BertForSequenceClassification from alexandrainst +author: John Snow Labs +name: danish_sentiment_base +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_sentiment_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_sentiment_base_da_5.1.4_3.4_1698802910210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_sentiment_base_da_5.1.4_3.4_1698802910210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("danish_sentiment_base","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("danish_sentiment_base","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_sentiment_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| + +## References + +https://huggingface.co/alexandrainst/da-sentiment-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-danish_subjectivivity_classification_base_da.md b/docs/_posts/ahmedlone127/2023-11-01-danish_subjectivivity_classification_base_da.md new file mode 100644 index 000000000000..9e0fea42a436 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-danish_subjectivivity_classification_base_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish danish_subjectivivity_classification_base BertForSequenceClassification from alexandrainst +author: John Snow Labs +name: danish_subjectivivity_classification_base +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_subjectivivity_classification_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_subjectivivity_classification_base_da_5.1.4_3.4_1698806475476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_subjectivivity_classification_base_da_5.1.4_3.4_1698806475476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("danish_subjectivivity_classification_base","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("danish_subjectivivity_classification_base","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_subjectivivity_classification_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.6 MB| + +## References + +https://huggingface.co/alexandrainst/da-subjectivivity-classification-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-deepfeatecosystemcrossencoder_en.md b/docs/_posts/ahmedlone127/2023-11-01-deepfeatecosystemcrossencoder_en.md new file mode 100644 index 000000000000..8361562b9842 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-deepfeatecosystemcrossencoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English deepfeatecosystemcrossencoder BertForSequenceClassification from DecisionOptimizationSystem +author: John Snow Labs +name: deepfeatecosystemcrossencoder +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deepfeatecosystemcrossencoder` is a English model originally trained by DecisionOptimizationSystem. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deepfeatecosystemcrossencoder_en_5.1.4_3.4_1698815176677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deepfeatecosystemcrossencoder_en_5.1.4_3.4_1698815176677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("deepfeatecosystemcrossencoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("deepfeatecosystemcrossencoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deepfeatecosystemcrossencoder| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.2 MB| + +## References + +https://huggingface.co/DecisionOptimizationSystem/DeepFeatEcosystemCrossEncoder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-dialect_recognitionv2_en.md b/docs/_posts/ahmedlone127/2023-11-01-dialect_recognitionv2_en.md new file mode 100644 index 000000000000..1051ff9c99d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-dialect_recognitionv2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dialect_recognitionv2 BertForSequenceClassification from asalhi85 +author: John Snow Labs +name: dialect_recognitionv2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dialect_recognitionv2` is a English model originally trained by asalhi85. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dialect_recognitionv2_en_5.1.4_3.4_1698831978131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dialect_recognitionv2_en_5.1.4_3.4_1698831978131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dialect_recognitionv2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dialect_recognitionv2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dialect_recognitionv2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|610.9 MB| + +## References + +https://huggingface.co/asalhi85/Dialect_Recognitionv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-disaster_tweet_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-disaster_tweet_bert_en.md new file mode 100644 index 000000000000..a826742a0b4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-disaster_tweet_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English disaster_tweet_bert BertForSequenceClassification from garynguyen1174 +author: John Snow Labs +name: disaster_tweet_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`disaster_tweet_bert` is a English model originally trained by garynguyen1174. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/disaster_tweet_bert_en_5.1.4_3.4_1698846997493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/disaster_tweet_bert_en_5.1.4_3.4_1698846997493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("disaster_tweet_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("disaster_tweet_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|disaster_tweet_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/garynguyen1174/disaster_tweet_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-distilbert_base_task_multi_label_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-distilbert_base_task_multi_label_classification_en.md new file mode 100644 index 000000000000..78a3b3f6cb70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-distilbert_base_task_multi_label_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_task_multi_label_classification BertForSequenceClassification from LinaSaba +author: John Snow Labs +name: distilbert_base_task_multi_label_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_task_multi_label_classification` is a English model originally trained by LinaSaba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_task_multi_label_classification_en_5.1.4_3.4_1698818773781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_task_multi_label_classification_en_5.1.4_3.4_1698818773781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_task_multi_label_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_task_multi_label_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_task_multi_label_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/LinaSaba/distilbert-base-task-multi-label-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2_en.md b/docs/_posts/ahmedlone127/2023-11-01-distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2_en.md new file mode 100644 index 000000000000..27df4535417d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2 BertForSequenceClassification from bdotloh +author: John Snow Labs +name: distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2` is a English model originally trained by bdotloh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2_en_5.1.4_3.4_1698837848194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2_en_5.1.4_3.4_1698837848194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_go_emotion_empathetic_dialogues_context_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.1 MB| + +## References + +https://huggingface.co/bdotloh/distilbert-base-uncased-go-emotion-empathetic-dialogues-context-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-distilbert_portuguese_cased_finetuned_quantity_en.md b/docs/_posts/ahmedlone127/2023-11-01-distilbert_portuguese_cased_finetuned_quantity_en.md new file mode 100644 index 000000000000..c7df66b52cce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-distilbert_portuguese_cased_finetuned_quantity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_portuguese_cased_finetuned_quantity BertForSequenceClassification from alexia20816 +author: John Snow Labs +name: distilbert_portuguese_cased_finetuned_quantity +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_portuguese_cased_finetuned_quantity` is a English model originally trained by alexia20816. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_portuguese_cased_finetuned_quantity_en_5.1.4_3.4_1698862017696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_portuguese_cased_finetuned_quantity_en_5.1.4_3.4_1698862017696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_portuguese_cased_finetuned_quantity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("distilbert_portuguese_cased_finetuned_quantity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_portuguese_cased_finetuned_quantity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|285.1 MB| + +## References + +https://huggingface.co/alexia20816/distilbert-portuguese-cased-finetuned-quantity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-dr_en.md b/docs/_posts/ahmedlone127/2023-11-01-dr_en.md new file mode 100644 index 000000000000..a67bc200c879 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-dr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dr BertForSequenceClassification from Tianlin668 +author: John Snow Labs +name: dr +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dr` is a English model originally trained by Tianlin668. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dr_en_5.1.4_3.4_1698862561303.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dr_en_5.1.4_3.4_1698862561303.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dr| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.8 MB| + +## References + +https://huggingface.co/Tianlin668/DR \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-dronology_bert_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-01-dronology_bert_uncased_en.md new file mode 100644 index 000000000000..1ab00175335a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-dronology_bert_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dronology_bert_uncased BertForSequenceClassification from SarmadBashir +author: John Snow Labs +name: dronology_bert_uncased +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dronology_bert_uncased` is a English model originally trained by SarmadBashir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dronology_bert_uncased_en_5.1.4_3.4_1698833912228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dronology_bert_uncased_en_5.1.4_3.4_1698833912228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dronology_bert_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dronology_bert_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dronology_bert_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/SarmadBashir/dronology_bert_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-dummy_model_helloworld2307_en.md b/docs/_posts/ahmedlone127/2023-11-01-dummy_model_helloworld2307_en.md new file mode 100644 index 000000000000..c3d8f2e2210a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-dummy_model_helloworld2307_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dummy_model_helloworld2307 BertForSequenceClassification from HelloWorld2307 +author: John Snow Labs +name: dummy_model_helloworld2307 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_model_helloworld2307` is a English model originally trained by HelloWorld2307. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_model_helloworld2307_en_5.1.4_3.4_1698861987207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_model_helloworld2307_en_5.1.4_3.4_1698861987207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dummy_model_helloworld2307","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dummy_model_helloworld2307","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dummy_model_helloworld2307| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/HelloWorld2307/dummy-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-dziribert_sentiment_ar.md b/docs/_posts/ahmedlone127/2023-11-01-dziribert_sentiment_ar.md new file mode 100644 index 000000000000..079907ff06d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-dziribert_sentiment_ar.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Arabic dziribert_sentiment BertForSequenceClassification from alger-ia +author: John Snow Labs +name: dziribert_sentiment +date: 2023-11-01 +tags: [bert, ar, open_source, sequence_classification, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dziribert_sentiment` is a Arabic model originally trained by alger-ia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dziribert_sentiment_ar_5.1.4_3.4_1698804438836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dziribert_sentiment_ar_5.1.4_3.4_1698804438836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("dziribert_sentiment","ar")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("dziribert_sentiment","ar") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dziribert_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|464.8 MB| + +## References + +https://huggingface.co/alger-ia/dziribert_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_base_discriminator_offenseval2019_downsample_en.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_base_discriminator_offenseval2019_downsample_en.md new file mode 100644 index 000000000000..93f4b18364ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_base_discriminator_offenseval2019_downsample_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English ElectraForSequenceClassification Base Cased model (from mohsenfayyaz) +author: John Snow Labs +name: electra_classifier_base_discriminator_offenseval2019_downsample +date: 2023-11-01 +tags: [en, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-base-discriminator-offenseval2019-downsample` is a English model originally trained by `mohsenfayyaz`. + +## Predicted Entities + +`NOT`, `OFF` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_base_discriminator_offenseval2019_downsample_en_5.1.4_3.4_1698805095428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_base_discriminator_offenseval2019_downsample_en_5.1.4_3.4_1698805095428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_base_discriminator_offenseval2019_downsample","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_base_discriminator_offenseval2019_downsample","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.electra.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_base_discriminator_offenseval2019_downsample| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/electra-base-discriminator-offenseval2019-downsample \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_kc_base_bias_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_kc_base_bias_ko.md new file mode 100644 index 000000000000..458c2b8b77f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_kc_base_bias_ko.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Korean ElectraForSequenceClassification Base Cased model (from beomi) +author: John Snow Labs +name: electra_classifier_beep_kc_base_bias +date: 2023-11-01 +tags: [ko, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beep-KcELECTRA-base-bias` is a Korean model originally trained by `beomi`. + +## Predicted Entities + +`none`, `others`, `gender` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_kc_base_bias_ko_5.1.4_3.4_1698812331856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_kc_base_bias_ko_5.1.4_3.4_1698812331856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_beep_kc_base_bias","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_beep_kc_base_bias","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.electra.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_beep_kc_base_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|466.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/beomi/beep-KcELECTRA-base-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_kc_base_hate_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_kc_base_hate_ko.md new file mode 100644 index 000000000000..7bcbeccd25ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_kc_base_hate_ko.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Korean ElectraForSequenceClassification Base Cased model (from beomi) +author: John Snow Labs +name: electra_classifier_beep_kc_base_hate +date: 2023-11-01 +tags: [ko, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beep-KcELECTRA-base-hate` is a Korean model originally trained by `beomi`. + +## Predicted Entities + +`hate`, `none`, `offensive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_kc_base_hate_ko_5.1.4_3.4_1698800228890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_kc_base_hate_ko_5.1.4_3.4_1698800228890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_beep_kc_base_hate","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_beep_kc_base_hate","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.electra.hate.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_beep_kc_base_hate| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|466.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/beomi/beep-KcELECTRA-base-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_korean_base_v3_discriminator_bias_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_korean_base_v3_discriminator_bias_ko.md new file mode 100644 index 000000000000..7f4a08eb9dbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_korean_base_v3_discriminator_bias_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_beep_korean_base_v3_discriminator_bias BertForSequenceClassification from beomi +author: John Snow Labs +name: electra_classifier_beep_korean_base_v3_discriminator_bias +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_beep_korean_base_v3_discriminator_bias` is a Korean model originally trained by beomi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_korean_base_v3_discriminator_bias_ko_5.1.4_3.4_1698798370071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_korean_base_v3_discriminator_bias_ko_5.1.4_3.4_1698798370071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_beep_korean_base_v3_discriminator_bias","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_beep_korean_base_v3_discriminator_bias","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_beep_korean_base_v3_discriminator_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/beomi/beep-koelectra-base-v3-discriminator-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_korean_base_v3_discriminator_hate_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_korean_base_v3_discriminator_hate_ko.md new file mode 100644 index 000000000000..1e6f650db321 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_beep_korean_base_v3_discriminator_hate_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_beep_korean_base_v3_discriminator_hate BertForSequenceClassification from beomi +author: John Snow Labs +name: electra_classifier_beep_korean_base_v3_discriminator_hate +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_beep_korean_base_v3_discriminator_hate` is a Korean model originally trained by beomi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_korean_base_v3_discriminator_hate_ko_5.1.4_3.4_1698800414337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_beep_korean_base_v3_discriminator_hate_ko_5.1.4_3.4_1698800414337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_beep_korean_base_v3_discriminator_hate","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_beep_korean_base_v3_discriminator_hate","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_beep_korean_base_v3_discriminator_hate| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/beomi/beep-koelectra-base-v3-discriminator-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_divehi_small_news_classification_dv.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_divehi_small_news_classification_dv.md new file mode 100644 index 000000000000..adc6ed48125d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_divehi_small_news_classification_dv.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dhivehi, Divehi, Maldivian electra_classifier_divehi_small_news_classification BertForSequenceClassification from ashraq +author: John Snow Labs +name: electra_classifier_divehi_small_news_classification +date: 2023-11-01 +tags: [bert, dv, open_source, sequence_classification, onnx] +task: Text Classification +language: dv +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_divehi_small_news_classification` is a Dhivehi, Divehi, Maldivian model originally trained by ashraq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_divehi_small_news_classification_dv_5.1.4_3.4_1698800542591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_divehi_small_news_classification_dv_5.1.4_3.4_1698800542591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_divehi_small_news_classification","dv")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_divehi_small_news_classification","dv") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_divehi_small_news_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|dv| +|Size:|50.9 MB| + +## References + +https://huggingface.co/ashraq/dv-electra-small-news-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_electricidad_base_finetuned_go_emotions_es.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_electricidad_base_finetuned_go_emotions_es.md new file mode 100644 index 000000000000..b7d8b69e614b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_electricidad_base_finetuned_go_emotions_es.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Spanish ElectraForSequenceClassification Base Cased model (from mrm8488) +author: John Snow Labs +name: electra_classifier_electricidad_base_finetuned_go_emotions +date: 2023-11-01 +tags: [es, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electricidad-base-finetuned-go_emotions-es` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + +`asco`, `deseo`, `remordimiento`, `aprobación`, `gratitud`, `enfado`, `neutral`, `alivio`, `realización`, `molestia`, `dolor`, `sorpresa`, `miedo`, `orgullo`, `decepción`, `admiración`, `amor`, `diversión`, `alegría`, `desaprobación`, `cuidando`, `curiosidad`, `vergüenza`, `excitación`, `optimismo`, `nerviosismo`, `confusión`, `tristeza` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_electricidad_base_finetuned_go_emotions_es_5.1.4_3.4_1698805393920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_electricidad_base_finetuned_go_emotions_es_5.1.4_3.4_1698805393920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_electricidad_base_finetuned_go_emotions","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_electricidad_base_finetuned_go_emotions","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.electra.go_emotions.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_electricidad_base_finetuned_go_emotions| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|410.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/electricidad-base-finetuned-go_emotions-es +- https://paperswithcode.com/sota?task=Text+Classification&dataset=go_emotions-es-mt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_kc_base_bad_sentence_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_kc_base_bad_sentence_ko.md new file mode 100644 index 000000000000..bdb7ee1a35ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_kc_base_bad_sentence_ko.md @@ -0,0 +1,116 @@ +--- +layout: model +title: Korean ElectraForSequenceClassification Base Cased model (from JminJ) +author: John Snow Labs +name: electra_classifier_kc_base_bad_sentence +date: 2023-11-01 +tags: [ko, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kcElectra_base_Bad_Sentence_Classifier` is a Korean model originally trained by `JminJ`. + +## Predicted Entities + +`bad_sen`, `ok_sen` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_kc_base_bad_sentence_ko_5.1.4_3.4_1698805681680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_kc_base_bad_sentence_ko_5.1.4_3.4_1698805681680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_kc_base_bad_sentence","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_kc_base_bad_sentence","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.electra.base.kc.by_jminj").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_kc_base_bad_sentence| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|466.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JminJ/kcElectra_base_Bad_Sentence_Classifier +- https://github.com/smilegate-ai/korean_unsmile_dataset +- https://github.com/kocohub/korean-hate-speech +- https://github.com/Beomi/KcELECTRA +- https://github.com/monologg/KoELECTRA +- https://github.com/JminJ/Bad_text_classifier +- https://github.com/Beomi/KcELECTRA +- https://github.com/monologg/KoELECTRA +- https://github.com/smilegate-ai/korean_unsmile_dataset +- https://github.com/kocohub/korean-hate-speech +- https://arxiv.org/abs/2003.10555 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_bad_sentence_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_bad_sentence_ko.md new file mode 100644 index 000000000000..014b4ed75bc1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_bad_sentence_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_bad_sentence BertForSequenceClassification from JminJ +author: John Snow Labs +name: electra_classifier_korean_base_bad_sentence +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_bad_sentence` is a Korean model originally trained by JminJ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_bad_sentence_ko_5.1.4_3.4_1698800734287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_bad_sentence_ko_5.1.4_3.4_1698800734287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_bad_sentence","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_bad_sentence","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_bad_sentence| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/JminJ/koElectra_base_Bad_Sentence_Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_bias_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_bias_ko.md new file mode 100644 index 000000000000..6cd7360edfb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_bias_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_bias BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_base_bias +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_bias` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_bias_ko_5.1.4_3.4_1698812532553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_bias_ko_5.1.4_3.4_1698812532553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_bias","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_bias","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/monologg/koelectra-base-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_finetuned_nsmc_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_finetuned_nsmc_ko.md new file mode 100644 index 000000000000..1e0373387c59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_finetuned_nsmc_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_finetuned_nsmc BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_base_finetuned_nsmc +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_finetuned_nsmc` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_finetuned_nsmc_ko_5.1.4_3.4_1698798593211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_finetuned_nsmc_ko_5.1.4_3.4_1698798593211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_finetuned_nsmc","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_finetuned_nsmc","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_finetuned_nsmc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|414.4 MB| + +## References + +https://huggingface.co/monologg/koelectra-base-finetuned-nsmc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_finetuned_sentiment_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_finetuned_sentiment_ko.md new file mode 100644 index 000000000000..27a2acd2e0fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_finetuned_sentiment_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_finetuned_sentiment BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_base_finetuned_sentiment +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_finetuned_sentiment` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_finetuned_sentiment_ko_5.1.4_3.4_1698812743439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_finetuned_sentiment_ko_5.1.4_3.4_1698812743439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_finetuned_sentiment","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_finetuned_sentiment","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_finetuned_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|414.4 MB| + +## References + +https://huggingface.co/monologg/koelectra-base-finetuned-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_gender_bias_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_gender_bias_ko.md new file mode 100644 index 000000000000..c959294848d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_gender_bias_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_gender_bias BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_base_gender_bias +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_gender_bias` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_gender_bias_ko_5.1.4_3.4_1698812957562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_gender_bias_ko_5.1.4_3.4_1698812957562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_gender_bias","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_gender_bias","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_gender_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/monologg/koelectra-base-gender-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_bias_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_bias_ko.md new file mode 100644 index 000000000000..9486601bb294 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_bias_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_v3_bias BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_base_v3_bias +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_v3_bias` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_bias_ko_5.1.4_3.4_1698800922156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_bias_ko_5.1.4_3.4_1698800922156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_bias","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_bias","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_v3_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/monologg/koelectra-base-v3-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_gender_bias_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_gender_bias_ko.md new file mode 100644 index 000000000000..e8f4363457b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_gender_bias_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_v3_gender_bias BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_base_v3_gender_bias +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_v3_gender_bias` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_gender_bias_ko_5.1.4_3.4_1698798764702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_gender_bias_ko_5.1.4_3.4_1698798764702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_gender_bias","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_gender_bias","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_v3_gender_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/monologg/koelectra-base-v3-gender-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_generalized_sentiment_analysis_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_generalized_sentiment_analysis_ko.md new file mode 100644 index 000000000000..d707a2f7367b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_generalized_sentiment_analysis_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_v3_generalized_sentiment_analysis BertForSequenceClassification from jaehyeong +author: John Snow Labs +name: electra_classifier_korean_base_v3_generalized_sentiment_analysis +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_v3_generalized_sentiment_analysis` is a Korean model originally trained by jaehyeong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_generalized_sentiment_analysis_ko_5.1.4_3.4_1698805885291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_generalized_sentiment_analysis_ko_5.1.4_3.4_1698805885291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_generalized_sentiment_analysis","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_generalized_sentiment_analysis","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_v3_generalized_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/jaehyeong/koelectra-base-v3-generalized-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_hate_speech_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_hate_speech_ko.md new file mode 100644 index 000000000000..2e08b0dcc700 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_base_v3_hate_speech_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_base_v3_hate_speech BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_base_v3_hate_speech +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_base_v3_hate_speech` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_hate_speech_ko_5.1.4_3.4_1698806072350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_base_v3_hate_speech_ko_5.1.4_3.4_1698806072350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_hate_speech","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_base_v3_hate_speech","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_base_v3_hate_speech| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/monologg/koelectra-base-v3-hate-speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_hatespeech_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_hatespeech_ko.md new file mode 100644 index 000000000000..2f8d5beedbb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_hatespeech_ko.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Korean ElectraForSequenceClassification Cased model (from beomi) +author: John Snow Labs +name: electra_classifier_korean_hatespeech +date: 2023-11-01 +tags: [ko, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `korean-hatespeech-classifier` is a Korean model originally trained by `beomi`. + +## Predicted Entities + +`Offensive`, `None`, `Hate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_hatespeech_ko_5.1.4_3.4_1698808225174.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_hatespeech_ko_5.1.4_3.4_1698808225174.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_korean_hatespeech","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_korean_hatespeech","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.electra.hate.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_hatespeech| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|466.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/beomi/korean-hatespeech-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_hatespeech_multilabel_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_hatespeech_multilabel_ko.md new file mode 100644 index 000000000000..74df8261d6dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_hatespeech_multilabel_ko.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Korean ElectraForSequenceClassification Cased model (from beomi) +author: John Snow Labs +name: electra_classifier_korean_hatespeech_multilabel +date: 2023-11-01 +tags: [ko, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `korean-hatespeech-multilabel` is a Korean model originally trained by `beomi`. + +## Predicted Entities + +`bias_gender`, `offensive`, `bias_others`, `hate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_hatespeech_multilabel_ko_5.1.4_3.4_1698808542460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_hatespeech_multilabel_ko_5.1.4_3.4_1698808542460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_korean_hatespeech_multilabel","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_korean_hatespeech_multilabel","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.electra.hate.by_beomi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_hatespeech_multilabel| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|466.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/beomi/korean-hatespeech-multilabel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_intent_cls_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_intent_cls_ko.md new file mode 100644 index 000000000000..c4c4e32d1ca8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_intent_cls_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_small_finetuned_intent_cls BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_small_finetuned_intent_cls +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_small_finetuned_intent_cls` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_small_finetuned_intent_cls_ko_5.1.4_3.4_1698813246340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_small_finetuned_intent_cls_ko_5.1.4_3.4_1698813246340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_small_finetuned_intent_cls","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_small_finetuned_intent_cls","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_small_finetuned_intent_cls| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|54.1 MB| + +## References + +https://huggingface.co/monologg/koelectra-small-finetuned-intent-cls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_nsmc_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_nsmc_ko.md new file mode 100644 index 000000000000..86464aabbe20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_nsmc_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_small_finetuned_nsmc BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_small_finetuned_nsmc +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_small_finetuned_nsmc` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_small_finetuned_nsmc_ko_5.1.4_3.4_1698813342493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_small_finetuned_nsmc_ko_5.1.4_3.4_1698813342493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_small_finetuned_nsmc","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_small_finetuned_nsmc","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_small_finetuned_nsmc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|51.9 MB| + +## References + +https://huggingface.co/monologg/koelectra-small-finetuned-nsmc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_sentiment_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_sentiment_ko.md new file mode 100644 index 000000000000..ac7e196ffd50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_korean_small_finetuned_sentiment_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_korean_small_finetuned_sentiment BertForSequenceClassification from monologg +author: John Snow Labs +name: electra_classifier_korean_small_finetuned_sentiment +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_korean_small_finetuned_sentiment` is a Korean model originally trained by monologg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_small_finetuned_sentiment_ko_5.1.4_3.4_1698807948068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_korean_small_finetuned_sentiment_ko_5.1.4_3.4_1698807948068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_small_finetuned_sentiment","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_korean_small_finetuned_sentiment","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_korean_small_finetuned_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|51.9 MB| + +## References + +https://huggingface.co/monologg/koelectra-small-finetuned-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_kote_for_easygoing_people_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_kote_for_easygoing_people_ko.md new file mode 100644 index 000000000000..ec8990216454 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_kote_for_easygoing_people_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean ElectraForSequenceClassification Cased model (from searle-j) +author: John Snow Labs +name: electra_classifier_kote_for_easygoing_people +date: 2023-11-01 +tags: [ko, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kote_for_easygoing_people` is a Korean model originally trained by `searle-j`. + +## Predicted Entities + +`깨달음`, `놀람`, `기쁨`, `부담/안_내킴`, `우쭐댐/무시함`, `공포/무서움`, `흐뭇함(귀여움/예쁨)`, `환영/호의`, `부끄러움`, `화남/분노`, `패배/자기혐오`, `귀찮음`, `짜증`, `불쌍함/연민`, `증오/혐오`, `기대감`, `안심/신뢰`, `행복`, `재미없음`, `절망`, `비장함`, `어이없음`, `지긋지긋`, `불평/불만`, `고마움`, `안타까움/실망`, `불안/걱정`, `즐거움/신남`, `한심함`, `뿌듯함`, `슬픔`, `죄책감`, `경악`, `없음`, `역겨움/징그러움`, `힘듦/지침`, `신기함/관심`, `편안/쾌적`, `당황/난처`, `의심/불신`, `감동/감탄`, `아껴주는`, `존경`, `서러움` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_kote_for_easygoing_people_ko_5.1.4_3.4_1698806378075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_kote_for_easygoing_people_ko_5.1.4_3.4_1698806378075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_kote_for_easygoing_people","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_kote_for_easygoing_people","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_kote_for_easygoing_people| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|466.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/searle-j/kote_for_easygoing_people \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_nli_efl_hateval_en.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_nli_efl_hateval_en.md new file mode 100644 index 000000000000..9737d7c9d5cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_nli_efl_hateval_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English ElectraForSequenceClassification Large Cased model (from ChrisZeng) +author: John Snow Labs +name: electra_classifier_large_discriminator_nli_efl_hateval +date: 2023-11-01 +tags: [en, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-large-discriminator-nli-efl-hateval` is a English model originally trained by `ChrisZeng`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_large_discriminator_nli_efl_hateval_en_5.1.4_3.4_1698809142656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_large_discriminator_nli_efl_hateval_en_5.1.4_3.4_1698809142656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_large_discriminator_nli_efl_hateval","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_large_discriminator_nli_efl_hateval","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.electra.hate.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_large_discriminator_nli_efl_hateval| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ChrisZeng/electra-large-discriminator-nli-efl-hateval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_nli_efl_tweeteval_en.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_nli_efl_tweeteval_en.md new file mode 100644 index 000000000000..2f8fe632460e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_nli_efl_tweeteval_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English ElectraForSequenceClassification Large Cased model (from ChrisZeng) +author: John Snow Labs +name: electra_classifier_large_discriminator_nli_efl_tweeteval +date: 2023-11-01 +tags: [en, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-large-discriminator-nli-efl-tweeteval` is a English model originally trained by `ChrisZeng`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_large_discriminator_nli_efl_tweeteval_en_5.1.4_3.4_1698799302207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_large_discriminator_nli_efl_tweeteval_en_5.1.4_3.4_1698799302207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_large_discriminator_nli_efl_tweeteval","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_large_discriminator_nli_efl_tweeteval","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.electra.tweet.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_large_discriminator_nli_efl_tweeteval| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ChrisZeng/electra-large-discriminator-nli-efl-tweeteval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli_en.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli_en.md new file mode 100644 index 000000000000..5fadda3147c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English ElectraForSequenceClassification Large Cased model (from ynie) +author: John Snow Labs +name: electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli +date: 2023-11-01 +tags: [en, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-large-discriminator-snli_mnli_fever_anli_R1_R2_R3-nli` is a English model originally trained by `ynie`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli_en_5.1.4_3.4_1698801658839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli_en_5.1.4_3.4_1698801658839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.electra.fever.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_large_discriminator_snli_mnli_fever_anli_r1_r2_r3_nli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|801.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ynie/electra-large-discriminator-snli_mnli_fever_anli_R1_R2_R3-nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mfma_en.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mfma_en.md new file mode 100644 index 000000000000..12840001d60c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mfma_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English ElectraForSequenceClassification Cased model (from henry931007) +author: John Snow Labs +name: electra_classifier_mfma +date: 2023-11-01 +tags: [en, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mfma` is a English model originally trained by `henry931007`. + +## Predicted Entities + +`entailment`, `not_entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_mfma_en_5.1.4_3.4_1698801939933.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_mfma_en_5.1.4_3.4_1698801939933.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_mfma","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_mfma","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.electra").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_mfma| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/henry931007/mfma \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mindlogic_korean_ai_citizen_base_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mindlogic_korean_ai_citizen_base_ko.md new file mode 100644 index 000000000000..4057f5a9d785 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mindlogic_korean_ai_citizen_base_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_mindlogic_korean_ai_citizen_base BertForSequenceClassification from mindlogic +author: John Snow Labs +name: electra_classifier_mindlogic_korean_ai_citizen_base +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_mindlogic_korean_ai_citizen_base` is a Korean model originally trained by mindlogic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_mindlogic_korean_ai_citizen_base_ko_5.1.4_3.4_1698813533983.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_mindlogic_korean_ai_citizen_base_ko_5.1.4_3.4_1698813533983.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_mindlogic_korean_ai_citizen_base","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_mindlogic_korean_ai_citizen_base","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_mindlogic_korean_ai_citizen_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|466.8 MB| + +## References + +https://huggingface.co/mindlogic/mindlogic-electra-ko-ai-citizen-classifier-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification_es.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification_es.md new file mode 100644 index 000000000000..e911c0611e09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification_es.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Spanish ElectraForSequenceClassification Small Cased model (from mrm8488) +author: John Snow Labs +name: electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification +date: 2023-11-01 +tags: [es, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electricidad-small-finetuned-amazon-review-classification` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + +`⭐⭐⭐`, `⭐⭐`, `⭐⭐⭐⭐`, `⭐`, `⭐⭐⭐⭐⭐` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification_es_5.1.4_3.4_1698809298687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification_es_5.1.4_3.4_1698809298687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_mrm8488_electricidad_small_finetuned_amazon_review_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|51.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/electricidad-small-finetuned-amazon-review-classification +- https://paperswithcode.com/sota?task=Text+Classification&dataset=amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_nsmc_korean_test_model_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_nsmc_korean_test_model_ko.md new file mode 100644 index 000000000000..c738a667e97e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_nsmc_korean_test_model_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean electra_classifier_nsmc_korean_test_model BertForSequenceClassification from JaeCheol +author: John Snow Labs +name: electra_classifier_nsmc_korean_test_model +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_classifier_nsmc_korean_test_model` is a Korean model originally trained by JaeCheol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_nsmc_korean_test_model_ko_5.1.4_3.4_1698799515871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_nsmc_korean_test_model_ko_5.1.4_3.4_1698799515871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_nsmc_korean_test_model","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("electra_classifier_nsmc_korean_test_model","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_nsmc_korean_test_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|421.6 MB| + +## References + +https://huggingface.co/JaeCheol/nsmc_koelectra_test_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_dialog_base_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_dialog_base_turkish_tr.md new file mode 100644 index 000000000000..289a19960d27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_dialog_base_turkish_tr.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Turkish ElectraForSequenceClassification Base Cased model (from Izzet) +author: John Snow Labs +name: electra_classifier_qd_dialog_base_turkish +date: 2023-11-01 +tags: [tr, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qd_dialog_electra-base-turkish` is a Turkish model originally trained by `Izzet`. + +## Predicted Entities + +`NQ`, `Q` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_qd_dialog_base_turkish_tr_5.1.4_3.4_1698802212262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_qd_dialog_base_turkish_tr_5.1.4_3.4_1698802212262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_qd_dialog_base_turkish","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_qd_dialog_base_turkish","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.electra.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_qd_dialog_base_turkish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Izzet/qd_dialog_electra-base-turkish +- https://github.com/izzetkalic/botcuk-dataset-analyze/tree/main/datasets/qd-dialog \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_quora_base_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_quora_base_turkish_tr.md new file mode 100644 index 000000000000..367b997e7f55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_quora_base_turkish_tr.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Turkish ElectraForSequenceClassification Base Cased model (from Izzet) +author: John Snow Labs +name: electra_classifier_qd_quora_base_turkish +date: 2023-11-01 +tags: [tr, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qd_quora_electra-base-turkish` is a Turkish model originally trained by `Izzet`. + +## Predicted Entities + +`NQ`, `Q` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_qd_quora_base_turkish_tr_5.1.4_3.4_1698806617458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_qd_quora_base_turkish_tr_5.1.4_3.4_1698806617458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_qd_quora_base_turkish","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_qd_quora_base_turkish","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.electra.base.by_izzet").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_qd_quora_base_turkish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Izzet/qd_quora_electra-base-turkish +- https://github.com/izzetkalic/botcuk-dataset-analyze/tree/main/datasets/qd-quora \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_tweet_base_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_tweet_base_turkish_tr.md new file mode 100644 index 000000000000..8a947e401a27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_qd_tweet_base_turkish_tr.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Turkish ElectraForSequenceClassification Base Cased model (from Izzet) +author: John Snow Labs +name: electra_classifier_qd_tweet_base_turkish +date: 2023-11-01 +tags: [tr, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qd_tweet_electra-base-turkish` is a Turkish model originally trained by `Izzet`. + +## Predicted Entities + +`OQ`, `NQ`, `FK`, `RQ` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_qd_tweet_base_turkish_tr_5.1.4_3.4_1698802515925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_qd_tweet_base_turkish_tr_5.1.4_3.4_1698802515925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_qd_tweet_base_turkish","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_qd_tweet_base_turkish","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.electra.tweet.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_qd_tweet_base_turkish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Izzet/qd_tweet_electra-base-turkish +- https://github.com/izzetkalic/botcuk-dataset-analyze/tree/main/datasets/qd-tweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_small_finetuned_imdb_en.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_small_finetuned_imdb_en.md new file mode 100644 index 000000000000..889ce2fb32bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_small_finetuned_imdb_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English ElectraForSequenceClassification Small Cased model (from monologg) +author: John Snow Labs +name: electra_classifier_small_finetuned_imdb +date: 2023-11-01 +tags: [en, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-small-finetuned-imdb` is a English model originally trained by `monologg`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_small_finetuned_imdb_en_5.1.4_3.4_1698799641985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_small_finetuned_imdb_en_5.1.4_3.4_1698799641985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_small_finetuned_imdb","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_small_finetuned_imdb","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.electra.imdb.small_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_small_finetuned_imdb| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|51.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/monologg/electra-small-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_tunib_base_bad_sentence_ko.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_tunib_base_bad_sentence_ko.md new file mode 100644 index 000000000000..c8f039f195f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_tunib_base_bad_sentence_ko.md @@ -0,0 +1,116 @@ +--- +layout: model +title: Korean ElectraForSequenceClassification Base Cased model (from JminJ) +author: John Snow Labs +name: electra_classifier_tunib_base_bad_sentence +date: 2023-11-01 +tags: [ko, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tunibElectra_base_Bad_Sentence_Classifier` is a Korean model originally trained by `JminJ`. + +## Predicted Entities + +`ok_sen`, `bad_sen` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_tunib_base_bad_sentence_ko_5.1.4_3.4_1698806932999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_tunib_base_bad_sentence_ko_5.1.4_3.4_1698806932999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_tunib_base_bad_sentence","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_tunib_base_bad_sentence","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.electra.tunib.base.by_jminj").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_tunib_base_bad_sentence| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|414.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JminJ/tunibElectra_base_Bad_Sentence_Classifier +- https://github.com/smilegate-ai/korean_unsmile_dataset +- https://github.com/kocohub/korean-hate-speech +- https://github.com/Beomi/KcELECTRA +- https://github.com/monologg/KoELECTRA +- https://github.com/JminJ/Bad_text_classifier +- https://github.com/Beomi/KcELECTRA +- https://github.com/monologg/KoELECTRA +- https://github.com/smilegate-ai/korean_unsmile_dataset +- https://github.com/kocohub/korean-hate-speech +- https://arxiv.org/abs/2003.10555 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_turkish_sentiment_analysis_tr.md b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_turkish_sentiment_analysis_tr.md new file mode 100644 index 000000000000..7fbef2f26cef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-electra_classifier_turkish_sentiment_analysis_tr.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Turkish ElectraForSequenceClassification Cased model (from kuzgunlar) +author: John Snow Labs +name: electra_classifier_turkish_sentiment_analysis +date: 2023-11-01 +tags: [tr, open_source, electra, sequence_classification, classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained ElectraForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-turkish-sentiment-analysis` is a Turkish model originally trained by `kuzgunlar`. + +## Predicted Entities + +`Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_classifier_turkish_sentiment_analysis_tr_5.1.4_3.4_1698807200721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_classifier_turkish_sentiment_analysis_tr_5.1.4_3.4_1698807200721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_turkish_sentiment_analysis","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = BertForSequenceClassification.pretrained("electra_classifier_turkish_sentiment_analysis","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.electra.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_classifier_turkish_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/kuzgunlar/electra-turkish-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-english_abusive_muril_en.md b/docs/_posts/ahmedlone127/2023-11-01-english_abusive_muril_en.md new file mode 100644 index 000000000000..a39da4bcb28c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-english_abusive_muril_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English english_abusive_muril BertForSequenceClassification from Hate-speech-CNERG +author: John Snow Labs +name: english_abusive_muril +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_abusive_muril` is a English model originally trained by Hate-speech-CNERG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_abusive_muril_en_5.1.4_3.4_1698813113328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_abusive_muril_en_5.1.4_3.4_1698813113328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("english_abusive_muril","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("english_abusive_muril","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_abusive_muril| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|892.6 MB| + +## References + +https://huggingface.co/Hate-speech-CNERG/english-abusive-MuRIL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-english_grammar_checker_en.md b/docs/_posts/ahmedlone127/2023-11-01-english_grammar_checker_en.md new file mode 100644 index 000000000000..279f607f4a61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-english_grammar_checker_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English english_grammar_checker BertForSequenceClassification from abdulmatinomotoso +author: John Snow Labs +name: english_grammar_checker +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_grammar_checker` is a English model originally trained by abdulmatinomotoso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_grammar_checker_en_5.1.4_3.4_1698838765796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_grammar_checker_en_5.1.4_3.4_1698838765796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("english_grammar_checker","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("english_grammar_checker","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_grammar_checker| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/abdulmatinomotoso/English_Grammar_Checker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-english_sarcasm_detector_en.md b/docs/_posts/ahmedlone127/2023-11-01-english_sarcasm_detector_en.md new file mode 100644 index 000000000000..bdb7253b28be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-english_sarcasm_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English english_sarcasm_detector BertForSequenceClassification from helinivan +author: John Snow Labs +name: english_sarcasm_detector +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_sarcasm_detector` is a English model originally trained by helinivan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_sarcasm_detector_en_5.1.4_3.4_1698815777502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_sarcasm_detector_en_5.1.4_3.4_1698815777502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("english_sarcasm_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("english_sarcasm_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_sarcasm_detector| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/helinivan/english-sarcasm-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-ernie_finetuned_qqp_en.md b/docs/_posts/ahmedlone127/2023-11-01-ernie_finetuned_qqp_en.md new file mode 100644 index 000000000000..770ef47f58d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-ernie_finetuned_qqp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ernie_finetuned_qqp BertForSequenceClassification from rajiv003 +author: John Snow Labs +name: ernie_finetuned_qqp +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ernie_finetuned_qqp` is a English model originally trained by rajiv003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ernie_finetuned_qqp_en_5.1.4_3.4_1698810030280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ernie_finetuned_qqp_en_5.1.4_3.4_1698810030280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("ernie_finetuned_qqp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("ernie_finetuned_qqp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ernie_finetuned_qqp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.7 MB| + +## References + +https://huggingface.co/rajiv003/ernie-finetuned-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-factcc_en.md b/docs/_posts/ahmedlone127/2023-11-01-factcc_en.md new file mode 100644 index 000000000000..9862497d5e35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-factcc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English factcc BertForSequenceClassification from manueldeprada +author: John Snow Labs +name: factcc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`factcc` is a English model originally trained by manueldeprada. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/factcc_en_5.1.4_3.4_1698834824541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/factcc_en_5.1.4_3.4_1698834824541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("factcc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("factcc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|factcc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/manueldeprada/FactCC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-feather_berts_0_connectivity_en.md b/docs/_posts/ahmedlone127/2023-11-01-feather_berts_0_connectivity_en.md new file mode 100644 index 000000000000..2e7cf1b0d837 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-feather_berts_0_connectivity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_0_connectivity BertForSequenceClassification from connectivity +author: John Snow Labs +name: feather_berts_0_connectivity +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_0_connectivity` is a English model originally trained by connectivity. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_0_connectivity_en_5.1.4_3.4_1698821929730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_0_connectivity_en_5.1.4_3.4_1698821929730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_0_connectivity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_0_connectivity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_0_connectivity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/connectivity/feather_berts_0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-feather_berts_46_jeevesh8_en.md b/docs/_posts/ahmedlone127/2023-11-01-feather_berts_46_jeevesh8_en.md new file mode 100644 index 000000000000..c2d347cbd2d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-feather_berts_46_jeevesh8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English feather_berts_46_jeevesh8 BertForSequenceClassification from Jeevesh8 +author: John Snow Labs +name: feather_berts_46_jeevesh8 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`feather_berts_46_jeevesh8` is a English model originally trained by Jeevesh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/feather_berts_46_jeevesh8_en_5.1.4_3.4_1698834486143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/feather_berts_46_jeevesh8_en_5.1.4_3.4_1698834486143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_46_jeevesh8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("feather_berts_46_jeevesh8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|feather_berts_46_jeevesh8| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Jeevesh8/feather_berts_46 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-final_project_finetuned_bert_base_multilingual_cased_english_xx.md b/docs/_posts/ahmedlone127/2023-11-01-final_project_finetuned_bert_base_multilingual_cased_english_xx.md new file mode 100644 index 000000000000..99e7149c8f2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-final_project_finetuned_bert_base_multilingual_cased_english_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual final_project_finetuned_bert_base_multilingual_cased_english BertForSequenceClassification from Worgu +author: John Snow Labs +name: final_project_finetuned_bert_base_multilingual_cased_english +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`final_project_finetuned_bert_base_multilingual_cased_english` is a Multilingual model originally trained by Worgu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/final_project_finetuned_bert_base_multilingual_cased_english_xx_5.1.4_3.4_1698824699421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/final_project_finetuned_bert_base_multilingual_cased_english_xx_5.1.4_3.4_1698824699421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("final_project_finetuned_bert_base_multilingual_cased_english","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("final_project_finetuned_bert_base_multilingual_cased_english","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|final_project_finetuned_bert_base_multilingual_cased_english| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/Worgu/Final_Project_finetuned_bert-base-multilingual-cased_english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finance_sentiment_chinese_base_zh.md b/docs/_posts/ahmedlone127/2023-11-01-finance_sentiment_chinese_base_zh.md new file mode 100644 index 000000000000..d082455002af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finance_sentiment_chinese_base_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese finance_sentiment_chinese_base BertForSequenceClassification from bardsai +author: John Snow Labs +name: finance_sentiment_chinese_base +date: 2023-11-01 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_sentiment_chinese_base` is a Chinese model originally trained by bardsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_sentiment_chinese_base_zh_5.1.4_3.4_1698809545296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_sentiment_chinese_base_zh_5.1.4_3.4_1698809545296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finance_sentiment_chinese_base","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finance_sentiment_chinese_base","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_sentiment_chinese_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/bardsai/finance-sentiment-zh-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finance_sentiment_german_base_de.md b/docs/_posts/ahmedlone127/2023-11-01-finance_sentiment_german_base_de.md new file mode 100644 index 000000000000..cd2856420678 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finance_sentiment_german_base_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German finance_sentiment_german_base BertForSequenceClassification from bardsai +author: John Snow Labs +name: finance_sentiment_german_base +date: 2023-11-01 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_sentiment_german_base` is a German model originally trained by bardsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_sentiment_german_base_de_5.1.4_3.4_1698830420101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_sentiment_german_base_de_5.1.4_3.4_1698830420101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finance_sentiment_german_base","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finance_sentiment_german_base","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_sentiment_german_base| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| + +## References + +https://huggingface.co/bardsai/finance-sentiment-de-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finbert_esg_9_categories_en.md b/docs/_posts/ahmedlone127/2023-11-01-finbert_esg_9_categories_en.md new file mode 100644 index 000000000000..a6a45037b891 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finbert_esg_9_categories_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finbert_esg_9_categories BertForSequenceClassification from yiyanghkust +author: John Snow Labs +name: finbert_esg_9_categories +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_esg_9_categories` is a English model originally trained by yiyanghkust. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_esg_9_categories_en_5.1.4_3.4_1698814527230.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_esg_9_categories_en_5.1.4_3.4_1698814527230.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finbert_esg_9_categories","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finbert_esg_9_categories","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_esg_9_categories| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/yiyanghkust/finbert-esg-9-categories \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finbert_fomc_en.md b/docs/_posts/ahmedlone127/2023-11-01-finbert_fomc_en.md new file mode 100644 index 000000000000..2b1d940305c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finbert_fomc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finbert_fomc BertForSequenceClassification from ZiweiChen +author: John Snow Labs +name: finbert_fomc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_fomc` is a English model originally trained by ZiweiChen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_fomc_en_5.1.4_3.4_1698862556393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_fomc_en_5.1.4_3.4_1698862556393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finbert_fomc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finbert_fomc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_fomc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.6 MB| + +## References + +https://huggingface.co/ZiweiChen/FinBERT-FOMC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finbert_tone_finetuned_fintwitter_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-finbert_tone_finetuned_fintwitter_classification_en.md new file mode 100644 index 000000000000..ef7f5ae55556 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finbert_tone_finetuned_fintwitter_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finbert_tone_finetuned_fintwitter_classification BertForSequenceClassification from nickmuchi +author: John Snow Labs +name: finbert_tone_finetuned_fintwitter_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_tone_finetuned_fintwitter_classification` is a English model originally trained by nickmuchi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_tone_finetuned_fintwitter_classification_en_5.1.4_3.4_1698821416440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_tone_finetuned_fintwitter_classification_en_5.1.4_3.4_1698821416440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finbert_tone_finetuned_fintwitter_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finbert_tone_finetuned_fintwitter_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_tone_finetuned_fintwitter_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.6 MB| + +## References + +https://huggingface.co/nickmuchi/finbert-tone-finetuned-fintwitter-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finbertptbr_pt.md b/docs/_posts/ahmedlone127/2023-11-01-finbertptbr_pt.md new file mode 100644 index 000000000000..192680ca4eac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finbertptbr_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese finbertptbr BertForSequenceClassification from turing-usp +author: John Snow Labs +name: finbertptbr +date: 2023-11-01 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbertptbr` is a Portuguese model originally trained by turing-usp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbertptbr_pt_5.1.4_3.4_1698817850799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbertptbr_pt_5.1.4_3.4_1698817850799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finbertptbr","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finbertptbr","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbertptbr| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/turing-usp/FinBertPTBR \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-fine_tune_chinese_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-01-fine_tune_chinese_sentiment_en.md new file mode 100644 index 000000000000..0027e5fe86a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-fine_tune_chinese_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tune_chinese_sentiment BertForSequenceClassification from DavidLanz +author: John Snow Labs +name: fine_tune_chinese_sentiment +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tune_chinese_sentiment` is a English model originally trained by DavidLanz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tune_chinese_sentiment_en_5.1.4_3.4_1698818916544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tune_chinese_sentiment_en_5.1.4_3.4_1698818916544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_chinese_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tune_chinese_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tune_chinese_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.0 MB| + +## References + +https://huggingface.co/DavidLanz/fine_tune_chinese_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-fine_tuned_indonesian_sentiment_classifier_id.md b/docs/_posts/ahmedlone127/2023-11-01-fine_tuned_indonesian_sentiment_classifier_id.md new file mode 100644 index 000000000000..2c62ab971be3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-fine_tuned_indonesian_sentiment_classifier_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian fine_tuned_indonesian_sentiment_classifier BertForSequenceClassification from hanifnoerr +author: John Snow Labs +name: fine_tuned_indonesian_sentiment_classifier +date: 2023-11-01 +tags: [bert, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_indonesian_sentiment_classifier` is a Indonesian model originally trained by hanifnoerr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_indonesian_sentiment_classifier_id_5.1.4_3.4_1698802700117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_indonesian_sentiment_classifier_id_5.1.4_3.4_1698802700117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_indonesian_sentiment_classifier","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("fine_tuned_indonesian_sentiment_classifier","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_indonesian_sentiment_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|466.4 MB| + +## References + +https://huggingface.co/hanifnoerr/Fine-tuned-Indonesian-Sentiment-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetunebertclsfaq_en.md b/docs/_posts/ahmedlone127/2023-11-01-finetunebertclsfaq_en.md new file mode 100644 index 000000000000..a251787d4116 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetunebertclsfaq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetunebertclsfaq BertForSequenceClassification from Joe99 +author: John Snow Labs +name: finetunebertclsfaq +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetunebertclsfaq` is a English model originally trained by Joe99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetunebertclsfaq_en_5.1.4_3.4_1698814620360.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetunebertclsfaq_en_5.1.4_3.4_1698814620360.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetunebertclsfaq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetunebertclsfaq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetunebertclsfaq| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/Joe99/FinetuneBERTClsFAQ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuned_bert_base_on_iemocap_1_en.md b/docs/_posts/ahmedlone127/2023-11-01-finetuned_bert_base_on_iemocap_1_en.md new file mode 100644 index 000000000000..e9a1b57e2386 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuned_bert_base_on_iemocap_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_bert_base_on_iemocap_1 BertForSequenceClassification from minoosh +author: John Snow Labs +name: finetuned_bert_base_on_iemocap_1 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_on_iemocap_1` is a English model originally trained by minoosh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_iemocap_1_en_5.1.4_3.4_1698810207802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_on_iemocap_1_en_5.1.4_3.4_1698810207802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bert_base_on_iemocap_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bert_base_on_iemocap_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_on_iemocap_1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/minoosh/finetuned_bert-base-on-IEMOCAP_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuned_bertu_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-01-finetuned_bertu_sentiment_en.md new file mode 100644 index 000000000000..1c81b220c2e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuned_bertu_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_bertu_sentiment BertForSequenceClassification from DGurgurov +author: John Snow Labs +name: finetuned_bertu_sentiment +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bertu_sentiment` is a English model originally trained by DGurgurov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bertu_sentiment_en_5.1.4_3.4_1698816110396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bertu_sentiment_en_5.1.4_3.4_1698816110396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bertu_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bertu_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bertu_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|470.9 MB| + +## References + +https://huggingface.co/DGurgurov/finetuned-bertu-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuned_bleurt_large_en.md b/docs/_posts/ahmedlone127/2023-11-01-finetuned_bleurt_large_en.md new file mode 100644 index 000000000000..51588b1dce5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuned_bleurt_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_bleurt_large BertForSequenceClassification from vaiibhavgupta +author: John Snow Labs +name: finetuned_bleurt_large +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bleurt_large` is a English model originally trained by vaiibhavgupta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bleurt_large_en_5.1.4_3.4_1698845118896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bleurt_large_en_5.1.4_3.4_1698845118896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bleurt_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_bleurt_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bleurt_large| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/vaiibhavgupta/finetuned-bleurt-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_donlapark_en.md b/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_donlapark_en.md new file mode 100644 index 000000000000..7743a08b460c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_donlapark_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_yelp_donlapark BertForSequenceClassification from Donlapark +author: John Snow Labs +name: finetuned_yelp_donlapark +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_yelp_donlapark` is a English model originally trained by Donlapark. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_yelp_donlapark_en_5.1.4_3.4_1698807859648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_yelp_donlapark_en_5.1.4_3.4_1698807859648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_yelp_donlapark","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_yelp_donlapark","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_yelp_donlapark| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Donlapark/finetuned_yelp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_oatnapat_en.md b/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_oatnapat_en.md new file mode 100644 index 000000000000..eb0dc8ecd024 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_oatnapat_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_yelp_oatnapat BertForSequenceClassification from OatNapat +author: John Snow Labs +name: finetuned_yelp_oatnapat +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_yelp_oatnapat` is a English model originally trained by OatNapat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_yelp_oatnapat_en_5.1.4_3.4_1698812604334.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_yelp_oatnapat_en_5.1.4_3.4_1698812604334.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_yelp_oatnapat","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_yelp_oatnapat","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_yelp_oatnapat| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/OatNapat/finetuned_yelp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_shelterrp_en.md b/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_shelterrp_en.md new file mode 100644 index 000000000000..8011cc3c1a3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuned_yelp_shelterrp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_yelp_shelterrp BertForSequenceClassification from shelterrp +author: John Snow Labs +name: finetuned_yelp_shelterrp +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_yelp_shelterrp` is a English model originally trained by shelterrp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_yelp_shelterrp_en_5.1.4_3.4_1698808353966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_yelp_shelterrp_en_5.1.4_3.4_1698808353966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_yelp_shelterrp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuned_yelp_shelterrp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_yelp_shelterrp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/shelterrp/finetuned_yelp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuning_multilingual_bert_other_xx.md b/docs/_posts/ahmedlone127/2023-11-01-finetuning_multilingual_bert_other_xx.md new file mode 100644 index 000000000000..c26927b87045 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuning_multilingual_bert_other_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual finetuning_multilingual_bert_other BertForSequenceClassification from charisma-entertainment +author: John Snow Labs +name: finetuning_multilingual_bert_other +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_multilingual_bert_other` is a Multilingual model originally trained by charisma-entertainment. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_multilingual_bert_other_xx_5.1.4_3.4_1698870485408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_multilingual_bert_other_xx_5.1.4_3.4_1698870485408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_multilingual_bert_other","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_multilingual_bert_other","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_multilingual_bert_other| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/charisma-entertainment/finetuning-multilingual-bert-other \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-finetuning_multilingual_bert_thumbs_xx.md b/docs/_posts/ahmedlone127/2023-11-01-finetuning_multilingual_bert_thumbs_xx.md new file mode 100644 index 000000000000..b3af2f49943a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-finetuning_multilingual_bert_thumbs_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual finetuning_multilingual_bert_thumbs BertForSequenceClassification from charisma-entertainment +author: John Snow Labs +name: finetuning_multilingual_bert_thumbs +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_multilingual_bert_thumbs` is a Multilingual model originally trained by charisma-entertainment. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_multilingual_bert_thumbs_xx_5.1.4_3.4_1698814568354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_multilingual_bert_thumbs_xx_5.1.4_3.4_1698814568354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_multilingual_bert_thumbs","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("finetuning_multilingual_bert_thumbs","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_multilingual_bert_thumbs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/charisma-entertainment/finetuning-multilingual-bert-thumbs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-frenchmedmcqa_biobert_v1_1_wikipedia_bm25_fr.md b/docs/_posts/ahmedlone127/2023-11-01-frenchmedmcqa_biobert_v1_1_wikipedia_bm25_fr.md new file mode 100644 index 000000000000..498cd37d8977 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-frenchmedmcqa_biobert_v1_1_wikipedia_bm25_fr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: French frenchmedmcqa_biobert_v1_1_wikipedia_bm25 BertForSequenceClassification from qanastek +author: John Snow Labs +name: frenchmedmcqa_biobert_v1_1_wikipedia_bm25 +date: 2023-11-01 +tags: [bert, fr, open_source, sequence_classification, onnx] +task: Text Classification +language: fr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frenchmedmcqa_biobert_v1_1_wikipedia_bm25` is a French model originally trained by qanastek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frenchmedmcqa_biobert_v1_1_wikipedia_bm25_fr_5.1.4_3.4_1698815025255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frenchmedmcqa_biobert_v1_1_wikipedia_bm25_fr_5.1.4_3.4_1698815025255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("frenchmedmcqa_biobert_v1_1_wikipedia_bm25","fr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("frenchmedmcqa_biobert_v1_1_wikipedia_bm25","fr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frenchmedmcqa_biobert_v1_1_wikipedia_bm25| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fr| +|Size:|405.4 MB| + +## References + +https://huggingface.co/qanastek/FrenchMedMCQA-BioBERT-V1.1-Wikipedia-BM25 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-frugalscore_medium_bert_base_mover_score_en.md b/docs/_posts/ahmedlone127/2023-11-01-frugalscore_medium_bert_base_mover_score_en.md new file mode 100644 index 000000000000..b233e1020b40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-frugalscore_medium_bert_base_mover_score_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English frugalscore_medium_bert_base_mover_score BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_medium_bert_base_mover_score +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_medium_bert_base_mover_score` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_medium_bert_base_mover_score_en_5.1.4_3.4_1698861056839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_medium_bert_base_mover_score_en_5.1.4_3.4_1698861056839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_medium_bert_base_mover_score","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_medium_bert_base_mover_score","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_medium_bert_base_mover_score| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|155.2 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_medium_bert-base_mover-score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-frugalscore_tiny_bert_base_bert_score_en.md b/docs/_posts/ahmedlone127/2023-11-01-frugalscore_tiny_bert_base_bert_score_en.md new file mode 100644 index 000000000000..17e029aad531 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-frugalscore_tiny_bert_base_bert_score_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English frugalscore_tiny_bert_base_bert_score BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_tiny_bert_base_bert_score +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_tiny_bert_base_bert_score` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_bert_base_bert_score_en_5.1.4_3.4_1698807771479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_bert_base_bert_score_en_5.1.4_3.4_1698807771479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_tiny_bert_base_bert_score","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_tiny_bert_base_bert_score","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_tiny_bert_base_bert_score| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_tiny_bert-base_bert-score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-frugalscore_tiny_bert_base_mover_score_en.md b/docs/_posts/ahmedlone127/2023-11-01-frugalscore_tiny_bert_base_mover_score_en.md new file mode 100644 index 000000000000..cb4e0a8df9ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-frugalscore_tiny_bert_base_mover_score_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English frugalscore_tiny_bert_base_mover_score BertForSequenceClassification from moussaKam +author: John Snow Labs +name: frugalscore_tiny_bert_base_mover_score +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`frugalscore_tiny_bert_base_mover_score` is a English model originally trained by moussaKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_bert_base_mover_score_en_5.1.4_3.4_1698861181413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/frugalscore_tiny_bert_base_mover_score_en_5.1.4_3.4_1698861181413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_tiny_bert_base_mover_score","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("frugalscore_tiny_bert_base_mover_score","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|frugalscore_tiny_bert_base_mover_score| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/moussaKam/frugalscore_tiny_bert-base_mover-score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-german_covid_vaccine_misinformation_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-german_covid_vaccine_misinformation_classifier_en.md new file mode 100644 index 000000000000..2a506aedb7da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-german_covid_vaccine_misinformation_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English german_covid_vaccine_misinformation_classifier BertForSequenceClassification from Ghunghru +author: John Snow Labs +name: german_covid_vaccine_misinformation_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`german_covid_vaccine_misinformation_classifier` is a English model originally trained by Ghunghru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/german_covid_vaccine_misinformation_classifier_en_5.1.4_3.4_1698862789759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/german_covid_vaccine_misinformation_classifier_en_5.1.4_3.4_1698862789759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("german_covid_vaccine_misinformation_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("german_covid_vaccine_misinformation_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|german_covid_vaccine_misinformation_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.1 MB| + +## References + +https://huggingface.co/Ghunghru/german_covid_vaccine_misinformation_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-gk_hinglish_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-01-gk_hinglish_sentiment_en.md new file mode 100644 index 000000000000..4a51bceb438e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-gk_hinglish_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gk_hinglish_sentiment BertForSequenceClassification from ganeshkharad +author: John Snow Labs +name: gk_hinglish_sentiment +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gk_hinglish_sentiment` is a English model originally trained by ganeshkharad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gk_hinglish_sentiment_en_5.1.4_3.4_1698814293283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gk_hinglish_sentiment_en_5.1.4_3.4_1698814293283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("gk_hinglish_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("gk_hinglish_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gk_hinglish_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/ganeshkharad/gk-hinglish-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-grammer_classiffication_en.md b/docs/_posts/ahmedlone127/2023-11-01-grammer_classiffication_en.md new file mode 100644 index 000000000000..fde9dc37bfdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-grammer_classiffication_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English grammer_classiffication BertForSequenceClassification from Vaibhavbrkn +author: John Snow Labs +name: grammer_classiffication +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`grammer_classiffication` is a English model originally trained by Vaibhavbrkn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/grammer_classiffication_en_5.1.4_3.4_1698812055645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/grammer_classiffication_en_5.1.4_3.4_1698812055645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("grammer_classiffication","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("grammer_classiffication","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|grammer_classiffication| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Vaibhavbrkn/grammer_classiffication \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-guardrail_en.md b/docs/_posts/ahmedlone127/2023-11-01-guardrail_en.md new file mode 100644 index 000000000000..3f6b52aa79bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-guardrail_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English guardrail BertForSequenceClassification from odunola +author: John Snow Labs +name: guardrail +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`guardrail` is a English model originally trained by odunola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/guardrail_en_5.1.4_3.4_1698865740063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/guardrail_en_5.1.4_3.4_1698865740063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("guardrail","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("guardrail","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|guardrail| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/odunola/guardrail \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hackmit_finetuned_sst2_blaine_mason_en.md b/docs/_posts/ahmedlone127/2023-11-01-hackmit_finetuned_sst2_blaine_mason_en.md new file mode 100644 index 000000000000..9631e8f06aa4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hackmit_finetuned_sst2_blaine_mason_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hackmit_finetuned_sst2_blaine_mason BertForSequenceClassification from Blaine-Mason +author: John Snow Labs +name: hackmit_finetuned_sst2_blaine_mason +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hackmit_finetuned_sst2_blaine_mason` is a English model originally trained by Blaine-Mason. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hackmit_finetuned_sst2_blaine_mason_en_5.1.4_3.4_1698867015271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hackmit_finetuned_sst2_blaine_mason_en_5.1.4_3.4_1698867015271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hackmit_finetuned_sst2_blaine_mason","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hackmit_finetuned_sst2_blaine_mason","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hackmit_finetuned_sst2_blaine_mason| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/Blaine-Mason/hackMIT-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-halacha_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-halacha_classifier_en.md new file mode 100644 index 000000000000..a2ad60a4a539 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-halacha_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English halacha_classifier BertForSequenceClassification from sivan22 +author: John Snow Labs +name: halacha_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`halacha_classifier` is a English model originally trained by sivan22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/halacha_classifier_en_5.1.4_3.4_1698862784902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/halacha_classifier_en_5.1.4_3.4_1698862784902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("halacha_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("halacha_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|halacha_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|692.6 MB| + +## References + +https://huggingface.co/sivan22/halacha-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hate_speech_bert_ctoraman_en.md b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_bert_ctoraman_en.md new file mode 100644 index 000000000000..c0164432f0d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_bert_ctoraman_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_speech_bert_ctoraman BertForSequenceClassification from ctoraman +author: John Snow Labs +name: hate_speech_bert_ctoraman +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_bert_ctoraman` is a English model originally trained by ctoraman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_bert_ctoraman_en_5.1.4_3.4_1698801459879.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_bert_ctoraman_en_5.1.4_3.4_1698801459879.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_bert_ctoraman","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_bert_ctoraman","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_bert_ctoraman| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ctoraman/hate-speech-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hate_speech_berturk_en.md b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_berturk_en.md new file mode 100644 index 000000000000..63080fc8d03f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_berturk_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_speech_berturk BertForSequenceClassification from ctoraman +author: John Snow Labs +name: hate_speech_berturk +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_berturk` is a English model originally trained by ctoraman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_berturk_en_5.1.4_3.4_1698812379558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_berturk_en_5.1.4_3.4_1698812379558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_berturk","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_berturk","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_berturk| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.7 MB| + +## References + +https://huggingface.co/ctoraman/hate-speech-berturk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hate_speech_detection_in_amharic_language_mbert_am.md b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_detection_in_amharic_language_mbert_am.md new file mode 100644 index 000000000000..26002459b761 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_detection_in_amharic_language_mbert_am.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Amharic hate_speech_detection_in_amharic_language_mbert BertForSequenceClassification from NathyB +author: John Snow Labs +name: hate_speech_detection_in_amharic_language_mbert +date: 2023-11-01 +tags: [bert, am, open_source, sequence_classification, onnx] +task: Text Classification +language: am +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_detection_in_amharic_language_mbert` is a Amharic model originally trained by NathyB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_detection_in_amharic_language_mbert_am_5.1.4_3.4_1698868208562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_detection_in_amharic_language_mbert_am_5.1.4_3.4_1698868208562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_detection_in_amharic_language_mbert","am")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_detection_in_amharic_language_mbert","am") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_detection_in_amharic_language_mbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|am| +|Size:|667.2 MB| + +## References + +https://huggingface.co/NathyB/Hate-Speech-Detection-in-Amharic-Language-mBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hate_speech_english_en.md b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_english_en.md new file mode 100644 index 000000000000..39f99a428a9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_speech_english BertForSequenceClassification from IMSyPP +author: John Snow Labs +name: hate_speech_english +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_english` is a English model originally trained by IMSyPP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_english_en_5.1.4_3.4_1698800603225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_english_en_5.1.4_3.4_1698800603225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_english| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/IMSyPP/hate_speech_en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hate_speech_italian_it.md b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_italian_it.md new file mode 100644 index 000000000000..cf88bbdb34eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hate_speech_italian_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian hate_speech_italian BertForSequenceClassification from IMSyPP +author: John Snow Labs +name: hate_speech_italian +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_italian` is a Italian model originally trained by IMSyPP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_italian_it_5.1.4_3.4_1698814544205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_italian_it_5.1.4_3.4_1698814544205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_italian","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hate_speech_italian","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_italian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|691.9 MB| + +## References + +https://huggingface.co/IMSyPP/hate_speech_it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hatebert_hate_offensive_oriya_normal_speech_en.md b/docs/_posts/ahmedlone127/2023-11-01-hatebert_hate_offensive_oriya_normal_speech_en.md new file mode 100644 index 000000000000..2afeeb615a60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hatebert_hate_offensive_oriya_normal_speech_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hatebert_hate_offensive_oriya_normal_speech BertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: hatebert_hate_offensive_oriya_normal_speech +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hatebert_hate_offensive_oriya_normal_speech` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hatebert_hate_offensive_oriya_normal_speech_en_5.1.4_3.4_1698844107630.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hatebert_hate_offensive_oriya_normal_speech_en_5.1.4_3.4_1698844107630.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hatebert_hate_offensive_oriya_normal_speech","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hatebert_hate_offensive_oriya_normal_speech","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hatebert_hate_offensive_oriya_normal_speech| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/DunnBC22/hateBERT-Hate_Offensive_or_Normal_Speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-headlinepopularity_en.md b/docs/_posts/ahmedlone127/2023-11-01-headlinepopularity_en.md new file mode 100644 index 000000000000..dbf36d937383 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-headlinepopularity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English headlinepopularity BertForSequenceClassification from omidvaramin +author: John Snow Labs +name: headlinepopularity +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`headlinepopularity` is a English model originally trained by omidvaramin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/headlinepopularity_en_5.1.4_3.4_1698863030193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/headlinepopularity_en_5.1.4_3.4_1698863030193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("headlinepopularity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("headlinepopularity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|headlinepopularity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/omidvaramin/HeadlinePopularity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_anger_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_anger_en.md new file mode 100644 index 000000000000..8de1af8c57a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_anger_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_anger BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_anger +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_anger` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_anger_en_5.1.4_3.4_1698861232242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_anger_en_5.1.4_3.4_1698861232242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_anger","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_anger","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_anger| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_anger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_anticipation_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_anticipation_en.md new file mode 100644 index 000000000000..15551a542768 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_anticipation_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_anticipation BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_anticipation +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_anticipation` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_anticipation_en_5.1.4_3.4_1698867881453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_anticipation_en_5.1.4_3.4_1698867881453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_anticipation","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_anticipation","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_anticipation| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_anticipation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_disgust_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_disgust_en.md new file mode 100644 index 000000000000..3c7b649b5f4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_disgust_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_disgust BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_disgust +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_disgust` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_disgust_en_5.1.4_3.4_1698836489609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_disgust_en_5.1.4_3.4_1698836489609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_disgust","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_disgust","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_disgust| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_disgust \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_fear_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_fear_en.md new file mode 100644 index 000000000000..6afa8970ae36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_fear_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_fear BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_fear +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_fear` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_fear_en_5.1.4_3.4_1698834487481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_fear_en_5.1.4_3.4_1698834487481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_fear","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_fear","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_fear| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_fear \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_joy_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_joy_en.md new file mode 100644 index 000000000000..c77d4a29e38a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_joy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_joy BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_joy +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_joy` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_joy_en_5.1.4_3.4_1698861498663.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_joy_en_5.1.4_3.4_1698861498663.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_joy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_joy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_joy| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_joy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_sadness_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_sadness_en.md new file mode 100644 index 000000000000..3ee3aaf8cc7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_sadness_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_sadness BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_sadness +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_sadness` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_sadness_en_5.1.4_3.4_1698864207020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_sadness_en_5.1.4_3.4_1698864207020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_sadness","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_sadness","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_sadness| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_sadness \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_surprise_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_surprise_en.md new file mode 100644 index 000000000000..67d11a8f4f84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_surprise_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_surprise BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_surprise +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_surprise` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_surprise_en_5.1.4_3.4_1698861685048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_surprise_en_5.1.4_3.4_1698861685048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_surprise","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_surprise","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_surprise| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_surprise \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hebemo_trust_en.md b/docs/_posts/ahmedlone127/2023-11-01-hebemo_trust_en.md new file mode 100644 index 000000000000..7d6f0b1d9017 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hebemo_trust_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hebemo_trust BertForSequenceClassification from avichr +author: John Snow Labs +name: hebemo_trust +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebemo_trust` is a English model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebemo_trust_en_5.1.4_3.4_1698842615442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebemo_trust_en_5.1.4_3.4_1698842615442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_trust","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hebemo_trust","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebemo_trust| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.3 MB| + +## References + +https://huggingface.co/avichr/hebEMO_trust \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hindi_abusive_muril_hi.md b/docs/_posts/ahmedlone127/2023-11-01-hindi_abusive_muril_hi.md new file mode 100644 index 000000000000..392ebe09710c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hindi_abusive_muril_hi.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Hindi hindi_abusive_muril BertForSequenceClassification from Hate-speech-CNERG +author: John Snow Labs +name: hindi_abusive_muril +date: 2023-11-01 +tags: [bert, hi, open_source, sequence_classification, onnx] +task: Text Classification +language: hi +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hindi_abusive_muril` is a Hindi model originally trained by Hate-speech-CNERG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hindi_abusive_muril_hi_5.1.4_3.4_1698823414994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hindi_abusive_muril_hi_5.1.4_3.4_1698823414994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hindi_abusive_muril","hi")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hindi_abusive_muril","hi") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hindi_abusive_muril| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|hi| +|Size:|892.6 MB| + +## References + +https://huggingface.co/Hate-speech-CNERG/hindi-abusive-MuRIL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hinglish_bert_class_en.md b/docs/_posts/ahmedlone127/2023-11-01-hinglish_bert_class_en.md new file mode 100644 index 000000000000..1e6b5134219a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hinglish_bert_class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hinglish_bert_class BertForSequenceClassification from verloop +author: John Snow Labs +name: hinglish_bert_class +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hinglish_bert_class` is a English model originally trained by verloop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hinglish_bert_class_en_5.1.4_3.4_1698842957325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hinglish_bert_class_en_5.1.4_3.4_1698842957325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hinglish_bert_class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hinglish_bert_class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hinglish_bert_class| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.2 MB| + +## References + +https://huggingface.co/verloop/Hinglish-Bert-Class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hse_vk_nlp_sentiment_version3_en.md b/docs/_posts/ahmedlone127/2023-11-01-hse_vk_nlp_sentiment_version3_en.md new file mode 100644 index 000000000000..016aeb227160 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hse_vk_nlp_sentiment_version3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hse_vk_nlp_sentiment_version3 BertForSequenceClassification from marcus2000 +author: John Snow Labs +name: hse_vk_nlp_sentiment_version3 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hse_vk_nlp_sentiment_version3` is a English model originally trained by marcus2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hse_vk_nlp_sentiment_version3_en_5.1.4_3.4_1698872410091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hse_vk_nlp_sentiment_version3_en_5.1.4_3.4_1698872410091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hse_vk_nlp_sentiment_version3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hse_vk_nlp_sentiment_version3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hse_vk_nlp_sentiment_version3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|664.4 MB| + +## References + +https://huggingface.co/marcus2000/HSE_VK_NLP_sentiment_version3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-hse_vk_nlp_sentiment_version7_en.md b/docs/_posts/ahmedlone127/2023-11-01-hse_vk_nlp_sentiment_version7_en.md new file mode 100644 index 000000000000..d6eddf5e34c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-hse_vk_nlp_sentiment_version7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hse_vk_nlp_sentiment_version7 BertForSequenceClassification from marcus2000 +author: John Snow Labs +name: hse_vk_nlp_sentiment_version7 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hse_vk_nlp_sentiment_version7` is a English model originally trained by marcus2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hse_vk_nlp_sentiment_version7_en_5.1.4_3.4_1698862241555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hse_vk_nlp_sentiment_version7_en_5.1.4_3.4_1698862241555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("hse_vk_nlp_sentiment_version7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("hse_vk_nlp_sentiment_version7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hse_vk_nlp_sentiment_version7| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|664.4 MB| + +## References + +https://huggingface.co/marcus2000/HSE_VK_NLP_sentiment_version7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-huggingface_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-01-huggingface_sentiment_analysis_en.md new file mode 100644 index 000000000000..ff30c150e543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-huggingface_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English huggingface_sentiment_analysis BertForSequenceClassification from raulangelj +author: John Snow Labs +name: huggingface_sentiment_analysis +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`huggingface_sentiment_analysis` is a English model originally trained by raulangelj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/huggingface_sentiment_analysis_en_5.1.4_3.4_1698845119166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/huggingface_sentiment_analysis_en_5.1.4_3.4_1698845119166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("huggingface_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("huggingface_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|huggingface_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/raulangelj/huggingface_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-iljp_incaselawbert_fine_tuned_5_epochs_en.md b/docs/_posts/ahmedlone127/2023-11-01-iljp_incaselawbert_fine_tuned_5_epochs_en.md new file mode 100644 index 000000000000..d1b760aed56a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-iljp_incaselawbert_fine_tuned_5_epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English iljp_incaselawbert_fine_tuned_5_epochs BertForSequenceClassification from AnonymousSub +author: John Snow Labs +name: iljp_incaselawbert_fine_tuned_5_epochs +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iljp_incaselawbert_fine_tuned_5_epochs` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iljp_incaselawbert_fine_tuned_5_epochs_en_5.1.4_3.4_1698862231544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iljp_incaselawbert_fine_tuned_5_epochs_en_5.1.4_3.4_1698862231544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("iljp_incaselawbert_fine_tuned_5_epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("iljp_incaselawbert_fine_tuned_5_epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iljp_incaselawbert_fine_tuned_5_epochs| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.0 MB| + +## References + +https://huggingface.co/AnonymousSub/ILJP_InCaseLawBERT_fine-tuned-5-epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-imdb_finetuned_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-01-imdb_finetuned_bert_base_uncased_en.md new file mode 100644 index 000000000000..2719fc7540a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-imdb_finetuned_bert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English imdb_finetuned_bert_base_uncased BertForSequenceClassification from JiaqiLee +author: John Snow Labs +name: imdb_finetuned_bert_base_uncased +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb_finetuned_bert_base_uncased` is a English model originally trained by JiaqiLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb_finetuned_bert_base_uncased_en_5.1.4_3.4_1698815331180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb_finetuned_bert_base_uncased_en_5.1.4_3.4_1698815331180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("imdb_finetuned_bert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("imdb_finetuned_bert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb_finetuned_bert_base_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/JiaqiLee/imdb-finetuned-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-indic_abusive_allinone_muril_xx.md b/docs/_posts/ahmedlone127/2023-11-01-indic_abusive_allinone_muril_xx.md new file mode 100644 index 000000000000..a12185e84590 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-indic_abusive_allinone_muril_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual indic_abusive_allinone_muril BertForSequenceClassification from Hate-speech-CNERG +author: John Snow Labs +name: indic_abusive_allinone_muril +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indic_abusive_allinone_muril` is a Multilingual model originally trained by Hate-speech-CNERG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indic_abusive_allinone_muril_xx_5.1.4_3.4_1698809564575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indic_abusive_allinone_muril_xx_5.1.4_3.4_1698809564575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indic_abusive_allinone_muril","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indic_abusive_allinone_muril","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indic_abusive_allinone_muril| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|892.6 MB| + +## References + +https://huggingface.co/Hate-speech-CNERG/indic-abusive-allInOne-MuRIL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-indobert_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-indobert_classification_en.md new file mode 100644 index 000000000000..16b2e77f550d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-indobert_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indobert_classification BertForSequenceClassification from afbudiman +author: John Snow Labs +name: indobert_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_classification` is a English model originally trained by afbudiman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_classification_en_5.1.4_3.4_1698864009125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_classification_en_5.1.4_3.4_1698864009125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobert_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobert_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.4 MB| + +## References + +https://huggingface.co/afbudiman/indobert-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-indobert_crossencoder_mmarco_en.md b/docs/_posts/ahmedlone127/2023-11-01-indobert_crossencoder_mmarco_en.md new file mode 100644 index 000000000000..34880eca118b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-indobert_crossencoder_mmarco_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indobert_crossencoder_mmarco BertForSequenceClassification from carles-undergrad-thesis +author: John Snow Labs +name: indobert_crossencoder_mmarco +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_crossencoder_mmarco` is a English model originally trained by carles-undergrad-thesis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_crossencoder_mmarco_en_5.1.4_3.4_1698866679506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_crossencoder_mmarco_en_5.1.4_3.4_1698866679506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobert_crossencoder_mmarco","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobert_crossencoder_mmarco","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_crossencoder_mmarco| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|413.9 MB| + +## References + +https://huggingface.co/carles-undergrad-thesis/indobert-crossencoder-mmarco \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-indobert_emotion_classification_id.md b/docs/_posts/ahmedlone127/2023-11-01-indobert_emotion_classification_id.md new file mode 100644 index 000000000000..1e6365c537ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-indobert_emotion_classification_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian indobert_emotion_classification BertForSequenceClassification from thoriqfy +author: John Snow Labs +name: indobert_emotion_classification +date: 2023-11-01 +tags: [bert, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_emotion_classification` is a Indonesian model originally trained by thoriqfy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_emotion_classification_id_5.1.4_3.4_1698801957691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_emotion_classification_id_5.1.4_3.4_1698801957691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobert_emotion_classification","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobert_emotion_classification","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_emotion_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|1.3 GB| + +## References + +https://huggingface.co/thoriqfy/indobert-emotion-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-indobert_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-01-indobert_sentiment_analysis_en.md new file mode 100644 index 000000000000..7dd1fbfa5fc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-indobert_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indobert_sentiment_analysis BertForSequenceClassification from dafex +author: John Snow Labs +name: indobert_sentiment_analysis +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_sentiment_analysis` is a English model originally trained by dafex. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_sentiment_analysis_en_5.1.4_3.4_1698832553764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_sentiment_analysis_en_5.1.4_3.4_1698832553764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indobert_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indobert_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_sentiment_analysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.0 MB| + +## References + +https://huggingface.co/dafex/indobert-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-indonesia_bert_sentiment_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-indonesia_bert_sentiment_classification_en.md new file mode 100644 index 000000000000..55de772518a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-indonesia_bert_sentiment_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indonesia_bert_sentiment_classification BertForSequenceClassification from mdhugol +author: John Snow Labs +name: indonesia_bert_sentiment_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indonesia_bert_sentiment_classification` is a English model originally trained by mdhugol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indonesia_bert_sentiment_classification_en_5.1.4_3.4_1698808027096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indonesia_bert_sentiment_classification_en_5.1.4_3.4_1698808027096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("indonesia_bert_sentiment_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("indonesia_bert_sentiment_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indonesia_bert_sentiment_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.4 MB| + +## References + +https://huggingface.co/mdhugol/indonesia-bert-sentiment-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-insurance_multiple_label_my83_v2_en.md b/docs/_posts/ahmedlone127/2023-11-01-insurance_multiple_label_my83_v2_en.md new file mode 100644 index 000000000000..93d68df2c648 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-insurance_multiple_label_my83_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English insurance_multiple_label_my83_v2 BertForSequenceClassification from Leonardolin +author: John Snow Labs +name: insurance_multiple_label_my83_v2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`insurance_multiple_label_my83_v2` is a English model originally trained by Leonardolin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/insurance_multiple_label_my83_v2_en_5.1.4_3.4_1698813898723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/insurance_multiple_label_my83_v2_en_5.1.4_3.4_1698813898723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("insurance_multiple_label_my83_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("insurance_multiple_label_my83_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|insurance_multiple_label_my83_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.4 MB| + +## References + +https://huggingface.co/Leonardolin/insurance_multiple_label_my83-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-intent_cls_en.md b/docs/_posts/ahmedlone127/2023-11-01-intent_cls_en.md new file mode 100644 index 000000000000..02b17116651e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-intent_cls_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English intent_cls BertForSequenceClassification from EthanChen0418 +author: John Snow Labs +name: intent_cls +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_cls` is a English model originally trained by EthanChen0418. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_cls_en_5.1.4_3.4_1698869803940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_cls_en_5.1.4_3.4_1698869803940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("intent_cls","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("intent_cls","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_cls| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/EthanChen0418/intent_cls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-jailbreak_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-jailbreak_classifier_en.md new file mode 100644 index 000000000000..b4d7c4e1a341 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-jailbreak_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English jailbreak_classifier BertForSequenceClassification from jackhhao +author: John Snow Labs +name: jailbreak_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jailbreak_classifier` is a English model originally trained by jackhhao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jailbreak_classifier_en_5.1.4_3.4_1698863726221.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jailbreak_classifier_en_5.1.4_3.4_1698863726221.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("jailbreak_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("jailbreak_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jailbreak_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jackhhao/jailbreak-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-jobclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-jobclassifier_en.md new file mode 100644 index 000000000000..3e48c801cb6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-jobclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English jobclassifier BertForSequenceClassification from CleveGreen +author: John Snow Labs +name: jobclassifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobclassifier` is a English model originally trained by CleveGreen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobclassifier_en_5.1.4_3.4_1698842228263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobclassifier_en_5.1.4_3.4_1698842228263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("jobclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("jobclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobclassifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|406.3 MB| + +## References + +https://huggingface.co/CleveGreen/JobClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-kambert_en.md b/docs/_posts/ahmedlone127/2023-11-01-kambert_en.md new file mode 100644 index 000000000000..6ea24364c87d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-kambert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English kambert BertForSequenceClassification from Kamarin +author: John Snow Labs +name: kambert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kambert` is a English model originally trained by Kamarin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kambert_en_5.1.4_3.4_1698835093440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kambert_en_5.1.4_3.4_1698835093440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("kambert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("kambert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kambert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Kamarin/kambert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-kcbert_base_finetuned_hate_ko.md b/docs/_posts/ahmedlone127/2023-11-01-kcbert_base_finetuned_hate_ko.md new file mode 100644 index 000000000000..328f17606f33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-kcbert_base_finetuned_hate_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean kcbert_base_finetuned_hate BertForSequenceClassification from hegelty +author: John Snow Labs +name: kcbert_base_finetuned_hate +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kcbert_base_finetuned_hate` is a Korean model originally trained by hegelty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kcbert_base_finetuned_hate_ko_5.1.4_3.4_1698809919557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kcbert_base_finetuned_hate_ko_5.1.4_3.4_1698809919557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("kcbert_base_finetuned_hate","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("kcbert_base_finetuned_hate","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kcbert_base_finetuned_hate| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|408.4 MB| + +## References + +https://huggingface.co/hegelty/KcBERT-Base-finetuned-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-kcbert_formal_classifier_ko.md b/docs/_posts/ahmedlone127/2023-11-01-kcbert_formal_classifier_ko.md new file mode 100644 index 000000000000..c851098d525e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-kcbert_formal_classifier_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean kcbert_formal_classifier BertForSequenceClassification from j5ng +author: John Snow Labs +name: kcbert_formal_classifier +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kcbert_formal_classifier` is a Korean model originally trained by j5ng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kcbert_formal_classifier_ko_5.1.4_3.4_1698812952417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kcbert_formal_classifier_ko_5.1.4_3.4_1698812952417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("kcbert_formal_classifier","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("kcbert_formal_classifier","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kcbert_formal_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|408.4 MB| + +## References + +https://huggingface.co/j5ng/kcbert-formal-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-kpf_cross_encoder_v1_ko.md b/docs/_posts/ahmedlone127/2023-11-01-kpf_cross_encoder_v1_ko.md new file mode 100644 index 000000000000..d7561337d52d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-kpf_cross_encoder_v1_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean kpf_cross_encoder_v1 BertForSequenceClassification from bongsoo +author: John Snow Labs +name: kpf_cross_encoder_v1 +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kpf_cross_encoder_v1` is a Korean model originally trained by bongsoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kpf_cross_encoder_v1_ko_5.1.4_3.4_1698809946198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kpf_cross_encoder_v1_ko_5.1.4_3.4_1698809946198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("kpf_cross_encoder_v1","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("kpf_cross_encoder_v1","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kpf_cross_encoder_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|427.3 MB| + +## References + +https://huggingface.co/bongsoo/kpf-cross-encoder-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-lawinorders_en.md b/docs/_posts/ahmedlone127/2023-11-01-lawinorders_en.md new file mode 100644 index 000000000000..e7eb3014f9d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-lawinorders_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English lawinorders BertForSequenceClassification from circulartext +author: John Snow Labs +name: lawinorders +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lawinorders` is a English model originally trained by circulartext. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lawinorders_en_5.1.4_3.4_1698816085111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lawinorders_en_5.1.4_3.4_1698816085111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("lawinorders","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("lawinorders","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lawinorders| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/circulartext/lawinorders \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-log_classifier_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-01-log_classifier_distilbert_en.md new file mode 100644 index 000000000000..44f56c096ce7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-log_classifier_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English log_classifier_distilbert BertForSequenceClassification from Lowder +author: John Snow Labs +name: log_classifier_distilbert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`log_classifier_distilbert` is a English model originally trained by Lowder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/log_classifier_distilbert_en_5.1.4_3.4_1698862938466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/log_classifier_distilbert_en_5.1.4_3.4_1698862938466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("log_classifier_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("log_classifier_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|log_classifier_distilbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.8 MB| + +## References + +https://huggingface.co/Lowder/log_classifier_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-lyric_tonga_tonga_islands_genre_en.md b/docs/_posts/ahmedlone127/2023-11-01-lyric_tonga_tonga_islands_genre_en.md new file mode 100644 index 000000000000..7f6bfea9e8f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-lyric_tonga_tonga_islands_genre_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English lyric_tonga_tonga_islands_genre BertForSequenceClassification from Veucci +author: John Snow Labs +name: lyric_tonga_tonga_islands_genre +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lyric_tonga_tonga_islands_genre` is a English model originally trained by Veucci. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lyric_tonga_tonga_islands_genre_en_5.1.4_3.4_1698863016852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lyric_tonga_tonga_islands_genre_en_5.1.4_3.4_1698863016852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("lyric_tonga_tonga_islands_genre","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("lyric_tonga_tonga_islands_genre","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lyric_tonga_tonga_islands_genre| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Veucci/lyric-to-genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_12_v2_cross_encoder_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_12_v2_cross_encoder_en.md new file mode 100644 index 000000000000..b8ec23e21d2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_12_v2_cross_encoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_minilm_l_12_v2_cross_encoder BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_minilm_l_12_v2_cross_encoder +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_minilm_l_12_v2_cross_encoder` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_12_v2_cross_encoder_en_5.1.4_3.4_1698807364320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_12_v2_cross_encoder_en_5.1.4_3.4_1698807364320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_12_v2_cross_encoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_12_v2_cross_encoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_minilm_l_12_v2_cross_encoder| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.2 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_2_v2_cross_encoder_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_2_v2_cross_encoder_en.md new file mode 100644 index 000000000000..454fff8673c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_2_v2_cross_encoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_minilm_l_2_v2_cross_encoder BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_minilm_l_2_v2_cross_encoder +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_minilm_l_2_v2_cross_encoder` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_2_v2_cross_encoder_en_5.1.4_3.4_1698802998479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_2_v2_cross_encoder_en_5.1.4_3.4_1698802998479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_2_v2_cross_encoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_2_v2_cross_encoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_minilm_l_2_v2_cross_encoder| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|57.7 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-2-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_4_v2_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_4_v2_en.md new file mode 100644 index 000000000000..9a587a5d8bdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_4_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_minilm_l_4_v2 BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_minilm_l_4_v2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_minilm_l_4_v2` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_4_v2_en_5.1.4_3.4_1698799750367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_4_v2_en_5.1.4_3.4_1698799750367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_4_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_4_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_minilm_l_4_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|71.0 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-4-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_6_v2_cross_encoder_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_6_v2_cross_encoder_en.md new file mode 100644 index 000000000000..c2a94b380e56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_6_v2_cross_encoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_minilm_l_6_v2_cross_encoder BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_minilm_l_6_v2_cross_encoder +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_minilm_l_6_v2_cross_encoder` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_6_v2_cross_encoder_en_5.1.4_3.4_1698807460490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_6_v2_cross_encoder_en_5.1.4_3.4_1698807460490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_6_v2_cross_encoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_6_v2_cross_encoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_minilm_l_6_v2_cross_encoder| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.3 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_6_v2_navteca_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_6_v2_navteca_en.md new file mode 100644 index 000000000000..9a11848f6740 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_minilm_l_6_v2_navteca_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_minilm_l_6_v2_navteca BertForSequenceClassification from navteca +author: John Snow Labs +name: malay_marco_minilm_l_6_v2_navteca +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_minilm_l_6_v2_navteca` is a English model originally trained by navteca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_6_v2_navteca_en_5.1.4_3.4_1698800249951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_minilm_l_6_v2_navteca_en_5.1.4_3.4_1698800249951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_6_v2_navteca","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_minilm_l_6_v2_navteca","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_minilm_l_6_v2_navteca| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|84.3 MB| + +## References + +https://huggingface.co/navteca/ms-marco-MiniLM-L-6-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_2_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_2_en.md new file mode 100644 index 000000000000..afe1f29a2cec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_tinybert_l_2 BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_tinybert_l_2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_tinybert_l_2` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_2_en_5.1.4_3.4_1698813857069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_2_en_5.1.4_3.4_1698813857069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_tinybert_l_2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_2_v2_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_2_v2_en.md new file mode 100644 index 000000000000..08b2cfd34b54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_2_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_tinybert_l_2_v2 BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_tinybert_l_2_v2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_tinybert_l_2_v2` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_2_v2_en_5.1.4_3.4_1698807560421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_2_v2_en_5.1.4_3.4_1698807560421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_2_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_2_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_tinybert_l_2_v2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_4_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_4_en.md new file mode 100644 index 000000000000..3e81db162047 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_tinybert_l_4 BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_tinybert_l_4 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_tinybert_l_4` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_4_en_5.1.4_3.4_1698809807596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_4_en_5.1.4_3.4_1698809807596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_tinybert_l_4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_6_en.md b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_6_en.md new file mode 100644 index 000000000000..1d454ec6fd04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malay_marco_tinybert_l_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malay_marco_tinybert_l_6 BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: malay_marco_tinybert_l_6 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malay_marco_tinybert_l_6` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_6_en_5.1.4_3.4_1698816077435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malay_marco_tinybert_l_6_en_5.1.4_3.4_1698816077435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malay_marco_tinybert_l_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malay_marco_tinybert_l_6| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|251.2 MB| + +## References + +https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-malware_url_detect_en.md b/docs/_posts/ahmedlone127/2023-11-01-malware_url_detect_en.md new file mode 100644 index 000000000000..0e0fa100fa4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-malware_url_detect_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English malware_url_detect BertForSequenceClassification from elftsdmr +author: John Snow Labs +name: malware_url_detect +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`malware_url_detect` is a English model originally trained by elftsdmr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/malware_url_detect_en_5.1.4_3.4_1698800965682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/malware_url_detect_en_5.1.4_3.4_1698800965682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("malware_url_detect","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("malware_url_detect","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|malware_url_detect| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/elftsdmr/malware-url-detect \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-marathi_sentiment_md_mr.md b/docs/_posts/ahmedlone127/2023-11-01-marathi_sentiment_md_mr.md new file mode 100644 index 000000000000..e9dc81ccf335 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-marathi_sentiment_md_mr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Marathi marathi_sentiment_md BertForSequenceClassification from l3cube-pune +author: John Snow Labs +name: marathi_sentiment_md +date: 2023-11-01 +tags: [bert, mr, open_source, sequence_classification, onnx] +task: Text Classification +language: mr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_sentiment_md` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_sentiment_md_mr_5.1.4_3.4_1698807039717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_sentiment_md_mr_5.1.4_3.4_1698807039717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("marathi_sentiment_md","mr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("marathi_sentiment_md","mr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_sentiment_md| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|mr| +|Size:|892.8 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-sentiment-md \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-maverick_midas_en.md b/docs/_posts/ahmedlone127/2023-11-01-maverick_midas_en.md new file mode 100644 index 000000000000..b9d033770204 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-maverick_midas_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English maverick_midas BertForSequenceClassification from lukasec +author: John Snow Labs +name: maverick_midas +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`maverick_midas` is a English model originally trained by lukasec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/maverick_midas_en_5.1.4_3.4_1698871166771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/maverick_midas_en_5.1.4_3.4_1698871166771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("maverick_midas","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("maverick_midas","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|maverick_midas| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/lukasec/Maverick-Midas \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-maverick_moneyball_en.md b/docs/_posts/ahmedlone127/2023-11-01-maverick_moneyball_en.md new file mode 100644 index 000000000000..ba312cbacf41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-maverick_moneyball_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English maverick_moneyball BertForSequenceClassification from lukasec +author: John Snow Labs +name: maverick_moneyball +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`maverick_moneyball` is a English model originally trained by lukasec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/maverick_moneyball_en_5.1.4_3.4_1698815004716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/maverick_moneyball_en_5.1.4_3.4_1698815004716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("maverick_moneyball","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("maverick_moneyball","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|maverick_moneyball| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/lukasec/Maverick-Moneyball \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-mbzuai_political_bias_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-mbzuai_political_bias_bert_en.md new file mode 100644 index 000000000000..713a3d3a3db5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-mbzuai_political_bias_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbzuai_political_bias_bert BertForSequenceClassification from theArif +author: John Snow Labs +name: mbzuai_political_bias_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbzuai_political_bias_bert` is a English model originally trained by theArif. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbzuai_political_bias_bert_en_5.1.4_3.4_1698865556745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbzuai_political_bias_bert_en_5.1.4_3.4_1698865556745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("mbzuai_political_bias_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("mbzuai_political_bias_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbzuai_political_bias_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/theArif/mbzuai-political-bias-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-minilm_evidence_types_en.md b/docs/_posts/ahmedlone127/2023-11-01-minilm_evidence_types_en.md new file mode 100644 index 000000000000..00ec19529a37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-minilm_evidence_types_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English minilm_evidence_types BertForSequenceClassification from marieke93 +author: John Snow Labs +name: minilm_evidence_types +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_evidence_types` is a English model originally trained by marieke93. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_evidence_types_en_5.1.4_3.4_1698802658961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_evidence_types_en_5.1.4_3.4_1698802658961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("minilm_evidence_types","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("minilm_evidence_types","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_evidence_types| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|117.2 MB| + +## References + +https://huggingface.co/marieke93/MiniLM-evidence-types \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_chi_dedupe_en.md b/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_chi_dedupe_en.md new file mode 100644 index 000000000000..5e65f1d32781 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_chi_dedupe_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English minilm_finetuned_chi_dedupe BertForSequenceClassification from dhalladin +author: John Snow Labs +name: minilm_finetuned_chi_dedupe +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_finetuned_chi_dedupe` is a English model originally trained by dhalladin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_finetuned_chi_dedupe_en_5.1.4_3.4_1698815874565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_finetuned_chi_dedupe_en_5.1.4_3.4_1698815874565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_chi_dedupe","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_chi_dedupe","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_finetuned_chi_dedupe| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|111.2 MB| + +## References + +https://huggingface.co/dhalladin/minilm-finetuned-chi-dedupe \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_chi_dedupe_southern_sotho_en.md b/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_chi_dedupe_southern_sotho_en.md new file mode 100644 index 000000000000..b2385b50a743 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_chi_dedupe_southern_sotho_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English minilm_finetuned_chi_dedupe_southern_sotho BertForSequenceClassification from dhalladin +author: John Snow Labs +name: minilm_finetuned_chi_dedupe_southern_sotho +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_finetuned_chi_dedupe_southern_sotho` is a English model originally trained by dhalladin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_finetuned_chi_dedupe_southern_sotho_en_5.1.4_3.4_1698816224552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_finetuned_chi_dedupe_southern_sotho_en_5.1.4_3.4_1698816224552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_chi_dedupe_southern_sotho","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_chi_dedupe_southern_sotho","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_finetuned_chi_dedupe_southern_sotho| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|111.2 MB| + +## References + +https://huggingface.co/dhalladin/minilm-finetuned-chi-dedupe-st \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_emotion_celorenzo2104_en.md b/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_emotion_celorenzo2104_en.md new file mode 100644 index 000000000000..3ab66d8de96e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-minilm_finetuned_emotion_celorenzo2104_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English minilm_finetuned_emotion_celorenzo2104 BertForSequenceClassification from celorenzo2104 +author: John Snow Labs +name: minilm_finetuned_emotion_celorenzo2104 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_finetuned_emotion_celorenzo2104` is a English model originally trained by celorenzo2104. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_finetuned_emotion_celorenzo2104_en_5.1.4_3.4_1698823431283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_finetuned_emotion_celorenzo2104_en_5.1.4_3.4_1698823431283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_emotion_celorenzo2104","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("minilm_finetuned_emotion_celorenzo2104","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_finetuned_emotion_celorenzo2104| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|118.5 MB| + +## References + +https://huggingface.co/celorenzo2104/minilm-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-minilm_l6_h384_uncased_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-01-minilm_l6_h384_uncased_sst2_en.md new file mode 100644 index 000000000000..7822407e7b13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-minilm_l6_h384_uncased_sst2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English minilm_l6_h384_uncased_sst2 BertForSequenceClassification from philschmid +author: John Snow Labs +name: minilm_l6_h384_uncased_sst2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilm_l6_h384_uncased_sst2` is a English model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilm_l6_h384_uncased_sst2_en_5.1.4_3.4_1698807460605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilm_l6_h384_uncased_sst2_en_5.1.4_3.4_1698807460605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("minilm_l6_h384_uncased_sst2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("minilm_l6_h384_uncased_sst2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilm_l6_h384_uncased_sst2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|77.1 MB| + +## References + +https://huggingface.co/philschmid/MiniLM-L6-H384-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-miread_neuro_large_en.md b/docs/_posts/ahmedlone127/2023-11-01-miread_neuro_large_en.md new file mode 100644 index 000000000000..b6b4e5805d03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-miread_neuro_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English miread_neuro_large BertForSequenceClassification from biodatlab +author: John Snow Labs +name: miread_neuro_large +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`miread_neuro_large` is a English model originally trained by biodatlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/miread_neuro_large_en_5.1.4_3.4_1698811728236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/miread_neuro_large_en_5.1.4_3.4_1698811728236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("miread_neuro_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("miread_neuro_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|miread_neuro_large| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|418.8 MB| + +## References + +https://huggingface.co/biodatlab/MIReAD-Neuro-Large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-model_rpii2023_en.md b/docs/_posts/ahmedlone127/2023-11-01-model_rpii2023_en.md new file mode 100644 index 000000000000..def221637712 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-model_rpii2023_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English model_rpii2023 BertForSequenceClassification from rpii2023 +author: John Snow Labs +name: model_rpii2023 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_rpii2023` is a English model originally trained by rpii2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_rpii2023_en_5.1.4_3.4_1698805121892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_rpii2023_en_5.1.4_3.4_1698805121892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("model_rpii2023","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("model_rpii2023","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_rpii2023| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/rpii2023/model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-modelloscenari_en.md b/docs/_posts/ahmedlone127/2023-11-01-modelloscenari_en.md new file mode 100644 index 000000000000..e96679474b02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-modelloscenari_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English modelloscenari BertForSequenceClassification from Marco127 +author: John Snow Labs +name: modelloscenari +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelloscenari` is a English model originally trained by Marco127. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelloscenari_en_5.1.4_3.4_1698813498837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelloscenari_en_5.1.4_3.4_1698813498837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("modelloscenari","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("modelloscenari","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelloscenari| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.8 MB| + +## References + +https://huggingface.co/Marco127/ModelloScenari \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-modified_bert_toxicity_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-modified_bert_toxicity_classification_en.md new file mode 100644 index 000000000000..48862ba871d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-modified_bert_toxicity_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English modified_bert_toxicity_classification BertForSequenceClassification from Ptato +author: John Snow Labs +name: modified_bert_toxicity_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modified_bert_toxicity_classification` is a English model originally trained by Ptato. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modified_bert_toxicity_classification_en_5.1.4_3.4_1698862102589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modified_bert_toxicity_classification_en_5.1.4_3.4_1698862102589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("modified_bert_toxicity_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("modified_bert_toxicity_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modified_bert_toxicity_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Ptato/Modified-Bert-Toxicity-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-moresexistbert_edos_en.md b/docs/_posts/ahmedlone127/2023-11-01-moresexistbert_edos_en.md new file mode 100644 index 000000000000..f7f57691da95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-moresexistbert_edos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English moresexistbert_edos BertForSequenceClassification from clincolnoz +author: John Snow Labs +name: moresexistbert_edos +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moresexistbert_edos` is a English model originally trained by clincolnoz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moresexistbert_edos_en_5.1.4_3.4_1698845601388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moresexistbert_edos_en_5.1.4_3.4_1698845601388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("moresexistbert_edos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("moresexistbert_edos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moresexistbert_edos| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/clincolnoz/MoreSexistBERT-edos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-negativeresultdetector_en.md b/docs/_posts/ahmedlone127/2023-11-01-negativeresultdetector_en.md new file mode 100644 index 000000000000..54fece4fe823 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-negativeresultdetector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English negativeresultdetector BertForSequenceClassification from ClinicalMetaScience +author: John Snow Labs +name: negativeresultdetector +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`negativeresultdetector` is a English model originally trained by ClinicalMetaScience. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/negativeresultdetector_en_5.1.4_3.4_1698861460145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/negativeresultdetector_en_5.1.4_3.4_1698861460145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("negativeresultdetector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("negativeresultdetector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|negativeresultdetector| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/ClinicalMetaScience/NegativeResultDetector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-negbleurt_en.md b/docs/_posts/ahmedlone127/2023-11-01-negbleurt_en.md new file mode 100644 index 000000000000..39c3b78cfa80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-negbleurt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English negbleurt BertForSequenceClassification from tum-nlp +author: John Snow Labs +name: negbleurt +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`negbleurt` is a English model originally trained by tum-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/negbleurt_en_5.1.4_3.4_1698831108772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/negbleurt_en_5.1.4_3.4_1698831108772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("negbleurt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("negbleurt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|negbleurt| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/tum-nlp/NegBLEURT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-nepalisentimentanalysis_ne.md b/docs/_posts/ahmedlone127/2023-11-01-nepalisentimentanalysis_ne.md new file mode 100644 index 000000000000..442d3d0c81eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-nepalisentimentanalysis_ne.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Nepali (macrolanguage) nepalisentimentanalysis BertForSequenceClassification from dpkrm +author: John Snow Labs +name: nepalisentimentanalysis +date: 2023-11-01 +tags: [bert, ne, open_source, sequence_classification, onnx] +task: Text Classification +language: ne +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepalisentimentanalysis` is a Nepali (macrolanguage) model originally trained by dpkrm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepalisentimentanalysis_ne_5.1.4_3.4_1698809346407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepalisentimentanalysis_ne_5.1.4_3.4_1698809346407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nepalisentimentanalysis","ne")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nepalisentimentanalysis","ne") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepalisentimentanalysis| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ne| +|Size:|409.4 MB| + +## References + +https://huggingface.co/dpkrm/NepaliSentimentAnalysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-news_category_classification_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-01-news_category_classification_turkish_tr.md new file mode 100644 index 000000000000..6ff4d37c01ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-news_category_classification_turkish_tr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Turkish news_category_classification_turkish BertForSequenceClassification from Kodiks +author: John Snow Labs +name: news_category_classification_turkish +date: 2023-11-01 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_category_classification_turkish` is a Turkish model originally trained by Kodiks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_category_classification_turkish_tr_5.1.4_3.4_1698804143395.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_category_classification_turkish_tr_5.1.4_3.4_1698804143395.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("news_category_classification_turkish","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("news_category_classification_turkish","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_category_classification_turkish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.5 MB| + +## References + +https://huggingface.co/Kodiks/news-category-classification-turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-news_category_classifier_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-01-news_category_classifier_distilbert_en.md new file mode 100644 index 000000000000..e2d3145bd647 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-news_category_classifier_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English news_category_classifier_distilbert BertForSequenceClassification from dima806 +author: John Snow Labs +name: news_category_classifier_distilbert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_category_classifier_distilbert` is a English model originally trained by dima806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_category_classifier_distilbert_en_5.1.4_3.4_1698821352655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_category_classifier_distilbert_en_5.1.4_3.4_1698821352655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("news_category_classifier_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("news_category_classifier_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_category_classifier_distilbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/dima806/news-category-classifier-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-news_genre_classifier_xx.md b/docs/_posts/ahmedlone127/2023-11-01-news_genre_classifier_xx.md new file mode 100644 index 000000000000..f886ec81f0e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-news_genre_classifier_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual news_genre_classifier BertForSequenceClassification from lesyar +author: John Snow Labs +name: news_genre_classifier +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_genre_classifier` is a Multilingual model originally trained by lesyar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_genre_classifier_xx_5.1.4_3.4_1698815071534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_genre_classifier_xx_5.1.4_3.4_1698815071534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("news_genre_classifier","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("news_genre_classifier","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_genre_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/lesyar/news_genre_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-nlp_classifier_news_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-01-nlp_classifier_news_turkish_tr.md new file mode 100644 index 000000000000..feab37ffee33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-nlp_classifier_news_turkish_tr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Turkish nlp_classifier_news_turkish BertForSequenceClassification from aimped +author: John Snow Labs +name: nlp_classifier_news_turkish +date: 2023-11-01 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_classifier_news_turkish` is a Turkish model originally trained by aimped. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_classifier_news_turkish_tr_5.1.4_3.4_1698819940242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_classifier_news_turkish_tr_5.1.4_3.4_1698819940242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nlp_classifier_news_turkish","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nlp_classifier_news_turkish","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_classifier_news_turkish| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|414.5 MB| + +## References + +https://huggingface.co/aimped/nlp-classifier-news-tr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q1_en.md b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q1_en.md new file mode 100644 index 000000000000..a565fd105418 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_qual_q1 BertForSequenceClassification from maxspad +author: John Snow Labs +name: nlp_qual_q1 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_qual_q1` is a English model originally trained by maxspad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_qual_q1_en_5.1.4_3.4_1698806241364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_qual_q1_en_5.1.4_3.4_1698806241364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_q1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_q1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_qual_q1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/maxspad/nlp-qual-q1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q2i_en.md b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q2i_en.md new file mode 100644 index 000000000000..23d406b510ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q2i_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_qual_q2i BertForSequenceClassification from maxspad +author: John Snow Labs +name: nlp_qual_q2i +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_qual_q2i` is a English model originally trained by maxspad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_qual_q2i_en_5.1.4_3.4_1698814072931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_qual_q2i_en_5.1.4_3.4_1698814072931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_q2i","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_q2i","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_qual_q2i| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/maxspad/nlp-qual-q2i \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q3i_en.md b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q3i_en.md new file mode 100644 index 000000000000..1da6d5db3d09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_q3i_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_qual_q3i BertForSequenceClassification from maxspad +author: John Snow Labs +name: nlp_qual_q3i +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_qual_q3i` is a English model originally trained by maxspad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_qual_q3i_en_5.1.4_3.4_1698807684915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_qual_q3i_en_5.1.4_3.4_1698807684915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_q3i","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_q3i","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_qual_q3i| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/maxspad/nlp-qual-q3i \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_qual_en.md b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_qual_en.md new file mode 100644 index 000000000000..34eb5140a891 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-nlp_qual_qual_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_qual_qual BertForSequenceClassification from maxspad +author: John Snow Labs +name: nlp_qual_qual +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_qual_qual` is a English model originally trained by maxspad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_qual_qual_en_5.1.4_3.4_1698813024287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_qual_qual_en_5.1.4_3.4_1698813024287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_qual","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("nlp_qual_qual","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_qual_qual| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.5 MB| + +## References + +https://huggingface.co/maxspad/nlp-qual-qual \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-odio_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-odio_bert_en.md new file mode 100644 index 000000000000..bf98cc2b132c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-odio_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English odio_bert BertForSequenceClassification from Mesay +author: John Snow Labs +name: odio_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`odio_bert` is a English model originally trained by Mesay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/odio_bert_en_5.1.4_3.4_1698872378435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/odio_bert_en_5.1.4_3.4_1698872378435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("odio_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("odio_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|odio_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/Mesay/Odio-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-one_for_all_toxicity_v3_xx.md b/docs/_posts/ahmedlone127/2023-11-01-one_for_all_toxicity_v3_xx.md new file mode 100644 index 000000000000..1927c25cd4b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-one_for_all_toxicity_v3_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual one_for_all_toxicity_v3 BertForSequenceClassification from FredZhang7 +author: John Snow Labs +name: one_for_all_toxicity_v3 +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`one_for_all_toxicity_v3` is a Multilingual model originally trained by FredZhang7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/one_for_all_toxicity_v3_xx_5.1.4_3.4_1698811158516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/one_for_all_toxicity_v3_xx_5.1.4_3.4_1698811158516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("one_for_all_toxicity_v3","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("one_for_all_toxicity_v3","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|one_for_all_toxicity_v3| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.2 MB| + +## References + +https://huggingface.co/FredZhang7/one-for-all-toxicity-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-paragraph_s36000_en.md b/docs/_posts/ahmedlone127/2023-11-01-paragraph_s36000_en.md new file mode 100644 index 000000000000..3687e06ab3fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-paragraph_s36000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English paragraph_s36000 BertForSequenceClassification from jjonhwa +author: John Snow Labs +name: paragraph_s36000 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`paragraph_s36000` is a English model originally trained by jjonhwa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/paragraph_s36000_en_5.1.4_3.4_1698809754085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/paragraph_s36000_en_5.1.4_3.4_1698809754085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("paragraph_s36000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("paragraph_s36000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|paragraph_s36000| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/jjonhwa/paragraph_s36000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-pari_conditions_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-01-pari_conditions_fine_tuned_en.md new file mode 100644 index 000000000000..384e07406684 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-pari_conditions_fine_tuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pari_conditions_fine_tuned BertForSequenceClassification from eboubetana +author: John Snow Labs +name: pari_conditions_fine_tuned +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pari_conditions_fine_tuned` is a English model originally trained by eboubetana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pari_conditions_fine_tuned_en_5.1.4_3.4_1698824696842.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pari_conditions_fine_tuned_en_5.1.4_3.4_1698824696842.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pari_conditions_fine_tuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pari_conditions_fine_tuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pari_conditions_fine_tuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/eboubetana/pari-conditions-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-pari_features_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-01-pari_features_fine_tuned_en.md new file mode 100644 index 000000000000..4389033b03b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-pari_features_fine_tuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pari_features_fine_tuned BertForSequenceClassification from eboubetana +author: John Snow Labs +name: pari_features_fine_tuned +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pari_features_fine_tuned` is a English model originally trained by eboubetana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pari_features_fine_tuned_en_5.1.4_3.4_1698825996321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pari_features_fine_tuned_en_5.1.4_3.4_1698825996321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pari_features_fine_tuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pari_features_fine_tuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pari_features_fine_tuned| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/eboubetana/pari-features-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-parlabert_en.md b/docs/_posts/ahmedlone127/2023-11-01-parlabert_en.md new file mode 100644 index 000000000000..ba240084920d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-parlabert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English parlabert BertForSequenceClassification from jesperjmb +author: John Snow Labs +name: parlabert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parlabert` is a English model originally trained by jesperjmb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parlabert_en_5.1.4_3.4_1698860899906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parlabert_en_5.1.4_3.4_1698860899906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("parlabert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("parlabert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parlabert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.4 MB| + +## References + +https://huggingface.co/jesperjmb/parlaBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-parrot_fluency_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-parrot_fluency_model_en.md new file mode 100644 index 000000000000..8ee86f4905f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-parrot_fluency_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English parrot_fluency_model BertForSequenceClassification from prithivida +author: John Snow Labs +name: parrot_fluency_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parrot_fluency_model` is a English model originally trained by prithivida. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parrot_fluency_model_en_5.1.4_3.4_1698809541472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parrot_fluency_model_en_5.1.4_3.4_1698809541472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("parrot_fluency_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("parrot_fluency_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parrot_fluency_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prithivida/parrot_fluency_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-partypress_monolingual_denmark_da.md b/docs/_posts/ahmedlone127/2023-11-01-partypress_monolingual_denmark_da.md new file mode 100644 index 000000000000..c97cde2c3af3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-partypress_monolingual_denmark_da.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Danish partypress_monolingual_denmark BertForSequenceClassification from partypress +author: John Snow Labs +name: partypress_monolingual_denmark +date: 2023-11-01 +tags: [bert, da, open_source, sequence_classification, onnx] +task: Text Classification +language: da +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`partypress_monolingual_denmark` is a Danish model originally trained by partypress. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/partypress_monolingual_denmark_da_5.1.4_3.4_1698811007455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/partypress_monolingual_denmark_da_5.1.4_3.4_1698811007455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("partypress_monolingual_denmark","da")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("partypress_monolingual_denmark","da") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|partypress_monolingual_denmark| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|da| +|Size:|414.5 MB| + +## References + +https://huggingface.co/partypress/partypress-monolingual-denmark \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-partypress_monolingual_germany_de.md b/docs/_posts/ahmedlone127/2023-11-01-partypress_monolingual_germany_de.md new file mode 100644 index 000000000000..40c2348b85be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-partypress_monolingual_germany_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German partypress_monolingual_germany BertForSequenceClassification from partypress +author: John Snow Labs +name: partypress_monolingual_germany +date: 2023-11-01 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`partypress_monolingual_germany` is a German model originally trained by partypress. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/partypress_monolingual_germany_de_5.1.4_3.4_1698839698710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/partypress_monolingual_germany_de_5.1.4_3.4_1698839698710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("partypress_monolingual_germany","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("partypress_monolingual_germany","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|partypress_monolingual_germany| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.2 MB| + +## References + +https://huggingface.co/partypress/partypress-monolingual-germany \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-persian_text_sentiment_bert_v1_fa.md b/docs/_posts/ahmedlone127/2023-11-01-persian_text_sentiment_bert_v1_fa.md new file mode 100644 index 000000000000..e2be1f9ea0ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-persian_text_sentiment_bert_v1_fa.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Persian persian_text_sentiment_bert_v1 BertForSequenceClassification from SeyedAli +author: John Snow Labs +name: persian_text_sentiment_bert_v1 +date: 2023-11-01 +tags: [bert, fa, open_source, sequence_classification, onnx] +task: Text Classification +language: fa +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`persian_text_sentiment_bert_v1` is a Persian model originally trained by SeyedAli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/persian_text_sentiment_bert_v1_fa_5.1.4_3.4_1698862250105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/persian_text_sentiment_bert_v1_fa_5.1.4_3.4_1698862250105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("persian_text_sentiment_bert_v1","fa")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("persian_text_sentiment_bert_v1","fa") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|persian_text_sentiment_bert_v1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|fa| +|Size:|608.6 MB| + +## References + +https://huggingface.co/SeyedAli/Persian-Text-Sentiment-Bert-V1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-pico_evidence_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-pico_evidence_classification_model_en.md new file mode 100644 index 000000000000..7916cd363226 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-pico_evidence_classification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pico_evidence_classification_model BertForSequenceClassification from owaiskha9654 +author: John Snow Labs +name: pico_evidence_classification_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pico_evidence_classification_model` is a English model originally trained by owaiskha9654. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pico_evidence_classification_model_en_5.1.4_3.4_1698843136513.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pico_evidence_classification_model_en_5.1.4_3.4_1698843136513.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pico_evidence_classification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pico_evidence_classification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pico_evidence_classification_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/owaiskha9654/PICO_Evidence_Classification_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-pold_en.md b/docs/_posts/ahmedlone127/2023-11-01-pold_en.md new file mode 100644 index 000000000000..922293d92d84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-pold_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pold BertForSequenceClassification from ijazulhaq +author: John Snow Labs +name: pold +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pold` is a English model originally trained by ijazulhaq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pold_en_5.1.4_3.4_1698823524801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pold_en_5.1.4_3.4_1698823524801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pold","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pold","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pold| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.7 MB| + +## References + +https://huggingface.co/ijazulhaq/pold \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-politicalbiasbert_en.md b/docs/_posts/ahmedlone127/2023-11-01-politicalbiasbert_en.md new file mode 100644 index 000000000000..2cf917c8573a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-politicalbiasbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English politicalbiasbert BertForSequenceClassification from bucketresearch +author: John Snow Labs +name: politicalbiasbert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`politicalbiasbert` is a English model originally trained by bucketresearch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/politicalbiasbert_en_5.1.4_3.4_1698800433251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/politicalbiasbert_en_5.1.4_3.4_1698800433251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("politicalbiasbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("politicalbiasbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|politicalbiasbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/bucketresearch/politicalBiasBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-product_name_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-product_name_classifier_en.md new file mode 100644 index 000000000000..590f4a69e839 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-product_name_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English product_name_classifier BertForSequenceClassification from cbrosch +author: John Snow Labs +name: product_name_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`product_name_classifier` is a English model originally trained by cbrosch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/product_name_classifier_en_5.1.4_3.4_1698816594651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/product_name_classifier_en_5.1.4_3.4_1698816594651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("product_name_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("product_name_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|product_name_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/cbrosch/product_name_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-pubmed_biobert_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-pubmed_biobert_text_classification_en.md new file mode 100644 index 000000000000..0d1a2ec1d698 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-pubmed_biobert_text_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pubmed_biobert_text_classification BertForSequenceClassification from saidhr20 +author: John Snow Labs +name: pubmed_biobert_text_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmed_biobert_text_classification` is a English model originally trained by saidhr20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmed_biobert_text_classification_en_5.1.4_3.4_1698868393647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmed_biobert_text_classification_en_5.1.4_3.4_1698868393647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pubmed_biobert_text_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pubmed_biobert_text_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmed_biobert_text_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/saidhr20/pubmed-biobert-text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-pubmedbert_mnli_mednli_en.md b/docs/_posts/ahmedlone127/2023-11-01-pubmedbert_mnli_mednli_en.md new file mode 100644 index 000000000000..3783f377e481 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-pubmedbert_mnli_mednli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pubmedbert_mnli_mednli BertForSequenceClassification from pritamdeka +author: John Snow Labs +name: pubmedbert_mnli_mednli +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmedbert_mnli_mednli` is a English model originally trained by pritamdeka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmedbert_mnli_mednli_en_5.1.4_3.4_1698830429167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmedbert_mnli_mednli_en_5.1.4_3.4_1698830429167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("pubmedbert_mnli_mednli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("pubmedbert_mnli_mednli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmedbert_mnli_mednli| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|410.4 MB| + +## References + +https://huggingface.co/pritamdeka/PubMedBERT-MNLI-MedNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-question_vs_statement_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-question_vs_statement_classifier_en.md new file mode 100644 index 000000000000..bacffb25771f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-question_vs_statement_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English question_vs_statement_classifier BertForSequenceClassification from shahrukhx01 +author: John Snow Labs +name: question_vs_statement_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_vs_statement_classifier` is a English model originally trained by shahrukhx01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_vs_statement_classifier_en_5.1.4_3.4_1698808580126.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_vs_statement_classifier_en_5.1.4_3.4_1698808580126.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("question_vs_statement_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("question_vs_statement_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_vs_statement_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/shahrukhx01/question-vs-statement-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-racebert_ethnicity_en.md b/docs/_posts/ahmedlone127/2023-11-01-racebert_ethnicity_en.md new file mode 100644 index 000000000000..363c305bee65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-racebert_ethnicity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English racebert_ethnicity BertForSequenceClassification from pparasurama +author: John Snow Labs +name: racebert_ethnicity +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`racebert_ethnicity` is a English model originally trained by pparasurama. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/racebert_ethnicity_en_5.1.4_3.4_1698813415030.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/racebert_ethnicity_en_5.1.4_3.4_1698813415030.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("racebert_ethnicity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("racebert_ethnicity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|racebert_ethnicity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/pparasurama/raceBERT-ethnicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-re2g_reranker_nq_en.md b/docs/_posts/ahmedlone127/2023-11-01-re2g_reranker_nq_en.md new file mode 100644 index 000000000000..ae54ac04ad10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-re2g_reranker_nq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English re2g_reranker_nq BertForSequenceClassification from ibm +author: John Snow Labs +name: re2g_reranker_nq +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`re2g_reranker_nq` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/re2g_reranker_nq_en_5.1.4_3.4_1698810741451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/re2g_reranker_nq_en_5.1.4_3.4_1698810741451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("re2g_reranker_nq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("re2g_reranker_nq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|re2g_reranker_nq| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ibm/re2g-reranker-nq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-re2g_reranker_trex_en.md b/docs/_posts/ahmedlone127/2023-11-01-re2g_reranker_trex_en.md new file mode 100644 index 000000000000..3fe814e4ab3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-re2g_reranker_trex_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English re2g_reranker_trex BertForSequenceClassification from ibm +author: John Snow Labs +name: re2g_reranker_trex +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`re2g_reranker_trex` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/re2g_reranker_trex_en_5.1.4_3.4_1698837023460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/re2g_reranker_trex_en_5.1.4_3.4_1698837023460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("re2g_reranker_trex","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("re2g_reranker_trex","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|re2g_reranker_trex| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ibm/re2g-reranker-trex \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class_en.md b/docs/_posts/ahmedlone127/2023-11-01-readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class_en.md new file mode 100644 index 000000000000..615e1d1eac2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class BertForSequenceClassification from lmvasque +author: John Snow Labs +name: readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class` is a English model originally trained by lmvasque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class_en_5.1.4_3.4_1698837190035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class_en_5.1.4_3.4_1698837190035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|readability_spanish_benchmark_mbert_english_spanish_paragraphs_3class| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/lmvasque/readability-es-benchmark-mbert-en-es-paragraphs-3class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-roberta_base_cold_zh.md b/docs/_posts/ahmedlone127/2023-11-01-roberta_base_cold_zh.md new file mode 100644 index 000000000000..8ded257a5d4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-roberta_base_cold_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese roberta_base_cold BertForSequenceClassification from thu-coai +author: John Snow Labs +name: roberta_base_cold +date: 2023-11-01 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_cold` is a Chinese model originally trained by thu-coai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_cold_zh_5.1.4_3.4_1698805578650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_cold_zh_5.1.4_3.4_1698805578650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("roberta_base_cold","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("roberta_base_cold","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_cold| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.2 MB| + +## References + +https://huggingface.co/thu-coai/roberta-base-cold \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-roberta_based_uncased_finetuned_financial_headline_en.md b/docs/_posts/ahmedlone127/2023-11-01-roberta_based_uncased_finetuned_financial_headline_en.md new file mode 100644 index 000000000000..a70905ed21a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-roberta_based_uncased_finetuned_financial_headline_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_based_uncased_finetuned_financial_headline BertForSequenceClassification from odunola +author: John Snow Labs +name: roberta_based_uncased_finetuned_financial_headline +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_based_uncased_finetuned_financial_headline` is a English model originally trained by odunola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_based_uncased_finetuned_financial_headline_en_5.1.4_3.4_1698813674183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_based_uncased_finetuned_financial_headline_en_5.1.4_3.4_1698813674183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("roberta_based_uncased_finetuned_financial_headline","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("roberta_based_uncased_finetuned_financial_headline","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_based_uncased_finetuned_financial_headline| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|108.0 MB| + +## References + +https://huggingface.co/odunola/roberta-based_uncased-finetuned-financial-headline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-roberta_tiny_imdb_en.md b/docs/_posts/ahmedlone127/2023-11-01-roberta_tiny_imdb_en.md new file mode 100644 index 000000000000..924c4296a60f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-roberta_tiny_imdb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_tiny_imdb BertForSequenceClassification from AntoineB +author: John Snow Labs +name: roberta_tiny_imdb +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_tiny_imdb` is a English model originally trained by AntoineB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_tiny_imdb_en_5.1.4_3.4_1698815916133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_tiny_imdb_en_5.1.4_3.4_1698815916133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("roberta_tiny_imdb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("roberta_tiny_imdb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_tiny_imdb| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|104.9 MB| + +## References + +https://huggingface.co/AntoineB/roberta-tiny-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rosatom_survey_sentiment_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-rosatom_survey_sentiment_classifier_en.md new file mode 100644 index 000000000000..358e9609bb89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rosatom_survey_sentiment_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rosatom_survey_sentiment_classifier BertForSequenceClassification from traptrip +author: John Snow Labs +name: rosatom_survey_sentiment_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rosatom_survey_sentiment_classifier` is a English model originally trained by traptrip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rosatom_survey_sentiment_classifier_en_5.1.4_3.4_1698861858024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rosatom_survey_sentiment_classifier_en_5.1.4_3.4_1698861858024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rosatom_survey_sentiment_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rosatom_survey_sentiment_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rosatom_survey_sentiment_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/traptrip/rosatom_survey_sentiment_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_cedr_russian_emotion_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_cedr_russian_emotion_ru.md new file mode 100644 index 000000000000..ea82c2b6b202 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_cedr_russian_emotion_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_cased_cedr_russian_emotion BertForSequenceClassification from seara +author: John Snow Labs +name: rubert_base_cased_cedr_russian_emotion +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_cedr_russian_emotion` is a Russian model originally trained by seara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_cedr_russian_emotion_ru_5.1.4_3.4_1698811666816.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_cedr_russian_emotion_ru_5.1.4_3.4_1698811666816.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_cedr_russian_emotion","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_cedr_russian_emotion","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_cedr_russian_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| + +## References + +https://huggingface.co/seara/rubert-base-cased-cedr-russian-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_russian_go_emotions_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_russian_go_emotions_ru.md new file mode 100644 index 000000000000..031eba467dff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_russian_go_emotions_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_cased_russian_go_emotions BertForSequenceClassification from seara +author: John Snow Labs +name: rubert_base_cased_russian_go_emotions +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_russian_go_emotions` is a Russian model originally trained by seara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_russian_go_emotions_ru_5.1.4_3.4_1698801280113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_russian_go_emotions_ru_5.1.4_3.4_1698801280113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_russian_go_emotions","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_russian_go_emotions","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_russian_go_emotions| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.6 MB| + +## References + +https://huggingface.co/seara/rubert-base-cased-ru-go-emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_russian_sentiment_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_russian_sentiment_ru.md new file mode 100644 index 000000000000..27eb8deed565 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_russian_sentiment_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_cased_russian_sentiment BertForSequenceClassification from seara +author: John Snow Labs +name: rubert_base_cased_russian_sentiment +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_russian_sentiment` is a Russian model originally trained by seara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_russian_sentiment_ru_5.1.4_3.4_1698806745491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_russian_sentiment_ru_5.1.4_3.4_1698806745491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_russian_sentiment","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_russian_sentiment","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_russian_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| + +## References + +https://huggingface.co/seara/rubert-base-cased-russian-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_sentiment_study_feedbacks_solyanka_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_sentiment_study_feedbacks_solyanka_ru.md new file mode 100644 index 000000000000..07bcaa3337c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_cased_sentiment_study_feedbacks_solyanka_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_cased_sentiment_study_feedbacks_solyanka BertForSequenceClassification from seninoseno +author: John Snow Labs +name: rubert_base_cased_sentiment_study_feedbacks_solyanka +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_sentiment_study_feedbacks_solyanka` is a Russian model originally trained by seninoseno. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentiment_study_feedbacks_solyanka_ru_5.1.4_3.4_1698861737806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_sentiment_study_feedbacks_solyanka_ru_5.1.4_3.4_1698861737806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentiment_study_feedbacks_solyanka","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_cased_sentiment_study_feedbacks_solyanka","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_sentiment_study_feedbacks_solyanka| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/seninoseno/rubert-base-cased-sentiment-study-feedbacks-solyanka \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_base_intent_detection_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_intent_detection_ru.md new file mode 100644 index 000000000000..f4fdde78547e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_intent_detection_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_intent_detection BertForSequenceClassification from Den4ikAI +author: John Snow Labs +name: rubert_base_intent_detection +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_intent_detection` is a Russian model originally trained by Den4ikAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_intent_detection_ru_5.1.4_3.4_1698815603430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_intent_detection_ru_5.1.4_3.4_1698815603430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_intent_detection","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_intent_detection","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_intent_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|669.4 MB| + +## References + +https://huggingface.co/Den4ikAI/ruBert_base_intent_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_base_qa_ranker_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_qa_ranker_ru.md new file mode 100644 index 000000000000..d6a5e4323ac4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_qa_ranker_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_qa_ranker BertForSequenceClassification from Den4ikAI +author: John Snow Labs +name: rubert_base_qa_ranker +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_qa_ranker` is a Russian model originally trained by Den4ikAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_qa_ranker_ru_5.1.4_3.4_1698812306841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_qa_ranker_ru_5.1.4_3.4_1698812306841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_qa_ranker","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_qa_ranker","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_qa_ranker| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.6 MB| + +## References + +https://huggingface.co/Den4ikAI/ruBert-base-qa-ranker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_base_russian_emotion_detection_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_russian_emotion_detection_ru.md new file mode 100644 index 000000000000..143046530523 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_base_russian_emotion_detection_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_base_russian_emotion_detection BertForSequenceClassification from MaxKazak +author: John Snow Labs +name: rubert_base_russian_emotion_detection +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_russian_emotion_detection` is a Russian model originally trained by MaxKazak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_russian_emotion_detection_ru_5.1.4_3.4_1698844929269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_russian_emotion_detection_ru_5.1.4_3.4_1698844929269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_russian_emotion_detection","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_base_russian_emotion_detection","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_russian_emotion_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|669.3 MB| + +## References + +https://huggingface.co/MaxKazak/ruBert-base-russian-emotion-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_conversational_russian_sentiment_rusentiment_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_conversational_russian_sentiment_rusentiment_ru.md new file mode 100644 index 000000000000..db613b0baa00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_conversational_russian_sentiment_rusentiment_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_conversational_russian_sentiment_rusentiment BertForSequenceClassification from sismetanin +author: John Snow Labs +name: rubert_conversational_russian_sentiment_rusentiment +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_conversational_russian_sentiment_rusentiment` is a Russian model originally trained by sismetanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_conversational_russian_sentiment_rusentiment_ru_5.1.4_3.4_1698864805427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_conversational_russian_sentiment_rusentiment_ru_5.1.4_3.4_1698864805427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_conversational_russian_sentiment_rusentiment","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_conversational_russian_sentiment_rusentiment","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_conversational_russian_sentiment_rusentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/sismetanin/rubert_conversational-ru-sentiment-rusentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_russian_sentiment_rusentiment_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_russian_sentiment_rusentiment_ru.md new file mode 100644 index 000000000000..0ad5a0f740f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_russian_sentiment_rusentiment_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_russian_sentiment_rusentiment BertForSequenceClassification from sismetanin +author: John Snow Labs +name: rubert_russian_sentiment_rusentiment +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_russian_sentiment_rusentiment` is a Russian model originally trained by sismetanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_russian_sentiment_rusentiment_ru_5.1.4_3.4_1698810246205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_russian_sentiment_rusentiment_ru_5.1.4_3.4_1698810246205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_russian_sentiment_rusentiment","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_russian_sentiment_rusentiment","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_russian_sentiment_rusentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| + +## References + +https://huggingface.co/sismetanin/rubert-ru-sentiment-rusentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_cognitive_bias_en.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_cognitive_bias_en.md new file mode 100644 index 000000000000..e537ab80e491 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_cognitive_bias_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_tiny2_cognitive_bias BertForSequenceClassification from amedvedev +author: John Snow Labs +name: rubert_tiny2_cognitive_bias +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_cognitive_bias` is a English model originally trained by amedvedev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_cognitive_bias_en_5.1.4_3.4_1698814713224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_cognitive_bias_en_5.1.4_3.4_1698814713224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_cognitive_bias","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_cognitive_bias","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_cognitive_bias| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/amedvedev/rubert-tiny2-cognitive-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_russian_go_emotions_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_russian_go_emotions_ru.md new file mode 100644 index 000000000000..46533bc84000 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_russian_go_emotions_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_tiny2_russian_go_emotions BertForSequenceClassification from seara +author: John Snow Labs +name: rubert_tiny2_russian_go_emotions +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_russian_go_emotions` is a Russian model originally trained by seara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_russian_go_emotions_ru_5.1.4_3.4_1698807677059.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_russian_go_emotions_ru_5.1.4_3.4_1698807677059.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_russian_go_emotions","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_russian_go_emotions","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_russian_go_emotions| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.5 MB| + +## References + +https://huggingface.co/seara/rubert-tiny2-ru-go-emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_russian_sentiment_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_russian_sentiment_ru.md new file mode 100644 index 000000000000..c91a66924e04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny2_russian_sentiment_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_tiny2_russian_sentiment BertForSequenceClassification from seara +author: John Snow Labs +name: rubert_tiny2_russian_sentiment +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_russian_sentiment` is a Russian model originally trained by seara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_russian_sentiment_ru_5.1.4_3.4_1698814356913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_russian_sentiment_ru_5.1.4_3.4_1698814356913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_russian_sentiment","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny2_russian_sentiment","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_russian_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.5 MB| + +## References + +https://huggingface.co/seara/rubert-tiny2-russian-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_comp_question_classification_en.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_comp_question_classification_en.md new file mode 100644 index 000000000000..ea8f9a551428 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_comp_question_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_tiny_comp_question_classification BertForSequenceClassification from lilaspourpre +author: John Snow Labs +name: rubert_tiny_comp_question_classification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny_comp_question_classification` is a English model originally trained by lilaspourpre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny_comp_question_classification_en_5.1.4_3.4_1698815919038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny_comp_question_classification_en_5.1.4_3.4_1698815919038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_comp_question_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_comp_question_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny_comp_question_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|109.5 MB| + +## References + +https://huggingface.co/lilaspourpre/rubert-tiny-comp_question_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_questions_classifier_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_questions_classifier_ru.md new file mode 100644 index 000000000000..40fbcba3a08c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_questions_classifier_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_tiny_questions_classifier BertForSequenceClassification from Den4ikAI +author: John Snow Labs +name: rubert_tiny_questions_classifier +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny_questions_classifier` is a Russian model originally trained by Den4ikAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny_questions_classifier_ru_5.1.4_3.4_1698827379278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny_questions_classifier_ru_5.1.4_3.4_1698827379278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_questions_classifier","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_questions_classifier","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny_questions_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.5 MB| + +## References + +https://huggingface.co/Den4ikAI/ruBert-tiny-questions-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_replicas_classifier_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_replicas_classifier_ru.md new file mode 100644 index 000000000000..72acabf4d379 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_replicas_classifier_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_tiny_replicas_classifier BertForSequenceClassification from Den4ikAI +author: John Snow Labs +name: rubert_tiny_replicas_classifier +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny_replicas_classifier` is a Russian model originally trained by Den4ikAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny_replicas_classifier_ru_5.1.4_3.4_1698811846715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny_replicas_classifier_ru_5.1.4_3.4_1698811846715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_replicas_classifier","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_replicas_classifier","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny_replicas_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|109.5 MB| + +## References + +https://huggingface.co/Den4ikAI/ruBert-tiny-replicas-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_stance_calssification_en.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_stance_calssification_en.md new file mode 100644 index 000000000000..5285960aebe9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_tiny_stance_calssification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rubert_tiny_stance_calssification BertForSequenceClassification from lilaspourpre +author: John Snow Labs +name: rubert_tiny_stance_calssification +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny_stance_calssification` is a English model originally trained by lilaspourpre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny_stance_calssification_en_5.1.4_3.4_1698816201327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny_stance_calssification_en_5.1.4_3.4_1698816201327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_stance_calssification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_tiny_stance_calssification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny_stance_calssification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|44.2 MB| + +## References + +https://huggingface.co/lilaspourpre/rubert-tiny-stance-calssification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-rubert_toxic_pikabu_2ch_ru.md b/docs/_posts/ahmedlone127/2023-11-01-rubert_toxic_pikabu_2ch_ru.md new file mode 100644 index 000000000000..2a654c7af02f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-rubert_toxic_pikabu_2ch_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian rubert_toxic_pikabu_2ch BertForSequenceClassification from sismetanin +author: John Snow Labs +name: rubert_toxic_pikabu_2ch +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_toxic_pikabu_2ch` is a Russian model originally trained by sismetanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_toxic_pikabu_2ch_ru_5.1.4_3.4_1698803414664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_toxic_pikabu_2ch_ru_5.1.4_3.4_1698803414664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("rubert_toxic_pikabu_2ch","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("rubert_toxic_pikabu_2ch","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_toxic_pikabu_2ch| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|666.5 MB| + +## References + +https://huggingface.co/sismetanin/rubert-toxic-pikabu-2ch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-russian_inappropriate_messages_ru.md b/docs/_posts/ahmedlone127/2023-11-01-russian_inappropriate_messages_ru.md new file mode 100644 index 000000000000..89c9c00a432d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-russian_inappropriate_messages_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian russian_inappropriate_messages BertForSequenceClassification from apanc +author: John Snow Labs +name: russian_inappropriate_messages +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`russian_inappropriate_messages` is a Russian model originally trained by apanc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/russian_inappropriate_messages_ru_5.1.4_3.4_1698808490347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/russian_inappropriate_messages_ru_5.1.4_3.4_1698808490347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("russian_inappropriate_messages","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("russian_inappropriate_messages","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|russian_inappropriate_messages| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/apanc/russian-inappropriate-messages \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-russian_sensitive_topics_apanc_ru.md b/docs/_posts/ahmedlone127/2023-11-01-russian_sensitive_topics_apanc_ru.md new file mode 100644 index 000000000000..67c10318fc6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-russian_sensitive_topics_apanc_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian russian_sensitive_topics_apanc BertForSequenceClassification from apanc +author: John Snow Labs +name: russian_sensitive_topics_apanc +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`russian_sensitive_topics_apanc` is a Russian model originally trained by apanc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/russian_sensitive_topics_apanc_ru_5.1.4_3.4_1698799966870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/russian_sensitive_topics_apanc_ru_5.1.4_3.4_1698799966870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("russian_sensitive_topics_apanc","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("russian_sensitive_topics_apanc","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|russian_sensitive_topics_apanc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|665.6 MB| + +## References + +https://huggingface.co/apanc/russian-sensitive-topics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-russian_toxicity_classifier_ru.md b/docs/_posts/ahmedlone127/2023-11-01-russian_toxicity_classifier_ru.md new file mode 100644 index 000000000000..e35d73dabdc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-russian_toxicity_classifier_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian russian_toxicity_classifier BertForSequenceClassification from s-nlp +author: John Snow Labs +name: russian_toxicity_classifier +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`russian_toxicity_classifier` is a Russian model originally trained by s-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/russian_toxicity_classifier_ru_5.1.4_3.4_1698803773614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/russian_toxicity_classifier_ru_5.1.4_3.4_1698803773614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("russian_toxicity_classifier","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("russian_toxicity_classifier","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|russian_toxicity_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|664.4 MB| + +## References + +https://huggingface.co/s-nlp/russian_toxicity_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sa_bertchatgptapp_en.md b/docs/_posts/ahmedlone127/2023-11-01-sa_bertchatgptapp_en.md new file mode 100644 index 000000000000..5cef47ca76e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sa_bertchatgptapp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sa_bertchatgptapp BertForSequenceClassification from ninahf1503 +author: John Snow Labs +name: sa_bertchatgptapp +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sa_bertchatgptapp` is a English model originally trained by ninahf1503. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sa_bertchatgptapp_en_5.1.4_3.4_1698808546386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sa_bertchatgptapp_en_5.1.4_3.4_1698808546386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sa_bertchatgptapp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sa_bertchatgptapp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sa_bertchatgptapp| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ninahf1503/SA-BERTchatgptapp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sands_en.md b/docs/_posts/ahmedlone127/2023-11-01-sands_en.md new file mode 100644 index 000000000000..aad677e479f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sands_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sands BertForSequenceClassification from NCHS +author: John Snow Labs +name: sands +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sands` is a English model originally trained by NCHS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sands_en_5.1.4_3.4_1698809145442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sands_en_5.1.4_3.4_1698809145442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sands","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sands","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sands| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/NCHS/SANDS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sbert_russian_sentiment_rusentiment_ru.md b/docs/_posts/ahmedlone127/2023-11-01-sbert_russian_sentiment_rusentiment_ru.md new file mode 100644 index 000000000000..114a6cedec5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sbert_russian_sentiment_rusentiment_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian sbert_russian_sentiment_rusentiment BertForSequenceClassification from sismetanin +author: John Snow Labs +name: sbert_russian_sentiment_rusentiment +date: 2023-11-01 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sbert_russian_sentiment_rusentiment` is a Russian model originally trained by sismetanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sbert_russian_sentiment_rusentiment_ru_5.1.4_3.4_1698811540099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sbert_russian_sentiment_rusentiment_ru_5.1.4_3.4_1698811540099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sbert_russian_sentiment_rusentiment","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sbert_russian_sentiment_rusentiment","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sbert_russian_sentiment_rusentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|1.6 GB| + +## References + +https://huggingface.co/sismetanin/sbert-ru-sentiment-rusentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-scicite_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-scicite_classification_model_en.md new file mode 100644 index 000000000000..0de1517f929e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-scicite_classification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English scicite_classification_model BertForSequenceClassification from AlvianKhairi +author: John Snow Labs +name: scicite_classification_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scicite_classification_model` is a English model originally trained by AlvianKhairi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scicite_classification_model_en_5.1.4_3.4_1698840707113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scicite_classification_model_en_5.1.4_3.4_1698840707113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("scicite_classification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("scicite_classification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scicite_classification_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/AlvianKhairi/Scicite_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-scotus_v10_en.md b/docs/_posts/ahmedlone127/2023-11-01-scotus_v10_en.md new file mode 100644 index 000000000000..86f8dc975062 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-scotus_v10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English scotus_v10 BertForSequenceClassification from raminass +author: John Snow Labs +name: scotus_v10 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scotus_v10` is a English model originally trained by raminass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scotus_v10_en_5.1.4_3.4_1698844998471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scotus_v10_en_5.1.4_3.4_1698844998471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("scotus_v10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("scotus_v10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scotus_v10| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|131.6 MB| + +## References + +https://huggingface.co/raminass/scotus-v10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sdg_classification_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-sdg_classification_bert_en.md new file mode 100644 index 000000000000..5a94631504d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sdg_classification_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sdg_classification_bert BertForSequenceClassification from sadickam +author: John Snow Labs +name: sdg_classification_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sdg_classification_bert` is a English model originally trained by sadickam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sdg_classification_bert_en_5.1.4_3.4_1698812128717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sdg_classification_bert_en_5.1.4_3.4_1698812128717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sdg_classification_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sdg_classification_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sdg_classification_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/sadickam/sdg-classification-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sdg_classifier_osdg_en.md b/docs/_posts/ahmedlone127/2023-11-01-sdg_classifier_osdg_en.md new file mode 100644 index 000000000000..889312e4efa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sdg_classifier_osdg_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sdg_classifier_osdg BertForSequenceClassification from jonas +author: John Snow Labs +name: sdg_classifier_osdg +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sdg_classifier_osdg` is a English model originally trained by jonas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sdg_classifier_osdg_en_5.1.4_3.4_1698807214966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sdg_classifier_osdg_en_5.1.4_3.4_1698807214966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sdg_classifier_osdg","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sdg_classifier_osdg","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sdg_classifier_osdg| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/jonas/sdg_classifier_osdg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sell_buy_intent_classifier_bert_mini_en.md b/docs/_posts/ahmedlone127/2023-11-01-sell_buy_intent_classifier_bert_mini_en.md new file mode 100644 index 000000000000..c8f3ab9c3dcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sell_buy_intent_classifier_bert_mini_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sell_buy_intent_classifier_bert_mini BertForSequenceClassification from obsei-ai +author: John Snow Labs +name: sell_buy_intent_classifier_bert_mini +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sell_buy_intent_classifier_bert_mini` is a English model originally trained by obsei-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sell_buy_intent_classifier_bert_mini_en_5.1.4_3.4_1698830302437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sell_buy_intent_classifier_bert_mini_en_5.1.4_3.4_1698830302437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sell_buy_intent_classifier_bert_mini","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sell_buy_intent_classifier_bert_mini","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sell_buy_intent_classifier_bert_mini| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|42.1 MB| + +## References + +https://huggingface.co/obsei-ai/sell-buy-intent-classifier-bert-mini \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sentence_aldi_en.md b/docs/_posts/ahmedlone127/2023-11-01-sentence_aldi_en.md new file mode 100644 index 000000000000..29d139bf8170 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sentence_aldi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentence_aldi BertForSequenceClassification from AMR-KELEG +author: John Snow Labs +name: sentence_aldi +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_aldi` is a English model originally trained by AMR-KELEG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_aldi_en_5.1.4_3.4_1698818155232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_aldi_en_5.1.4_3.4_1698818155232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentence_aldi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentence_aldi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_aldi| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|610.9 MB| + +## References + +https://huggingface.co/AMR-KELEG/Sentence-ALDi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sentence_level_certainty_en.md b/docs/_posts/ahmedlone127/2023-11-01-sentence_level_certainty_en.md new file mode 100644 index 000000000000..895dd155e9fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sentence_level_certainty_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentence_level_certainty BertForSequenceClassification from pedropei +author: John Snow Labs +name: sentence_level_certainty +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_level_certainty` is a English model originally trained by pedropei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_level_certainty_en_5.1.4_3.4_1698813298159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_level_certainty_en_5.1.4_3.4_1698813298159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentence_level_certainty","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentence_level_certainty","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_level_certainty| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.9 MB| + +## References + +https://huggingface.co/pedropei/sentence-level-certainty \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sentiment_analysis_pretrained_en.md b/docs/_posts/ahmedlone127/2023-11-01-sentiment_analysis_pretrained_en.md new file mode 100644 index 000000000000..be61ec4cd157 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sentiment_analysis_pretrained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_pretrained BertForSequenceClassification from prasadsawant7 +author: John Snow Labs +name: sentiment_analysis_pretrained +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_pretrained` is a English model originally trained by prasadsawant7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_pretrained_en_5.1.4_3.4_1698811867688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_pretrained_en_5.1.4_3.4_1698811867688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_pretrained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_analysis_pretrained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_pretrained| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/prasadsawant7/sentiment-analysis-pretrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sentiment_model_sample_27go_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-01-sentiment_model_sample_27go_emotion_en.md new file mode 100644 index 000000000000..5ae7bd6e8ab8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sentiment_model_sample_27go_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_model_sample_27go_emotion BertForSequenceClassification from jkhan447 +author: John Snow Labs +name: sentiment_model_sample_27go_emotion +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_model_sample_27go_emotion` is a English model originally trained by jkhan447. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_model_sample_27go_emotion_en_5.1.4_3.4_1698815878849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_model_sample_27go_emotion_en_5.1.4_3.4_1698815878849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_model_sample_27go_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_model_sample_27go_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_model_sample_27go_emotion| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/jkhan447/sentiment-model-sample-27go-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sentiment_ohb3_hubert_hungarian_hu.md b/docs/_posts/ahmedlone127/2023-11-01-sentiment_ohb3_hubert_hungarian_hu.md new file mode 100644 index 000000000000..ea8e5c9be738 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sentiment_ohb3_hubert_hungarian_hu.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Hungarian sentiment_ohb3_hubert_hungarian BertForSequenceClassification from NYTK +author: John Snow Labs +name: sentiment_ohb3_hubert_hungarian +date: 2023-11-01 +tags: [bert, hu, open_source, sequence_classification, onnx] +task: Text Classification +language: hu +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_ohb3_hubert_hungarian` is a Hungarian model originally trained by NYTK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_ohb3_hubert_hungarian_hu_5.1.4_3.4_1698861079465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_ohb3_hubert_hungarian_hu_5.1.4_3.4_1698861079465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_ohb3_hubert_hungarian","hu")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_ohb3_hubert_hungarian","hu") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_ohb3_hubert_hungarian| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|hu| +|Size:|414.7 MB| + +## References + +https://huggingface.co/NYTK/sentiment-ohb3-hubert-hungarian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sentiment_xdistil_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-01-sentiment_xdistil_uncased_en.md new file mode 100644 index 000000000000..9267cae5e118 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sentiment_xdistil_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_xdistil_uncased BertForSequenceClassification from hakonmh +author: John Snow Labs +name: sentiment_xdistil_uncased +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_xdistil_uncased` is a English model originally trained by hakonmh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_xdistil_uncased_en_5.1.4_3.4_1698811595924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_xdistil_uncased_en_5.1.4_3.4_1698811595924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_xdistil_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sentiment_xdistil_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_xdistil_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.1 MB| + +## References + +https://huggingface.co/hakonmh/sentiment-xdistil-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-spam_ham_classifier_just_text_en.md b/docs/_posts/ahmedlone127/2023-11-01-spam_ham_classifier_just_text_en.md new file mode 100644 index 000000000000..b7df368d31c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-spam_ham_classifier_just_text_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English spam_ham_classifier_just_text BertForSequenceClassification from martin-bendik +author: John Snow Labs +name: spam_ham_classifier_just_text +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spam_ham_classifier_just_text` is a English model originally trained by martin-bendik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spam_ham_classifier_just_text_en_5.1.4_3.4_1698862638749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spam_ham_classifier_just_text_en_5.1.4_3.4_1698862638749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("spam_ham_classifier_just_text","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("spam_ham_classifier_just_text","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spam_ham_classifier_just_text| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/martin-bendik/spam_ham_classifier_just_text \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-spam_usernames_classifier_xx.md b/docs/_posts/ahmedlone127/2023-11-01-spam_usernames_classifier_xx.md new file mode 100644 index 000000000000..27b4d7086396 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-spam_usernames_classifier_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual spam_usernames_classifier BertForSequenceClassification from lokas +author: John Snow Labs +name: spam_usernames_classifier +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spam_usernames_classifier` is a Multilingual model originally trained by lokas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spam_usernames_classifier_xx_5.1.4_3.4_1698815808620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spam_usernames_classifier_xx_5.1.4_3.4_1698815808620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("spam_usernames_classifier","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("spam_usernames_classifier","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spam_usernames_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|667.3 MB| + +## References + +https://huggingface.co/lokas/spam-usernames-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-spanish_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-01-spanish_sentiment_model_en.md new file mode 100644 index 000000000000..ea80af189385 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-spanish_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English spanish_sentiment_model BertForSequenceClassification from karina-aquino +author: John Snow Labs +name: spanish_sentiment_model +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spanish_sentiment_model` is a English model originally trained by karina-aquino. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spanish_sentiment_model_en_5.1.4_3.4_1698806293284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spanish_sentiment_model_en_5.1.4_3.4_1698806293284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("spanish_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("spanish_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spanish_sentiment_model| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|627.7 MB| + +## References + +https://huggingface.co/karina-aquino/spanish-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-sscibert_politics_en.md b/docs/_posts/ahmedlone127/2023-11-01-sscibert_politics_en.md new file mode 100644 index 000000000000..ed5e739e2948 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-sscibert_politics_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sscibert_politics BertForSequenceClassification from kalawinka +author: John Snow Labs +name: sscibert_politics +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sscibert_politics` is a English model originally trained by kalawinka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sscibert_politics_en_5.1.4_3.4_1698815594551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sscibert_politics_en_5.1.4_3.4_1698815594551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("sscibert_politics","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("sscibert_politics","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sscibert_politics| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|412.1 MB| + +## References + +https://huggingface.co/kalawinka/SSciBERT_politics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-stance_detection_en.md b/docs/_posts/ahmedlone127/2023-11-01-stance_detection_en.md new file mode 100644 index 000000000000..2ed55532a955 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-stance_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stance_detection BertForSequenceClassification from cheese7858 +author: John Snow Labs +name: stance_detection +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stance_detection` is a English model originally trained by cheese7858. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stance_detection_en_5.1.4_3.4_1698815243866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stance_detection_en_5.1.4_3.4_1698815243866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("stance_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("stance_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stance_detection| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/cheese7858/stance_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-stsb_tinybert_l_4_en.md b/docs/_posts/ahmedlone127/2023-11-01-stsb_tinybert_l_4_en.md new file mode 100644 index 000000000000..47c40cdca569 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-stsb_tinybert_l_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stsb_tinybert_l_4 BertForSequenceClassification from cross-encoder +author: John Snow Labs +name: stsb_tinybert_l_4 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_tinybert_l_4` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_en_5.1.4_3.4_1698807848082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_tinybert_l_4_en_5.1.4_3.4_1698807848082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("stsb_tinybert_l_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_tinybert_l_4| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|54.2 MB| + +## References + +https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-student_reasoning_en.md b/docs/_posts/ahmedlone127/2023-11-01-student_reasoning_en.md new file mode 100644 index 000000000000..35e4f5b3e90d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-student_reasoning_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English student_reasoning BertForSequenceClassification from ddemszky +author: John Snow Labs +name: student_reasoning +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`student_reasoning` is a English model originally trained by ddemszky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/student_reasoning_en_5.1.4_3.4_1698805690235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/student_reasoning_en_5.1.4_3.4_1698805690235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("student_reasoning","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("student_reasoning","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|student_reasoning| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/ddemszky/student-reasoning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-swahbert_sw.md b/docs/_posts/ahmedlone127/2023-11-01-swahbert_sw.md new file mode 100644 index 000000000000..2905b1ca3cd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-swahbert_sw.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Swahili (macrolanguage) swahbert BertForSequenceClassification from metabloit +author: John Snow Labs +name: swahbert +date: 2023-11-01 +tags: [bert, sw, open_source, sequence_classification, onnx] +task: Text Classification +language: sw +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swahbert` is a Swahili (macrolanguage) model originally trained by metabloit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swahbert_sw_5.1.4_3.4_1698871463548.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swahbert_sw_5.1.4_3.4_1698871463548.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("swahbert","sw")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("swahbert","sw") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swahbert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sw| +|Size:|466.4 MB| + +## References + +https://huggingface.co/metabloit/swahBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-swedish_sentiment_fear_sv.md b/docs/_posts/ahmedlone127/2023-11-01-swedish_sentiment_fear_sv.md new file mode 100644 index 000000000000..c2240434bb1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-swedish_sentiment_fear_sv.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Swedish swedish_sentiment_fear BertForSequenceClassification from RecordedFuture +author: John Snow Labs +name: swedish_sentiment_fear +date: 2023-11-01 +tags: [bert, sv, open_source, sequence_classification, onnx] +task: Text Classification +language: sv +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swedish_sentiment_fear` is a Swedish model originally trained by RecordedFuture. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swedish_sentiment_fear_sv_5.1.4_3.4_1698825037720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swedish_sentiment_fear_sv_5.1.4_3.4_1698825037720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("swedish_sentiment_fear","sv")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("swedish_sentiment_fear","sv") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swedish_sentiment_fear| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sv| +|Size:|467.4 MB| + +## References + +https://huggingface.co/RecordedFuture/Swedish-Sentiment-Fear \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-swedish_sentiment_violence_sv.md b/docs/_posts/ahmedlone127/2023-11-01-swedish_sentiment_violence_sv.md new file mode 100644 index 000000000000..2c65666d7d8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-swedish_sentiment_violence_sv.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Swedish swedish_sentiment_violence BertForSequenceClassification from RecordedFuture +author: John Snow Labs +name: swedish_sentiment_violence +date: 2023-11-01 +tags: [bert, sv, open_source, sequence_classification, onnx] +task: Text Classification +language: sv +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swedish_sentiment_violence` is a Swedish model originally trained by RecordedFuture. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swedish_sentiment_violence_sv_5.1.4_3.4_1698809830262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swedish_sentiment_violence_sv_5.1.4_3.4_1698809830262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("swedish_sentiment_violence","sv")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("swedish_sentiment_violence","sv") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swedish_sentiment_violence| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sv| +|Size:|467.4 MB| + +## References + +https://huggingface.co/RecordedFuture/Swedish-Sentiment-Violence \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-synteract_en.md b/docs/_posts/ahmedlone127/2023-11-01-synteract_en.md new file mode 100644 index 000000000000..79be023c2617 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-synteract_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English synteract BertForSequenceClassification from GleghornLab +author: John Snow Labs +name: synteract +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`synteract` is a English model originally trained by GleghornLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/synteract_en_5.1.4_3.4_1698813025062.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/synteract_en_5.1.4_3.4_1698813025062.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("synteract","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("synteract","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|synteract| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/GleghornLab/SYNTERACT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tda_bert_english_cola_en.md b/docs/_posts/ahmedlone127/2023-11-01-tda_bert_english_cola_en.md new file mode 100644 index 000000000000..9b986ccaacd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tda_bert_english_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tda_bert_english_cola BertForSequenceClassification from iproskurina +author: John Snow Labs +name: tda_bert_english_cola +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tda_bert_english_cola` is a English model originally trained by iproskurina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tda_bert_english_cola_en_5.1.4_3.4_1698814246031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tda_bert_english_cola_en_5.1.4_3.4_1698814246031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tda_bert_english_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tda_bert_english_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tda_bert_english_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/iproskurina/tda-bert-en-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tda_itabert_ita_cola_it.md b/docs/_posts/ahmedlone127/2023-11-01-tda_itabert_ita_cola_it.md new file mode 100644 index 000000000000..c20d748d7795 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tda_itabert_ita_cola_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian tda_itabert_ita_cola BertForSequenceClassification from iproskurina +author: John Snow Labs +name: tda_itabert_ita_cola +date: 2023-11-01 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tda_itabert_ita_cola` is a Italian model originally trained by iproskurina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tda_itabert_ita_cola_it_5.1.4_3.4_1698827284432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tda_itabert_ita_cola_it_5.1.4_3.4_1698827284432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tda_itabert_ita_cola","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tda_itabert_ita_cola","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tda_itabert_ita_cola| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|411.8 MB| + +## References + +https://huggingface.co/iproskurina/tda-itabert-ita-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-test_trainer_louislian2341_en.md b/docs/_posts/ahmedlone127/2023-11-01-test_trainer_louislian2341_en.md new file mode 100644 index 000000000000..5e1d82470c6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-test_trainer_louislian2341_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English test_trainer_louislian2341 BertForSequenceClassification from louislian2341 +author: John Snow Labs +name: test_trainer_louislian2341 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_louislian2341` is a English model originally trained by louislian2341. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_louislian2341_en_5.1.4_3.4_1698815677470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_louislian2341_en_5.1.4_3.4_1698815677470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_louislian2341","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("test_trainer_louislian2341","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_louislian2341| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/louislian2341/test-trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-text_classification_hyosyung_highschool_en.md b/docs/_posts/ahmedlone127/2023-11-01-text_classification_hyosyung_highschool_en.md new file mode 100644 index 000000000000..1e6866e4c3b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-text_classification_hyosyung_highschool_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_classification_hyosyung_highschool BertForSequenceClassification from logan221111 +author: John Snow Labs +name: text_classification_hyosyung_highschool +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classification_hyosyung_highschool` is a English model originally trained by logan221111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classification_hyosyung_highschool_en_5.1.4_3.4_1698812056421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classification_hyosyung_highschool_en_5.1.4_3.4_1698812056421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("text_classification_hyosyung_highschool","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("text_classification_hyosyung_highschool","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classification_hyosyung_highschool| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.5 MB| + +## References + +https://huggingface.co/logan221111/text_classification_Hyosyung_HighSchool \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-text_complexity_classification_de.md b/docs/_posts/ahmedlone127/2023-11-01-text_complexity_classification_de.md new file mode 100644 index 000000000000..6e9f27e62676 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-text_complexity_classification_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German text_complexity_classification BertForSequenceClassification from krupper +author: John Snow Labs +name: text_complexity_classification +date: 2023-11-01 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_complexity_classification` is a German model originally trained by krupper. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_complexity_classification_de_5.1.4_3.4_1698837848183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_complexity_classification_de_5.1.4_3.4_1698837848183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("text_complexity_classification","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("text_complexity_classification","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_complexity_classification| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|412.0 MB| + +## References + +https://huggingface.co/krupper/text-complexity-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-thestats2_en.md b/docs/_posts/ahmedlone127/2023-11-01-thestats2_en.md new file mode 100644 index 000000000000..9b2e2ac8decb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-thestats2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English thestats2 BertForSequenceClassification from circulartext +author: John Snow Labs +name: thestats2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thestats2` is a English model originally trained by circulartext. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thestats2_en_5.1.4_3.4_1698814054362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thestats2_en_5.1.4_3.4_1698814054362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("thestats2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("thestats2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thestats2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/circulartext/thestats2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-thucnews_en.md b/docs/_posts/ahmedlone127/2023-11-01-thucnews_en.md new file mode 100644 index 000000000000..e6da8752539a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-thucnews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English thucnews BertForSequenceClassification from shed-e +author: John Snow Labs +name: thucnews +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thucnews` is a English model originally trained by shed-e. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thucnews_en_5.1.4_3.4_1698817046621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thucnews_en_5.1.4_3.4_1698817046621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("thucnews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("thucnews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thucnews| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|224.0 MB| + +## References + +https://huggingface.co/shed-e/thucnews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tijdelijk_en.md b/docs/_posts/ahmedlone127/2023-11-01-tijdelijk_en.md new file mode 100644 index 000000000000..350b4537caed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tijdelijk_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tijdelijk BertForSequenceClassification from jairwaal +author: John Snow Labs +name: tijdelijk +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tijdelijk` is a English model originally trained by jairwaal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tijdelijk_en_5.1.4_3.4_1698815550244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tijdelijk_en_5.1.4_3.4_1698815550244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tijdelijk","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tijdelijk","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tijdelijk| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.0 MB| + +## References + +https://huggingface.co/jairwaal/tijdelijk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tiny_bert_jdc_en.md b/docs/_posts/ahmedlone127/2023-11-01-tiny_bert_jdc_en.md new file mode 100644 index 000000000000..b154709745ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tiny_bert_jdc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_bert_jdc BertForSequenceClassification from tkuye +author: John Snow Labs +name: tiny_bert_jdc +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_bert_jdc` is a English model originally trained by tkuye. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_bert_jdc_en_5.1.4_3.4_1698822477272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_bert_jdc_en_5.1.4_3.4_1698822477272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_jdc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_bert_jdc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_bert_jdc| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/tkuye/tiny-bert-jdc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tiny_random_bertforsequenceclassification_hf_internal_testing_en.md b/docs/_posts/ahmedlone127/2023-11-01-tiny_random_bertforsequenceclassification_hf_internal_testing_en.md new file mode 100644 index 000000000000..4d2fea707900 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tiny_random_bertforsequenceclassification_hf_internal_testing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_random_bertforsequenceclassification_hf_internal_testing BertForSequenceClassification from hf-internal-testing +author: John Snow Labs +name: tiny_random_bertforsequenceclassification_hf_internal_testing +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_bertforsequenceclassification_hf_internal_testing` is a English model originally trained by hf-internal-testing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_bertforsequenceclassification_hf_internal_testing_en_5.1.4_3.4_1698801041961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_bertforsequenceclassification_hf_internal_testing_en_5.1.4_3.4_1698801041961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_random_bertforsequenceclassification_hf_internal_testing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_random_bertforsequenceclassification_hf_internal_testing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_bertforsequenceclassification_hf_internal_testing| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|353.9 KB| + +## References + +https://huggingface.co/hf-internal-testing/tiny-random-BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tiny_random_bertforsequenceclassification_ydshieh_en.md b/docs/_posts/ahmedlone127/2023-11-01-tiny_random_bertforsequenceclassification_ydshieh_en.md new file mode 100644 index 000000000000..40a094ba0134 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tiny_random_bertforsequenceclassification_ydshieh_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_random_bertforsequenceclassification_ydshieh BertForSequenceClassification from ydshieh +author: John Snow Labs +name: tiny_random_bertforsequenceclassification_ydshieh +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_bertforsequenceclassification_ydshieh` is a English model originally trained by ydshieh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_bertforsequenceclassification_ydshieh_en_5.1.4_3.4_1698861919854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_bertforsequenceclassification_ydshieh_en_5.1.4_3.4_1698861919854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tiny_random_bertforsequenceclassification_ydshieh","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tiny_random_bertforsequenceclassification_ydshieh","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_bertforsequenceclassification_ydshieh| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|353.8 KB| + +## References + +https://huggingface.co/ydshieh/tiny-random-BertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tinybert_29_med_intents_en.md b/docs/_posts/ahmedlone127/2023-11-01-tinybert_29_med_intents_en.md new file mode 100644 index 000000000000..719d312901f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tinybert_29_med_intents_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tinybert_29_med_intents BertForSequenceClassification from m-aliabbas1 +author: John Snow Labs +name: tinybert_29_med_intents +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tinybert_29_med_intents` is a English model originally trained by m-aliabbas1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tinybert_29_med_intents_en_5.1.4_3.4_1698863442183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tinybert_29_med_intents_en_5.1.4_3.4_1698863442183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tinybert_29_med_intents","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tinybert_29_med_intents","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tinybert_29_med_intents| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/m-aliabbas1/tinybert_29_med_intents \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-told_breton_binary_samoan_bertimbau_pt.md b/docs/_posts/ahmedlone127/2023-11-01-told_breton_binary_samoan_bertimbau_pt.md new file mode 100644 index 000000000000..004dfde64512 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-told_breton_binary_samoan_bertimbau_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese told_breton_binary_samoan_bertimbau BertForSequenceClassification from alexandreteles +author: John Snow Labs +name: told_breton_binary_samoan_bertimbau +date: 2023-11-01 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`told_breton_binary_samoan_bertimbau` is a Portuguese model originally trained by alexandreteles. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/told_breton_binary_samoan_bertimbau_pt_5.1.4_3.4_1698872487360.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/told_breton_binary_samoan_bertimbau_pt_5.1.4_3.4_1698872487360.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("told_breton_binary_samoan_bertimbau","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("told_breton_binary_samoan_bertimbau","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|told_breton_binary_samoan_bertimbau| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.2 MB| + +## References + +https://huggingface.co/alexandreteles/told_br_binary_sm_bertimbau \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tool_choose_ko.md b/docs/_posts/ahmedlone127/2023-11-01-tool_choose_ko.md new file mode 100644 index 000000000000..e211afb4999c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tool_choose_ko.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Korean tool_choose BertForSequenceClassification from hohorong +author: John Snow Labs +name: tool_choose +date: 2023-11-01 +tags: [bert, ko, open_source, sequence_classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tool_choose` is a Korean model originally trained by hohorong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tool_choose_ko_5.1.4_3.4_1698811538754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tool_choose_ko_5.1.4_3.4_1698811538754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tool_choose","ko")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tool_choose","ko") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tool_choose| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|667.3 MB| + +## References + +https://huggingface.co/hohorong/tool_choose \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-topic_classification_20ng_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-topic_classification_20ng_bert_en.md new file mode 100644 index 000000000000..f9bc52db67e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-topic_classification_20ng_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_classification_20ng_bert BertForSequenceClassification from dsmitran +author: John Snow Labs +name: topic_classification_20ng_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_classification_20ng_bert` is a English model originally trained by dsmitran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_classification_20ng_bert_en_5.1.4_3.4_1698861485981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_classification_20ng_bert_en_5.1.4_3.4_1698861485981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("topic_classification_20ng_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("topic_classification_20ng_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_classification_20ng_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/dsmitran/topic-classification-20ng-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-topic_xdistil_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-01-topic_xdistil_uncased_en.md new file mode 100644 index 000000000000..f9ccf217ffca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-topic_xdistil_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_xdistil_uncased BertForSequenceClassification from hakonmh +author: John Snow Labs +name: topic_xdistil_uncased +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_xdistil_uncased` is a English model originally trained by hakonmh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_xdistil_uncased_en_5.1.4_3.4_1698835252023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_xdistil_uncased_en_5.1.4_3.4_1698835252023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("topic_xdistil_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("topic_xdistil_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_xdistil_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|124.1 MB| + +## References + +https://huggingface.co/hakonmh/topic-xdistil-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-toutiao_zh.md b/docs/_posts/ahmedlone127/2023-11-01-toutiao_zh.md new file mode 100644 index 000000000000..83463eaf927f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-toutiao_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese toutiao BertForSequenceClassification from myml +author: John Snow Labs +name: toutiao +date: 2023-11-01 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toutiao` is a Chinese model originally trained by myml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toutiao_zh_5.1.4_3.4_1698810694229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toutiao_zh_5.1.4_3.4_1698810694229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("toutiao","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toutiao","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toutiao| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|383.3 MB| + +## References + +https://huggingface.co/myml/toutiao \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-toxic_bert_german_de.md b/docs/_posts/ahmedlone127/2023-11-01-toxic_bert_german_de.md new file mode 100644 index 000000000000..7ae0b707e280 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-toxic_bert_german_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German toxic_bert_german BertForSequenceClassification from ankekat1000 +author: John Snow Labs +name: toxic_bert_german +date: 2023-11-01 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_bert_german` is a German model originally trained by ankekat1000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_bert_german_de_5.1.4_3.4_1698827688688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_bert_german_de_5.1.4_3.4_1698827688688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("toxic_bert_german","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toxic_bert_german","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_bert_german| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|409.1 MB| + +## References + +https://huggingface.co/ankekat1000/toxic-bert-german \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-toxicitymodelpt_pt.md b/docs/_posts/ahmedlone127/2023-11-01-toxicitymodelpt_pt.md new file mode 100644 index 000000000000..496b83832a5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-toxicitymodelpt_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese toxicitymodelpt BertForSequenceClassification from nicholasKluge +author: John Snow Labs +name: toxicitymodelpt +date: 2023-11-01 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicitymodelpt` is a Portuguese model originally trained by nicholasKluge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicitymodelpt_pt_5.1.4_3.4_1698869804077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicitymodelpt_pt_5.1.4_3.4_1698869804077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("toxicitymodelpt","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("toxicitymodelpt","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicitymodelpt| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|408.1 MB| + +## References + +https://huggingface.co/nicholasKluge/ToxicityModelPT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-trac2020_all_c_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2023-11-01-trac2020_all_c_bert_base_multilingual_uncased_xx.md new file mode 100644 index 000000000000..656544a249cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-trac2020_all_c_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual trac2020_all_c_bert_base_multilingual_uncased BertForSequenceClassification from socialmediaie +author: John Snow Labs +name: trac2020_all_c_bert_base_multilingual_uncased +date: 2023-11-01 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trac2020_all_c_bert_base_multilingual_uncased` is a Multilingual model originally trained by socialmediaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trac2020_all_c_bert_base_multilingual_uncased_xx_5.1.4_3.4_1698815393542.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trac2020_all_c_bert_base_multilingual_uncased_xx_5.1.4_3.4_1698815393542.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_all_c_bert_base_multilingual_uncased","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trac2020_all_c_bert_base_multilingual_uncased","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trac2020_all_c_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|627.7 MB| + +## References + +https://huggingface.co/socialmediaie/TRAC2020_ALL_C_bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-trading_ai_en.md b/docs/_posts/ahmedlone127/2023-11-01-trading_ai_en.md new file mode 100644 index 000000000000..bc1f6b337f2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-trading_ai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English trading_ai BertForSequenceClassification from papepipopu +author: John Snow Labs +name: trading_ai +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trading_ai` is a English model originally trained by papepipopu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trading_ai_en_5.1.4_3.4_1698861489258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trading_ai_en_5.1.4_3.4_1698861489258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("trading_ai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trading_ai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trading_ai| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/papepipopu/trading_ai \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-trainer_projet7_en.md b/docs/_posts/ahmedlone127/2023-11-01-trainer_projet7_en.md new file mode 100644 index 000000000000..c09a1f617f96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-trainer_projet7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English trainer_projet7 BertForSequenceClassification from ChrisX42 +author: John Snow Labs +name: trainer_projet7 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trainer_projet7` is a English model originally trained by ChrisX42. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trainer_projet7_en_5.1.4_3.4_1698813818877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trainer_projet7_en_5.1.4_3.4_1698813818877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("trainer_projet7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("trainer_projet7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trainer_projet7| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/ChrisX42/trainer_projet7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tunibert_ar.md b/docs/_posts/ahmedlone127/2023-11-01-tunibert_ar.md new file mode 100644 index 000000000000..df3d40a98f22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tunibert_ar.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Arabic tunibert BertForSequenceClassification from AhmedBou +author: John Snow Labs +name: tunibert +date: 2023-11-01 +tags: [bert, ar, open_source, sequence_classification, onnx] +task: Text Classification +language: ar +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tunibert` is a Arabic model originally trained by AhmedBou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tunibert_ar_5.1.4_3.4_1698803949530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tunibert_ar_5.1.4_3.4_1698803949530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tunibert","ar")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tunibert","ar") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tunibert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ar| +|Size:|414.2 MB| + +## References + +https://huggingface.co/AhmedBou/TuniBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-turkish_bert_uncased_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-01-turkish_bert_uncased_sentiment_en.md new file mode 100644 index 000000000000..89fa35bfa42d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-turkish_bert_uncased_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English turkish_bert_uncased_sentiment BertForSequenceClassification from yigitbekir +author: John Snow Labs +name: turkish_bert_uncased_sentiment +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_bert_uncased_sentiment` is a English model originally trained by yigitbekir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_bert_uncased_sentiment_en_5.1.4_3.4_1698840984177.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_bert_uncased_sentiment_en_5.1.4_3.4_1698840984177.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("turkish_bert_uncased_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("turkish_bert_uncased_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_bert_uncased_sentiment| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|414.6 MB| + +## References + +https://huggingface.co/yigitbekir/turkish-bert-uncased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-tydiqa_boolean_question_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-tydiqa_boolean_question_classifier_en.md new file mode 100644 index 000000000000..7649c761f2ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-tydiqa_boolean_question_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tydiqa_boolean_question_classifier BertForSequenceClassification from PrimeQA +author: John Snow Labs +name: tydiqa_boolean_question_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tydiqa_boolean_question_classifier` is a English model originally trained by PrimeQA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tydiqa_boolean_question_classifier_en_5.1.4_3.4_1698814284898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tydiqa_boolean_question_classifier_en_5.1.4_3.4_1698814284898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("tydiqa_boolean_question_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("tydiqa_boolean_question_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tydiqa_boolean_question_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|667.3 MB| + +## References + +https://huggingface.co/PrimeQA/tydiqa-boolean-question-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-udot_forunf_mk1_en.md b/docs/_posts/ahmedlone127/2023-11-01-udot_forunf_mk1_en.md new file mode 100644 index 000000000000..453448afecc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-udot_forunf_mk1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English udot_forunf_mk1 BertForSequenceClassification from parthsolanke +author: John Snow Labs +name: udot_forunf_mk1 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`udot_forunf_mk1` is a English model originally trained by parthsolanke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/udot_forunf_mk1_en_5.1.4_3.4_1698843898830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/udot_forunf_mk1_en_5.1.4_3.4_1698843898830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("udot_forunf_mk1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("udot_forunf_mk1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|udot_forunf_mk1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|260.7 MB| + +## References + +https://huggingface.co/parthsolanke/udot-forunf-mk1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-udot_mk1_en.md b/docs/_posts/ahmedlone127/2023-11-01-udot_mk1_en.md new file mode 100644 index 000000000000..124ce18440da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-udot_mk1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English udot_mk1 BertForSequenceClassification from parthsolanke +author: John Snow Labs +name: udot_mk1 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`udot_mk1` is a English model originally trained by parthsolanke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/udot_mk1_en_5.1.4_3.4_1698843386032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/udot_mk1_en_5.1.4_3.4_1698843386032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("udot_mk1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("udot_mk1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|udot_mk1| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|260.8 MB| + +## References + +https://huggingface.co/parthsolanke/udot-mk1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-udot_mk2_en.md b/docs/_posts/ahmedlone127/2023-11-01-udot_mk2_en.md new file mode 100644 index 000000000000..cfb9e8d0ee64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-udot_mk2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English udot_mk2 BertForSequenceClassification from parthsolanke +author: John Snow Labs +name: udot_mk2 +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`udot_mk2` is a English model originally trained by parthsolanke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/udot_mk2_en_5.1.4_3.4_1698814251155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/udot_mk2_en_5.1.4_3.4_1698814251155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("udot_mk2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("udot_mk2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|udot_mk2| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|260.7 MB| + +## References + +https://huggingface.co/parthsolanke/udot-mk2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-unbias_classification_bert_en.md b/docs/_posts/ahmedlone127/2023-11-01-unbias_classification_bert_en.md new file mode 100644 index 000000000000..00a73319968a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-unbias_classification_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English unbias_classification_bert BertForSequenceClassification from newsmediabias +author: John Snow Labs +name: unbias_classification_bert +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unbias_classification_bert` is a English model originally trained by newsmediabias. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unbias_classification_bert_en_5.1.4_3.4_1698804751622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unbias_classification_bert_en_5.1.4_3.4_1698804751622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("unbias_classification_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("unbias_classification_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unbias_classification_bert| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/newsmediabias/UnBIAS-classification-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-unbias_classification_bert_old_en.md b/docs/_posts/ahmedlone127/2023-11-01-unbias_classification_bert_old_en.md new file mode 100644 index 000000000000..e19f2b436d3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-unbias_classification_bert_old_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English unbias_classification_bert_old BertForSequenceClassification from newsmediabias +author: John Snow Labs +name: unbias_classification_bert_old +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unbias_classification_bert_old` is a English model originally trained by newsmediabias. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unbias_classification_bert_old_en_5.1.4_3.4_1698814803910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unbias_classification_bert_old_en_5.1.4_3.4_1698814803910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("unbias_classification_bert_old","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("unbias_classification_bert_old","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unbias_classification_bert_old| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/newsmediabias/UnBIAS-classification-bert-old \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-unbias_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-01-unbias_classifier_en.md new file mode 100644 index 000000000000..ea21ebcb4bdb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-unbias_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English unbias_classifier BertForSequenceClassification from newsmediabias +author: John Snow Labs +name: unbias_classifier +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unbias_classifier` is a English model originally trained by newsmediabias. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unbias_classifier_en_5.1.4_3.4_1698861807970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unbias_classifier_en_5.1.4_3.4_1698861807970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("unbias_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("unbias_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unbias_classifier| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/newsmediabias/UnBIAS-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-unicausal_seq_baseline_en.md b/docs/_posts/ahmedlone127/2023-11-01-unicausal_seq_baseline_en.md new file mode 100644 index 000000000000..a2b65b2ce1a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-unicausal_seq_baseline_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English unicausal_seq_baseline BertForSequenceClassification from tanfiona +author: John Snow Labs +name: unicausal_seq_baseline +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unicausal_seq_baseline` is a English model originally trained by tanfiona. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unicausal_seq_baseline_en_5.1.4_3.4_1698826339240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unicausal_seq_baseline_en_5.1.4_3.4_1698826339240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("unicausal_seq_baseline","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("unicausal_seq_baseline","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unicausal_seq_baseline| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/tanfiona/unicausal-seq-baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-urdu_abusive_muril_ur.md b/docs/_posts/ahmedlone127/2023-11-01-urdu_abusive_muril_ur.md new file mode 100644 index 000000000000..181a56a4f17a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-urdu_abusive_muril_ur.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Urdu urdu_abusive_muril BertForSequenceClassification from Hate-speech-CNERG +author: John Snow Labs +name: urdu_abusive_muril +date: 2023-11-01 +tags: [bert, ur, open_source, sequence_classification, onnx] +task: Text Classification +language: ur +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`urdu_abusive_muril` is a Urdu model originally trained by Hate-speech-CNERG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/urdu_abusive_muril_ur_5.1.4_3.4_1698821719938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/urdu_abusive_muril_ur_5.1.4_3.4_1698821719938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("urdu_abusive_muril","ur")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("urdu_abusive_muril","ur") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|urdu_abusive_muril| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ur| +|Size:|892.6 MB| + +## References + +https://huggingface.co/Hate-speech-CNERG/urdu-abusive-MuRIL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-vietnamese_content_cls_vi.md b/docs/_posts/ahmedlone127/2023-11-01-vietnamese_content_cls_vi.md new file mode 100644 index 000000000000..046f31df0f89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-vietnamese_content_cls_vi.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Vietnamese vietnamese_content_cls BertForSequenceClassification from vietdata +author: John Snow Labs +name: vietnamese_content_cls +date: 2023-11-01 +tags: [bert, vi, open_source, sequence_classification, onnx] +task: Text Classification +language: vi +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vietnamese_content_cls` is a Vietnamese model originally trained by vietdata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vietnamese_content_cls_vi_5.1.4_3.4_1698806971239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vietnamese_content_cls_vi_5.1.4_3.4_1698806971239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("vietnamese_content_cls","vi")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("vietnamese_content_cls","vi") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vietnamese_content_cls| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|vi| +|Size:|501.5 MB| + +## References + +https://huggingface.co/vietdata/vietnamese-content-cls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-xuyuan_trial_sentiment_bert_chinese_en.md b/docs/_posts/ahmedlone127/2023-11-01-xuyuan_trial_sentiment_bert_chinese_en.md new file mode 100644 index 000000000000..6a7c849b759c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-xuyuan_trial_sentiment_bert_chinese_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English xuyuan_trial_sentiment_bert_chinese BertForSequenceClassification from touch20032003 +author: John Snow Labs +name: xuyuan_trial_sentiment_bert_chinese +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xuyuan_trial_sentiment_bert_chinese` is a English model originally trained by touch20032003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xuyuan_trial_sentiment_bert_chinese_en_5.1.4_3.4_1698810003208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xuyuan_trial_sentiment_bert_chinese_en_5.1.4_3.4_1698810003208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("xuyuan_trial_sentiment_bert_chinese","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("xuyuan_trial_sentiment_bert_chinese","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xuyuan_trial_sentiment_bert_chinese| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|383.4 MB| + +## References + +https://huggingface.co/touch20032003/xuyuan-trial-sentiment-bert-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-yt_list_en.md b/docs/_posts/ahmedlone127/2023-11-01-yt_list_en.md new file mode 100644 index 000000000000..5f4532405f31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-yt_list_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English yt_list BertForSequenceClassification from focia +author: John Snow Labs +name: yt_list +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yt_list` is a English model originally trained by focia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yt_list_en_5.1.4_3.4_1698860934022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yt_list_en_5.1.4_3.4_1698860934022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("yt_list","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("yt_list","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yt_list| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/focia/yt-list \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-01-yt_negativity_en.md b/docs/_posts/ahmedlone127/2023-11-01-yt_negativity_en.md new file mode 100644 index 000000000000..8ea8ddf535cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-01-yt_negativity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English yt_negativity BertForSequenceClassification from focia +author: John Snow Labs +name: yt_negativity +date: 2023-11-01 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.1.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yt_negativity` is a English model originally trained by focia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yt_negativity_en_5.1.4_3.4_1698816056064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yt_negativity_en_5.1.4_3.4_1698816056064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("yt_negativity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("yt_negativity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yt_negativity| +|Compatibility:|Spark NLP 5.1.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| + +## References + +https://huggingface.co/focia/yt-negativity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_chemical_en.md b/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_chemical_en.md new file mode 100644 index 000000000000..e558fa336232 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_chemical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_chemical BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_chemical +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_chemical` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_chemical_en_5.2.0_3.0_1699314054977.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_chemical_en_5.2.0_3.0_1699314054977.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_chemical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_chemical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_chemical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Chemical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_disease_en.md b/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_disease_en.md new file mode 100644 index 000000000000..86932b12de7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_disease BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_disease +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_disease` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_disease_en_5.2.0_3.0_1699314054888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_disease_en_5.2.0_3.0_1699314054888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_gene_en.md b/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_gene_en.md new file mode 100644 index 000000000000..29a505f38376 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bent_pubmedbert_ner_gene_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_gene BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_gene +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_gene` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_gene_en_5.2.0_3.0_1699304365196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_gene_en_5.2.0_3.0_1699304365196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_gene","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_gene", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_gene| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Gene \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_addresses_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_addresses_en.md new file mode 100644 index 000000000000..f27aa8f38a7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_addresses_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_addresses BertForTokenClassification from ctrlbuzz +author: John Snow Labs +name: bert_addresses +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_addresses` is a English model originally trained by ctrlbuzz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_addresses_en_5.2.0_3.0_1699304551042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_addresses_en_5.2.0_3.0_1699304551042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_addresses","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_addresses", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_addresses| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ctrlbuzz/bert-addresses \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_base_multilingual_cased_masakhaner_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_base_multilingual_cased_masakhaner_xx.md new file mode 100644 index 000000000000..4dd59671af4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_base_multilingual_cased_masakhaner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_masakhaner BertForTokenClassification from Davlan +author: John Snow Labs +name: bert_base_multilingual_cased_masakhaner +date: 2023-11-06 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_masakhaner` is a Multilingual model originally trained by Davlan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_masakhaner_xx_5.2.0_3.0_1699306245905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_masakhaner_xx_5.2.0_3.0_1699306245905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_masakhaner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_multilingual_cased_masakhaner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_masakhaner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Davlan/bert-base-multilingual-cased-masakhaner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_italian_cased_ner_it.md b/docs/_posts/ahmedlone127/2023-11-06-bert_italian_cased_ner_it.md new file mode 100644 index 000000000000..824127f46052 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_italian_cased_ner_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian bert_italian_cased_ner BertForTokenClassification from osiria +author: John Snow Labs +name: bert_italian_cased_ner +date: 2023-11-06 +tags: [bert, it, open_source, token_classification, onnx] +task: Named Entity Recognition +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_italian_cased_ner` is a Italian model originally trained by osiria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_italian_cased_ner_it_5.2.0_3.0_1699303842218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_italian_cased_ner_it_5.2.0_3.0_1699303842218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_italian_cased_ner","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_italian_cased_ner", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_italian_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|it| +|Size:|409.0 MB| + +## References + +https://huggingface.co/osiria/bert-italian-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_italian_finetuned_ner_it.md b/docs/_posts/ahmedlone127/2023-11-06-bert_italian_finetuned_ner_it.md new file mode 100644 index 000000000000..d11875f09155 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_italian_finetuned_ner_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian bert_italian_finetuned_ner BertForTokenClassification from nickprock +author: John Snow Labs +name: bert_italian_finetuned_ner +date: 2023-11-06 +tags: [bert, it, open_source, token_classification, onnx] +task: Named Entity Recognition +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_italian_finetuned_ner` is a Italian model originally trained by nickprock. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_italian_finetuned_ner_it_5.2.0_3.0_1699307848390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_italian_finetuned_ner_it_5.2.0_3.0_1699307848390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_italian_finetuned_ner","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_italian_finetuned_ner", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_italian_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|it| +|Size:|409.7 MB| + +## References + +https://huggingface.co/nickprock/bert-italian-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_italian_uncased_ner_it.md b/docs/_posts/ahmedlone127/2023-11-06-bert_italian_uncased_ner_it.md new file mode 100644 index 000000000000..4445da7b9773 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_italian_uncased_ner_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian bert_italian_uncased_ner BertForTokenClassification from osiria +author: John Snow Labs +name: bert_italian_uncased_ner +date: 2023-11-06 +tags: [bert, it, open_source, token_classification, onnx] +task: Named Entity Recognition +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_italian_uncased_ner` is a Italian model originally trained by osiria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_italian_uncased_ner_it_5.2.0_3.0_1699304734543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_italian_uncased_ner_it_5.2.0_3.0_1699304734543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_italian_uncased_ner","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_italian_uncased_ner", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_italian_uncased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|it| +|Size:|407.1 MB| + +## References + +https://huggingface.co/osiria/bert-italian-uncased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_aditya22_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_aditya22_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..b61df6e6af06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_aditya22_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from aditya22) +author: John Snow Labs +name: bert_ner_aditya22_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `aditya22`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_aditya22_bert_finetuned_ner_en_5.2.0_3.0_1699284189318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_aditya22_bert_finetuned_ner_en_5.2.0_3.0_1699284189318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_aditya22_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_aditya22_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_aditya22").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_aditya22_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/aditya22/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ag_based_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ag_based_ner_en.md new file mode 100644 index 000000000000..23c2a51ce6c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ag_based_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Wanjiru) +author: John Snow Labs +name: bert_ner_ag_based_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ag_based_ner` is a English model originally trained by `Wanjiru`. + +## Predicted Entities + +`ITEM`, `REGION`, `METRIC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ag_based_ner_en_5.2.0_3.0_1699283645796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ag_based_ner_en_5.2.0_3.0_1699283645796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ag_based_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ag_based_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.base.by_wanjiru").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ag_based_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Wanjiru/ag_based_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_agro_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_agro_ner_en.md new file mode 100644 index 000000000000..b1226bb611ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_agro_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from gauravnuti) +author: John Snow Labs +name: bert_ner_agro_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `agro-ner` is a English model originally trained by `gauravnuti`. + +## Predicted Entities + +`ITEM`, `REGION`, `METRIC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_agro_ner_en_5.2.0_3.0_1699283913171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_agro_ner_en_5.2.0_3.0_1699283913171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_agro_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_agro_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_gauravnuti").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_agro_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gauravnuti/agro-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_airi_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_airi_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..a492a39d09f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_airi_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from airi) +author: John Snow Labs +name: bert_ner_airi_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `airi`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_airi_bert_finetuned_ner_en_5.2.0_3.0_1699282984146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_airi_bert_finetuned_ner_en_5.2.0_3.0_1699282984146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_airi_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_airi_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.airi..by_airi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_airi_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/airi/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ajgp_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ajgp_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..6d163ccd77fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ajgp_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ajgp_bert_finetuned_ner BertForTokenClassification from AJGP +author: John Snow Labs +name: bert_ner_ajgp_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ajgp_bert_finetuned_ner` is a English model originally trained by AJGP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ajgp_bert_finetuned_ner_en_5.2.0_3.0_1699270662024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ajgp_bert_finetuned_ner_en_5.2.0_3.0_1699270662024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ajgp_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ajgp_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ajgp_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AJGP/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alekseykorshuk_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alekseykorshuk_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..de6fcf7fa51d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alekseykorshuk_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_alekseykorshuk_bert_finetuned_ner BertForTokenClassification from AlekseyKorshuk +author: John Snow Labs +name: bert_ner_alekseykorshuk_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_alekseykorshuk_bert_finetuned_ner` is a English model originally trained by AlekseyKorshuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_alekseykorshuk_bert_finetuned_ner_en_5.2.0_3.0_1699270653499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_alekseykorshuk_bert_finetuned_ner_en_5.2.0_3.0_1699270653499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_alekseykorshuk_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_alekseykorshuk_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_alekseykorshuk_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AlekseyKorshuk/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexander_learn_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexander_learn_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..e520231ff186 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexander_learn_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_alexander_learn_bert_finetuned_ner_accelerate BertForTokenClassification from Alexander-Learn +author: John Snow Labs +name: bert_ner_alexander_learn_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_alexander_learn_bert_finetuned_ner_accelerate` is a English model originally trained by Alexander-Learn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_alexander_learn_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699270645066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_alexander_learn_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699270645066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_alexander_learn_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_alexander_learn_bert_finetuned_ner_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_alexander_learn_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Alexander-Learn/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexander_learn_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexander_learn_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ae84d5a6cf3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexander_learn_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_alexander_learn_bert_finetuned_ner BertForTokenClassification from Alexander-Learn +author: John Snow Labs +name: bert_ner_alexander_learn_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_alexander_learn_bert_finetuned_ner` is a English model originally trained by Alexander-Learn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_alexander_learn_bert_finetuned_ner_en_5.2.0_3.0_1699270657984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_alexander_learn_bert_finetuned_ner_en_5.2.0_3.0_1699270657984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_alexander_learn_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_alexander_learn_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_alexander_learn_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Alexander-Learn/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexanderpeter_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexanderpeter_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..9f0ed17def7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alexanderpeter_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_alexanderpeter_bert_finetuned_ner BertForTokenClassification from AlexanderPeter +author: John Snow Labs +name: bert_ner_alexanderpeter_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_alexanderpeter_bert_finetuned_ner` is a English model originally trained by AlexanderPeter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_alexanderpeter_bert_finetuned_ner_en_5.2.0_3.0_1699270877112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_alexanderpeter_bert_finetuned_ner_en_5.2.0_3.0_1699270877112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_alexanderpeter_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_alexanderpeter_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_alexanderpeter_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AlexanderPeter/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alwaysgetbetter_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alwaysgetbetter_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..313f02d1fcce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_alwaysgetbetter_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from alwaysgetbetter) +author: John Snow Labs +name: bert_ner_alwaysgetbetter_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `alwaysgetbetter`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_alwaysgetbetter_bert_finetuned_ner_en_5.2.0_3.0_1699284454261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_alwaysgetbetter_bert_finetuned_ner_en_5.2.0_3.0_1699284454261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_alwaysgetbetter_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_alwaysgetbetter_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_alwaysgetbetter").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_alwaysgetbetter_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/alwaysgetbetter/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amasi_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amasi_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..534e3b46644e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amasi_wikineural_multilingual_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from amasi) +author: John Snow Labs +name: bert_ner_amasi_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `wikineural-multilingual-ner` is a English model originally trained by `amasi`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_amasi_wikineural_multilingual_ner_en_5.2.0_3.0_1699282412379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_amasi_wikineural_multilingual_ner_en_5.2.0_3.0_1699282412379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_amasi_wikineural_multilingual_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_amasi_wikineural_multilingual_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.wikineural.multilingual.by_amasi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_amasi_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/amasi/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amir36_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amir36_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..140145a2306c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amir36_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from amir36) +author: John Snow Labs +name: bert_ner_amir36_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `amir36`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_amir36_bert_finetuned_ner_en_5.2.0_3.0_1699284193765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_amir36_bert_finetuned_ner_en_5.2.0_3.0_1699284193765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_amir36_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_amir36_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_amir36").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_amir36_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/amir36/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amrita03_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amrita03_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..b849e4386ec3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_amrita03_wikineural_multilingual_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from amrita03) +author: John Snow Labs +name: bert_ner_amrita03_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `wikineural-multilingual-ner` is a English model originally trained by `amrita03`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_amrita03_wikineural_multilingual_ner_en_5.2.0_3.0_1699284702214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_amrita03_wikineural_multilingual_ner_en_5.2.0_3.0_1699284702214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_amrita03_wikineural_multilingual_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_amrita03_wikineural_multilingual_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.wikineural.multilingual.by_amrita03").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_amrita03_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/amrita03/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_aneela_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_aneela_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..6fa9e9b3af22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_aneela_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_aneela_bert_finetuned_ner BertForTokenClassification from Aneela +author: John Snow Labs +name: bert_ner_aneela_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_aneela_bert_finetuned_ner` is a English model originally trained by Aneela. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_aneela_bert_finetuned_ner_en_5.2.0_3.0_1699271082614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_aneela_bert_finetuned_ner_en_5.2.0_3.0_1699271082614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_aneela_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_aneela_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_aneela_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Aneela/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_anery_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_anery_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..326e2799e362 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_anery_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_anery_bert_finetuned_ner BertForTokenClassification from Anery +author: John Snow Labs +name: bert_ner_anery_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_anery_bert_finetuned_ner` is a English model originally trained by Anery. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_anery_bert_finetuned_ner_en_5.2.0_3.0_1699271277614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_anery_bert_finetuned_ner_en_5.2.0_3.0_1699271277614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_anery_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_anery_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_anery_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Anery/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_animalthemuppet_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_animalthemuppet_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..d067ddb99223 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_animalthemuppet_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from animalthemuppet) +author: John Snow Labs +name: bert_ner_animalthemuppet_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `animalthemuppet`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_animalthemuppet_bert_finetuned_ner_en_5.2.0_3.0_1699282678624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_animalthemuppet_bert_finetuned_ner_en_5.2.0_3.0_1699282678624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_animalthemuppet_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_animalthemuppet_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_animalthemuppet").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_animalthemuppet_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/animalthemuppet/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_arabert_ner_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_arabert_ner_ar.md new file mode 100644 index 000000000000..8f3800e33951 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_arabert_ner_ar.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Arabic Named Entity Recognition (from abdusahmbzuai) +author: John Snow Labs +name: bert_ner_arabert_ner +date: 2023-11-06 +tags: [bert, ner, token_classification, ar, open_source, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `arabert-ner` is a Arabic model orginally trained by `abdusahmbzuai`. + +## Predicted Entities + +`ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_arabert_ner_ar_5.2.0_3.0_1699285067114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_arabert_ner_ar_5.2.0_3.0_1699285067114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ +.setInputCol("text") \ +.setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ +.setInputCols("sentence") \ +.setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_arabert_ner","ar") \ +.setInputCols(["sentence", "token"]) \ +.setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["أنا أحب الشرارة NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") +.setInputCols(Array("document")) +.setOutputCol("sentence") + +val tokenizer = new Tokenizer() +.setInputCols(Array("sentence")) +.setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_arabert_ner","ar") +.setInputCols(Array("sentence", "token")) +.setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("أنا أحب الشرارة NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.ner.arabert_ner").predict("""أنا أحب الشرارة NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_arabert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|504.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/abdusahmbzuai/arabert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_archeobertje_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_archeobertje_ner_en.md new file mode 100644 index 000000000000..a496112d3f3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_archeobertje_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_archeobertje_ner BertForTokenClassification from alexbrandsen +author: John Snow Labs +name: bert_ner_archeobertje_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_archeobertje_ner` is a English model originally trained by alexbrandsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_archeobertje_ner_en_5.2.0_3.0_1699271484539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_archeobertje_ner_en_5.2.0_3.0_1699271484539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_archeobertje_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_archeobertje_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_archeobertje_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.5 MB| + +## References + +https://huggingface.co/alexbrandsen/ArcheoBERTje-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_artemis13fowl_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_artemis13fowl_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..e0d7e2bab817 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_artemis13fowl_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from artemis13fowl) +author: John Snow Labs +name: bert_ner_artemis13fowl_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `artemis13fowl`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_artemis13fowl_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699282963671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_artemis13fowl_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699282963671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_artemis13fowl_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_artemis13fowl_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.accelerate.by_artemis13fowl").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_artemis13fowl_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/artemis13fowl/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_artemis13fowl_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_artemis13fowl_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..8de7a051017d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_artemis13fowl_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from artemis13fowl) +author: John Snow Labs +name: bert_ner_artemis13fowl_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `artemis13fowl`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_artemis13fowl_bert_finetuned_ner_en_5.2.0_3.0_1699284459080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_artemis13fowl_bert_finetuned_ner_en_5.2.0_3.0_1699284459080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_artemis13fowl_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_artemis13fowl_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.artemis13fowl.by_artemis13fowl").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_artemis13fowl_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/artemis13fowl/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ashwathgojo234_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ashwathgojo234_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..a08782792682 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ashwathgojo234_wikineural_multilingual_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ashwathgojo234) +author: John Snow Labs +name: bert_ner_ashwathgojo234_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `wikineural-multilingual-ner` is a English model originally trained by `ashwathgojo234`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ashwathgojo234_wikineural_multilingual_ner_en_5.2.0_3.0_1699284708521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ashwathgojo234_wikineural_multilingual_ner_en_5.2.0_3.0_1699284708521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ashwathgojo234_wikineural_multilingual_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ashwathgojo234_wikineural_multilingual_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.wikineural.multilingual.by_ashwathgojo234").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ashwathgojo234_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ashwathgojo234/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_prodigy_10_3362554_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_prodigy_10_3362554_en.md new file mode 100644 index 000000000000..e48e8cbe021a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_prodigy_10_3362554_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English Named Entity Recognition (from abhishek) +author: John Snow Labs +name: bert_ner_autonlp_prodigy_10_3362554 +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `autonlp-prodigy-10-3362554` is a English model orginally trained by `abhishek`. + +## Predicted Entities + +`LOCATION`, `PERSON`, `ORG`, `PRODUCT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_autonlp_prodigy_10_3362554_en_5.2.0_3.0_1699285552337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_autonlp_prodigy_10_3362554_en_5.2.0_3.0_1699285552337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_autonlp_prodigy_10_3362554","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_autonlp_prodigy_10_3362554","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.prodigy").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_autonlp_prodigy_10_3362554| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/abhishek/autonlp-prodigy-10-3362554 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_tele_nepal_bhasa_5k_557515810_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_tele_nepal_bhasa_5k_557515810_en.md new file mode 100644 index 000000000000..97c59fb75dfa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_tele_nepal_bhasa_5k_557515810_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_autonlp_tele_nepal_bhasa_5k_557515810 BertForTokenClassification from kSaluja +author: John Snow Labs +name: bert_ner_autonlp_tele_nepal_bhasa_5k_557515810 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_autonlp_tele_nepal_bhasa_5k_557515810` is a English model originally trained by kSaluja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_autonlp_tele_nepal_bhasa_5k_557515810_en_5.2.0_3.0_1699283291210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_autonlp_tele_nepal_bhasa_5k_557515810_en_5.2.0_3.0_1699283291210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_autonlp_tele_nepal_bhasa_5k_557515810","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_autonlp_tele_nepal_bhasa_5k_557515810", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_autonlp_tele_nepal_bhasa_5k_557515810| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/kSaluja/autonlp-tele_new_5k-557515810 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_tele_red_data_model_585716433_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_tele_red_data_model_585716433_en.md new file mode 100644 index 000000000000..e47484e6b073 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autonlp_tele_red_data_model_585716433_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English Named Entity Recognition (from kSaluja) +author: John Snow Labs +name: bert_ner_autonlp_tele_red_data_model_585716433 +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `autonlp-tele_red_data_model-585716433` is a English model orginally trained by `kSaluja`. + +## Predicted Entities + +`TARGET`, `SUGGESTIONTYPE`, `CALLTYPE`, `INSTRUMENT`, `BUYPRICE`, `HOLDINGPERIOD`, `STOPLOSS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_autonlp_tele_red_data_model_585716433_en_5.2.0_3.0_1699283792922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_autonlp_tele_red_data_model_585716433_en_5.2.0_3.0_1699283792922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_autonlp_tele_red_data_model_585716433","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_autonlp_tele_red_data_model_585716433","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.tele_red.by_ksaluja").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_autonlp_tele_red_data_model_585716433| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/kSaluja/autonlp-tele_red_data_model-585716433 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autotrain_acronym_identification_7324788_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autotrain_acronym_identification_7324788_en.md new file mode 100644 index 000000000000..a5e7beebe2c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_autotrain_acronym_identification_7324788_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from lewtun) +author: John Snow Labs +name: bert_ner_autotrain_acronym_identification_7324788 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-acronym-identification-7324788` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`long`, `short` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_autotrain_acronym_identification_7324788_en_5.2.0_3.0_1699284085005.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_autotrain_acronym_identification_7324788_en_5.2.0_3.0_1699284085005.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_autotrain_acronym_identification_7324788","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_autotrain_acronym_identification_7324788","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_autotrain_acronym_identification_7324788| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/lewtun/autotrain-acronym-identification-7324788 +- https://paperswithcode.com/sota?task=Token+Classification&dataset=acronym_identification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_awilli_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_awilli_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..a52fc0050797 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_awilli_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from awilli) +author: John Snow Labs +name: bert_ner_awilli_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `awilli`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_awilli_bert_finetuned_ner_en_5.2.0_3.0_1699283264167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_awilli_bert_finetuned_ner_en_5.2.0_3.0_1699283264167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_awilli_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_awilli_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_awilli").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_awilli_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/awilli/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_balamurugan1603_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_balamurugan1603_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..b2bcf91c94bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_balamurugan1603_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from balamurugan1603) +author: John Snow Labs +name: bert_ner_balamurugan1603_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `balamurugan1603`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_balamurugan1603_bert_finetuned_ner_en_5.2.0_3.0_1699283543486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_balamurugan1603_bert_finetuned_ner_en_5.2.0_3.0_1699283543486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_balamurugan1603_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_balamurugan1603_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_balamurugan1603").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_balamurugan1603_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/balamurugan1603/bert-finetuned-ner +- https://github.com/balamurugan1603/Named-Entity-Recognition-using-Tranformers/blob/main/named-entity-recognition-using-transfer-learning.ipynb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_baseline_bertv3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_baseline_bertv3_en.md new file mode 100644 index 000000000000..a6517d8e32a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_baseline_bertv3_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from brad1141) +author: John Snow Labs +name: bert_ner_baseline_bertv3 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `baseline_bertv3` is a English model originally trained by `brad1141`. + +## Predicted Entities + +`Position`, `Lead`, `Claim`, `Rebuttal`, `Concluding Statement`, `Evidence`, `Counterclaim` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_baseline_bertv3_en_5.2.0_3.0_1699284937227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_baseline_bertv3_en_5.2.0_3.0_1699284937227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_baseline_bertv3","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_baseline_bertv3","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.base.by_brad1141").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_baseline_bertv3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/brad1141/baseline_bertv3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_batya66_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_batya66_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..3808b518cf92 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_batya66_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from batya66) +author: John Snow Labs +name: bert_ner_batya66_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `batya66`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_batya66_bert_finetuned_ner_en_5.2.0_3.0_1699285160966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_batya66_bert_finetuned_ner_en_5.2.0_3.0_1699285160966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_batya66_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_batya66_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_batya66").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_batya66_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/batya66/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_imbalancedpubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_imbalancedpubmedbert_en.md new file mode 100644 index 000000000000..5b8504e2c89d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_imbalancedpubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc2gm_gene_imbalancedpubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc2gm_gene_imbalancedpubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc2gm_gene_imbalancedpubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_imbalancedpubmedbert_en_5.2.0_3.0_1699271680381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_imbalancedpubmedbert_en_5.2.0_3.0_1699271680381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc2gm_gene_imbalancedpubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc2gm_gene_imbalancedpubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc2gm_gene_imbalancedpubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC2GM-Gene_ImbalancedPubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased_en.md new file mode 100644 index 000000000000..99a395dc526e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased_en_5.2.0_3.0_1699271892513.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased_en_5.2.0_3.0_1699271892513.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc2gm_gene_imbalancedscibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC2GM-Gene_Imbalancedscibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_modified_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_modified_pubmedbert_en.md new file mode 100644 index 000000000000..1a94c1ffdcf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_modified_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc2gm_gene_modified_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc2gm_gene_modified_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc2gm_gene_modified_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_modified_pubmedbert_en_5.2.0_3.0_1699270871971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_modified_pubmedbert_en_5.2.0_3.0_1699270871971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc2gm_gene_modified_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc2gm_gene_modified_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc2gm_gene_modified_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC2GM-Gene-Modified_PubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_modified_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_modified_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..96a3ebf2237c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc2gm_gene_modified_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc2gm_gene_modified_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc2gm_gene_modified_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc2gm_gene_modified_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699270887496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc2gm_gene_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699270887496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc2gm_gene_modified_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc2gm_gene_modified_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc2gm_gene_modified_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC2GM-Gene-Modified_scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_chem_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_chem_pubmedbert_en.md new file mode 100644 index 000000000000..dbe7e4ff0bb6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_chem_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_chem_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_chem_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_chem_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_chem_pubmedbert_en_5.2.0_3.0_1699271459302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_chem_pubmedbert_en_5.2.0_3.0_1699271459302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_chem_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_chem_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_chem_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4_CHEM_PubmedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_biobert_v1.1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_biobert_v1.1_en.md new file mode 100644 index 000000000000..c7c5ad9cfbd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_biobert_v1.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_modified_biobert_v1.1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_modified_biobert_v1.1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_modified_biobert_v1.1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_biobert_v1.1_en_5.2.0_3.0_1699271684661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_biobert_v1.1_en_5.2.0_3.0_1699271684661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_modified_biobert_v1.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_modified_biobert_v1.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_modified_biobert_v1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4_Modified-biobert-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..d3d319753d8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699271647471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699271647471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_modified_biomednlp_pubmedbert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4_Modified_BiomedNLP-PubMedBERT-base-uncased-abstract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_en.md new file mode 100644 index 000000000000..ba4612079881 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_en_5.2.0_3.0_1699271880315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_en_5.2.0_3.0_1699271880315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_modified_bluebert_pubmed_uncased_l_12_h_768_a_12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4_Modified-bluebert_pubmed_uncased_L-12_H-768_A-12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_pubmedbert_small_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_pubmedbert_small_en.md new file mode 100644 index 000000000000..7aa9c2b696e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_pubmedbert_small_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_modified_pubmedbert_small BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_modified_pubmedbert_small +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_modified_pubmedbert_small` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_pubmedbert_small_en_5.2.0_3.0_1699271999583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_pubmedbert_small_en_5.2.0_3.0_1699271999583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_modified_pubmedbert_small","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_modified_pubmedbert_small", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_modified_pubmedbert_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4-modified-PubmedBert_small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_scibert_scivocab_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_scibert_scivocab_uncased_en.md new file mode 100644 index 000000000000..1ae5d68f1d91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_modified_scibert_scivocab_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_modified_scibert_scivocab_uncased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_modified_scibert_scivocab_uncased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_modified_scibert_scivocab_uncased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_scibert_scivocab_uncased_en_5.2.0_3.0_1699273014223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_modified_scibert_scivocab_uncased_en_5.2.0_3.0_1699273014223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_modified_scibert_scivocab_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_modified_scibert_scivocab_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_modified_scibert_scivocab_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4_Modified-scibert_scivocab_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_biobert_v1.1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_biobert_v1.1_en.md new file mode 100644 index 000000000000..69c425a81214 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_biobert_v1.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_original_biobert_v1.1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_original_biobert_v1.1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_original_biobert_v1.1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_biobert_v1.1_en_5.2.0_3.0_1699272091400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_biobert_v1.1_en_5.2.0_3.0_1699272091400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_original_biobert_v1.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_original_biobert_v1.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_original_biobert_v1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4-Original-biobert-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..2044c1051d12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699271894218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699271894218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_original_biomednlp_pubmedbert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4_Original-BiomedNLP-PubMedBERT-base-uncased-abstract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12_en.md new file mode 100644 index 000000000000..b75549f66928 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12_en_5.2.0_3.0_1699272091362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12_en_5.2.0_3.0_1699272091362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_original_bluebert_pubmed_uncased_l_12_h_768_a_12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4-Original-bluebert_pubmed_uncased_L-12_H-768_A-12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_pubmedbert_en.md new file mode 100644 index 000000000000..06fb4861eca6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_original_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_original_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_original_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_pubmedbert_en_5.2.0_3.0_1699272279631.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_pubmedbert_en_5.2.0_3.0_1699272279631.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_original_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_original_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_original_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4-original-PubmedBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_pubmedbert_small_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_pubmedbert_small_en.md new file mode 100644 index 000000000000..46ea4ee3dcc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_pubmedbert_small_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_original_pubmedbert_small BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_original_pubmedbert_small +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_original_pubmedbert_small` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_pubmedbert_small_en_5.2.0_3.0_1699271640545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_pubmedbert_small_en_5.2.0_3.0_1699271640545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_original_pubmedbert_small","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_original_pubmedbert_small", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_original_pubmedbert_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4-original-PubmedBert_small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_scibert_scivocab_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_scibert_scivocab_uncased_en.md new file mode 100644 index 000000000000..80e567967ee1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4_original_scibert_scivocab_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4_original_scibert_scivocab_uncased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4_original_scibert_scivocab_uncased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4_original_scibert_scivocab_uncased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_scibert_scivocab_uncased_en_5.2.0_3.0_1699271827890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4_original_scibert_scivocab_uncased_en_5.2.0_3.0_1699271827890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4_original_scibert_scivocab_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4_original_scibert_scivocab_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4_original_scibert_scivocab_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4-Original-scibert_scivocab_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_biobert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_biobert_384_en.md new file mode 100644 index 000000000000..a8fadc215ff2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_biobert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_modified_biobert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_modified_biobert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_modified_biobert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_biobert_384_en_5.2.0_3.0_1699272076197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_biobert_384_en_5.2.0_3.0_1699272076197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_modified_biobert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_modified_biobert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_modified_biobert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Modified-BioBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_biobert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_biobert_512_en.md new file mode 100644 index 000000000000..765507ac58db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_biobert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_modified_biobert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_modified_biobert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_modified_biobert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_biobert_512_en_5.2.0_3.0_1699270857924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_biobert_512_en_5.2.0_3.0_1699270857924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_modified_biobert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_modified_biobert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_modified_biobert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Modified-BioBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_bluebert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_bluebert_384_en.md new file mode 100644 index 000000000000..befe11e25b8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_bluebert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_modified_bluebert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_modified_bluebert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_modified_bluebert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_bluebert_384_en_5.2.0_3.0_1699270658502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_bluebert_384_en_5.2.0_3.0_1699270658502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_modified_bluebert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_modified_bluebert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_modified_bluebert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Modified-BlueBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_bluebert_512_en.md new file mode 100644 index 000000000000..7c9b32c1a0e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_modified_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_modified_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_modified_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_bluebert_512_en_5.2.0_3.0_1699270885638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_bluebert_512_en_5.2.0_3.0_1699270885638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_modified_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_modified_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_modified_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Modified-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_pubmedbert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_pubmedbert_384_en.md new file mode 100644 index 000000000000..ed7edc007d7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_pubmedbert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_modified_pubmedbert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_modified_pubmedbert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_modified_pubmedbert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_pubmedbert_384_en_5.2.0_3.0_1699271049403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_pubmedbert_384_en_5.2.0_3.0_1699271049403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_modified_pubmedbert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_modified_pubmedbert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_modified_pubmedbert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Modified-PubMedBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_scibert_384_en.md new file mode 100644 index 000000000000..d726bbabf8e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_modified_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_modified_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_modified_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_scibert_384_en_5.2.0_3.0_1699271087624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_scibert_384_en_5.2.0_3.0_1699271087624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_modified_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_modified_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_modified_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Modified-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_scibert_512_en.md new file mode 100644 index 000000000000..350a6b6c46e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_modified_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_modified_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_modified_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_modified_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_scibert_512_en_5.2.0_3.0_1699271093334.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_modified_scibert_512_en_5.2.0_3.0_1699271093334.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_modified_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_modified_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_modified_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Modified-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_biobert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_biobert_512_en.md new file mode 100644 index 000000000000..14e9c3b0400f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_biobert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_original_biobert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_original_biobert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_original_biobert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_biobert_512_en_5.2.0_3.0_1699272245029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_biobert_512_en_5.2.0_3.0_1699272245029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_original_biobert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_original_biobert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_original_biobert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Original-BioBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_bluebert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_bluebert_384_en.md new file mode 100644 index 000000000000..1f77b04c2a54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_bluebert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_original_bluebert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_original_bluebert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_original_bluebert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_bluebert_384_en_5.2.0_3.0_1699272422077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_bluebert_384_en_5.2.0_3.0_1699272422077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_original_bluebert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_original_bluebert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_original_bluebert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Original-BlueBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_bluebert_512_en.md new file mode 100644 index 000000000000..e595068f38dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_original_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_original_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_original_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_bluebert_512_en_5.2.0_3.0_1699271057009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_bluebert_512_en_5.2.0_3.0_1699271057009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_original_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_original_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_original_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Original-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_pubmedbert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_pubmedbert_384_en.md new file mode 100644 index 000000000000..2e0e828961d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_pubmedbert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_original_pubmedbert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_original_pubmedbert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_original_pubmedbert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_pubmedbert_384_en_5.2.0_3.0_1699272607827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_pubmedbert_384_en_5.2.0_3.0_1699272607827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_original_pubmedbert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_original_pubmedbert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_original_pubmedbert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Original-PubMedBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_pubmedbert_512_en.md new file mode 100644 index 000000000000..d0110bd8e73f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_original_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_original_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_original_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_pubmedbert_512_en_5.2.0_3.0_1699271292362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_pubmedbert_512_en_5.2.0_3.0_1699271292362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_original_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_original_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_original_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Original-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_scibert_384_en.md new file mode 100644 index 000000000000..8e5d25c77772 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_original_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_original_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_original_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_scibert_384_en_5.2.0_3.0_1699271288390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_scibert_384_en_5.2.0_3.0_1699271288390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_original_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_original_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_original_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Original-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_scibert_512_en.md new file mode 100644 index 000000000000..a74c50055e94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_chem_original_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_chem_original_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_chem_original_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_chem_original_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_scibert_512_en_5.2.0_3.0_1699272789113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_chem_original_scibert_512_en_5.2.0_3.0_1699272789113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_chem_original_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_chem_original_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_chem_original_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Chem-Original-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1_en.md new file mode 100644 index 000000000000..66b534f2c304 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from ghadeermobasher) +author: John Snow Labs +name: bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bc4chemd-imbalanced-biobert-base-casesd-v1.1` is a English model originally trained by `ghadeermobasher`. + +## Predicted Entities + +`Chemical` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1_en_5.2.0_3.0_1699285462210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1_en_5.2.0_3.0_1699285462210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.biobert.chemical.base_imbalanced").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_imbalanced_biobert_base_casesd_v1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ghadeermobasher/bc4chemd-imbalanced-biobert-base-casesd-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalancedpubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalancedpubmedbert_en.md new file mode 100644 index 000000000000..6f1ed769646f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalancedpubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_imbalancedpubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_imbalancedpubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_imbalancedpubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_imbalancedpubmedbert_en_5.2.0_3.0_1699271488566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_imbalancedpubmedbert_en_5.2.0_3.0_1699271488566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_imbalancedpubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_imbalancedpubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_imbalancedpubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD_ImbalancedPubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalancedscibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalancedscibert_scivocab_cased_en.md new file mode 100644 index 000000000000..c358ca8f0ae6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_imbalancedscibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_imbalancedscibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_imbalancedscibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_imbalancedscibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_imbalancedscibert_scivocab_cased_en_5.2.0_3.0_1699271264127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_imbalancedscibert_scivocab_cased_en_5.2.0_3.0_1699271264127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_imbalancedscibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_imbalancedscibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_imbalancedscibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD_Imbalancedscibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_pubmed_clinical_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_pubmed_clinical_en.md new file mode 100644 index 000000000000..108ad3017a3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_pubmed_clinical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_modified_pubmed_clinical BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_modified_pubmed_clinical +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_modified_pubmed_clinical` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_modified_pubmed_clinical_en_5.2.0_3.0_1699271493242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_modified_pubmed_clinical_en_5.2.0_3.0_1699271493242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_modified_pubmed_clinical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_modified_pubmed_clinical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_modified_pubmed_clinical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Modified_pubmed_clinical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_pubmedbert_en.md new file mode 100644 index 000000000000..cb02c1242476 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_modified_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_modified_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_modified_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_modified_pubmedbert_en_5.2.0_3.0_1699271259751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_modified_pubmedbert_en_5.2.0_3.0_1699271259751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_modified_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_modified_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_modified_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Modified_PubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..659be625ef4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_modified_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_modified_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_modified_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_modified_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699271454378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699271454378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_modified_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_modified_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_modified_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Modified_scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_original_biobert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_original_biobert_384_en.md new file mode 100644 index 000000000000..4d10483f7afa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc4chemd_original_biobert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc4chemd_original_biobert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc4chemd_original_biobert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc4chemd_original_biobert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_original_biobert_384_en_5.2.0_3.0_1699271675645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc4chemd_original_biobert_384_en_5.2.0_3.0_1699271675645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc4chemd_original_biobert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc4chemd_original_biobert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc4chemd_original_biobert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC4CHEMD-Original-BioBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cd_chem_modified_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cd_chem_modified_pubmedbert_512_en.md new file mode 100644 index 000000000000..cc44b7d332f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cd_chem_modified_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cd_chem_modified_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cd_chem_modified_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cd_chem_modified_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cd_chem_modified_pubmedbert_512_en_5.2.0_3.0_1699274833523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cd_chem_modified_pubmedbert_512_en_5.2.0_3.0_1699274833523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cd_chem_modified_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cd_chem_modified_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cd_chem_modified_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CD-Chem-Modified-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..e7732d81ad81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699271832947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699271832947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem2_imbalanced_biomednlp_pubmedbert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem2-imbalanced-BiomedNLP-PubMedBERT-base-uncased-abstract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..fac025e09caf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699272464114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699272464114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem2_modified_biomednlp_pubmedbert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem2-Modified_BiomedNLP-PubMedBERT-base-uncased-abstract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_384_en.md new file mode 100644 index 000000000000..49ef156db282 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_biobert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_biobert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_biobert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_384_en_5.2.0_3.0_1699272191056.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_384_en_5.2.0_3.0_1699272191056.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_biobert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_biobert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_biobert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified-BioBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_512_en.md new file mode 100644 index 000000000000..bb35f23e8e49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_biobert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_biobert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_biobert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_512_en_5.2.0_3.0_1699272025115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_512_en_5.2.0_3.0_1699272025115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_biobert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_biobert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_biobert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified-BioBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_large_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_large_cased_en.md new file mode 100644 index 000000000000..79890bf9c367 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_large_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_biobert_large_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_biobert_large_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_biobert_large_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_large_cased_en_5.2.0_3.0_1699272789103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_large_cased_en_5.2.0_3.0_1699272789103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_biobert_large_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_biobert_large_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_biobert_large_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified_biobert-large-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest_en.md new file mode 100644 index 000000000000..be7d2b82e931 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest_en_5.2.0_3.0_1699272658333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest_en_5.2.0_3.0_1699272658333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_biobert_v1.1_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified_biobert-v1.1_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_bluebert_512_en.md new file mode 100644 index 000000000000..54c47852e024 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_bluebert_512_en_5.2.0_3.0_1699272391082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_bluebert_512_en_5.2.0_3.0_1699272391082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md new file mode 100644 index 000000000000..f6b3c4ae4f01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699272396685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699272396685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified_bluebert_pubmed_uncased_L-12_H-768_A-12_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_abstract_3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_abstract_3_en.md new file mode 100644 index 000000000000..38e4e41e3f51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_abstract_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_pubmed_abstract_3 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_pubmed_abstract_3 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_pubmed_abstract_3` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmed_abstract_3_en_5.2.0_3.0_1699272849932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmed_abstract_3_en_5.2.0_3.0_1699272849932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_pubmed_abstract_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_pubmed_abstract_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_pubmed_abstract_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified_pubmed_abstract_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest_en.md new file mode 100644 index 000000000000..e2d6f6e877b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest_en_5.2.0_3.0_1699273249110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest_en_5.2.0_3.0_1699273249110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_pubmed_abstract_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified_pubmed_abstract_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_full_3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_full_3_en.md new file mode 100644 index 000000000000..8683e08d68bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmed_full_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_pubmed_full_3 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_pubmed_full_3 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_pubmed_full_3` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmed_full_3_en_5.2.0_3.0_1699273041573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmed_full_3_en_5.2.0_3.0_1699273041573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_pubmed_full_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_pubmed_full_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_pubmed_full_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified_pubmed_full_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmedbert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmedbert_384_en.md new file mode 100644 index 000000000000..d0365c661d6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_pubmedbert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_pubmedbert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_pubmedbert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_pubmedbert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmedbert_384_en_5.2.0_3.0_1699272581256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_pubmedbert_384_en_5.2.0_3.0_1699272581256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_pubmedbert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_pubmedbert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_pubmedbert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified-PubMedBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_384_en.md new file mode 100644 index 000000000000..d1b12e45dbed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_scibert_384_en_5.2.0_3.0_1699272278010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_scibert_384_en_5.2.0_3.0_1699272278010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_512_en.md new file mode 100644 index 000000000000..a1c56c64108b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_scibert_512_en_5.2.0_3.0_1699272218296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_scibert_512_en_5.2.0_3.0_1699272218296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest_en.md new file mode 100644 index 000000000000..c158cd8d722f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest_en_5.2.0_3.0_1699272470586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest_en_5.2.0_3.0_1699272470586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_modified_scibert_scivocab_uncased_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Modified_scibert_scivocab_uncased_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_biobert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_biobert_384_en.md new file mode 100644 index 000000000000..9fbdf5d99309 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_biobert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_original_biobert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_original_biobert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_original_biobert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_biobert_384_en_5.2.0_3.0_1699273253315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_biobert_384_en_5.2.0_3.0_1699273253315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_original_biobert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_original_biobert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_original_biobert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Original-BioBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_biobert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_biobert_512_en.md new file mode 100644 index 000000000000..ad578a28bdb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_biobert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_original_biobert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_original_biobert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_original_biobert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_biobert_512_en_5.2.0_3.0_1699273465922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_biobert_512_en_5.2.0_3.0_1699273465922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_original_biobert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_original_biobert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_original_biobert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Original-BioBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_bluebert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_bluebert_384_en.md new file mode 100644 index 000000000000..77d36bffda25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_bluebert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_original_bluebert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_original_bluebert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_original_bluebert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_bluebert_384_en_5.2.0_3.0_1699272576665.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_bluebert_384_en_5.2.0_3.0_1699272576665.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_original_bluebert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_original_bluebert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_original_bluebert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Original-BlueBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_bluebert_512_en.md new file mode 100644 index 000000000000..e30e9fdf91d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_original_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_original_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_original_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_bluebert_512_en_5.2.0_3.0_1699272780116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_bluebert_512_en_5.2.0_3.0_1699272780116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_original_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_original_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_original_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Original-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_pubmedbert_512_en.md new file mode 100644 index 000000000000..0f19eb34fb4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_original_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_original_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_original_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_pubmedbert_512_en_5.2.0_3.0_1699273683823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_pubmedbert_512_en_5.2.0_3.0_1699273683823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_original_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_original_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_original_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Original-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_scibert_384_en.md new file mode 100644 index 000000000000..68ed53764fe9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_original_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_original_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_original_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_scibert_384_en_5.2.0_3.0_1699273006930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_scibert_384_en_5.2.0_3.0_1699273006930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_original_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_original_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_original_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Original-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_scibert_512_en.md new file mode 100644 index 000000000000..90b8b7315390 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chem_original_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chem_original_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chem_original_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chem_original_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_scibert_512_en_5.2.0_3.0_1699272658520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chem_original_scibert_512_en_5.2.0_3.0_1699272658520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chem_original_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chem_original_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chem_original_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chem-Original-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2_en.md new file mode 100644 index 000000000000..37fd79b6df73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2_en_5.2.0_3.0_1699272850975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2_en_5.2.0_3.0_1699272850975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_disease_balanced_biobert_base_cased_v1.2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-Disease-balanced-biobert-base-cased-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext_en.md new file mode 100644 index 000000000000..f8c81357a852 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext_en_5.2.0_3.0_1699273006951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext_en_5.2.0_3.0_1699273006951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_disease_balanced_biomednlp_pubmedbert_base_uncased_abstract_fulltext| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-Disease-balanced-BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert_en.md new file mode 100644 index 000000000000..61a806ce50a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert_en_5.2.0_3.0_1699273249356.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert_en_5.2.0_3.0_1699273249356.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_disease_balanced_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-Disease-balanced-pubmedbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext_en.md new file mode 100644 index 000000000000..bbdd85aac129 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext_en_5.2.0_3.0_1699273884858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext_en_5.2.0_3.0_1699273884858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_disease_balanced_sapbert_from_pubmedbert_fulltext| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-Disease-balanced-SapBERT-from-PubMedBERT-fulltext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..a004f9ed8cee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased_en_5.2.0_3.0_1699274075895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased_en_5.2.0_3.0_1699274075895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_disease_balanced_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-Disease-balanced-scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_biobert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_biobert_en.md new file mode 100644 index 000000000000..af14af849f36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_biobert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_imbalanced_biobert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_imbalanced_biobert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_imbalanced_biobert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_biobert_en_5.2.0_3.0_1699273436029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_biobert_en_5.2.0_3.0_1699273436029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_imbalanced_biobert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_imbalanced_biobert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_imbalanced_biobert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-imbalanced-biobert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest_en.md new file mode 100644 index 000000000000..457e69aac0f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest_en_5.2.0_3.0_1699273444414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest_en_5.2.0_3.0_1699273444414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_imbalanced_biobert_v1.1_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-imbalanced-biobert-v1.1_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md new file mode 100644 index 000000000000..e519490f8619 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699274253537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699274253537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-imbalanced-bluebert_pubmed_uncased_L-12_H-768_A-12_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest_en.md new file mode 100644 index 000000000000..032df80232af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest_en_5.2.0_3.0_1699273235277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest_en_5.2.0_3.0_1699273235277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_base_uncased_abstract_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-imbalanced-PubMedBERT-base-uncased-abstract_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_en.md new file mode 100644 index 000000000000..2a075cd0b823 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_imbalanced_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_imbalanced_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_imbalanced_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_en_5.2.0_3.0_1699273674731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_pubmedbert_en_5.2.0_3.0_1699273674731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_imbalanced_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_imbalanced_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_imbalanced_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-imbalanced-pubmedbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..d1b086f78e5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased_en_5.2.0_3.0_1699273050637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased_en_5.2.0_3.0_1699273050637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical_Imbalanced-scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest_en.md new file mode 100644 index 000000000000..bf8d8cdc743f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest_en_5.2.0_3.0_1699273476501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest_en_5.2.0_3.0_1699273476501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_imbalanced_scibert_scivocab_uncased_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical-imbalanced-scibert_scivocab_uncased_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_modified_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_modified_pubmedbert_en.md new file mode 100644 index 000000000000..555d1fd77211 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_modified_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_modified_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_modified_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_modified_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_modified_pubmedbert_en_5.2.0_3.0_1699273471127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_modified_pubmedbert_en_5.2.0_3.0_1699273471127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_modified_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_modified_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_modified_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical_Modified_PubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..73b90b253997 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699273240022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699273240022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_chemical_modified_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Chemical_Modified_scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_balancedpubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_balancedpubmedbert_en.md new file mode 100644 index 000000000000..7b54f8be6da6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_balancedpubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_balancedpubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_balancedpubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_balancedpubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_balancedpubmedbert_en_5.2.0_3.0_1699273879347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_balancedpubmedbert_en_5.2.0_3.0_1699273879347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_balancedpubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_balancedpubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_balancedpubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-balancedPubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1_en.md new file mode 100644 index 000000000000..fb443d5feaf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1_en_5.2.0_3.0_1699274225547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1_en_5.2.0_3.0_1699274225547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_imbalanced_biobert_v1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-imbalanced-biobert-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..5b007709af65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699274050674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699274050674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_imbalanced_biomednlp_pubmedbert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-imbalanced-BiomedNLP-PubMedBERT-base-uncased-abstract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md new file mode 100644 index 000000000000..6934702c59b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699274654243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699274654243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_imbalanced_bluebert_pubmed_uncased_l_12_h_768_a_12_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-imbalanced-bluebert_pubmed_uncased_L-12_H-768_A-12_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased_en.md new file mode 100644 index 000000000000..18745fb94301 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased_en_5.2.0_3.0_1699274239287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased_en_5.2.0_3.0_1699274239287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_imbalanced_scibert_scivocab_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-imbalanced-scibert_scivocab_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_biobert_v1.1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_biobert_v1.1_en.md new file mode 100644 index 000000000000..f39369b4b9f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_biobert_v1.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_modified_biobert_v1.1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_modified_biobert_v1.1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_modified_biobert_v1.1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_biobert_v1.1_en_5.2.0_3.0_1699274446300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_biobert_v1.1_en_5.2.0_3.0_1699274446300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_modified_biobert_v1.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_modified_biobert_v1.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_modified_biobert_v1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-Modified_biobert-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..b7fe20f12f1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699273649579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699273649579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_modified_biomednlp_pubmedbert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-Modified_BiomedNLP-PubMedBERT-base-uncased-abstract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md new file mode 100644 index 000000000000..74b447374d10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699273828582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699273828582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-Modified_bluebert_pubmed_uncased_L-12_H-768_A-12_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_pubmedbert_en.md new file mode 100644 index 000000000000..5db1c18cf2b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_modified_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_modified_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_modified_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_pubmedbert_en_5.2.0_3.0_1699273683103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_pubmedbert_en_5.2.0_3.0_1699273683103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_modified_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_modified_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_modified_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease_Modified_PubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..370e74c91fab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699274040713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased_en_5.2.0_3.0_1699274040713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_modified_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease_Modified_scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased_en.md new file mode 100644 index 000000000000..1b746f12a02e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased_en_5.2.0_3.0_1699273644723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased_en_5.2.0_3.0_1699273644723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_disease_modified_scibert_scivocab_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Disease-Modified_scibert_scivocab_uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2_en.md new file mode 100644 index 000000000000..7e71625b1e15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2_en_5.2.0_3.0_1699273859980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2_en_5.2.0_3.0_1699273859980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_imbalanced_biobert_base_cased_v1.2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Imbalanced-biobert-base-cased-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_pubmedbert_en.md new file mode 100644 index 000000000000..4e825bfe2736 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_imbalanced_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_imbalanced_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_imbalanced_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_pubmedbert_en_5.2.0_3.0_1699274428579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_pubmedbert_en_5.2.0_3.0_1699274428579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_imbalanced_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_imbalanced_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_imbalanced_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Imbalanced-PubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext_en.md new file mode 100644 index 000000000000..e7b0a34e369b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext_en_5.2.0_3.0_1699273829751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext_en_5.2.0_3.0_1699273829751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_imbalanced_sapbert_from_pubmedbert_fulltext| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Imbalanced-SapBERT-from-PubMedBERT-fulltext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..ca9ea8adbb85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased_en_5.2.0_3.0_1699274659277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased_en_5.2.0_3.0_1699274659277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bc5cdr_imbalanced_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BC5CDR-Imbalanced-scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bchem4_modified_biobert_v1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bchem4_modified_biobert_v1_en.md new file mode 100644 index 000000000000..3d34fdf2280a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bchem4_modified_biobert_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bchem4_modified_biobert_v1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bchem4_modified_biobert_v1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bchem4_modified_biobert_v1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bchem4_modified_biobert_v1_en_5.2.0_3.0_1699274045512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bchem4_modified_biobert_v1_en_5.2.0_3.0_1699274045512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bchem4_modified_biobert_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bchem4_modified_biobert_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bchem4_modified_biobert_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BCHEM4-Modified-BioBERT-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_catalan_ner_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_catalan_ner_ar.md new file mode 100644 index 000000000000..dc71108dc299 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_catalan_ner_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_ner_bert_base_arabic_camelbert_catalan_ner BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_ner_bert_base_arabic_camelbert_catalan_ner +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_arabic_camelbert_catalan_ner` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_catalan_ner_ar_5.2.0_3.0_1699284928701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_catalan_ner_ar_5.2.0_3.0_1699284928701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_arabic_camelbert_catalan_ner","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_arabic_camelbert_catalan_ner", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_arabic_camelbert_catalan_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.6 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-ca-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_danish_ner_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_danish_ner_ar.md new file mode 100644 index 000000000000..df92f4b6335a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_danish_ner_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_ner_bert_base_arabic_camelbert_danish_ner BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_ner_bert_base_arabic_camelbert_danish_ner +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_arabic_camelbert_danish_ner` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_danish_ner_ar_5.2.0_3.0_1699285748425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_danish_ner_ar_5.2.0_3.0_1699285748425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_arabic_camelbert_danish_ner","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_arabic_camelbert_danish_ner", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_arabic_camelbert_danish_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.8 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-da-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_mix_ner_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_mix_ner_ar.md new file mode 100644 index 000000000000..5e844559a1dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_mix_ner_ar.md @@ -0,0 +1,119 @@ +--- +layout: model +title: Arabic Named Entity Recognition (Modern Standard Arabic-MSA, Dialectal Arabic-DA and Classical Arabic-CA) +author: John Snow Labs +name: bert_ner_bert_base_arabic_camelbert_mix_ner +date: 2023-11-06 +tags: [bert, ner, token_classification, ar, open_source, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `bert-base-arabic-camelbert-mix-ner` is a Arabic model orginally trained by `CAMeL-Lab`. + +## Predicted Entities + +`ORG`, `LOC`, `PERS`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_mix_ner_ar_5.2.0_3.0_1699286605823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_mix_ner_ar_5.2.0_3.0_1699286605823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ +.setInputCol("text") \ +.setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ +.setInputCols("sentence") \ +.setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_arabic_camelbert_mix_ner","ar") \ +.setInputCols(["sentence", "token"]) \ +.setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["أنا أحب الشرارة NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") +.setInputCols(Array("document")) +.setOutputCol("sentence") + +val tokenizer = new Tokenizer() +.setInputCols(Array("sentence")) +.setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_arabic_camelbert_mix_ner","ar") +.setInputCols(Array("sentence", "token")) +.setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("أنا أحب الشرارة NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.ner.arabic_camelbert_mix_ner").predict("""أنا أحب الشرارة NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_arabic_camelbert_mix_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-mix-ner +- https://camel.abudhabi.nyu.edu/anercorp/ +- https://arxiv.org/abs/2103.06678 +- https://github.com/CAMeL-Lab/CAMeLBERT +- https://github.com/CAMeL-Lab/camel_tools +- https://github.com/CAMeL-Lab/camel_tools \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_msa_ner_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_msa_ner_ar.md new file mode 100644 index 000000000000..e1ffc875df4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_arabic_camelbert_msa_ner_ar.md @@ -0,0 +1,119 @@ +--- +layout: model +title: Arabic Named Entity Recognition (Modern Standard Arabic-MSA) +author: John Snow Labs +name: bert_ner_bert_base_arabic_camelbert_msa_ner +date: 2023-11-06 +tags: [bert, ner, token_classification, ar, open_source, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `bert-base-arabic-camelbert-msa-ner` is a Arabic model orginally trained by `CAMeL-Lab`. + +## Predicted Entities + +`ORG`, `LOC`, `PERS`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_msa_ner_ar_5.2.0_3.0_1699285984142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_arabic_camelbert_msa_ner_ar_5.2.0_3.0_1699285984142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ +.setInputCol("text") \ +.setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ +.setInputCols("sentence") \ +.setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_arabic_camelbert_msa_ner","ar") \ +.setInputCols(["sentence", "token"]) \ +.setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["أنا أحب الشرارة NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") +.setInputCols(Array("document")) +.setOutputCol("sentence") + +val tokenizer = new Tokenizer() +.setInputCols(Array("sentence")) +.setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_arabic_camelbert_msa_ner","ar") +.setInputCols(Array("sentence", "token")) +.setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("أنا أحب الشرارة NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.ner.arabic_camelbert_msa_ner").predict("""أنا أحب الشرارة NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_arabic_camelbert_msa_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-msa-ner +- https://camel.abudhabi.nyu.edu/anercorp/ +- https://arxiv.org/abs/2103.06678 +- https://github.com/CAMeL-Lab/CAMeLBERT +- https://github.com/CAMeL-Lab/camel_tools +- https://github.com/CAMeL-Lab/camel_tools \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_cased_chunking_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_cased_chunking_en.md new file mode 100644 index 000000000000..d615677eac45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_cased_chunking_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from QCRI) +author: John Snow Labs +name: bert_ner_bert_base_cased_chunking +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-chunking` is a English model originally trained by `QCRI`. + +## Predicted Entities + +`VP`, `ADVP`, `UCP`, `ADJP`, `LST`, `PRT`, `INTJ`, `SBAR`, `CONJP`, `NP`, `PP` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_cased_chunking_en_5.2.0_3.0_1699285172718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_cased_chunking_en_5.2.0_3.0_1699285172718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_cased_chunking","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_cased_chunking","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_cased_chunking| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/QCRI/bert-base-cased-chunking \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_cased_semitic_languages_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_cased_semitic_languages_en.md new file mode 100644 index 000000000000..9d5937d069e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_cased_semitic_languages_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_base_cased_semitic_languages BertForTokenClassification from QCRI +author: John Snow Labs +name: bert_ner_bert_base_cased_semitic_languages +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_cased_semitic_languages` is a English model originally trained by QCRI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_cased_semitic_languages_en_5.2.0_3.0_1699286156337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_cased_semitic_languages_en_5.2.0_3.0_1699286156337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_cased_semitic_languages","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_cased_semitic_languages", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_cased_semitic_languages| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/QCRI/bert-base-cased-sem \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner_nl.md new file mode 100644 index 000000000000..7b082d27c97d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner_nl.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Dutch BertForTokenClassification Base Cased model (from wietsedv) +author: John Snow Labs +name: bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner +date: 2023-11-06 +tags: [bert, ner, open_source, nl, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-dutch-cased-finetuned-conll2002-ner` is a Dutch model originally trained by `wietsedv`. + +## Predicted Entities + +`misc`, `per`, `org`, `loc` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner_nl_5.2.0_3.0_1699286868952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner_nl_5.2.0_3.0_1699286868952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner","nl") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner","nl") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.ner.bert.conll.cased_base_finetuned").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_dutch_cased_finetuned_conll2002_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|406.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/wietsedv/bert-base-dutch-cased-finetuned-conll2002-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_sonar_ner_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_sonar_ner_nl.md new file mode 100644 index 000000000000..4779ae3c5994 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_sonar_ner_nl.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Dutch BertForTokenClassification Base Cased model (from wietsedv) +author: John Snow Labs +name: bert_ner_bert_base_dutch_cased_finetuned_sonar_ner +date: 2023-11-06 +tags: [bert, ner, open_source, nl, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-dutch-cased-finetuned-sonar-ner` is a Dutch model originally trained by `wietsedv`. + +## Predicted Entities + +`pro`, `per`, `misc`, `loc`, `eve`, `org` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_dutch_cased_finetuned_sonar_ner_nl_5.2.0_3.0_1699285459434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_dutch_cased_finetuned_sonar_ner_nl_5.2.0_3.0_1699285459434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_dutch_cased_finetuned_sonar_ner","nl") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_dutch_cased_finetuned_sonar_ner","nl") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.ner.bert.cased_base_finetuned").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_dutch_cased_finetuned_sonar_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|406.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/wietsedv/bert-base-dutch-cased-finetuned-sonar-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner_nl.md new file mode 100644 index 000000000000..55d8b0a90730 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner_nl.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Dutch BertForTokenClassification Base Cased model (from wietsedv) +author: John Snow Labs +name: bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner +date: 2023-11-06 +tags: [bert, ner, open_source, nl, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-dutch-cased-finetuned-udlassy-ner` is a Dutch model originally trained by `wietsedv`. + +## Predicted Entities + +`TIME`, `WORK_OF_ART`, `FAC`, `NORP`, `PERCENT`, `DATE`, `PRODUCT`, `LANGUAGE`, `CARDINAL`, `EVENT`, `MONEY`, `LAW`, `QUANTITY`, `GPE`, `ORDINAL`, `ORG`, `PERSON`, `LOC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner_nl_5.2.0_3.0_1699284199218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner_nl_5.2.0_3.0_1699284199218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner","nl") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner","nl") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.ner.bert.cased_base_finetuned.by_wietsedv").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_dutch_cased_finetuned_udlassy_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|406.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/wietsedv/bert-base-dutch-cased-finetuned-udlassy-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_20000_ner_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_20000_ner_de.md new file mode 100644 index 000000000000..f614d134a4c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_20000_ner_de.md @@ -0,0 +1,114 @@ +--- +layout: model +title: German BertForTokenClassification Base Cased model (from domischwimmbeck) +author: John Snow Labs +name: bert_ner_bert_base_german_cased_20000_ner +date: 2023-11-06 +tags: [bert, ner, open_source, de, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-german-cased-20000-ner` is a German model originally trained by `domischwimmbeck`. + +## Predicted Entities + +`PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_20000_ner_de_5.2.0_3.0_1699285737918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_20000_ner_de_5.2.0_3.0_1699285737918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_20000_ner","de") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_20000_ner","de") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.ner.bert.cased_base").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_german_cased_20000_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|406.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/domischwimmbeck/bert-base-german-cased-20000-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_20000_ner_uncased_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_20000_ner_uncased_de.md new file mode 100644 index 000000000000..c828469b2d68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_20000_ner_uncased_de.md @@ -0,0 +1,114 @@ +--- +layout: model +title: German BertForTokenClassification Base Uncased model (from domischwimmbeck) +author: John Snow Labs +name: bert_ner_bert_base_german_cased_20000_ner_uncased +date: 2023-11-06 +tags: [bert, ner, open_source, de, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-german-cased-20000-ner-uncased` is a German model originally trained by `domischwimmbeck`. + +## Predicted Entities + +`PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_20000_ner_uncased_de_5.2.0_3.0_1699286426973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_20000_ner_uncased_de_5.2.0_3.0_1699286426973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_20000_ner_uncased","de") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_20000_ner_uncased","de") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.ner.bert.uncased_base").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_german_cased_20000_ner_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|409.9 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/domischwimmbeck/bert-base-german-cased-20000-ner-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_fine_tuned_ner_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_fine_tuned_ner_de.md new file mode 100644 index 000000000000..585921c7c11c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_fine_tuned_ner_de.md @@ -0,0 +1,115 @@ +--- +layout: model +title: German BertForTokenClassification Base Cased model (from domischwimmbeck) +author: John Snow Labs +name: bert_ner_bert_base_german_cased_fine_tuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, de, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-german-cased-fine-tuned-ner` is a German model originally trained by `domischwimmbeck`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `OTH` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_fine_tuned_ner_de_5.2.0_3.0_1699285743061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_fine_tuned_ner_de_5.2.0_3.0_1699285743061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_fine_tuned_ner","de") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_fine_tuned_ner","de") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.ner.bert.cased_base.by_domischwimmbeck").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_german_cased_fine_tuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|406.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/domischwimmbeck/bert-base-german-cased-fine-tuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=germa_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_own_data_ner_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_own_data_ner_de.md new file mode 100644 index 000000000000..8aaf6d79db3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_german_cased_own_data_ner_de.md @@ -0,0 +1,114 @@ +--- +layout: model +title: German BertForTokenClassification Base Cased model (from domischwimmbeck) +author: John Snow Labs +name: bert_ner_bert_base_german_cased_own_data_ner +date: 2023-11-06 +tags: [bert, ner, open_source, de, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-german-cased-own-data-ner` is a German model originally trained by `domischwimmbeck`. + +## Predicted Entities + +`PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_own_data_ner_de_5.2.0_3.0_1699286002849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_german_cased_own_data_ner_de_5.2.0_3.0_1699286002849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_own_data_ner","de") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_german_cased_own_data_ner","de") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.ner.bert.own_data.cased_base.by_domischwimmbeck").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_german_cased_own_data_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|406.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/domischwimmbeck/bert-base-german-cased-own-data-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_hungarian_cased_ner_akdeniz27_hu.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_hungarian_cased_ner_akdeniz27_hu.md new file mode 100644 index 000000000000..5648fedcf93f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_hungarian_cased_ner_akdeniz27_hu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hungarian bert_ner_bert_base_hungarian_cased_ner_akdeniz27 BertForTokenClassification from akdeniz27 +author: John Snow Labs +name: bert_ner_bert_base_hungarian_cased_ner_akdeniz27 +date: 2023-11-06 +tags: [bert, hu, open_source, token_classification, onnx] +task: Named Entity Recognition +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_hungarian_cased_ner_akdeniz27` is a Hungarian model originally trained by akdeniz27. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_hungarian_cased_ner_akdeniz27_hu_5.2.0_3.0_1699286179353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_hungarian_cased_ner_akdeniz27_hu_5.2.0_3.0_1699286179353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_hungarian_cased_ner_akdeniz27","hu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_hungarian_cased_ner_akdeniz27", "hu") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_hungarian_cased_ner_akdeniz27| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|hu| +|Size:|412.5 MB| + +## References + +https://huggingface.co/akdeniz27/bert-base-hungarian-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_hungarian_cased_ner_fdominik98_hu.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_hungarian_cased_ner_fdominik98_hu.md new file mode 100644 index 000000000000..c227fb6593db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_hungarian_cased_ner_fdominik98_hu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hungarian bert_ner_bert_base_hungarian_cased_ner_fdominik98 BertForTokenClassification from fdominik98 +author: John Snow Labs +name: bert_ner_bert_base_hungarian_cased_ner_fdominik98 +date: 2023-11-06 +tags: [bert, hu, open_source, token_classification, onnx] +task: Named Entity Recognition +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_hungarian_cased_ner_fdominik98` is a Hungarian model originally trained by fdominik98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_hungarian_cased_ner_fdominik98_hu_5.2.0_3.0_1699286635287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_hungarian_cased_ner_fdominik98_hu_5.2.0_3.0_1699286635287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_hungarian_cased_ner_fdominik98","hu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_hungarian_cased_ner_fdominik98", "hu") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_hungarian_cased_ner_fdominik98| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|hu| +|Size:|412.5 MB| + +## References + +https://huggingface.co/fdominik98/bert-base-hu-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_indonesian_ner_id.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_indonesian_ner_id.md new file mode 100644 index 000000000000..3ff21bc6d664 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_indonesian_ner_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian bert_ner_bert_base_indonesian_ner BertForTokenClassification from cahya +author: John Snow Labs +name: bert_ner_bert_base_indonesian_ner +date: 2023-11-06 +tags: [bert, id, open_source, token_classification, onnx] +task: Named Entity Recognition +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_indonesian_ner` is a Indonesian model originally trained by cahya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_indonesian_ner_id_5.2.0_3.0_1699286388251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_indonesian_ner_id_5.2.0_3.0_1699286388251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_indonesian_ner","id") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_indonesian_ner", "id") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_indonesian_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|id| +|Size:|412.7 MB| + +## References + +https://huggingface.co/cahya/bert-base-indonesian-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_irish_cased_v1_finetuned_ner_ga.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_irish_cased_v1_finetuned_ner_ga.md new file mode 100644 index 000000000000..869334fbf678 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_irish_cased_v1_finetuned_ner_ga.md @@ -0,0 +1,115 @@ +--- +layout: model +title: Irish BertForTokenClassification Base Cased model (from jimregan) +author: John Snow Labs +name: bert_ner_bert_base_irish_cased_v1_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, ga, onnx] +task: Named Entity Recognition +language: ga +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-irish-cased-v1-finetuned-ner` is a Irish model originally trained by `jimregan`. + +## Predicted Entities + +`ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_irish_cased_v1_finetuned_ner_ga_5.2.0_3.0_1699286896975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_irish_cased_v1_finetuned_ner_ga_5.2.0_3.0_1699286896975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_irish_cased_v1_finetuned_ner","ga") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Is breá liom Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_irish_cased_v1_finetuned_ner","ga") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Is breá liom Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ga.ner.bert.wikiann.cased_base_finetuned").predict("""Is breá liom Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_irish_cased_v1_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ga| +|Size:|406.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/jimregan/bert-base-irish-cased-v1-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=wikiann \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner_en.md new file mode 100644 index 000000000000..38634e783f59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from wietsedv) +author: John Snow Labs +name: bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-conll2002-ner` is a English model originally trained by `wietsedv`. + +## Predicted Entities + +`misc`, `per`, `org`, `loc` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner_en_5.2.0_3.0_1699286738280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner_en_5.2.0_3.0_1699286738280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.cased_multilingual_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_multilingual_cased_finetuned_conll2002_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/wietsedv/bert-base-multilingual-cased-finetuned-conll2002-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner_en.md new file mode 100644 index 000000000000..59c5529148b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from wietsedv) +author: John Snow Labs +name: bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-sonar-ner` is a English model originally trained by `wietsedv`. + +## Predicted Entities + +`pro`, `per`, `misc`, `loc`, `eve`, `org` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner_en_5.2.0_3.0_1699287134901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner_en_5.2.0_3.0_1699287134901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.cased_multilingual_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_multilingual_cased_finetuned_sonar_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/wietsedv/bert-base-multilingual-cased-finetuned-sonar-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner_en.md new file mode 100644 index 000000000000..e2989f4b9945 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from wietsedv) +author: John Snow Labs +name: bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-udlassy-ner` is a English model originally trained by `wietsedv`. + +## Predicted Entities + +`TIME`, `WORK_OF_ART`, `FAC`, `NORP`, `PERCENT`, `DATE`, `PRODUCT`, `LANGUAGE`, `CARDINAL`, `EVENT`, `MONEY`, `LAW`, `QUANTITY`, `GPE`, `ORDINAL`, `ORG`, `PERSON`, `LOC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner_en_5.2.0_3.0_1699287234669.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner_en_5.2.0_3.0_1699287234669.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.cased_multilingual_base_finetuned.by_wietsedv").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_multilingual_cased_finetuned_udlassy_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/wietsedv/bert-base-multilingual-cased-finetuned-udlassy-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_ner_hrl_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_ner_hrl_ar.md new file mode 100644 index 000000000000..a9bcc10fda15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_ner_hrl_ar.md @@ -0,0 +1,123 @@ +--- +layout: model +title: Arabic Named Entity Recognition (from Davlan) +author: John Snow Labs +name: bert_ner_bert_base_multilingual_cased_ner_hrl +date: 2023-11-06 +tags: [bert, ner, token_classification, ar, open_source, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `bert-base-multilingual-cased-ner-hrl` is a Arabic model orginally trained by `Davlan`. + +## Predicted Entities + +`DATE`, `LOC`, `ORG`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_ner_hrl_ar_5.2.0_3.0_1699287585517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_ner_hrl_ar_5.2.0_3.0_1699287585517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ +.setInputCol("text") \ +.setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ +.setInputCols(["document"])\ +.setOutputCol("sentence") + +tokenizer = Tokenizer() \ +.setInputCols("sentence") \ +.setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_ner_hrl","ar") \ +.setInputCols(["sentence", "token"]) \ +.setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["أنا أحب الشرارة NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") +.setInputCols(Array("document")) +.setOutputCol("sentence") + +val tokenizer = new Tokenizer() +.setInputCols(Array("sentence")) +.setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_ner_hrl","ar") +.setInputCols(Array("sentence", "token")) +.setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("أنا أحب الشرارة NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.ner.multilingual_cased_ner_hrl").predict("""أنا أحب الشرارة NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_multilingual_cased_ner_hrl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Davlan/bert-base-multilingual-cased-ner-hrl +- https://camel.abudhabi.nyu.edu/anercorp/ +- https://www.clips.uantwerpen.be/conll2003/ner/ +- https://www.clips.uantwerpen.be/conll2003/ner/ +- https://www.clips.uantwerpen.be/conll2002/ner/ +- https://github.com/EuropeanaNewspapers/ner-corpora/tree/master/enp_FR.bnf.bio +- https://ontotext.fbk.eu/icab.html +- https://github.com/LUMII-AILab/FullStack/tree/master/NamedEntities +- https://www.clips.uantwerpen.be/conll2002/ner/ +- https://github.com/davidsbatista/NER-datasets/tree/master/Portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_semitic_languages_english_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_semitic_languages_english_en.md new file mode 100644 index 000000000000..6696247525e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_multilingual_cased_semitic_languages_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_base_multilingual_cased_semitic_languages_english BertForTokenClassification from QCRI +author: John Snow Labs +name: bert_ner_bert_base_multilingual_cased_semitic_languages_english +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_multilingual_cased_semitic_languages_english` is a English model originally trained by QCRI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_semitic_languages_english_en_5.2.0_3.0_1699287886874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_multilingual_cased_semitic_languages_english_en_5.2.0_3.0_1699287886874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_multilingual_cased_semitic_languages_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_multilingual_cased_semitic_languages_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_multilingual_cased_semitic_languages_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.2 MB| + +## References + +https://huggingface.co/QCRI/bert-base-multilingual-cased-sem-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_en.md new file mode 100644 index 000000000000..d3bc1a0aef52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_base_ner BertForTokenClassification from dslim +author: John Snow Labs +name: bert_ner_bert_base_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_ner` is a English model originally trained by dslim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_ner_en_5.2.0_3.0_1699283745489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_ner_en_5.2.0_3.0_1699283745489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/dslim/bert-base-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_finetuned_ner_isu_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_finetuned_ner_isu_en.md new file mode 100644 index 000000000000..07308f94eb5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_finetuned_ner_isu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_base_ner_finetuned_ner_isu BertForTokenClassification from mcdzwil +author: John Snow Labs +name: bert_ner_bert_base_ner_finetuned_ner_isu +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_ner_finetuned_ner_isu` is a English model originally trained by mcdzwil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_ner_finetuned_ner_isu_en_5.2.0_3.0_1699283923478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_ner_finetuned_ner_isu_en_5.2.0_3.0_1699283923478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_ner_finetuned_ner_isu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_ner_finetuned_ner_isu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_ner_finetuned_ner_isu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mcdzwil/bert-base-NER-finetuned-ner-ISU \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_uncased_en.md new file mode 100644 index 000000000000..d584730dcf42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_ner_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_base_ner_uncased BertForTokenClassification from dslim +author: John Snow Labs +name: bert_ner_bert_base_ner_uncased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_ner_uncased` is a English model originally trained by dslim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_ner_uncased_en_5.2.0_3.0_1699286352688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_ner_uncased_en_5.2.0_3.0_1699286352688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_ner_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_ner_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_ner_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/dslim/bert-base-NER-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_portuguese_archive_pt.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_portuguese_archive_pt.md new file mode 100644 index 000000000000..f111925261bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_portuguese_archive_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bert_ner_bert_base_portuguese_archive BertForTokenClassification from lfcc +author: John Snow Labs +name: bert_ner_bert_base_portuguese_archive +date: 2023-11-06 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_portuguese_archive` is a Portuguese model originally trained by lfcc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_portuguese_archive_pt_5.2.0_3.0_1699284396779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_portuguese_archive_pt_5.2.0_3.0_1699284396779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_portuguese_archive","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_portuguese_archive", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_portuguese_archive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|406.0 MB| + +## References + +https://huggingface.co/lfcc/bert-base-pt-archive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical_es.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical_es.md new file mode 100644 index 000000000000..b1cf2b8478a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical BertForTokenClassification from fmmolina +author: John Snow Labs +name: bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical +date: 2023-11-06 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical` is a Castilian, Spanish model originally trained by fmmolina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical_es_5.2.0_3.0_1699287676165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical_es_5.2.0_3.0_1699287676165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_spanish_wwm_uncased_finetuned_ner_medical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.7 MB| + +## References + +https://huggingface.co/fmmolina/bert-base-spanish-wwm-uncased-finetuned-NER-medical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_swedish_cased_neriob_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_swedish_cased_neriob_sv.md new file mode 100644 index 000000000000..5d613e643cd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_swedish_cased_neriob_sv.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Swedish BertForTokenClassification Base Cased model (from KBLab) +author: John Snow Labs +name: bert_ner_bert_base_swedish_cased_neriob +date: 2023-11-06 +tags: [bert, ner, open_source, sv, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-swedish-cased-neriob` is a Swedish model originally trained by `KBLab`. + +## Predicted Entities + +`PER`, `LOC`, `LOCORG`, `EVN`, `TME`, `WRK`, `MSR`, `OBJ`, `PRSWRK`, `OBJORG`, `ORG`, `ORGPRS`, `LOCPRS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_swedish_cased_neriob_sv_5.2.0_3.0_1699288022037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_swedish_cased_neriob_sv_5.2.0_3.0_1699288022037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_swedish_cased_neriob","sv") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Jag älskar Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_swedish_cased_neriob","sv") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Jag älskar Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sv.ner.bert.cased_base.neriob.by_kblab").predict("""Jag älskar Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_swedish_cased_neriob| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/KBLab/bert-base-swedish-cased-neriob \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tcm_0.6_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tcm_0.6_en.md new file mode 100644 index 000000000000..1bd774d7fbae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tcm_0.6_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from ricardo-filho) +author: John Snow Labs +name: bert_ner_bert_base_tcm_0.6 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_base_tcm_0.6` is a English model originally trained by `ricardo-filho`. + +## Predicted Entities + +`VALOR_OBJETO`, `NUMERO_EXERCICIO`, `CRITERIO_JULGAMENTO`, `MODALIDADE_LICITACAO`, `OBJETO_LICITACAO`, `DATA_SESSAO` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_tcm_0.6_en_5.2.0_3.0_1699284660937.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_tcm_0.6_en_5.2.0_3.0_1699284660937.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_tcm_0.6","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_tcm_0.6","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.base.tcm_0.6.by_ricardo_filho").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_tcm_0.6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ricardo-filho/bert_base_tcm_0.6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tcm_teste_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tcm_teste_en.md new file mode 100644 index 000000000000..f7e765c9268b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tcm_teste_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from ricardo-filho) +author: John Snow Labs +name: bert_ner_bert_base_tcm_teste +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_base_tcm_teste` is a English model originally trained by `ricardo-filho`. + +## Predicted Entities + +`VALOR_OBJETO`, `NUMERO_EXERCICIO`, `CRITERIO_JULGAMENTO`, `MODALIDADE_LICITACAO`, `OBJETO_LICITACAO`, `DATA_SESSAO` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_tcm_teste_en_5.2.0_3.0_1699284886370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_tcm_teste_en_5.2.0_3.0_1699284886370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_tcm_teste","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_tcm_teste","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.base.tcm_teste.by_ricardo_filho").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_tcm_teste| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ricardo-filho/bert_base_tcm_teste \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_turkish_ner_cased_pretrained_tr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_turkish_ner_cased_pretrained_tr.md new file mode 100644 index 000000000000..d5f08cf809e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_turkish_ner_cased_pretrained_tr.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Turkish BertForTokenClassification Base Cased model (from beyhan) +author: John Snow Labs +name: bert_ner_bert_base_turkish_ner_cased_pretrained +date: 2023-11-06 +tags: [bert, ner, open_source, tr, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-ner-cased-pretrained` is a Turkish model originally trained by `beyhan`. + +## Predicted Entities + +`LOC`, `U-ORG`, `PER`, `U-LOC`, `L-ORG`, `U-PER`, `ORG`, `L-LOC`, `L-PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_turkish_ner_cased_pretrained_tr_5.2.0_3.0_1699287757355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_turkish_ner_cased_pretrained_tr_5.2.0_3.0_1699287757355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_turkish_ner_cased_pretrained","tr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Spark NLP'yi seviyorum"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_turkish_ner_cased_pretrained","tr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Spark NLP'yi seviyorum").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.ner.bert.cased_base.by_beyhan").predict("""Spark NLP'yi seviyorum""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_turkish_ner_cased_pretrained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/beyhan/bert-base-turkish-ner-cased-pretrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_turkish_ner_cased_tr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_turkish_ner_cased_tr.md new file mode 100644 index 000000000000..9ace6672bc48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_turkish_ner_cased_tr.md @@ -0,0 +1,116 @@ +--- +layout: model +title: Turkish Named Entity Recognition (from savasy) +author: John Snow Labs +name: bert_ner_bert_base_turkish_ner_cased +date: 2023-11-06 +tags: [bert, ner, token_classification, tr, open_source, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `bert-base-turkish-ner-cased` is a Turkish model orginally trained by `savasy`. + +## Predicted Entities + +`LOC`, `PER`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_turkish_ner_cased_tr_5.2.0_3.0_1699285123857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_turkish_ner_cased_tr_5.2.0_3.0_1699285123857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_turkish_ner_cased","tr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Spark NLP'yi seviyorum"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_turkish_ner_cased","tr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Spark NLP'yi seviyorum").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.ner.bert.cased_base.by_savasy").predict("""Spark NLP'yi seviyorum""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_turkish_ner_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/savasy/bert-base-turkish-ner-cased +- https://schweter.eu/storage/turkish-bert-wikiann/$file +- https://github.com/stefan-it/turkish-bert/files/4558187/nerdata.txt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tweetner_2020_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tweetner_2020_en.md new file mode 100644 index 000000000000..f0c704e80c8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_tweetner_2020_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from tner) +author: John Snow Labs +name: bert_ner_bert_base_tweetner_2020 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-tweetner-2020` is a English model originally trained by `tner`. + +## Predicted Entities + +`product`, `creative_work`, `event`, `person`, `corporation`, `group`, `location` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_tweetner_2020_en_5.2.0_3.0_1699288281740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_tweetner_2020_en_5.2.0_3.0_1699288281740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_tweetner_2020","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_tweetner_2020","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.tweet.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_tweetner_2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tner/bert-base-tweetner-2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_clinical_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_clinical_ner_en.md new file mode 100644 index 000000000000..baa491c3b818 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_clinical_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Base Uncased model (from samrawal) +author: John Snow Labs +name: bert_ner_bert_base_uncased_clinical_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased_clinical-ner` is a English model originally trained by `samrawal`. + +## Predicted Entities + +`treatment`, `problem`, `test` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_clinical_ner_en_5.2.0_3.0_1699288511788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_clinical_ner_en_5.2.0_3.0_1699288511788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_uncased_clinical_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_uncased_clinical_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.clinical.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_uncased_clinical_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/samrawal/bert-base-uncased_clinical-ner +- https://n2c2.dbmi.hms.harvard.edu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_kinyarwanda_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_kinyarwanda_en.md new file mode 100644 index 000000000000..ef7ad38084b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_kinyarwanda_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_base_uncased_kinyarwanda BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_bert_base_uncased_kinyarwanda +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_uncased_kinyarwanda` is a English model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_kinyarwanda_en_5.2.0_3.0_1699285997600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_kinyarwanda_en_5.2.0_3.0_1699285997600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_uncased_kinyarwanda","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_uncased_kinyarwanda", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_uncased_kinyarwanda| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/arnolfokam/bert-base-uncased-kin \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_nigerian_pidgin_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_nigerian_pidgin_en.md new file mode 100644 index 000000000000..35ac0dd981f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_nigerian_pidgin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_base_uncased_nigerian_pidgin BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_bert_base_uncased_nigerian_pidgin +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_uncased_nigerian_pidgin` is a English model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_nigerian_pidgin_en_5.2.0_3.0_1699288440527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_nigerian_pidgin_en_5.2.0_3.0_1699288440527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_uncased_nigerian_pidgin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_uncased_nigerian_pidgin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_uncased_nigerian_pidgin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/arnolfokam/bert-base-uncased-pcm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_swahili_macrolanguage_sw.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_swahili_macrolanguage_sw.md new file mode 100644 index 000000000000..676fb66eae91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_base_uncased_swahili_macrolanguage_sw.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swahili (macrolanguage) bert_ner_bert_base_uncased_swahili_macrolanguage BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_bert_base_uncased_swahili_macrolanguage +date: 2023-11-06 +tags: [bert, sw, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sw +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_base_uncased_swahili_macrolanguage` is a Swahili (macrolanguage) model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_swahili_macrolanguage_sw_5.2.0_3.0_1699286179382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_base_uncased_swahili_macrolanguage_sw_5.2.0_3.0_1699286179382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_base_uncased_swahili_macrolanguage","sw") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_base_uncased_swahili_macrolanguage", "sw") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_base_uncased_swahili_macrolanguage| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sw| +|Size:|403.7 MB| + +## References + +https://huggingface.co/arnolfokam/bert-base-uncased-swa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_degree_major_ner_1000_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_degree_major_ner_1000_en.md new file mode 100644 index 000000000000..9a9f03858158 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_degree_major_ner_1000_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from pkushiqiang) +author: John Snow Labs +name: bert_ner_bert_degree_major_ner_1000 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-degree-major-ner-1000` is a English model originally trained by `pkushiqiang`. + +## Predicted Entities + +`degree`, `major` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_degree_major_ner_1000_en_5.2.0_3.0_1699286418066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_degree_major_ner_1000_en_5.2.0_3.0_1699286418066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_degree_major_ner_1000","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_degree_major_ner_1000","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.degree_major_ner_1000.by_pkushiqiang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_degree_major_ner_1000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/pkushiqiang/bert-degree-major-ner-1000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_dnrti_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_dnrti_en.md new file mode 100644 index 000000000000..ef0f576c44cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_dnrti_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_dnrti BertForTokenClassification from varsha12 +author: John Snow Labs +name: bert_ner_bert_dnrti +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_dnrti` is a English model originally trained by varsha12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_dnrti_en_5.2.0_3.0_1699274422209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_dnrti_en_5.2.0_3.0_1699274422209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_dnrti","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_dnrti", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_dnrti| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/varsha12/BERT_DNRTI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ehsan_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ehsan_ner_accelerate_en.md new file mode 100644 index 000000000000..de8e5760b96d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ehsan_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from EhsanYB) +author: John Snow Labs +name: bert_ner_bert_ehsan_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-ehsan-ner-accelerate` is a English model originally trained by `EhsanYB`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ehsan_ner_accelerate_en_5.2.0_3.0_1699285345341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ehsan_ner_accelerate_en_5.2.0_3.0_1699285345341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ehsan_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ehsan_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_ehsanyb").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_ehsan_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/EhsanYB/bert-ehsan-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ades_model_1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ades_model_1_en.md new file mode 100644 index 000000000000..b7f9d02c50ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ades_model_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_finetuned_ades_model_1 BertForTokenClassification from ajtamayoh +author: John Snow Labs +name: bert_ner_bert_finetuned_ades_model_1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_finetuned_ades_model_1` is a English model originally trained by ajtamayoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ades_model_1_en_5.2.0_3.0_1699286687992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ades_model_1_en_5.2.0_3.0_1699286687992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ades_model_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_finetuned_ades_model_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ades_model_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ajtamayoh/bert-finetuned-ADEs_model_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_comp2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_comp2_en.md new file mode 100644 index 000000000000..0d996104a936 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_comp2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from brad1141) +author: John Snow Labs +name: bert_ner_bert_finetuned_comp2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-comp2` is a English model originally trained by `brad1141`. + +## Predicted Entities + +`Concluding Statement`, `Position`, `Lead`, `Rebuttal`, `Evidence`, `Counterclaim`, `Claim` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_comp2_en_5.2.0_3.0_1699288704991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_comp2_en_5.2.0_3.0_1699288704991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_comp2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_comp2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_brad1141").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_comp2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/brad1141/bert-finetuned-comp2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_filler_2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_filler_2_en.md new file mode 100644 index 000000000000..dca73ebb2df9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_filler_2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from rdchambers) +author: John Snow Labs +name: bert_ner_bert_finetuned_filler_2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-filler-2` is a English model originally trained by `rdchambers`. + +## Predicted Entities + +`Null`, `Filler` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_filler_2_en_5.2.0_3.0_1699285723143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_filler_2_en_5.2.0_3.0_1699285723143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_filler_2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_filler_2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_rdchambers").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_filler_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/rdchambers/bert-finetuned-filler-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_0_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_0_en.md new file mode 100644 index 000000000000..f6e84aabf4fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_0_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Salvatore) +author: John Snow Labs +name: bert_ner_bert_finetuned_mutation_recognition_0 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-mutation-recognition-0` is a English model originally trained by `Salvatore`. + +## Predicted Entities + +`SNP`, `ProteinMutation`, `DNAMutation` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_0_en_5.2.0_3.0_1699286939559.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_0_en_5.2.0_3.0_1699286939559.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_0","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_0","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.mutation_recognition_0.by_salvatore").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_mutation_recognition_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Salvatore/bert-finetuned-mutation-recognition-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_1_en.md new file mode 100644 index 000000000000..00aad0ac5595 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_1_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Salvatore) +author: John Snow Labs +name: bert_ner_bert_finetuned_mutation_recognition_1 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-mutation-recognition-1` is a English model originally trained by `Salvatore`. + +## Predicted Entities + +`SNP`, `ProteinMutation`, `DNAMutation` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_1_en_5.2.0_3.0_1699289221752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_1_en_5.2.0_3.0_1699289221752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_1","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_1","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.mutation_recognition_1.by_salvatore").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_mutation_recognition_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Salvatore/bert-finetuned-mutation-recognition-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_2_en.md new file mode 100644 index 000000000000..3be822ccda6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Salvatore) +author: John Snow Labs +name: bert_ner_bert_finetuned_mutation_recognition_2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-mutation-recognition-2` is a English model originally trained by `Salvatore`. + +## Predicted Entities + +`SNP`, `ProteinMutation`, `DNAMutation` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_2_en_5.2.0_3.0_1699288948606.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_2_en_5.2.0_3.0_1699288948606.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.mutation_recognition_2.by_salvatore").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_mutation_recognition_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Salvatore/bert-finetuned-mutation-recognition-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_3_en.md new file mode 100644 index 000000000000..9264919ac144 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_3_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Salvatore) +author: John Snow Labs +name: bert_ner_bert_finetuned_mutation_recognition_3 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-mutation-recognition-3` is a English model originally trained by `Salvatore`. + +## Predicted Entities + +`SNP`, `ProteinMutation`, `DNAMutation` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_3_en_5.2.0_3.0_1699288342397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_3_en_5.2.0_3.0_1699288342397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_3","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_3","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.mutation_recognition_3.by_salvatore").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_mutation_recognition_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Salvatore/bert-finetuned-mutation-recognition-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_4_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_4_en.md new file mode 100644 index 000000000000..567ae730a700 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_mutation_recognition_4_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Salvatore) +author: John Snow Labs +name: bert_ner_bert_finetuned_mutation_recognition_4 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-mutation-recognition-4` is a English model originally trained by `Salvatore`. + +## Predicted Entities + +`SNP`, `ProteinMutation`, `DNAMutation` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_4_en_5.2.0_3.0_1699289226587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_mutation_recognition_4_en_5.2.0_3.0_1699289226587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_4","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_mutation_recognition_4","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.mutation_recognition_4.by_salvatore").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_mutation_recognition_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Salvatore/bert-finetuned-mutation-recognition-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner1_en.md new file mode 100644 index 000000000000..efefe8d6ffaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner1_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Wende) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner1 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner1` is a English model originally trained by `Wende`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner1_en_5.2.0_3.0_1699289487602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner1_en_5.2.0_3.0_1699289487602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner1","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner1","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned_v2.by_Wende").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Wende/bert-finetuned-ner1 +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner2_en.md new file mode 100644 index 000000000000..3c2dad5357bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Lamine) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner2` is a English model originally trained by `Lamine`. + +## Predicted Entities + +`geo`, `org`, `tim`, `gpe`, `per` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner2_en_5.2.0_3.0_1699289789019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner2_en_5.2.0_3.0_1699289789019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.sourcerecognition.v2.by_lamine").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Lamine/bert-finetuned-ner2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner3_en.md new file mode 100644 index 000000000000..a6a3160e1326 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner3_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Ghost1) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner3 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner3` is a English model originally trained by `Ghost1`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner3_en_5.2.0_3.0_1699287206516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner3_en_5.2.0_3.0_1699287206516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner3","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner3","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_Ghost1").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Ghost1/bert-finetuned-ner3 +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_chinese_zh.md new file mode 100644 index 000000000000..c11f9f9c9a35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_chinese_zh.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Chinese BertForTokenClassification Cased model (from Yip) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner_chinese +date: 2023-11-06 +tags: [bert, ner, open_source, zh, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-chinese` is a Chinese model originally trained by `Yip`. + +## Predicted Entities + +`company`, `name`, `position`, `movie`, `organization`, `scene`, `game`, `address`, `0`, `book`, `government` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_chinese_zh_5.2.0_3.0_1699286378509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_chinese_zh_5.2.0_3.0_1699286378509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_chinese","zh") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_chinese","zh") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.ner.bert.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Yip/bert-finetuned-ner-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..1401996a5a43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from caotianyu1996) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_finetuned_ner` is a English model originally trained by `caotianyu1996`. + +## Predicted Entities + +`Disease` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_en_5.2.0_3.0_1699285947988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_en_5.2.0_3.0_1699285947988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_caotianyu1996").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/caotianyu1996/bert_finetuned_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_sourcerecognition_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_sourcerecognition_en.md new file mode 100644 index 000000000000..098ecf60c9e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_sourcerecognition_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_finetuned_ner_sourcerecognition BertForTokenClassification from Lamine +author: John Snow Labs +name: bert_ner_bert_finetuned_ner_sourcerecognition +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_finetuned_ner_sourcerecognition` is a English model originally trained by Lamine. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_sourcerecognition_en_5.2.0_3.0_1699286124620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_sourcerecognition_en_5.2.0_3.0_1699286124620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_sourcerecognition","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_finetuned_ner_sourcerecognition", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner_sourcerecognition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Lamine/bert-finetuned-ner_SourceRecognition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart_sv.md new file mode 100644 index 000000000000..54f9e43f1d8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart_sv.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Swedish BertForTokenClassification Small Cased model (from Nonzerophilip) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart +date: 2023-11-06 +tags: [bert, ner, open_source, sv, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner_swedish_small_set_health_and_standart` is a Swedish model originally trained by `Nonzerophilip`. + +## Predicted Entities + +`PER`, `ORG`, `LOC`, `HEALTH`, `relation`, `PHARMA_DRUGS` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart_sv_5.2.0_3.0_1699288719693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart_sv_5.2.0_3.0_1699288719693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart","sv") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Jag älskar Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart","sv") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Jag älskar Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sv.ner.bert.small_finetuned").predict("""Jag älskar Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner_swedish_small_set_health_and_standart| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Nonzerophilip/bert-finetuned-ner_swedish_small_set_health_and_standart \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_large_set_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_large_set_sv.md new file mode 100644 index 000000000000..cb20ddae9abb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_large_set_sv.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Swedish BertForTokenClassification Large Cased model (from Nonzerophilip) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner_swedish_test_large_set +date: 2023-11-06 +tags: [bert, ner, open_source, sv, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner_swedish_test_large_set` is a Swedish model originally trained by `Nonzerophilip`. + +## Predicted Entities + +`MISC`, `inst`, `person`, `NAN`, `place` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_test_large_set_sv_5.2.0_3.0_1699288994399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_test_large_set_sv_5.2.0_3.0_1699288994399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_swedish_test_large_set","sv") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Jag älskar Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_swedish_test_large_set","sv") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Jag älskar Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sv.ner.bert.large_finetuned").predict("""Jag älskar Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner_swedish_test_large_set| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Nonzerophilip/bert-finetuned-ner_swedish_test_large_set \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_numb_2_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_numb_2_sv.md new file mode 100644 index 000000000000..ed494845aacb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_numb_2_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_ner_bert_finetuned_ner_swedish_test_numb_2 BertForTokenClassification from Nonzerophilip +author: John Snow Labs +name: bert_ner_bert_finetuned_ner_swedish_test_numb_2 +date: 2023-11-06 +tags: [bert, sv, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_finetuned_ner_swedish_test_numb_2` is a Swedish model originally trained by Nonzerophilip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_test_numb_2_sv_5.2.0_3.0_1699289976673.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_test_numb_2_sv_5.2.0_3.0_1699289976673.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_swedish_test_numb_2","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_finetuned_ner_swedish_test_numb_2", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner_swedish_test_numb_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.2 MB| + +## References + +https://huggingface.co/Nonzerophilip/bert-finetuned-ner_swedish_test_NUMb_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_sv.md new file mode 100644 index 000000000000..cd8985bacb77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_ner_swedish_test_sv.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Swedish BertForTokenClassification Cased model (from Nonzerophilip) +author: John Snow Labs +name: bert_ner_bert_finetuned_ner_swedish_test +date: 2023-11-06 +tags: [bert, ner, open_source, sv, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner_swedish_test` is a Swedish model originally trained by `Nonzerophilip`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_test_sv_5.2.0_3.0_1699286684416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_ner_swedish_test_sv_5.2.0_3.0_1699286684416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_swedish_test","sv") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Jag älskar Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_ner_swedish_test","sv") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Jag älskar Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sv.ner.bert.finetuned").predict("""Jag älskar Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_ner_swedish_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Nonzerophilip/bert-finetuned-ner_swedish_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_protagonist_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_protagonist_en.md new file mode 100644 index 000000000000..79b156884c88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_finetuned_protagonist_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from airi) +author: John Snow Labs +name: bert_ner_bert_finetuned_protagonist +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-protagonist` is a English model originally trained by `airi`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_protagonist_en_5.2.0_3.0_1699289429567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_finetuned_protagonist_en_5.2.0_3.0_1699289429567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_protagonist","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_finetuned_protagonist","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.protagonist.by_airi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_finetuned_protagonist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/airi/bert-finetuned-protagonist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_german_ner_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_german_ner_de.md new file mode 100644 index 000000000000..dcc10cf71d02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_german_ner_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_ner_bert_german_ner BertForTokenClassification from fhswf +author: John Snow Labs +name: bert_ner_bert_german_ner +date: 2023-11-06 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_german_ner` is a German model originally trained by fhswf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_german_ner_de_5.2.0_3.0_1699288005409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_german_ner_de_5.2.0_3.0_1699288005409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_german_ner","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_german_ner", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_german_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|409.9 MB| + +## References + +https://huggingface.co/fhswf/bert_de_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_keyword_extractor_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_keyword_extractor_en.md new file mode 100644 index 000000000000..63fa0f0653fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_keyword_extractor_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from yanekyuk) +author: John Snow Labs +name: bert_ner_bert_keyword_extractor +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-keyword-extractor` is a English model originally trained by `yanekyuk`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_keyword_extractor_en_5.2.0_3.0_1699286944350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_keyword_extractor_en_5.2.0_3.0_1699286944350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_keyword_extractor","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_keyword_extractor","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_yanekyuk").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_keyword_extractor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/yanekyuk/bert-keyword-extractor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_cased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_cased_finetuned_ner_en.md new file mode 100644 index 000000000000..f0ba1c9ce21f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_cased_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Large Cased model (from dpalominop) +author: John Snow Labs +name: bert_ner_bert_large_cased_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-finetuned-ner` is a English model originally trained by `dpalominop`. + +## Predicted Entities + +`OCC`, `DIS`, `RES` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_cased_finetuned_ner_en_5.2.0_3.0_1699288071995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_cased_finetuned_ner_en_5.2.0_3.0_1699288071995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_cased_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_cased_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.cased_large_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_large_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/dpalominop/bert-large-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_tweetner_2020_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_tweetner_2020_en.md new file mode 100644 index 000000000000..5ec096f0d7dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_tweetner_2020_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Large Cased model (from tner) +author: John Snow Labs +name: bert_ner_bert_large_tweetner_2020 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-tweetner-2020` is a English model originally trained by `tner`. + +## Predicted Entities + +`corporation`, `product`, `location`, `person`, `creative_work`, `group`, `event` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_tweetner_2020_en_5.2.0_3.0_1699289944517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_tweetner_2020_en_5.2.0_3.0_1699289944517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_tweetner_2020","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_tweetner_2020","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.tweet.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_large_tweetner_2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tner/bert-large-tweetner-2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_uncased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_uncased_finetuned_ner_en.md new file mode 100644 index 000000000000..a939933164ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_uncased_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English Named Entity Recognition (from Jorgeutd) +author: John Snow Labs +name: bert_ner_bert_large_uncased_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `bert-large-uncased-finetuned-ner` is a English model orginally trained by `Jorgeutd`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_uncased_finetuned_ner_en_5.2.0_3.0_1699287462496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_uncased_finetuned_ner_en_5.2.0_3.0_1699287462496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_uncased_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_uncased_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.uncased_large_finetuned").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_large_uncased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Jorgeutd/bert-large-uncased-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_uncased_med_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_uncased_med_ner_en.md new file mode 100644 index 000000000000..afc07624bdcc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_large_uncased_med_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Large Uncased model (from samrawal) +author: John Snow Labs +name: bert_ner_bert_large_uncased_med_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased_med-ner` is a English model originally trained by `samrawal`. + +## Predicted Entities + +`do`, `mo`, `f`, `m`, `du`, `r` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_uncased_med_ner_en_5.2.0_3.0_1699288074969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_large_uncased_med_ner_en_5.2.0_3.0_1699288074969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_uncased_med_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_large_uncased_med_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.uncased_large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_large_uncased_med_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/samrawal/bert-large-uncased_med-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_english_vera_pro_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_english_vera_pro_en.md new file mode 100644 index 000000000000..7e50446bc010 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_english_vera_pro_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_mention_english_vera_pro BertForTokenClassification from vera-pro +author: John Snow Labs +name: bert_ner_bert_mention_english_vera_pro +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_mention_english_vera_pro` is a English model originally trained by vera-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mention_english_vera_pro_en_5.2.0_3.0_1699288592892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mention_english_vera_pro_en_5.2.0_3.0_1699288592892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_mention_english_vera_pro","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_mention_english_vera_pro", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_mention_english_vera_pro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/vera-pro/bert-mention-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_french_vera_pro_fr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_french_vera_pro_fr.md new file mode 100644 index 000000000000..34f137a6d706 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_french_vera_pro_fr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: French bert_ner_bert_mention_french_vera_pro BertForTokenClassification from vera-pro +author: John Snow Labs +name: bert_ner_bert_mention_french_vera_pro +date: 2023-11-06 +tags: [bert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_mention_french_vera_pro` is a French model originally trained by vera-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mention_french_vera_pro_fr_5.2.0_3.0_1699288386379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mention_french_vera_pro_fr_5.2.0_3.0_1699288386379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_mention_french_vera_pro","fr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_mention_french_vera_pro", "fr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_mention_french_vera_pro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|665.1 MB| + +## References + +https://huggingface.co/vera-pro/bert-mention-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_german_vera_pro_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_german_vera_pro_de.md new file mode 100644 index 000000000000..b613a87d7349 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mention_german_vera_pro_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_ner_bert_mention_german_vera_pro BertForTokenClassification from vera-pro +author: John Snow Labs +name: bert_ner_bert_mention_german_vera_pro +date: 2023-11-06 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_mention_german_vera_pro` is a German model originally trained by vera-pro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mention_german_vera_pro_de_5.2.0_3.0_1699288386584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mention_german_vera_pro_de_5.2.0_3.0_1699288386584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_mention_german_vera_pro","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_mention_german_vera_pro", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_mention_german_vera_pro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|665.1 MB| + +## References + +https://huggingface.co/vera-pro/bert-mention-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mt4ts_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mt4ts_en.md new file mode 100644 index 000000000000..1527b623f0ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_mt4ts_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bert_mt4ts BertForTokenClassification from kevinjesse +author: John Snow Labs +name: bert_ner_bert_mt4ts +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_mt4ts` is a English model originally trained by kevinjesse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mt4ts_en_5.2.0_3.0_1699286187357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_mt4ts_en_5.2.0_3.0_1699286187357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_mt4ts","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_mt4ts", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_mt4ts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|549.8 MB| + +## References + +https://huggingface.co/kevinjesse/bert-MT4TS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_cased_conll2002_nld_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_cased_conll2002_nld_en.md new file mode 100644 index 000000000000..c75d51db5adf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_cased_conll2002_nld_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from proycon) +author: John Snow Labs +name: bert_ner_bert_ner_cased_conll2002_nld +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-ner-cased-conll2002-nld` is a English model originally trained by `proycon`. + +## Predicted Entities + +`misc`, `org`, `per`, `loc` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ner_cased_conll2002_nld_en_5.2.0_3.0_1699288665724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ner_cased_conll2002_nld_en_5.2.0_3.0_1699288665724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ner_cased_conll2002_nld","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ner_cased_conll2002_nld","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.cased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_ner_cased_conll2002_nld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/proycon/bert-ner-cased-conll2002-nld \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_cased_sonar1_nld_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_cased_sonar1_nld_en.md new file mode 100644 index 000000000000..8a9c9973be2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_cased_sonar1_nld_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from proycon) +author: John Snow Labs +name: bert_ner_bert_ner_cased_sonar1_nld +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-ner-cased-sonar1-nld` is a English model originally trained by `proycon`. + +## Predicted Entities + +`misc`, `org`, `eve`, `pro`, `loc`, `per` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ner_cased_sonar1_nld_en_5.2.0_3.0_1699290244203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ner_cased_sonar1_nld_en_5.2.0_3.0_1699290244203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ner_cased_sonar1_nld","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ner_cased_sonar1_nld","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.cased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_ner_cased_sonar1_nld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/proycon/bert-ner-cased-sonar1-nld \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_i2b2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_i2b2_en.md new file mode 100644 index 000000000000..07363e8e2c0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_ner_i2b2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from connorboyle) +author: John Snow Labs +name: bert_ner_bert_ner_i2b2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-ner-i2b2` is a English model originally trained by `connorboyle`. + +## Predicted Entities + +`STATE`, `ORGANIZATION`, `BIOID`, `HEALTHPLAN`, `PATIENT`, `COUNTRY`, `AGE`, `FAX`, `LOCATION`, `PHONE`, `IDNUM`, `DOCTOR`, `URL`, `DEVICE`, `STREET`, `DATE`, `ZIP`, `CITY`, `EMAIL`, `MEDICALRECORD`, `USERNAME`, `HOSPITAL`, `PROFESSION` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ner_i2b2_en_5.2.0_3.0_1699290313908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_ner_i2b2_en_5.2.0_3.0_1699290313908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ner_i2b2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_ner_i2b2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_connorboyle").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_ner_i2b2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/connorboyle/bert-ner-i2b2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_base_uncased_ner_arman_fa.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_base_uncased_ner_arman_fa.md new file mode 100644 index 000000000000..8a85947a609c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_base_uncased_ner_arman_fa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Persian bert_ner_bert_persian_farsi_base_uncased_ner_arman BertForTokenClassification from HooshvareLab +author: John Snow Labs +name: bert_ner_bert_persian_farsi_base_uncased_ner_arman +date: 2023-11-06 +tags: [bert, fa, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_persian_farsi_base_uncased_ner_arman` is a Persian model originally trained by HooshvareLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_persian_farsi_base_uncased_ner_arman_fa_5.2.0_3.0_1699288728231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_persian_farsi_base_uncased_ner_arman_fa_5.2.0_3.0_1699288728231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_persian_farsi_base_uncased_ner_arman","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_persian_farsi_base_uncased_ner_arman", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_persian_farsi_base_uncased_ner_arman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fa| +|Size:|606.5 MB| + +## References + +https://huggingface.co/HooshvareLab/bert-fa-base-uncased-ner-arman \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_base_uncased_ner_peyma_fa.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_base_uncased_ner_peyma_fa.md new file mode 100644 index 000000000000..afd53a37430d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_base_uncased_ner_peyma_fa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Persian bert_ner_bert_persian_farsi_base_uncased_ner_peyma BertForTokenClassification from HooshvareLab +author: John Snow Labs +name: bert_ner_bert_persian_farsi_base_uncased_ner_peyma +date: 2023-11-06 +tags: [bert, fa, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_persian_farsi_base_uncased_ner_peyma` is a Persian model originally trained by HooshvareLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_persian_farsi_base_uncased_ner_peyma_fa_5.2.0_3.0_1699288961564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_persian_farsi_base_uncased_ner_peyma_fa_5.2.0_3.0_1699288961564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_persian_farsi_base_uncased_ner_peyma","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_persian_farsi_base_uncased_ner_peyma", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_persian_farsi_base_uncased_ner_peyma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fa| +|Size:|606.6 MB| + +## References + +https://huggingface.co/HooshvareLab/bert-fa-base-uncased-ner-peyma \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_zwnj_base_ner_fa.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_zwnj_base_ner_fa.md new file mode 100644 index 000000000000..e657c585faf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_persian_farsi_zwnj_base_ner_fa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Persian bert_ner_bert_persian_farsi_zwnj_base_ner BertForTokenClassification from HooshvareLab +author: John Snow Labs +name: bert_ner_bert_persian_farsi_zwnj_base_ner +date: 2023-11-06 +tags: [bert, fa, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bert_persian_farsi_zwnj_base_ner` is a Persian model originally trained by HooshvareLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_persian_farsi_zwnj_base_ner_fa_5.2.0_3.0_1699285522969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_persian_farsi_zwnj_base_ner_fa_5.2.0_3.0_1699285522969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_persian_farsi_zwnj_base_ner","fa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bert_persian_farsi_zwnj_base_ner", "fa") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_persian_farsi_zwnj_base_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fa| +|Size:|441.6 MB| + +## References + +https://huggingface.co/HooshvareLab/bert-fa-zwnj-base-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_small_finetuned_typo_detection_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_small_finetuned_typo_detection_en.md new file mode 100644 index 000000000000..171b565bef55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_small_finetuned_typo_detection_en.md @@ -0,0 +1,117 @@ +--- +layout: model +title: English Named Entity Recognition (from mrm8488) +author: John Snow Labs +name: bert_ner_bert_small_finetuned_typo_detection +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `bert-small-finetuned-typo-detection` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + +`typo`, `ok` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_small_finetuned_typo_detection_en_5.2.0_3.0_1699290367344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_small_finetuned_typo_detection_en_5.2.0_3.0_1699290367344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_small_finetuned_typo_detection","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_small_finetuned_typo_detection","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.small_finetuned").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_small_finetuned_typo_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|41.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mrm8488/bert-small-finetuned-typo-detection +- https://github.com/mhagiwara/github-typo-corpus +- https://github.com/mhagiwara/github-typo-corpus +- https://twitter.com/mrm8488 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_spanish_cased_finetuned_ner_es.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_spanish_cased_finetuned_ner_es.md new file mode 100644 index 000000000000..a9ce8c143c5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_spanish_cased_finetuned_ner_es.md @@ -0,0 +1,118 @@ +--- +layout: model +title: Spanish Named Entity Recognition (from mrm8488) +author: John Snow Labs +name: bert_ner_bert_spanish_cased_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, token_classification, es, open_source, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `bert-spanish-cased-finetuned-ner` is a Spanish model orginally trained by `mrm8488`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_spanish_cased_finetuned_ner_es_5.2.0_3.0_1699290603269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_spanish_cased_finetuned_ner_es_5.2.0_3.0_1699290603269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_spanish_cased_finetuned_ner","es") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Amo Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_spanish_cased_finetuned_ner","es") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Amo Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.ner.bert.cased_finetuned").predict("""Amo Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_spanish_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mrm8488/bert-spanish-cased-finetuned-ner +- https://www.kaggle.com/nltkdata/conll-corpora +- https://github.com/dccuchile/beto +- https://www.kaggle.com/nltkdata/conll-corpora +- https://twitter.com/mrm8488 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_split_title_org_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_split_title_org_en.md new file mode 100644 index 000000000000..7ae136c1447c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_split_title_org_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from pkushiqiang) +author: John Snow Labs +name: bert_ner_bert_split_title_org +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-split-title-org` is a English model originally trained by `pkushiqiang`. + +## Predicted Entities + +`org`, `jbttl_extra`, `degree`, `major`, `job_title` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_split_title_org_en_5.2.0_3.0_1699290907094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_split_title_org_en_5.2.0_3.0_1699290907094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_split_title_org","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_split_title_org","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.split_title_org.by_pkushiqiang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_split_title_org| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/pkushiqiang/bert-split-title-org \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_srb_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_srb_ner_en.md new file mode 100644 index 000000000000..dfb7d9132f26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_srb_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Aleksandar) +author: John Snow Labs +name: bert_ner_bert_srb_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-srb-ner` is a English model originally trained by `Aleksandar`. + +## Predicted Entities + +`org`, `per`, `loc` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_srb_ner_en_5.2.0_3.0_1699288860558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_srb_ner_en_5.2.0_3.0_1699288860558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_srb_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_srb_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.wikiann.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_srb_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Aleksandar/bert-srb-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_srb_ner_setimes_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_srb_ner_setimes_en.md new file mode 100644 index 000000000000..7126b8356d39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_srb_ner_setimes_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Aleksandar) +author: John Snow Labs +name: bert_ner_bert_srb_ner_setimes +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-srb-ner-setimes` is a English model originally trained by `Aleksandar`. + +## Predicted Entities + +`misc`, `deriv`, `org`, `loc`, `per` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_srb_ner_setimes_en_5.2.0_3.0_1699289138381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_srb_ner_setimes_en_5.2.0_3.0_1699289138381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_srb_ner_setimes","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_srb_ner_setimes","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_aleksandar").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_srb_ner_setimes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Aleksandar/bert-srb-ner-setimes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_tiny_chinese_ner_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_tiny_chinese_ner_zh.md new file mode 100644 index 000000000000..f57e2cf72e0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_tiny_chinese_ner_zh.md @@ -0,0 +1,117 @@ +--- +layout: model +title: Chinese BertForTokenClassification Tiny Cased model (from ckiplab) +author: John Snow Labs +name: bert_ner_bert_tiny_chinese_ner +date: 2023-11-06 +tags: [bert, ner, open_source, zh, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-chinese-ner` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`E-WORK_OF_ART`, `E-PRODUCT`, `S-PERCENT`, `E-EVENT`, `S-WORK_OF_ART`, `E-PERSON`, `MONEY`, `S-CARDINAL`, `E-LAW`, `PRODUCT`, `S-GPE`, `S-LANGUAGE`, `E-ORDINAL`, `S-MONEY`, `E-MONEY`, `QUANTITY`, `GPE`, `S-PERSON`, `EVENT`, `S-ORG`, `E-LOC`, `S-QUANTITY`, `PERCENT`, `E-TIME`, `CARDINAL`, `S-EVENT`, `NORP`, `S-LOC`, `WORK_OF_ART`, `E-PERCENT`, `DATE`, `S-PRODUCT`, `S-LAW`, `E-LANGUAGE`, `ORG`, `ORDINAL`, `FAC`, `TIME`, `LANGUAGE`, `LOC`, `E-NORP`, `E-QUANTITY`, `PERSON`, `E-GPE`, `E-ORG`, `S-ORDINAL`, `S-DATE`, `S-FAC`, `E-FAC`, `S-NORP`, `E-DATE`, `LAW`, `S-TIME`, `E-CARDINAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_tiny_chinese_ner_zh_5.2.0_3.0_1699288809737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_tiny_chinese_ner_zh_5.2.0_3.0_1699288809737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_tiny_chinese_ner","zh") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_tiny_chinese_ner","zh") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.ner.bert.tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_tiny_chinese_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|43.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-tiny-chinese-ner +- https://github.com/ckiplab/ckip-transformers +- https://muyang.pro +- https://ckip.iis.sinica.edu.tw \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_title_org_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_title_org_en.md new file mode 100644 index 000000000000..a7a92b2abf02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bert_title_org_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from pkushiqiang) +author: John Snow Labs +name: bert_ner_bert_title_org +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-title-org` is a English model originally trained by `pkushiqiang`. + +## Predicted Entities + +`major`, `org`, `job_title`, `degree` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bert_title_org_en_5.2.0_3.0_1699290598864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bert_title_org_en_5.2.0_3.0_1699290598864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_title_org","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bert_title_org","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.title_org.by_pkushiqiang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bert_title_org| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/pkushiqiang/bert-title-org \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bertimbau_base_lener_breton_luciano_pt.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bertimbau_base_lener_breton_luciano_pt.md new file mode 100644 index 000000000000..8f460182593e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bertimbau_base_lener_breton_luciano_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bert_ner_bertimbau_base_lener_breton_luciano BertForTokenClassification from Luciano +author: John Snow Labs +name: bert_ner_bertimbau_base_lener_breton_luciano +date: 2023-11-06 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bertimbau_base_lener_breton_luciano` is a Portuguese model originally trained by Luciano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bertimbau_base_lener_breton_luciano_pt_5.2.0_3.0_1699289631856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bertimbau_base_lener_breton_luciano_pt_5.2.0_3.0_1699289631856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bertimbau_base_lener_breton_luciano","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bertimbau_base_lener_breton_luciano", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bertimbau_base_lener_breton_luciano| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Luciano/bertimbau-base-lener_br \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bertimbau_large_lener_breton_luciano_pt.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bertimbau_large_lener_breton_luciano_pt.md new file mode 100644 index 000000000000..b04f7d0da73c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bertimbau_large_lener_breton_luciano_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bert_ner_bertimbau_large_lener_breton_luciano BertForTokenClassification from Luciano +author: John Snow Labs +name: bert_ner_bertimbau_large_lener_breton_luciano +date: 2023-11-06 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bertimbau_large_lener_breton_luciano` is a Portuguese model originally trained by Luciano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bertimbau_large_lener_breton_luciano_pt_5.2.0_3.0_1699289482461.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bertimbau_large_lener_breton_luciano_pt_5.2.0_3.0_1699289482461.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bertimbau_large_lener_breton_luciano","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bertimbau_large_lener_breton_luciano", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bertimbau_large_lener_breton_luciano| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|1.2 GB| + +## References + +https://huggingface.co/Luciano/bertimbau-large-lener_br \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bgc_accession_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bgc_accession_en.md new file mode 100644 index 000000000000..97466930cdbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bgc_accession_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Maaly) +author: John Snow Labs +name: bert_ner_bgc_accession +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bgc-accession` is a English model originally trained by `Maaly`. + +## Predicted Entities + +`bgc` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bgc_accession_en_5.2.0_3.0_1699289917120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bgc_accession_en_5.2.0_3.0_1699289917120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bgc_accession","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bgc_accession","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.bgc_accession.by_maaly").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bgc_accession| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Maaly/bgc-accession +- https://gitlab.com/maaly7/emerald_bgcs_annotations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bigbio_mtl_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bigbio_mtl_en.md new file mode 100644 index 000000000000..f8de5679a54f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bigbio_mtl_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from bigscience-biomedical) +author: John Snow Labs +name: bert_ner_bigbio_mtl +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bigbio-mtl` is a English model originally trained by `bigscience-biomedical`. + +## Predicted Entities + +`medmentions_full_ner:B-T085)`, `pdr_EAE:Theme)`, `bionlp_shared_task_2009_ner:I-Entity)`, `pcr_ner:B-Herb)`, `gnormplus_ner:I-Gene)`, `bionlp_st_2013_cg_EAE:Participant)`, `pubmed_qa_labeled_fold0_CLF:yes)`, `bionlp_st_2013_gro_ner:B-Ribosome)`, `anat_em_ner:O)`, `seth_corpus_RE:Equals)`, `chemprot_RE:CPR:10)`, `medmentions_full_ner:B-T102)`, `medmentions_full_ner:I-T171)`, `medmentions_full_ner:I-T082)`, `bionlp_st_2013_cg_ED:B-Positive_regulation)`, `anat_em_ner:B-Multi-tissue_structure)`, `hprd50_ner:O)`, `bionlp_st_2013_gro_ner:B-OxidativeStress)`, `mlee_ED:I-Transcription)`, `cellfinder_ner:I-GeneProtein)`, `chia_ner:B-Reference_point)`, `medmentions_full_ner:B-T015)`, `ncbi_disease_ner:B-CompositeMention)`, `bionlp_st_2013_gro_ner:I-RNAPolymerase)`, `bionlp_st_2013_gro_ner:B-Virus)`, `bionlp_st_2013_gro_ED:B-Pathway)`, `medmentions_full_ner:B-T025)`, `chebi_nactem_abstr_ann1_ner:B-Metabolite)`, `bio_sim_verb_sts:7)`, `bionlp_st_2013_gro_ED:B-Maintenance)`, `medmentions_full_ner:I-T129)`, `scai_disease_ner:B-DISEASE)`, `chemprot_RE:CPR:9)`, `biorelex_ner:B-chemical)`, `bionlp_st_2013_gro_ED:I-TranscriptionOfGene)`, `bionlp_st_2013_gro_ED:I-BindingOfProteinToProteinBindingSiteOfProtein)`, `bionlp_st_2013_cg_ner:B-Amino_acid)`, `pubmed_qa_labeled_fold0_CLF:maybe)`, `bionlp_st_2013_gro_ner:I-Sequence)`, `pico_extraction_ner:O)`, `bc5cdr_ner:B-Chemical)`, `bionlp_st_2013_pc_ner:B-Simple_chemical)`, `bionlp_st_2011_id_ED:B-Gene_expression)`, `an_em_ner:B-Developing_anatomical_structure)`, `bionlp_st_2019_bb_ner:I-Phenotype)`, `genia_term_corpus_ner:B-DNA_family_or_group)`, `medmentions_st21pv_ner:I-T204)`, `bionlp_st_2013_gro_ner:B-bZIP)`, `bionlp_st_2013_gro_ner:I-Eukaryote)`, `bionlp_st_2013_pc_ner:I-Complex)`, `mlee_ner:I-Cell)`, `bionlp_shared_task_2009_ED:I-Localization)`, `hprd50_ner:I-protein)`, `mantra_gsc_en_patents_ner:B-PHYS)`, `bionlp_st_2013_gro_ED:B-RegulationOfGeneExpression)`, `medmentions_full_ner:B-T020)`, `genia_term_corpus_ner:B-ANDprotein_moleculeprotein_molecule)`, `bionlp_shared_task_2009_EAE:AtLoc)`, `genia_term_corpus_ner:B-protein_molecule)`, `bionlp_st_2013_gro_ner:B-Agonist)`, `mantra_gsc_en_medline_ner:B-PHEN)`, `medmentions_full_ner:B-T030)`, `biorelex_ner:I-RNA-family)`, `medmentions_full_ner:B-T169)`, `ddi_corpus_ner:B-BRAND)`, `medmentions_full_ner:B-T087)`, `genia_term_corpus_ner:I-nucleotide)`, `bionlp_st_2013_gro_ED:I-CellCyclePhaseTransition)`, `mantra_gsc_en_medline_ner:B-DEVI)`, `tmvar_v1_ner:O)`, `bionlp_st_2013_gro_ED:I-CellularComponentOrganizationAndBiogenesis)`, `bioscope_abstracts_ner:B-speculation)`, `ebm_pico_ner:B-Outcome_Adverse-effects)`, `bionlp_shared_task_2009_EAE:Site)`, `mantra_gsc_en_medline_ner:B-PHYS)`, `bionlp_st_2013_gro_ner:I-Lipid)`, `genia_term_corpus_ner:I-ANDprotein_substructureprotein_substructure)`, `medmentions_st21pv_ner:B-T007)`, `bionlp_st_2013_cg_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-Organism)`, `bc5cdr_ner:O)`, `bionlp_st_2011_id_EAE:Site)`, `bionlp_st_2013_gro_ner:I-NucleicAcid)`, `medmentions_full_ner:I-T040)`, `bionlp_st_2013_gro_ED:B-BindingOfProteinToProteinBindingSiteOfProtein)`, `mlee_ED:I-Blood_vessel_development)`, `bionlp_st_2013_gro_ner:B-ExpressionProfiling)`, `medmentions_full_ner:I-T044)`, `mantra_gsc_en_emea_ner:I-DEVI)`, `chia_ner:I-Person)`, `ebm_pico_ner:B-Intervention_Pharmacological)`, `scai_disease_ner:O)`, `medmentions_full_ner:I-T121)`, `bionlp_st_2011_epi_ner:I-Entity)`, `mantra_gsc_en_emea_ner:I-ANAT)`, `genia_term_corpus_ner:B-cell_component)`, `bionlp_st_2019_bb_RE:Lives_In)`, `bionlp_st_2013_gro_ED:B-CatabolicPathway)`, `mantra_gsc_en_medline_ner:B-ANAT)`, `medmentions_full_ner:I-T065)`, `bionlp_st_2013_gro_ner:B-TranscriptionCofactor)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfDNA)`, `pdr_EAE:Cause)`, `anat_em_ner:I-Developing_anatomical_structure)`, `anat_em_ner:B-Cancer)`, `bionlp_st_2013_pc_ED:B-Gene_expression)`, `genia_term_corpus_ner:I-ORDNA_domain_or_regionDNA_domain_or_region)`, `scai_disease_ner:I-ADVERSE)`, `bionlp_st_2013_cg_ED:B-Dephosphorylation)`, `bionlp_st_2013_gro_ED:I-Heterodimerization)`, `mlee_ED:B-Catabolism)`, `biorelex_ner:I-protein-isoform)`, `bionlp_shared_task_2009_COREF:None)`, `bionlp_st_2013_gro_ED:B-RNASplicing)`, `bionlp_st_2013_gro_EAE:hasPatient)`, `mantra_gsc_en_medline_ner:I-ANAT)`, `medmentions_full_ner:I-T015)`, `bionlp_st_2013_pc_EAE:Product)`, `bionlp_st_2013_pc_EAE:AtLoc)`, `bionlp_st_2013_gro_ED:B-ProteinTargeting)`, `cellfinder_ner:B-CellComponent)`, `mantra_gsc_en_medline_ner:I-DISO)`, `bionlp_st_2013_gro_ED:I-Translation)`, `bionlp_st_2013_gro_ner:I-Prokaryote)`, `genia_term_corpus_ner:I-lipid)`, `bionlp_st_2013_pc_ED:B-Deacetylation)`, `biorelex_ner:B-RNA)`, `scai_chemical_ner:B-FAMILY)`, `bionlp_st_2013_gro_ED:I-Pathway)`, `bionlp_st_2013_gro_ner:B-ProteinIdentification)`, `bionlp_st_2011_ge_ner:O)`, `mlee_ner:B-Protein_domain_or_region)`, `bionlp_st_2011_id_ner:B-Organism)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelixTF)`, `bionlp_st_2013_gro_ner:I-Chromatin)`, `mlee_ED:I-Binding)`, `mirna_ner:B-Relation_Trigger)`, `bionlp_st_2013_gro_ner:B-Nucleotide)`, `linnaeus_ner:I-species)`, `medmentions_full_ner:I-T024)`, `verspoor_2013_ner:I-body-part)`, `bionlp_st_2011_epi_EAE:Sidechain)`, `bionlp_st_2013_gro_ner:I-ReporterGeneConstruction)`, `bionlp_st_2013_gro_ner:B-DNAFragment)`, `bionlp_st_2013_gro_ner:B-PositiveTranscriptionRegulator)`, `medmentions_full_ner:I-T049)`, `medmentions_full_ner:I-T025)`, `verspoor_2013_ner:I-gene)`, `bionlp_st_2019_bb_RE:Exhibits)`, `bionlp_st_2013_cg_ED:B-Gene_expression)`, `bionlp_st_2013_ge_ner:O)`, `mlee_ner:I-Developing_anatomical_structure)`, `mlee_ED:B-Positive_regulation)`, `bionlp_st_2013_gro_ED:B-FormationOfTranscriptionInitiationComplex)`, `bionlp_st_2011_ge_ner:B-Entity)`, `ddi_corpus_ner:I-GROUP)`, `medmentions_full_ner:I-T017)`, `bionlp_st_2013_gro_ED:I-Mutation)`, `bionlp_st_2011_id_EAE:AtLoc)`, `bionlp_st_2011_ge_ED:B-Regulation)`, `bionlp_st_2011_ge_EAE:Theme)`, `bionlp_st_2013_gro_ner:I-ExperimentalMethod)`, `bionlp_st_2013_gro_ner:B-HMGTF)`, `chemdner_ner:B-Chemical)`, `ehr_rel_sts:1)`, `medmentions_full_ner:I-T196)`, `bioscope_papers_ner:B-negation)`, `bionlp_shared_task_2009_ED:I-Negative_regulation)`, `bionlp_st_2013_pc_ED:B-Phosphorylation)`, `biorelex_RE:bind)`, `bioinfer_ner:B-Protein_complex)`, `scai_chemical_ner:I-TRIVIALVAR)`, `bionlp_shared_task_2009_ED:I-Binding)`, `bionlp_st_2011_rel_ner:I-Entity)`, `anat_em_ner:B-Tissue)`, `bionlp_st_2013_cg_ED:I-Remodeling)`, `bionlp_st_2013_cg_ner:I-Cell)`, `medmentions_full_ner:I-T074)`, `sciq_SEQ:None)`, `mantra_gsc_en_medline_ner:I-PROC)`, `bionlp_st_2011_id_ED:I-Negative_regulation)`, `bionlp_st_2013_gro_ner:I-Agonist)`, `chia_ner:I-Reference_point)`, `medmentions_full_ner:B-T024)`, `bionlp_st_2013_gro_ner:B-Histone)`, `chia_ner:I-Negation)`, `lll_RE:None)`, `ncbi_disease_ner:I-DiseaseClass)`, `bionlp_st_2013_gro_ner:I-Chromosome)`, `scai_disease_ner:B-ADVERSE)`, `medmentions_full_ner:B-T130)`, `bionlp_st_2011_epi_ED:B-Catalysis)`, `bionlp_st_2011_epi_ner:O)`, `mlee_EAE:AtLoc)`, `bionlp_st_2013_gro_ED:B-RegulationOfPathway)`, `genia_term_corpus_ner:I-RNA_family_or_group)`, `biosses_sts:8)`, `bionlp_st_2013_gro_ner:I-MolecularFunction)`, `verspoor_2013_ner:B-gene)`, `an_em_ner:I-Cell)`, `bionlp_st_2011_id_ED:B-Localization)`, `bionlp_st_2011_ge_EAE:Site)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomainTF)`, `bionlp_st_2013_gro_EAE:hasAgent)`, `bionlp_st_2013_gro_ner:B-DNARegion)`, `bionlp_shared_task_2009_ED:O)`, `mlee_EAE:Cause)`, `bionlp_st_2011_epi_ED:B-Ubiquitination)`, `bionlp_st_2013_gro_ED:I-GeneExpression)`, `bionlp_st_2013_gro_ner:I-CatalyticActivity)`, `anat_em_ner:B-Anatomical_system)`, `lll_RE:genic_interaction)`, `bionlp_st_2013_gro_ner:B-Nucleus)`, `bionlp_st_2013_ge_ED:B-Acetylation)`, `ebm_pico_ner:B-Intervention_Educational)`, `medmentions_st21pv_ner:B-T005)`, `mlee_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-OrganicChemical)`, `medmentions_full_ner:I-T022)`, `gnormplus_ner:B-FamilyName)`, `bionlp_st_2013_gro_ED:I-NegativeRegulationOfTranscription)`, `bionlp_st_2013_gro_ner:I-ChromosomalDNA)`, `anat_em_ner:B-Cell)`, `bionlp_st_2013_gro_ner:I-TranscriptionCofactor)`, `chia_ner:I-Observation)`, `bioscope_abstracts_ner:I-negation)`, `medmentions_full_ner:I-T089)`, `bionlp_st_2013_gro_ner:B-AP2EREBPRelatedDomain)`, `bionlp_st_2013_gro_ner:I-ComplexMolecularEntity)`, `bionlp_st_2013_gro_ner:B-Lipid)`, `mlee_ED:B-Death)`, `biorelex_ner:I-gene)`, `bionlp_st_2011_id_ED:I-Positive_regulation)`, `medmentions_st21pv_ner:B-T058)`, `bionlp_st_2011_id_ED:O)`, `biorelex_ner:B-protein-region)`, `bionlp_st_2011_id_ED:B-Regulation)`, `verspoor_2013_RE:relatedTo)`, `bionlp_st_2011_id_ED:I-Gene_expression)`, `genia_term_corpus_ner:B-cell_line)`, `bionlp_st_2013_gro_ner:B-UpstreamRegulatorySequence)`, `genia_term_corpus_ner:B-polynucleotide)`, `genia_term_corpus_ner:I-cell_component)`, `medmentions_full_ner:B-T013)`, `bionlp_st_2011_ge_COREF:None)`, `ebm_pico_ner:B-Participant_Sample-size)`, `bionlp_st_2013_gro_ED:B-RNAMetabolism)`, `bionlp_st_2013_gro_ner:I-RNA)`, `ddi_corpus_RE:EFFECT)`, `medmentions_st21pv_ner:B-T031)`, `bionlp_st_2013_cg_ner:I-Immaterial_anatomical_entity)`, `ebm_pico_ner:I-Intervention_Physical)`, `bionlp_st_2013_gro_ner:B-MolecularStructure)`, `bionlp_st_2013_gro_ED:B-GeneExpression)`, `bionlp_st_2013_pc_ner:B-Complex)`, `medmentions_full_ner:I-T090)`, `medmentions_st21pv_ner:I-T005)`, `bionlp_st_2013_gro_ED:B-ProteinTransport)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomainTF)`, `bionlp_st_2013_gro_ner:I-CpGIsland)`, `bionlp_st_2013_gro_ner:B-AminoAcid)`, `bionlp_st_2013_gro_ED:B-SPhase)`, `bionlp_st_2011_epi_COREF:None)`, `bionlp_st_2013_pc_ner:I-Cellular_component)`, `genia_term_corpus_ner:B-ANDDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_gro_ner:B-Chromosome)`, `medmentions_full_ner:I-T010)`, `bionlp_st_2013_gro_ner:I-OxidativeStress)`, `bionlp_st_2013_cg_ner:I-Anatomical_system)`, `bionlp_st_2013_gro_ED:B-BindingOfTFToTFBindingSiteOfDNA)`, `medmentions_st21pv_ner:I-T062)`, `medmentions_full_ner:B-T081)`, `scai_chemical_ner:B-PARTIUPAC)`, `bionlp_st_2013_gro_ner:I-RibosomalRNA)`, `verspoor_2013_ner:O)`, `bionlp_st_2011_epi_ED:B-Methylation)`, `bionlp_shared_task_2009_ner:B-Entity)`, `bionlp_st_2013_pc_ED:B-Transport)`, `bio_sim_verb_sts:3)`, `bionlp_st_2013_gro_ED:I-Elongation)`, `medmentions_full_ner:B-T058)`, `biorelex_ner:B-protein)`, `mantra_gsc_en_patents_ner:B-DEVI)`, `bionlp_st_2013_gro_ner:I-BasicDomain)`, `medmentions_full_ner:I-T071)`, `bionlp_st_2013_gro_ED:I-DevelopmentalProcess)`, `bionlp_st_2013_cg_ED:B-Catabolism)`, `mlee_ED:B-Growth)`, `mlee_EAE:Theme)`, `ebm_pico_ner:I-Intervention_Surgical)`, `bionlp_st_2011_ge_ner:I-Entity)`, `an_em_ner:I-Organ)`, `bionlp_st_2013_ge_ED:B-Positive_regulation)`, `iepa_RE:PPI)`, `bionlp_st_2013_gro_ner:B-PhysicalContinuant)`, `chemprot_RE:CPR:4)`, `bionlp_st_2011_id_EAE:Theme)`, `bionlp_st_2013_cg_ED:B-Amino_acid_catabolism)`, `genia_term_corpus_ner:B-other_name)`, `medmentions_full_ner:I-T130)`, `bionlp_st_2011_id_ED:I-Process)`, `mantra_gsc_en_patents_ner:O)`, `bionlp_st_2013_pc_ED:B-Ubiquitination)`, `medmentions_full_ner:B-T018)`, `bionlp_st_2011_id_EAE:ToLoc)`, `bionlp_st_2013_cg_ner:B-Organism)`, `medmentions_full_ner:B-T014)`, `bionlp_st_2013_pc_ED:I-Activation)`, `mlee_ED:I-Death)`, `medmentions_full_ner:I-T047)`, `bionlp_st_2011_ge_EAE:ToLoc)`, `bionlp_st_2013_cg_ED:I-Gene_expression)`, `bionlp_st_2013_gro_ner:B-AntisenseRNA)`, `bionlp_st_2013_gro_ner:B-ProteinCodingDNARegion)`, `bionlp_st_2013_gro_ED:I-BindingOfTFToTFBindingSiteOfDNA)`, `bionlp_st_2013_pc_ED:B-Methylation)`, `bionlp_st_2013_gro_ED:B-GeneMutation)`, `mlee_EAE:None)`, `bionlp_shared_task_2009_EAE:CSite)`, `chebi_nactem_fullpaper_ner:I-Protein)`, `genia_term_corpus_ner:I-multi_cell)`, `bionlp_st_2013_cg_ED:B-Cell_division)`, `ncbi_disease_ner:B-DiseaseClass)`, `bionlp_st_2013_gro_ner:I-Gene)`, `ebm_pico_ner:B-Intervention_Surgical)`, `medmentions_full_ner:B-T042)`, `medmentions_full_ner:I-T051)`, `cellfinder_ner:B-GeneProtein)`, `bionlp_st_2011_id_COREF:None)`, `biorelex_ner:I-brand)`, `bionlp_st_2013_gro_ner:B-CatalyticActivity)`, `chebi_nactem_abstr_ann1_ner:I-Biological_Activity)`, `bionlp_st_2013_gro_ED:B-OrganismalProcess)`, `bionlp_st_2013_gro_EAE:hasAgent2)`, `chebi_nactem_abstr_ann1_ner:I-Species)`, `bionlp_st_2013_pc_ED:B-Deubiquitination)`, `bionlp_st_2013_gro_ner:I-GeneProduct)`, `mayosrs_sts:6)`, `anat_em_ner:B-Immaterial_anatomical_entity)`, `bio_sim_verb_sts:1)`, `bionlp_st_2011_epi_ner:B-Entity)`, `medmentions_full_ner:I-T169)`, `bionlp_st_2013_gro_ner:B-bZIPTF)`, `mlee_ner:B-Immaterial_anatomical_entity)`, `an_em_RE:None)`, `verspoor_2013_ner:B-Physiology)`, `sciq_SEQ:answer)`, `cellfinder_ner:I-CellType)`, `mlee_RE:frag)`, `medmentions_st21pv_ner:I-T103)`, `ddi_corpus_RE:None)`, `bionlp_st_2013_gro_ner:I-AntisenseRNA)`, `medmentions_st21pv_ner:I-T091)`, `bionlp_st_2011_epi_EAE:Cause)`, `bionlp_st_2013_gro_ED:I-BindingToRNA)`, `bionlp_st_2013_gro_ED:I-PositiveRegulationOfTranscription)`, `bionlp_st_2013_pc_COREF:coref)`, `medmentions_full_ner:I-T067)`, `medmentions_full_ner:B-T005)`, `bionlp_st_2013_gro_ED:I-CellularMetabolicProcess)`, `bionlp_st_2011_epi_ED:B-Acetylation)`, `osiris_ner:B-variant)`, `ncbi_disease_ner:O)`, `spl_adr_200db_train_ner:I-DrugClass)`, `mantra_gsc_en_patents_ner:I-CHEM)`, `bionlp_st_2013_gro_ED:B-CellHomeostasis)`, `mayosrs_sts:2)`, `mirna_ner:I-Species)`, `bionlp_st_2013_cg_ED:B-Reproduction)`, `medmentions_full_ner:I-T102)`, `medmentions_st21pv_ner:I-T033)`, `medmentions_full_ner:B-T097)`, `bionlp_st_2013_pc_ED:I-Negative_regulation)`, `bionlp_st_2013_gro_ED:B-Dimerization)`, `ebm_pico_ner:I-Participant_Age)`, `medmentions_full_ner:B-T095)`, `bionlp_st_2013_gro_ED:B-RegulationOfProcess)`, `medmentions_full_ner:B-T002)`, `bionlp_st_2013_gro_ED:B-Binding)`, `bionlp_st_2013_gro_ED:B-BindingOfProtein)`, `verspoor_2013_ner:I-Concepts_Ideas)`, `bionlp_st_2011_epi_ner:I-Protein)`, `ddi_corpus_ner:O)`, `bionlp_st_2013_gro_ED:I-RNAMetabolism)`, `an_em_ner:I-Multi-tissue_structure)`, `medmentions_full_ner:B-T062)`, `genia_term_corpus_ner:I-ANDDNA_family_or_groupDNA_family_or_group)`, `medmentions_full_ner:I-T080)`, `ebm_pico_ner:B-Outcome_Physical)`, `medmentions_st21pv_ner:B-T103)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactor)`, `chia_ner:I-Qualifier)`, `genia_term_corpus_ner:B-protein_domain_or_region)`, `bionlp_st_2013_gro_ED:B-IntraCellularTransport)`, `bionlp_st_2013_gro_ner:I-ThreeDimensionalMolecularStructure)`, `bionlp_st_2013_gro_ner:I-TranscriptionCoactivator)`, `an_em_ner:I-Immaterial_anatomical_entity)`, `chebi_nactem_fullpaper_ner:I-Chemical)`, `mantra_gsc_en_emea_ner:B-PROC)`, `biosses_sts:5)`, `bionlp_st_2013_cg_ner:B-Cancer)`, `genia_term_corpus_ner:B-BUT_NOTother_nameother_name)`, `bionlp_st_2013_gro_ED:I-CellDivision)`, `bionlp_st_2013_gro_ED:I-TranscriptionTermination)`, `bionlp_st_2013_cg_ED:B-Acetylation)`, `mlee_ED:I-Localization)`, `ehr_rel_sts:2)`, `biorelex_ner:I-protein-DNA-complex)`, `bionlp_st_2011_id_COREF:coref)`, `bioinfer_RE:None)`, `nlm_gene_ner:B-Gene)`, `medmentions_full_ner:B-T104)`, `biosses_sts:6)`, `bionlp_st_2013_gro_ner:B-ReporterGene)`, `biosses_sts:1)`, `biorelex_ner:I-organism)`, `chia_ner:B-Value)`, `cellfinder_ner:B-Anatomy)`, `bionlp_st_2013_gro_ED:I-RegulatoryProcess)`, `verspoor_2013_ner:B-body-part)`, `bionlp_st_2013_gro_ED:I-Localization)`, `biorelex_ner:B-RNA-family)`, `ebm_pico_ner:B-Intervention_Control)`, `bionlp_st_2013_cg_ED:B-Binding)`, `bionlp_st_2013_gro_ED:B-BindingOfProteinToDNA)`, `bionlp_st_2013_ge_EAE:Cause)`, `chemprot_RE:CPR:3)`, `chia_RE:Has_mood)`, `pico_extraction_ner:I-outcome)`, `medmentions_st21pv_ner:B-T074)`, `bionlp_st_2013_cg_ner:I-Amino_acid)`, `bionlp_st_2013_cg_ED:B-Protein_processing)`, `bionlp_st_2013_cg_ED:B-Regulation)`, `medmentions_full_ner:B-T197)`, `bionlp_st_2013_gro_ED:I-NegativeRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_cg_ED:I-Transcription)`, `bionlp_st_2013_ge_ED:B-Gene_expression)`, `mantra_gsc_en_patents_ner:I-PHYS)`, `bionlp_st_2013_gro_ner:B-NucleicAcid)`, `bionlp_st_2013_gro_ED:B-CellDivision)`, `medmentions_st21pv_ner:I-T017)`, `bionlp_st_2011_id_EAE:CSite)`, `medmentions_full_ner:I-T046)`, `medmentions_full_ner:B-T204)`, `bionlp_st_2013_pc_ED:I-Dissociation)`, `spl_adr_200db_train_ner:B-Negation)`, `bionlp_st_2013_gro_ED:I-MetabolicPathway)`, `bionlp_st_2013_ge_ED:B-Regulation)`, `nlm_gene_ner:B-GENERIF)`, `verspoor_2013_ner:I-Disorder)`, `bionlp_st_2013_gro_ner:I-ReporterGene)`, `bionlp_st_2013_gro_ner:B-Vitamin)`, `bionlp_st_2013_cg_ner:B-Immaterial_anatomical_entity)`, `bionlp_st_2013_pc_ED:B-Acetylation)`, `chia_ner:B-Visit)`, `mantra_gsc_en_medline_ner:I-OBJC)`, `mayosrs_sts:8)`, `bionlp_st_2013_cg_ner:I-DNA_domain_or_region)`, `osiris_ner:B-gene)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressor)`, `bionlp_st_2013_cg_ED:I-Regulation)`, `bionlp_st_2013_gro_ner:I-RNAMolecule)`, `bionlp_st_2011_ge_ner:I-Protein)`, `mlee_ED:I-Regulation)`, `mlee_COREF:coref)`, `bionlp_st_2013_cg_ED:B-Metastasis)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelix)`, `bioinfer_ner:I-Gene)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivatorActivity)`, `medmentions_full_ner:I-T131)`, `genia_term_corpus_ner:B-protein_family_or_group)`, `linnaeus_filtered_ner:I-species)`, `medmentions_st21pv_ner:I-T168)`, `medmentions_full_ner:B-T123)`, `genia_term_corpus_ner:B-cell_type)`, `chebi_nactem_fullpaper_ner:B-Chemical)`, `ddi_corpus_ner:I-DRUG_N)`, `scai_chemical_ner:I-FAMILY)`, `bionlp_st_2013_gro_ner:I-Locus)`, `biorelex_ner:B-DNA)`, `mlee_EAE:FromLoc)`, `mlee_ED:B-Synthesis)`, `bionlp_st_2013_pc_ED:I-Inactivation)`, `bionlp_st_2013_gro_EAE:hasPatient2)`, `bionlp_st_2013_gro_ner:B-Transcript)`, `anat_em_ner:B-Organ)`, `chebi_nactem_abstr_ann1_ner:I-Spectral_Data)`, `anat_em_ner:I-Organism_substance)`, `spl_adr_200db_train_ner:B-DrugClass)`, `bionlp_st_2013_gro_ED:I-Splicing)`, `bionlp_st_2013_pc_ED:B-Positive_regulation)`, `bionlp_st_2013_gro_ner:I-ProteinSubunit)`, `bionlp_st_2013_gro_ED:B-ResponseToChemicalStimulus)`, `bionlp_st_2013_gro_ner:B-MutantGene)`, `bionlp_st_2013_pc_ED:B-Binding)`, `bionlp_st_2019_bb_ner:B-Phenotype)`, `bionlp_st_2013_gro_ED:B-CellMotility)`, `diann_iber_eval_en_ner:I-Neg)`, `mantra_gsc_en_medline_ner:B-DISO)`, `mlee_ED:I-Growth)`, `ddi_corpus_ner:B-DRUG_N)`, `biorelex_ner:B-protein-domain)`, `bionlp_st_2013_gro_ner:B-Eukaryote)`, `ncbi_disease_ner:I-CompositeMention)`, `chebi_nactem_fullpaper_ner:I-Spectral_Data)`, `seth_corpus_ner:I-SNP)`, `bionlp_st_2013_gro_ED:B-Elongation)`, `bionlp_st_2013_cg_ner:B-Organ)`, `hprd50_ner:B-protein)`, `biorelex_ner:I-DNA)`, `bionlp_st_2013_gro_ED:I-CellDeath)`, `bionlp_st_2013_cg_ner:I-Organism_subdivision)`, `bionlp_st_2013_cg_ED:B-Planned_process)`, `bionlp_st_2013_cg_ner:B-Cellular_component)`, `bionlp_st_2013_pc_ner:B-Cellular_component)`, `bionlp_st_2019_bb_ner:B-Microorganism)`, `ddi_corpus_RE:INT)`, `medmentions_st21pv_ner:B-T038)`, `cellfinder_ner:B-CellLine)`, `bioinfer_ner:I-GeneproteinRNA)`, `bionlp_shared_task_2009_EAE:None)`, `bionlp_st_2011_id_ner:I-Chemical)`, `bionlp_st_2013_gro_ED:B-BindingOfTranscriptionFactorToDNA)`, `bionlp_st_2011_id_ED:B-Protein_catabolism)`, `bionlp_st_2013_cg_ED:B-Cell_differentiation)`, `bionlp_shared_task_2009_ED:B-Negative_regulation)`, `bionlp_st_2013_cg_ED:B-Ubiquitination)`, `nlm_gene_ner:O)`, `bionlp_st_2013_pc_ED:I-Regulation)`, `bionlp_st_2013_gro_ED:I-CellFateDetermination)`, `biorelex_ner:I-mutation)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorBindingSiteOfDNA)`, `mantra_gsc_en_emea_ner:I-LIVB)`, `biorelex_COREF:None)`, `bionlp_st_2013_gro_ED:I-CellHomeostasis)`, `bionlp_st_2013_gro_ner:B-PhysicalContact)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactor)`, `medmentions_full_ner:B-T167)`, `medmentions_st21pv_ner:B-T091)`, `seth_corpus_ner:I-Gene)`, `bionlp_st_2013_gro_ED:I-ProteinCatabolism)`, `ebm_pico_ner:O)`, `bionlp_st_2011_ge_COREF:coref)`, `bionlp_st_2013_gro_ner:I-bHLHTF)`, `mlee_ner:B-Organ)`, `bionlp_st_2013_gro_ED:B-BindingToMolecularEntity)`, `pdr_ED:I-Cause_of_disease)`, `bionlp_st_2011_epi_ED:B-Glycosylation)`, `medmentions_full_ner:B-T031)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorComplex)`, `biorelex_ner:B-disease)`, `chebi_nactem_fullpaper_ner:I-Biological_Activity)`, `medmentions_st21pv_ner:I-T092)`, `bionlp_st_2013_cg_COREF:coref)`, `medmentions_full_ner:B-T168)`, `pcr_ner:I-Chemical)`, `mlee_ED:B-Dissociation)`, `genia_relation_corpus_RE:None)`, `medmentions_full_ner:B-T092)`, `genia_term_corpus_ner:I-ANDDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_gro_ED:I-FormationOfProteinDNAComplex)`, `mlee_ED:B-Development)`, `medmentions_full_ner:I-T032)`, `bionlp_st_2013_gro_ED:I-RNASplicing)`, `medmentions_full_ner:I-T167)`, `genia_term_corpus_ner:B-protein_NA)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivator)`, `bionlp_st_2013_ge_ner:B-Entity)`, `chemprot_RE:CPR:5)`, `bionlp_shared_task_2009_ED:I-Transcription)`, `an_em_ner:B-Multi-tissue_structure)`, `minimayosrs_sts:2)`, `chia_ner:I-Measurement)`, `chia_RE:Has_temporal)`, `bionlp_shared_task_2009_EAE:Cause)`, `bionlp_st_2013_gro_ED:B-RegulationOfTranscription)`, `biorelex_ner:B-protein-DNA-complex)`, `cellfinder_ner:I-CellComponent)`, `bionlp_st_2013_gro_ED:B-MolecularInteraction)`, `bionlp_st_2013_cg_ED:B-Transcription)`, `medmentions_full_ner:I-UnknownType)`, `mlee_EAE:Site)`, `bionlp_st_2013_gro_ED:I-Homodimerization)`, `bionlp_st_2013_gro_ner:I-Phenotype)`, `chemprot_ner:I-GENE-N)`, `nlm_gene_ner:B-Other)`, `biorelex_ner:B-reagent)`, `genia_term_corpus_ner:B-ANDDNA_family_or_groupDNA_family_or_group)`, `medmentions_full_ner:I-T019)`, `bionlp_st_2013_gro_ner:B-DNABindingSite)`, `nlmchem_ner:O)`, `biorelex_ner:B-organism)`, `chebi_nactem_abstr_ann1_ner:B-Spectral_Data)`, `bionlp_st_2013_cg_ner:I-Multi-tissue_structure)`, `ebm_pico_ner:I-Outcome_Mental)`, `medmentions_full_ner:B-T010)`, `scai_disease_ner:I-DISEASE)`, `mantra_gsc_en_medline_ner:I-GEOG)`, `scai_chemical_ner:B-IUPAC)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfProtein)`, `chebi_nactem_fullpaper_ner:O)`, `verspoor_2013_ner:B-mutation)`, `biorelex_ner:B-protein-isoform)`, `chemprot_ner:I-GENE-Y)`, `bionlp_st_2013_cg_EAE:CSite)`, `medmentions_full_ner:I-T095)`, `bionlp_st_2013_gro_ED:B-ResponseProcess)`, `mirna_ner:I-Diseases)`, `bionlp_st_2013_gro_ner:I-DNABindingSite)`, `an_em_ner:O)`, `biorelex_ner:O)`, `seth_corpus_RE:AssociatedTo)`, `mlee_EAE:Participant)`, `mlee_ED:B-Negative_regulation)`, `bioscope_abstracts_ner:B-negation)`, `chebi_nactem_fullpaper_ner:I-Metabolite)`, `bionlp_st_2011_epi_ED:B-Demethylation)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressorActivity)`, `bionlp_shared_task_2009_ner:O)`, `bionlp_shared_task_2009_EAE:Theme)`, `mlee_ED:B-Protein_processing)`, `medmentions_full_ner:B-T029)`, `medmentions_st21pv_ner:I-T058)`, `bionlp_st_2011_ge_ner:B-Protein)`, `bionlp_st_2013_ge_ner:B-Protein)`, `scicite_TEXT:background)`, `medmentions_full_ner:I-T029)`, `bionlp_st_2013_ge_ED:B-Negative_regulation)`, `genia_term_corpus_ner:B-ANDcell_typecell_type)`, `bionlp_st_2013_gro_ner:I-Tissue)`, `genia_term_corpus_ner:I-protein_substructure)`, `bionlp_st_2013_gro_ner:I-TranslationFactor)`, `scai_chemical_ner:B-SUM)`, `bionlp_st_2011_ge_ED:I-Gene_expression)`, `minimayosrs_sts:5)`, `medmentions_full_ner:B-T082)`, `bionlp_st_2011_epi_ED:B-Dehydroxylation)`, `genia_term_corpus_ner:B-mono_cell)`, `bionlp_st_2013_gro_ner:B-DNA)`, `medmentions_full_ner:I-T200)`, `medmentions_full_ner:I-T114)`, `ncbi_disease_ner:I-Modifier)`, `bionlp_st_2013_cg_EAE:Theme)`, `medmentions_full_ner:B-T079)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndRNA)`, `genetaggold_ner:B-NEWGENE)`, `mlee_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_ED:B-PositiveRegulation)`, `medmentions_full_ner:B-T196)`, `bio_sim_verb_sts:4)`, `bionlp_st_2013_gro_ner:B-Microorganism)`, `bionlp_st_2013_pc_ED:I-Binding)`, `biorelex_ner:B-process)`, `bionlp_st_2013_gro_RE:encodes)`, `biorelex_ner:B-fusion-protein)`, `mirna_ner:I-Non-Specific_miRNAs)`, `biorelex_ner:B-amino-acid)`, `bionlp_st_2013_ge_ED:I-Protein_catabolism)`, `bioinfer_ner:I-DNA_family_or_group)`, `mlee_COREF:None)`, `bionlp_st_2013_cg_ED:I-Positive_regulation)`, `mlee_ED:B-DNA_methylation)`, `bionlp_st_2013_gro_ner:I-Chemical)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfProtein)`, `mantra_gsc_en_patents_ner:I-DEVI)`, `bionlp_st_2013_gro_ED:B-CellGrowth)`, `mantra_gsc_en_medline_ner:O)`, `medmentions_full_ner:B-T043)`, `chemprot_RE:CPR:7)`, `bionlp_st_2013_gro_ED:B-Heterodimerization)`, `chia_ner:I-Value)`, `medmentions_full_ner:B-T046)`, `medmentions_full_ner:I-T048)`, `bionlp_st_2013_cg_EAE:Site)`, `gnormplus_ner:O)`, `chemprot_ner:B-GENE-Y)`, `bionlp_st_2013_gro_ED:I-SignalingPathway)`, `scicite_TEXT:result)`, `bionlp_st_2011_id_ner:I-Regulon-operon)`, `bionlp_st_2013_gro_ED:B-BindingOfDNABindingDomainOfProteinToDNA)`, `cellfinder_ner:I-CellLine)`, `ebm_pico_ner:I-Outcome_Adverse-effects)`, `medmentions_full_ner:I-T116)`, `bionlp_st_2013_gro_ner:I-DNABindingDomainOfProtein)`, `genia_term_corpus_ner:I-protein_domain_or_region)`, `bionlp_st_2013_gro_ner:B-Nucleosome)`, `medmentions_st21pv_ner:B-T168)`, `chemprot_ner:B-CHEMICAL)`, `bionlp_st_2013_gro_ED:I-CatabolicPathway)`, `bioinfer_ner:B-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-bZIPTF)`, `genia_term_corpus_ner:B-body_part)`, `mirna_ner:I-GenesProteins)`, `chebi_nactem_abstr_ann1_ner:B-Protein)`, `an_em_ner:B-Organ)`, `bionlp_st_2013_ge_ED:I-Negative_regulation)`, `genia_term_corpus_ner:B-ANDprotein_family_or_groupprotein_family_or_group)`, `biorelex_ner:I-process)`, `mlee_ner:B-Tissue)`, `medmentions_full_ner:B-T041)`, `mlee_ner:I-Tissue)`, `bionlp_st_2013_gro_RE:hasFunction)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorActivity)`, `bionlp_st_2011_ge_ED:B-Negative_regulation)`, `biorelex_ner:B-protein-family)`, `bionlp_st_2011_epi_ED:I-Deacetylation)`, `ebm_pico_ner:I-Participant_Condition)`, `genia_term_corpus_ner:B-DNA_domain_or_region)`, `medmentions_full_ner:B-T125)`, `bionlp_st_2013_gro_ED:B-DevelopmentalProcess)`, `bionlp_st_2013_ge_ED:I-Ubiquitination)`, `bionlp_st_2013_gro_ED:B-Cleavage)`, `bionlp_st_2013_gro_ner:I-TATAbox)`, `bionlp_st_2013_cg_ner:B-Gene_or_gene_product)`, `cellfinder_ner:O)`, `bionlp_st_2013_gro_ED:B-CellularComponentOrganizationAndBiogenesis)`, `bionlp_st_2013_ge_ED:I-Regulation)`, `bionlp_st_2013_gro_ner:I-MutatedProtein)`, `bionlp_st_2013_gro_ner:I-bZIP)`, `spl_adr_200db_train_ner:O)`, `bionlp_st_2013_gro_ner:B-LivingEntity)`, `bionlp_st_2011_ge_ED:B-Protein_catabolism)`, `bionlp_st_2013_pc_ED:B-Conversion)`, `mantra_gsc_en_medline_ner:B-CHEM)`, `medmentions_full_ner:I-T026)`, `chebi_nactem_abstr_ann1_ner:I-Protein)`, `medmentions_full_ner:I-T085)`, `bionlp_st_2013_cg_ner:I-Organism_substance)`, `medmentions_full_ner:I-T045)`, `medmentions_full_ner:B-T067)`, `tmvar_v1_ner:B-SNP)`, `biorelex_ner:I-drug)`, `bionlp_st_2013_gro_ner:B-ExperimentalMethod)`, `bionlp_st_2013_cg_ED:I-Cell_death)`, `bionlp_st_2013_pc_ED:B-Hydroxylation)`, `bionlp_st_2013_gro_ner:B-ReporterGeneConstruction)`, `bionlp_st_2013_gro_ED:B-CellularDevelopmentalProcess)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivator)`, `bionlp_st_2013_gro_ED:I-CellCycle)`, `mantra_gsc_en_emea_ner:B-LIVB)`, `verspoor_2013_ner:B-disease)`, `mantra_gsc_en_patents_ner:B-PROC)`, `bc5cdr_ner:I-Chemical)`, `medmentions_full_ner:I-T056)`, `nlm_gene_ner:I-STARGENE)`, `medmentions_full_ner:B-T050)`, `scai_chemical_ner:B-TRIVIALVAR)`, `bionlp_st_2013_gro_ner:B-MolecularFunction)`, `medmentions_full_ner:B-T090)`, `bionlp_st_2013_pc_EAE:Theme)`, `bionlp_st_2013_gro_ED:B-CellCyclePhaseTransition)`, `chebi_nactem_fullpaper_ner:I-Species)`, `medmentions_full_ner:B-T170)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomain)`, `medmentions_full_ner:B-T060)`, `mlee_ED:I-Development)`, `medmentions_full_ner:I-T060)`, `bionlp_st_2013_gro_ner:B-Cell)`, `medmentions_full_ner:I-T037)`, `bionlp_st_2013_gro_ED:B-CellDeath)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelix)`, `bionlp_st_2013_gro_ner:B-InorganicChemical)`, `medmentions_full_ner:B-T037)`, `bionlp_st_2013_cg_ner:B-Organism_subdivision)`, `genia_term_corpus_ner:B-RNA_NA)`, `bionlp_st_2013_cg_ED:B-Blood_vessel_development)`, `bionlp_st_2013_gro_ED:B-CellDifferentiation)`, `genia_term_corpus_ner:I-DNA_molecule)`, `bionlp_st_2013_gro_ED:B-IntraCellularProcess)`, `bionlp_st_2013_gro_ner:I-MessengerRNA)`, `bionlp_st_2013_pc_ED:B-Pathway)`, `medmentions_full_ner:I-T086)`, `bionlp_st_2013_ge_ED:I-Transcription)`, `bionlp_st_2019_bb_ner:O)`, `medmentions_full_ner:I-T001)`, `minimayosrs_sts:6)`, `medmentions_full_ner:I-T020)`, `an_em_RE:Part-of)`, `bionlp_shared_task_2009_ner:I-Protein)`, `an_em_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-Spliceosome)`, `chebi_nactem_fullpaper_ner:B-Species)`, `mirna_ner:O)`, `bioinfer_RE:PPI)`, `bionlp_st_2013_cg_ner:B-Protein_domain_or_region)`, `anat_em_ner:B-Organism_substance)`, `bionlp_st_2013_gro_ED:I-IntraCellularProcess)`, `bioscope_papers_ner:I-speculation)`, `ddi_corpus_ner:B-DRUG)`, `medmentions_full_ner:I-T078)`, `bionlp_st_2013_gro_ner:I-HMGTF)`, `medmentions_full_ner:B-T053)`, `bionlp_st_2013_gro_ner:B-HomeoBox)`, `minimayosrs_sts:3)`, `mlee_ner:B-Multi-tissue_structure)`, `biosses_sts:4)`, `mlee_ED:I-Gene_expression)`, `medmentions_full_ner:B-T004)`, `chia_ner:I-Drug)`, `bionlp_st_2013_gro_ner:B-FusionOfGeneWithReporterGene)`, `genia_term_corpus_ner:I-cell_line)`, `ddi_corpus_RE:ADVISE)`, `bioscope_abstracts_ner:I-speculation)`, `chebi_nactem_abstr_ann1_ner:I-Metabolite)`, `bionlp_st_2013_gro_ner:I-ExpressionProfiling)`, `medmentions_full_ner:B-T016)`, `bionlp_st_2013_gro_ner:I-Holoenzyme)`, `bionlp_st_2013_gro_ED:B-TranscriptionTermination)`, `bionlp_st_2013_cg_ner:I-Organ)`, `tmvar_v1_ner:B-DNAMutation)`, `bionlp_st_2013_ge_EAE:CSite)`, `genia_term_corpus_ner:B-RNA_substructure)`, `medmentions_full_ner:I-T170)`, `medmentions_full_ner:B-T093)`, `genia_term_corpus_ner:I-inorganic)`, `bionlp_st_2013_gro_ner:B-bHLH)`, `mlee_ED:B-Cell_proliferation)`, `bionlp_st_2013_gro_RE:hasPart)`, `bionlp_st_2013_cg_ED:B-Pathway)`, `bionlp_st_2013_gro_ner:B-BasicDomain)`, `bionlp_st_2013_gro_ED:I-PositiveRegulationOfGeneExpression)`, `mayosrs_sts:4)`, `medmentions_st21pv_ner:B-T037)`, `an_em_ner:B-Anatomical_system)`, `bionlp_st_2013_gro_ner:B-Conformation)`, `bionlp_st_2013_gro_ner:I-GeneRegion)`, `bionlp_st_2013_gro_ED:I-PosttranslationalModification)`, `genia_term_corpus_ner:I-RNA_NA)`, `bionlp_st_2011_ge_EAE:Cause)`, `medmentions_full_ner:B-T019)`, `medmentions_full_ner:I-T069)`, `scai_chemical_ner:B-TRIVIAL)`, `bionlp_st_2013_ge_ED:I-Protein_modification)`, `bionlp_st_2013_pc_ED:B-Degradation)`, `mlee_ner:B-Gene_or_gene_product)`, `bionlp_st_2013_gro_ED:I-Phosphorylation)`, `biosses_sts:3)`, `mlee_ED:B-Acetylation)`, `mlee_ED:I-Negative_regulation)`, `bionlp_st_2013_ge_ED:B-Protein_catabolism)`, `bionlp_st_2013_gro_ner:B-Promoter)`, `bionlp_shared_task_2009_ED:I-Phosphorylation)`, `medmentions_full_ner:B-T195)`, `bionlp_st_2013_cg_ED:I-Binding)`, `bionlp_st_2011_id_ner:I-Organism)`, `medmentions_full_ner:I-T073)`, `bionlp_st_2013_gro_ner:I-OrganicChemical)`, `ebm_pico_ner:B-Participant_Age)`, `verspoor_2013_ner:B-Concepts_Ideas)`, `biosses_sts:2)`, `bionlp_st_2013_cg_ED:B-Remodeling)`, `bionlp_st_2013_gro_ner:B-tRNA)`, `medmentions_full_ner:I-T043)`, `an_em_COREF:None)`, `bionlp_st_2011_epi_ED:B-Hydroxylation)`, `mlee_ner:I-Immaterial_anatomical_entity)`, `bionlp_st_2013_ge_ED:B-Ubiquitination)`, `medmentions_full_ner:B-T065)`, `bionlp_st_2019_bb_RE:None)`, `bionlp_st_2013_gro_ED:B-CellAging)`, `mlee_ED:B-Phosphorylation)`, `bionlp_st_2013_gro_ED:I-PositiveRegulationOfTranscriptionOfGene)`, `ebm_pico_ner:I-Participant_Sample-size)`, `biorelex_COREF:coref)`, `bionlp_shared_task_2009_ED:I-Protein_catabolism)`, `bionlp_st_2013_gro_ner:I-DNAMolecule)`, `bionlp_st_2013_gro_ner:I-Enzyme)`, `genia_term_corpus_ner:I-protein_family_or_group)`, `genia_term_corpus_ner:I-ANDprotein_moleculeprotein_molecule)`, `biorelex_ner:B-gene)`, `bionlp_st_2013_gro_ED:I-ProteinTransport)`, `bionlp_st_2013_gro_ED:B-MolecularProcess)`, `chebi_nactem_abstr_ann1_ner:O)`, `bionlp_st_2013_gro_ED:B-BindingOfProteinToProteinBindingSiteOfDNA)`, `chemprot_RE:None)`, `bionlp_st_2013_pc_ner:O)`, `mayosrs_sts:7)`, `bionlp_st_2013_pc_ED:B-Negative_regulation)`, `bionlp_st_2013_gro_ner:B-Sequence)`, `medmentions_full_ner:B-T103)`, `bionlp_st_2013_gro_ner:B-Gene)`, `chia_ner:B-Observation)`, `chia_ner:B-Scope)`, `an_em_COREF:coref)`, `ebm_pico_ner:B-Participant_Sex)`, `mlee_ED:B-Regulation)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndDNA)`, `bionlp_st_2013_gro_ner:B-Phenotype)`, `verspoor_2013_ner:I-age)`, `medmentions_full_ner:B-T120)`, `bionlp_st_2011_epi_ED:B-Deacetylation)`, `bionlp_st_2013_gro_ner:B-Tissue)`, `bionlp_st_2013_gro_ner:B-MolecularEntity)`, `bionlp_st_2013_ge_ED:I-Binding)`, `biorelex_ner:I-peptide)`, `medmentions_st21pv_ner:I-T097)`, `iepa_RE:None)`, `medmentions_full_ner:B-T001)`, `bionlp_shared_task_2009_ED:I-Regulation)`, `bionlp_st_2013_gro_ner:B-FusionProtein)`, `medmentions_full_ner:I-T194)`, `biorelex_ner:B-cell)`, `medmentions_full_ner:I-T096)`, `chebi_nactem_fullpaper_ner:I-Chemical_Structure)`, `medmentions_full_ner:I-T018)`, `medmentions_full_ner:B-T201)`, `chia_RE:None)`, `medmentions_full_ner:B-T054)`, `biorelex_RE:None)`, `ebm_pico_ner:I-Intervention_Pharmacological)`, `bionlp_st_2013_gro_ED:I-CellDifferentiation)`, `bionlp_st_2013_cg_ED:I-Cell_proliferation)`, `bionlp_st_2013_gro_EAE:hasPatient4)`, `bionlp_st_2011_id_EAE:Participant)`, `bionlp_st_2013_gro_ner:B-Substrate)`, `bionlp_st_2011_ge_ED:B-Transcription)`, `verspoor_2013_ner:B-cohort-patient)`, `ebm_pico_ner:B-Outcome_Other)`, `biorelex_ner:B-protein-motif)`, `bionlp_st_2013_gro_ner:B-Ion)`, `mlee_ED:B-Translation)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomain)`, `ebm_pico_ner:B-Participant_Condition)`, `bionlp_st_2011_ge_ED:B-Phosphorylation)`, `nlm_gene_ner:I-Gene)`, `bionlp_st_2013_gro_ner:B-Locus)`, `bionlp_st_2013_gro_ner:B-SecondMessenger)`, `bionlp_st_2013_cg_ED:B-Infection)`, `bionlp_st_2011_epi_EAE:Contextgene)`, `chia_ner:B-Drug)`, `bionlp_st_2019_bb_ner:I-Habitat)`, `bionlp_shared_task_2009_COREF:coref)`, `bionlp_st_2013_gro_ner:I-MolecularEntity)`, `mlee_ner:B-Cellular_component)`, `genia_term_corpus_ner:B-other_organic_compound)`, `bionlp_st_2013_gro_ED:I-CellAdhesion)`, `anat_em_ner:B-Cellular_component)`, `bionlp_st_2013_gro_ED:B-ProteinMetabolism)`, `seth_corpus_ner:B-SNP)`, `pcr_ner:O)`, `bionlp_st_2013_gro_ED:I-CellCyclePhase)`, `mlee_ner:B-DNA_domain_or_region)`, `mantra_gsc_en_emea_ner:B-PHYS)`, `bionlp_st_2013_cg_ner:B-Multi-tissue_structure)`, `genia_term_corpus_ner:I-virus)`, `bionlp_shared_task_2009_ED:I-Positive_regulation)`, `medmentions_full_ner:I-T122)`, `mantra_gsc_en_patents_ner:B-DISO)`, `bionlp_st_2013_gro_ner:B-Heterochromatin)`, `genia_term_corpus_ner:O)`, `mlee_ED:I-Positive_regulation)`, `an_em_ner:B-Cell)`, `bionlp_st_2013_cg_ner:B-Simple_chemical)`, `bionlp_st_2013_gro_ner:I-Peptide)`, `chemprot_RE:CPR:6)`, `chebi_nactem_abstr_ann1_ner:B-Chemical)`, `genia_term_corpus_ner:I-cell_type)`, `genia_term_corpus_ner:I-other_name)`, `bionlp_st_2013_cg_EAE:FromLoc)`, `bionlp_st_2013_gro_ner:B-RNAMolecule)`, `bionlp_st_2013_gro_ner:B-SequenceHomologyAnalysis)`, `medmentions_full_ner:I-T042)`, `tmvar_v1_ner:B-ProteinMutation)`, `pdr_ner:O)`, `bionlp_st_2013_gro_ED:B-MetabolicPathway)`, `medmentions_full_ner:I-T057)`, `bionlp_st_2011_ge_EAE:CSite)`, `bionlp_st_2013_gro_ED:B-BindingToProtein)`, `verspoor_2013_ner:B-size)`, `mlee_ED:B-Transcription)`, `bionlp_st_2013_gro_ner:I-BindingSiteOfProtein)`, `bionlp_st_2011_id_ner:B-Chemical)`, `bionlp_st_2013_gro_ner:I-Ribosome)`, `verspoor_2013_ner:B-Phenomena)`, `medmentions_st21pv_ner:B-T017)`, `medmentions_full_ner:B-T028)`, `chia_ner:B-Temporal)`, `chia_ner:I-Temporal)`, `biorelex_ner:B-assay)`, `bionlp_st_2013_cg_ED:I-Pathway)`, `genia_term_corpus_ner:B-tissue)`, `nlmchem_ner:I-Chemical)`, `mirna_ner:I-Specific_miRNAs)`, `bionlp_st_2013_cg_ED:B-Negative_regulation)`, `medmentions_full_ner:I-T012)`, `mlee_ner:B-Organism_substance)`, `bionlp_st_2013_gro_ner:B-TranscriptionCoactivator)`, `genia_term_corpus_ner:I-tissue)`, `genia_term_corpus_ner:B-amino_acid_monomer)`, `mantra_gsc_en_patents_ner:I-ANAT)`, `medmentions_st21pv_ner:I-T082)`, `mantra_gsc_en_emea_ner:B-DEVI)`, `bionlp_st_2013_gro_RE:None)`, `medmentions_full_ner:I-T052)`, `bionlp_st_2011_ge_ED:I-Phosphorylation)`, `mqp_sts:3)`, `bionlp_st_2013_cg_ED:B-Glycosylation)`, `an_em_ner:B-Immaterial_anatomical_entity)`, `bionlp_st_2013_gro_ner:B-Chemical)`, `bionlp_st_2013_gro_ED:B-GeneSilencing)`, `bionlp_shared_task_2009_ED:B-Transcription)`, `genia_term_corpus_ner:B-other_artificial_source)`, `medmentions_full_ner:B-T072)`, `mantra_gsc_en_medline_ner:B-GEOG)`, `mirna_ner:B-Specific_miRNAs)`, `medmentions_full_ner:B-T190)`, `medmentions_full_ner:I-T031)`, `bionlp_st_2013_gro_ED:B-TranscriptionInitiation)`, `bionlp_st_2013_gro_ner:I-DoubleStrandDNA)`, `bionlp_st_2013_gro_ED:B-Translation)`, `scai_chemical_ner:I-IUPAC)`, `chemdner_ner:O)`, `bionlp_st_2013_gro_ED:B-G1Phase)`, `genia_term_corpus_ner:B-peptide)`, `bionlp_st_2013_gro_ED:B-PosttranslationalModification)`, `bionlp_st_2011_epi_EAE:Site)`, `an_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_cg_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_EAE:hasPatient3)`, `bionlp_st_2013_gro_ner:B-MessengerRNA)`, `medmentions_full_ner:B-T171)`, `bionlp_st_2013_ge_EAE:Theme2)`, `bionlp_st_2013_gro_ner:B-RNA)`, `genia_term_corpus_ner:I-amino_acid_monomer)`, `an_em_ner:B-Organism_substance)`, `bionlp_st_2013_gro_ED:I-RNAProcessing)`, `genia_term_corpus_ner:I-body_part)`, `medmentions_full_ner:B-T052)`, `chia_ner:B-Procedure)`, `bionlp_st_2013_gro_ner:B-Prokaryote)`, `bionlp_st_2011_ge_ED:I-Positive_regulation)`, `medmentions_full_ner:I-T061)`, `genia_term_corpus_ner:B-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:B-T096)`, `bionlp_st_2013_cg_ED:B-DNA_demethylation)`, `bionlp_st_2011_epi_ED:B-Deubiquitination)`, `medmentions_full_ner:B-T038)`, `medmentions_full_ner:I-T109)`, `bionlp_st_2013_gro_ED:I-SPhase)`, `bionlp_st_2013_gro_ner:I-EukaryoticCell)`, `pdr_ner:I-Plant)`, `bionlp_st_2013_gro_ED:I-Binding)`, `medmentions_full_ner:I-T092)`, `mantra_gsc_en_medline_ner:I-CHEM)`, `bionlp_st_2011_id_ED:B-Phosphorylation)`, `bionlp_st_2013_cg_ED:I-Metabolism)`, `bionlp_st_2013_gro_ED:B-PositiveRegulationOfGeneExpression)`, `chebi_nactem_fullpaper_ner:B-Biological_Activity)`, `ncbi_disease_ner:B-SpecificDisease)`, `mlee_ner:B-Organism)`, `medmentions_full_ner:B-T063)`, `bionlp_st_2013_cg_ED:B-Glycolysis)`, `medmentions_full_ner:I-T168)`, `medmentions_full_ner:I-T064)`, `bionlp_st_2013_gro_ner:B-DNAMolecule)`, `mlee_ED:B-Binding)`, `bioscope_abstracts_ner:O)`, `biorelex_ner:B-protein-complex)`, `bionlp_st_2013_gro_EAE:None)`, `mantra_gsc_en_medline_ner:I-PHEN)`, `bionlp_st_2013_cg_ner:B-Pathological_formation)`, `mlee_ED:I-Cell_proliferation)`, `bionlp_st_2013_pc_ner:I-Simple_chemical)`, `anat_em_ner:I-Cancer)`, `an_em_ner:I-Anatomical_system)`, `medmentions_full_ner:I-T072)`, `bionlp_st_2013_gro_ner:B-ProteinComplex)`, `bionlp_st_2013_gro_ED:I-NegativeRegulationOfGeneExpression)`, `bio_sim_verb_sts:2)`, `bionlp_st_2013_gro_ner:B-DoubleStrandDNA)`, `medmentions_full_ner:I-T066)`, `pdr_ED:B-Treatment_of_disease)`, `seth_corpus_ner:O)`, `bionlp_st_2013_ge_EAE:ToLoc)`, `bionlp_st_2013_gro_ED:B-Localization)`, `bionlp_st_2013_gro_ner:I-Exon)`, `medmentions_full_ner:B-T070)`, `biorelex_ner:I-experiment-tag)`, `medmentions_full_ner:B-T068)`, `medmentions_full_ner:I-T034)`, `cellfinder_ner:B-Species)`, `biorelex_ner:I-protein-RNA-complex)`, `medmentions_st21pv_ner:I-T201)`, `biosses_sts:0)`, `bionlp_st_2013_cg_ner:B-Organism_substance)`, `bionlp_st_2013_gro_ner:I-FusionGene)`, `genia_term_corpus_ner:B-protein_complex)`, `mantra_gsc_en_emea_ner:B-DISO)`, `bionlp_st_2013_gro_ED:I-RegulationOfGeneExpression)`, `medmentions_full_ner:I-T125)`, `bionlp_st_2013_ge_ner:I-Entity)`, `bionlp_st_2011_rel_ner:B-Entity)`, `medmentions_st21pv_ner:I-T031)`, `medmentions_full_ner:B-T099)`, `bionlp_st_2013_gro_ner:B-TATAbox)`, `bionlp_st_2013_gro_ner:I-BindingAssay)`, `bionlp_st_2019_bb_ner:I-Microorganism)`, `medmentions_full_ner:I-T059)`, `medmentions_full_ner:B-T114)`, `medmentions_st21pv_ner:I-T022)`, `bionlp_st_2013_pc_ED:B-Inactivation)`, `spl_adr_200db_train_ner:B-Factor)`, `bionlp_st_2013_gro_ner:B-Function)`, `bionlp_st_2013_gro_ner:B-GeneRegion)`, `medmentions_full_ner:I-T033)`, `bionlp_st_2013_cg_COREF:None)`, `bionlp_st_2013_gro_ner:B-HMG)`, `bionlp_shared_task_2009_ED:B-Binding)`, `bionlp_st_2013_gro_ner:B-Operon)`, `chemprot_ner:I-CHEMICAL)`, `ebm_pico_ner:I-Outcome_Pain)`, `medmentions_full_ner:I-T053)`, `bionlp_st_2013_gro_ner:B-Protein)`, `ebm_pico_ner:I-Outcome_Physical)`, `biorelex_ner:I-organelle)`, `verspoor_2013_ner:I-cohort-patient)`, `genia_term_corpus_ner:I-ANDprotein_family_or_groupprotein_family_or_group)`, `genia_term_corpus_ner:I-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfDNA)`, `bionlp_st_2013_ge_ED:B-Protein_modification)`, `bionlp_st_2011_epi_ED:B-Dephosphorylation)`, `bionlp_st_2013_gro_ner:B-RNAPolymerase)`, `an_em_ner:I-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:B-CellComponent)`, `biorelex_ner:I-chemical)`, `bionlp_st_2013_gro_ED:B-Mutation)`, `gnormplus_ner:B-DomainMotif)`, `bionlp_st_2013_gro_ner:B-Peptide)`, `bionlp_st_2013_pc_ED:B-Translation)`, `biorelex_ner:B-tissue)`, `bionlp_st_2011_ge_EAE:AtLoc)`, `biorelex_ner:I-RNA)`, `bionlp_st_2013_pc_ED:B-Regulation)`, `pico_extraction_ner:B-participant)`, `chia_RE:Has_qualifier)`, `chia_ner:I-Visit)`, `medmentions_full_ner:I-T008)`, `bionlp_st_2013_ge_ED:B-Phosphorylation)`, `medmentions_full_ner:I-T016)`, `pdr_ner:I-Disease)`, `pdr_ED:B-Cause_of_disease)`, `verspoor_2013_RE:has)`, `verspoor_2013_ner:I-ethnicity)`, `bionlp_st_2013_pc_EAE:Participant)`, `genia_term_corpus_ner:I-protein_NA)`, `ehr_rel_sts:7)`, `medmentions_full_ner:I-T079)`, `bionlp_st_2013_gro_ner:I-SmallInterferingRNA)`, `bionlp_st_2013_cg_ED:O)`, `pico_extraction_ner:I-intervention)`, `biorelex_ner:I-protein-domain)`, `chebi_nactem_abstr_ann1_ner:I-Chemical)`, `medmentions_full_ner:I-T011)`, `bionlp_st_2013_gro_ED:B-RegulationOfFunction)`, `mlee_ner:O)`, `mqp_sts:1)`, `bioscope_papers_ner:O)`, `chia_RE:Has_scope)`, `an_em_ner:I-Pathological_formation)`, `bc5cdr_ner:B-Disease)`, `gnormplus_ner:I-DomainMotif)`, `bionlp_st_2013_gro_ner:I-OpenReadingFrame)`, `mlee_ner:I-Cellular_component)`, `medmentions_full_ner:I-T195)`, `spl_adr_200db_train_ner:B-AdverseReaction)`, `bionlp_st_2011_ge_ED:B-Positive_regulation)`, `muchmore_en_ner:O)`, `bionlp_st_2013_gro_ner:I-Promoter)`, `bionlp_st_2013_gro_EAE:hasPatient5)`, `bionlp_st_2013_gro_ner:I-RegulatoryDNARegion)`, `bionlp_st_2013_gro_ner:I-RuntLikeDomain)`, `bionlp_st_2013_cg_ED:B-Carcinogenesis)`, `medmentions_full_ner:B-T040)`, `medmentions_full_ner:I-T103)`, `medmentions_st21pv_ner:I-T037)`, `mlee_EAE:ToLoc)`, `mlee_EAE:Instrument)`, `medmentions_full_ner:B-T008)`, `ebm_pico_ner:B-Intervention_Psychological)`, `bionlp_st_2013_gro_ner:B-Stress)`, `biorelex_ner:B-protein-RNA-complex)`, `bionlp_st_2013_gro_ED:B-RNAProcessing)`, `bionlp_st_2013_gro_ED:B-SignalingPathway)`, `genia_term_corpus_ner:B-multi_cell)`, `bionlp_st_2013_gro_ner:B-ChromosomalDNA)`, `anat_em_ner:I-Cellular_component)`, `spl_adr_200db_train_ner:I-Negation)`, `medmentions_full_ner:I-T087)`, `bionlp_st_2013_ge_ED:B-Deacetylation)`, `bionlp_st_2013_gro_ner:B-RegulatoryDNARegion)`, `ebm_pico_ner:B-Outcome_Pain)`, `bionlp_st_2011_ge_EAE:None)`, `bionlp_st_2013_gro_ED:I-RNABiosynthesis)`, `bionlp_st_2013_gro_ner:I-HomeoboxTF)`, `mantra_gsc_en_patents_ner:I-LIVB)`, `bionlp_st_2013_gro_ner:I-UpstreamRegulatorySequence)`, `ddi_corpus_ner:I-DRUG)`, `bionlp_st_2011_ge_ED:O)`, `mantra_gsc_en_medline_ner:B-OBJC)`, `bionlp_st_2013_gro_ED:I-ProteinBiosynthesis)`, `mayosrs_sts:3)`, `linnaeus_filtered_ner:O)`, `chia_RE:Has_multiplier)`, `bionlp_st_2011_ge_ED:B-Localization)`, `medmentions_full_ner:B-T116)`, `bionlp_st_2013_cg_EAE:ToLoc)`, `cellfinder_ner:B-CellType)`, `medmentions_full_ner:B-T007)`, `ehr_rel_sts:3)`, `anat_em_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:I-MutantProtein)`, `bionlp_st_2013_gro_ED:B-NegativeRegulationOfGeneExpression)`, `chemprot_ner:B-GENE-N)`, `mlee_ED:B-Blood_vessel_development)`, `medmentions_full_ner:I-T077)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressorActivity)`, `biorelex_ner:B-brand)`, `medmentions_full_ner:B-T091)`, `bionlp_st_2011_id_ED:B-Positive_regulation)`, `ebm_pico_ner:B-Outcome_Mental)`, `bionlp_st_2013_gro_ner:B-EukaryoticCell)`, `bionlp_st_2013_pc_ED:I-Positive_regulation)`, `genia_term_corpus_ner:I-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:I-T184)`, `bionlp_st_2011_id_ner:B-Protein)`, `mayosrs_sts:1)`, `mantra_gsc_en_patents_ner:B-CHEM)`, `mlee_ED:B-Ubiquitination)`, `biorelex_ner:B-mutation)`, `mantra_gsc_en_medline_ner:I-DEVI)`, `bionlp_st_2013_ge_ED:I-Positive_regulation)`, `linnaeus_ner:O)`, `bionlp_st_2013_gro_ner:B-Enzyme)`, `medmentions_st21pv_ner:B-T201)`, `medmentions_full_ner:B-T056)`, `bionlp_st_2011_id_EAE:Cause)`, `bionlp_st_2013_gro_ED:B-BindingToRNA)`, `verspoor_2013_ner:B-Disorder)`, `tmvar_v1_ner:I-DNAMutation)`, `mantra_gsc_en_patents_ner:B-OBJC)`, `medmentions_full_ner:B-T073)`, `bionlp_st_2013_gro_ED:I-CellularProcess)`, `bionlp_st_2013_gro_ED:I-NegativeRegulation)`, `anat_em_ner:I-Tissue)`, `bioinfer_ner:I-Individual_protein)`, `medmentions_full_ner:B-T191)`, `cellfinder_ner:I-Anatomy)`, `chia_ner:I-Scope)`, `ncbi_disease_ner:B-Modifier)`, `bionlp_st_2013_cg_ED:I-Growth)`, `medmentions_st21pv_ner:B-T082)`, `bionlp_st_2013_gro_ED:I-GeneSilencing)`, `mlee_ED:B-Pathway)`, `bionlp_st_2013_cg_ner:I-Cellular_component)`, `medmentions_full_ner:I-T054)`, `chia_ner:B-Condition)`, `verspoor_2013_ner:B-ethnicity)`, `genia_term_corpus_ner:I-carbohydrate)`, `mlee_ner:B-Developing_anatomical_structure)`, `medmentions_full_ner:B-T012)`, `bionlp_st_2013_gro_ner:I-AP2EREBPRelatedDomain)`, `bionlp_st_2013_gro_ED:B-Silencing)`, `mayosrs_sts:5)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorComplex)`, `genia_term_corpus_ner:B-ANDprotein_substructureprotein_substructure)`, `bionlp_shared_task_2009_ED:B-Regulation)`, `medmentions_full_ner:B-T064)`, `bionlp_st_2013_cg_ner:I-Tissue)`, `bionlp_st_2013_gro_ner:B-Intron)`, `bionlp_st_2013_cg_ED:I-Catabolism)`, `mlee_ED:B-Localization)`, `genia_term_corpus_ner:I-DNA_domain_or_region)`, `chia_ner:B-Device)`, `medmentions_full_ner:B-T026)`, `genia_term_corpus_ner:B-carbohydrate)`, `nlmchem_ner:B-Chemical)`, `bionlp_st_2013_gro_ED:B-Disease)`, `anat_em_ner:I-Immaterial_anatomical_entity)`, `genia_term_corpus_ner:B-DNA_molecule)`, `medmentions_full_ner:I-T007)`, `bionlp_st_2013_gro_ner:I-DNAFragment)`, `genia_term_corpus_ner:I-RNA_domain_or_region)`, `bionlp_st_2013_gro_ner:B-MutatedProtein)`, `ebm_pico_ner:I-Outcome_Mortality)`, `bionlp_st_2013_gro_ner:B-ProteinCodingRegion)`, `ebm_pico_ner:I-Intervention_Educational)`, `genia_term_corpus_ner:B-ANDcell_linecell_line)`, `spl_adr_200db_train_ner:I-AdverseReaction)`, `bionlp_st_2013_ge_EAE:Site)`, `bionlp_st_2013_cg_ED:I-Cell_transformation)`, `genia_term_corpus_ner:B-protein_substructure)`, `chia_ner:B-Mood)`, `bionlp_st_2013_gro_ED:I-Transport)`, `bionlp_st_2011_ge_ED:I-Negative_regulation)`, `medmentions_full_ner:I-T058)`, `biorelex_ner:B-parameter)`, `medmentions_st21pv_ner:O)`, `bionlp_st_2013_ge_ED:O)`, `bionlp_st_2013_pc_EAE:ToLoc)`, `cellfinder_ner:I-Species)`, `medmentions_full_ner:B-T069)`, `bionlp_st_2013_gro_ED:B-TranscriptionOfGene)`, `chia_ner:I-Condition)`, `mirna_ner:I-Relation_Trigger)`, `bionlp_st_2013_gro_ED:B-FormationOfProteinDNAComplex)`, `bionlp_st_2013_gro_ner:I-InorganicChemical)`, `bionlp_st_2011_id_ner:B-Entity)`, `bionlp_st_2013_gro_ner:B-PrimaryStructure)`, `an_em_ner:I-Cellular_component)`, `medmentions_full_ner:B-T021)`, `mlee_ner:B-Anatomical_system)`, `bionlp_st_2013_pc_ED:B-Localization)`, `chebi_nactem_fullpaper_ner:B-Spectral_Data)`, `mlee_EAE:CSite)`, `bionlp_st_2013_cg_ED:I-Negative_regulation)`, `mlee_ED:I-Breakdown)`, `bionlp_shared_task_2009_ED:B-Localization)`, `bionlp_shared_task_2009_ED:B-Phosphorylation)`, `medmentions_st21pv_ner:I-T170)`, `pico_extraction_ner:I-participant)`, `bionlp_st_2013_cg_ED:B-Breakdown)`, `bionlp_st_2013_gro_ner:I-Nucleotide)`, `chia_ner:B-Person)`, `medmentions_full_ner:B-T194)`, `chia_RE:Subsumes)`, `mlee_ED:B-Metabolism)`, `medmentions_full_ner:I-T099)`, `bionlp_st_2013_gro_ner:I-Protein)`, `an_em_ner:B-Tissue)`, `bioscope_papers_ner:B-speculation)`, `medmentions_st21pv_ner:B-T170)`, `bionlp_st_2013_gro_ED:B-ExperimentalIntervention)`, `bionlp_st_2011_epi_ED:I-Glycosylation)`, `mlee_ED:B-Gene_expression)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorActivity)`, `bionlp_st_2011_epi_ED:B-Phosphorylation)`, `mlee_ED:B-Breakdown)`, `mlee_RE:None)`, `bionlp_st_2013_pc_ED:B-Dephosphorylation)`, `mlee_ner:B-Organism_subdivision)`, `bionlp_st_2013_cg_EAE:Cause)`, `bionlp_st_2013_gro_ner:B-RNAPolymeraseII)`, `medmentions_st21pv_ner:B-T098)`, `bionlp_st_2013_ge_ED:I-Phosphorylation)`, `chia_RE:Has_negation)`, `spl_adr_200db_train_ner:I-Factor)`, `bionlp_st_2013_gro_ED:I-OrganismalProcess)`, `bionlp_shared_task_2009_ED:B-Protein_catabolism)`, `verspoor_2013_ner:I-mutation)`, `bionlp_st_2013_gro_ED:B-Phosphorylation)`, `bionlp_st_2013_ge_EAE:Site2)`, `medmentions_full_ner:B-T129)`, `seth_corpus_ner:B-RS)`, `ebm_pico_ner:I-Participant_Sex)`, `genia_term_corpus_ner:I-protein_molecule)`, `medmentions_full_ner:B-T192)`, `bionlp_st_2013_pc_EAE:None)`, `medmentions_full_ner:I-T094)`, `bionlp_st_2013_ge_ED:I-Gene_expression)`, `bionlp_st_2013_cg_ED:B-Mutation)`, `medmentions_st21pv_ner:B-T033)`, `mlee_ner:B-Drug_or_compound)`, `medmentions_full_ner:B-T061)`, `pcr_ner:I-Herb)`, `bionlp_st_2013_gro_ner:I-MolecularStructure)`, `bionlp_st_2013_cg_ED:I-Development)`, `medmentions_full_ner:B-T032)`, `bionlp_st_2013_pc_ED:B-Dissociation)`, `bionlp_st_2013_pc_ED:I-Localization)`, `genia_term_corpus_ner:B-nucleotide)`, `ebm_pico_ner:B-Outcome_Mortality)`, `bionlp_st_2011_rel_ner:O)`, `bionlp_st_2013_gro_ner:I-Cell)`, `medmentions_full_ner:I-T014)`, `mantra_gsc_en_emea_ner:B-ANAT)`, `medmentions_full_ner:I-T055)`, `medmentions_full_ner:B-T101)`, `bionlp_st_2013_gro_ED:I-RegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressor)`, `bionlp_st_2013_gro_ED:B-ProteinBiosynthesis)`, `biorelex_ner:I-cell)`, `verspoor_2013_RE:None)`, `bionlp_st_2013_cg_ED:I-Blood_vessel_development)`, `genia_term_corpus_ner:I-ANDcell_linecell_line)`, `bionlp_st_2011_id_ED:B-Transcription)`, `medmentions_full_ner:I-T204)`, `tmvar_v1_ner:I-SNP)`, `chia_RE:Has_value)`, `biorelex_ner:I-protein-family)`, `bionlp_st_2013_cg_ED:B-Death)`, `biorelex_ner:I-experimental-construct)`, `mantra_gsc_en_medline_ner:I-PHYS)`, `genia_term_corpus_ner:B-)`, `medmentions_full_ner:I-T203)`, `bionlp_st_2013_gro_ED:B-CellAdhesion)`, `bionlp_st_2013_gro_ner:B-TranslationFactor)`, `ebm_pico_ner:I-Intervention_Control)`, `bionlp_st_2011_ge_ED:I-Protein_catabolism)`, `bionlp_st_2013_gro_ner:B-BetaScaffoldDomain_WithMinorGrooveContacts)`, `bionlp_st_2013_gro_ED:I-BindingOfTFToTFBindingSiteOfProtein)`, `genia_term_corpus_ner:I-atom)`, `scai_chemical_ner:B-)`, `bionlp_st_2013_gro_ner:I-Stress)`, `bionlp_st_2013_pc_ED:I-Pathway)`, `bionlp_st_2011_epi_ED:I-Catalysis)`, `mlee_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-Exon)`, `medmentions_full_ner:I-T083)`, `bionlp_st_2013_cg_ED:B-Translation)`, `chia_ner:B-Measurement)`, `bionlp_st_2011_id_ner:B-Regulon-operon)`, `pdr_ED:I-Treatment_of_disease)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivatorActivity)`, `bionlp_st_2011_epi_ED:I-DNA_methylation)`, `osiris_ner:I-gene)`, `bionlp_st_2013_cg_ner:O)`, `pdr_ner:B-Plant)`, `bionlp_st_2013_gro_ED:B-PositiveRegulationOfTranscription)`, `mantra_gsc_en_patents_ner:B-ANAT)`, `medmentions_full_ner:I-T101)`, `ncbi_disease_ner:I-SpecificDisease)`, `medmentions_full_ner:B-T034)`, `linnaeus_filtered_ner:B-species)`, `bionlp_st_2011_ge_ED:B-Binding)`, `bionlp_st_2013_gro_ner:I-Histone)`, `bionlp_st_2013_cg_ED:I-Carcinogenesis)`, `medmentions_full_ner:I-T192)`, `medmentions_full_ner:B-T080)`, `bionlp_st_2013_ge_EAE:None)`, `bionlp_st_2013_gro_ner:B-BindingSiteOfProtein)`, `bionlp_st_2013_gro_ner:B-TranscriptionCorepressor)`, `ehr_rel_sts:4)`, `mlee_ner:I-Gene_or_gene_product)`, `ddi_corpus_RE:MECHANISM)`, `bionlp_st_2011_ge_ED:I-Localization)`, `bionlp_st_2013_gro_ED:I-CellularDevelopmentalProcess)`, `medmentions_full_ner:B-T098)`, `genia_term_corpus_ner:B-protein_subunit)`, `mantra_gsc_en_emea_ner:I-PROC)`, `bionlp_st_2013_gro_ner:I-ProteinCodingDNARegion)`, `scicite_TEXT:method)`, `bionlp_st_2013_gro_ner:I-CellComponent)`, `genia_term_corpus_ner:I-peptide)`, `medmentions_full_ner:B-T100)`, `bionlp_st_2013_pc_EAE:Cause)`, `medmentions_full_ner:B-T049)`, `bionlp_st_2013_gro_ED:B-Transport)`, `scai_chemical_ner:O)`, `medmentions_full_ner:B-T083)`, `diann_iber_eval_en_ner:I-Disability)`, `bionlp_st_2013_pc_ED:I-Translation)`, `medmentions_full_ner:I-T039)`, `anat_em_ner:B-Organism_subdivision)`, `bionlp_st_2013_gro_ner:I-Ligand)`, `bionlp_st_2013_cg_ED:B-Metabolism)`, `bionlp_st_2013_pc_ED:I-Phosphorylation)`, `bionlp_st_2011_id_ner:O)`, `mantra_gsc_en_patents_ner:B-PHEN)`, `bionlp_st_2013_gro_ner:I-Nucleus)`, `biorelex_ner:I-fusion-protein)`, `bionlp_st_2013_gro_ED:B-Affecting)`, `bionlp_st_2013_gro_ner:I-ComplexOfProteinAndRNA)`, `bionlp_st_2013_gro_ED:B-Methylation)`, `bionlp_st_2013_gro_ner:I-NuclearReceptor)`, `bionlp_st_2013_gro_ED:B-Mitosis)`, `bionlp_st_2013_gro_ED:I-PositiveRegulation)`, `bionlp_st_2013_gro_ED:B-ModificationOfMolecularEntity)`, `pdr_ED:O)`, `bionlp_st_2013_cg_ner:B-Cell)`, `chia_RE:OR)`, `bionlp_st_2013_cg_ner:I-Gene_or_gene_product)`, `bionlp_st_2013_gro_ner:B-Holoenzyme)`, `bionlp_shared_task_2009_EAE:ToLoc)`, `verspoor_2013_ner:I-disease)`, `biorelex_ner:I-tissue)`, `muchmore_en_ner:B-umlsterm)`, `bionlp_st_2013_gro_ED:B-NegativeRegulationOfTranscriptionByTranscriptionRepressor)`, `ehr_rel_sts:5)`, `bionlp_shared_task_2009_ner:B-Protein)`, `mantra_gsc_en_patents_ner:B-LIVB)`, `medmentions_st21pv_ner:I-T038)`, `bionlp_st_2013_gro_ner:B-TranscriptionRegulator)`, `medmentions_full_ner:O)`, `medmentions_full_ner:I-T002)`, `bionlp_st_2013_gro_ner:I-DNARegion)`, `medmentions_full_ner:B-T089)`, `bionlp_st_2013_gro_ED:I-BindingToProtein)`, `bionlp_st_2013_cg_EAE:AtLoc)`, `medmentions_full_ner:B-T077)`, `mirna_ner:B-Species)`, `bionlp_st_2013_gro_ner:I-TranscriptionRegulator)`, `bionlp_st_2013_gro_ner:I-tRNA)`, `bionlp_st_2013_gro_ner:I-Operon)`, `bionlp_st_2011_epi_ED:B-Deglycosylation)`, `chemprot_ner:O)`, `mlee_ner:I-Multi-tissue_structure)`, `genia_term_corpus_ner:B-AND_NOTcell_typecell_type)`, `medmentions_full_ner:I-T023)`, `medmentions_full_ner:B-T094)`, `chemprot_RE:CPR:1)`, `mlee_ED:B-Planned_process)`, `scai_chemical_ner:B-ABBREVIATION)`, `bionlp_st_2013_gro_ner:B-HomeoboxTF)`, `bionlp_st_2011_id_ED:B-Process)`, `bionlp_st_2013_gro_ner:I-Virus)`, `genia_term_corpus_ner:B-atom)`, `bionlp_st_2013_gro_RE:fromSpecies)`, `bionlp_st_2011_id_ED:B-Binding)`, `bionlp_st_2011_id_EAE:None)`, `medmentions_full_ner:B-T203)`, `bionlp_st_2013_gro_ner:B-ThreeDimensionalMolecularStructure)`, `muchmore_en_ner:I-umlsterm)`, `bionlp_st_2013_cg_ner:I-Developing_anatomical_structure)`, `bionlp_st_2013_pc_EAE:FromLoc)`, `genetaggold_ner:I-NEWGENE)`, `bionlp_st_2013_ge_EAE:Theme)`, `bionlp_st_2013_gro_ner:I-Attenuator)`, `nlm_gene_ner:I-Other)`, `medmentions_full_ner:B-T109)`, `osiris_ner:I-variant)`, `chia_ner:I-Mood)`, `medmentions_full_ner:I-T068)`, `minimayosrs_sts:4)`, `bionlp_st_2013_gro_ED:B-CellCyclePhase)`, `bionlp_st_2019_bb_ner:B-Habitat)`, `medmentions_full_ner:I-T097)`, `ehr_rel_sts:6)`, `bionlp_st_2011_epi_ED:I-Methylation)`, `bioinfer_ner:I-Protein_family_or_group)`, `medmentions_st21pv_ner:I-T098)`, `bionlp_st_2013_gro_ner:I-BetaScaffoldDomain_WithMinorGrooveContacts)`, `medmentions_full_ner:B-T047)`, `mlee_ED:B-Dephosphorylation)`, `mantra_gsc_en_emea_ner:I-PHYS)`, `pdr_ner:B-Disease)`, `genia_term_corpus_ner:I-)`, `chemdner_ner:I-Chemical)`, `bionlp_st_2013_gro_ED:B-PositiveRegulationOfTranscriptionOfGene)`, `mlee_ner:I-Protein_domain_or_region)`, `medmentions_full_ner:I-T104)`, `medmentions_full_ner:B-T039)`, `bio_sim_verb_sts:5)`, `chebi_nactem_abstr_ann1_ner:B-Biological_Activity)`, `bionlp_st_2011_epi_ED:I-DNA_demethylation)`, `nlm_gene_ner:I-GENERIF)`, `bionlp_st_2013_gro_ED:B-NegativeRegulationOfTranscription)`, `mantra_gsc_en_emea_ner:I-PHEN)`, `chebi_nactem_fullpaper_ner:B-Chemical_Structure)`, `genia_term_corpus_ner:B-RNA_molecule)`, `mlee_ner:B-Cell)`, `chia_ner:B-Qualifier)`, `bionlp_shared_task_2009_ED:B-Gene_expression)`, `bionlp_st_2013_gro_ner:I-Vitamin)`, `medmentions_full_ner:I-T013)`, `ehr_rel_sts:8)`, `medmentions_full_ner:I-T030)`, `diann_iber_eval_en_ner:O)`, `an_em_RE:frag)`, `genia_term_corpus_ner:I-DNA_substructure)`, `bionlp_st_2013_pc_EAE:Site)`, `genia_term_corpus_ner:I-ANDprotein_complexprotein_complex)`, `bionlp_st_2013_gro_ED:I-TranscriptionInitiation)`, `bionlp_st_2013_gro_ner:B-Polymerase)`, `medmentions_full_ner:I-T004)`, `bionlp_st_2013_gro_ED:B-NegativeRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_gro_ner:B-FusionGene)`, `bionlp_st_2011_ge_ED:I-Binding)`, `bionlp_st_2013_cg_ner:B-DNA_domain_or_region)`, `chia_ner:B-Negation)`, `bionlp_st_2013_gro_ner:I-FusionProtein)`, `minimayosrs_sts:8)`, `chebi_nactem_fullpaper_ner:B-Protein)`, `bionlp_st_2013_gro_ner:B-Enhancer)`, `bionlp_st_2013_gro_ED:B-NegativeRegulation)`, `medmentions_full_ner:I-T041)`, `mantra_gsc_en_emea_ner:O)`, `biorelex_ner:I-protein-motif)`, `bionlp_st_2011_epi_COREF:coref)`, `medmentions_full_ner:I-T093)`, `medmentions_full_ner:B-T200)`, `bionlp_st_2013_gro_ner:B-OpenReadingFrame)`, `bionlp_st_2013_cg_ED:I-Localization)`, `bionlp_st_2013_cg_ner:B-Tissue)`, `bionlp_st_2013_pc_COREF:None)`, `medmentions_full_ner:I-T123)`, `mlee_ED:O)`, `bionlp_st_2013_gro_ner:O)`, `bionlp_st_2013_gro_ner:B-ComplexMolecularEntity)`, `bionlp_st_2013_pc_ED:B-Transcription)`, `anat_em_ner:B-Pathological_formation)`, `diann_iber_eval_en_ner:B-Neg)`, `bionlp_st_2013_ge_ner:I-Protein)`, `scai_chemical_ner:I-TRIVIAL)`, `bionlp_st_2013_gro_ner:B-RibosomalRNA)`, `an_em_ner:B-Organism_subdivision)`, `mlee_ED:I-Remodeling)`, `genia_term_corpus_ner:B-RNA_domain_or_region)`, `bionlp_st_2013_gro_ner:B-BindingAssay)`, `medmentions_full_ner:B-T017)`, `mlee_ED:I-Translation)`, `bionlp_st_2013_gro_ner:B-CpGIsland)`, `bionlp_st_2013_pc_ner:I-Gene_or_gene_product)`, `bionlp_st_2013_gro_ner:I-HMG)`, `bionlp_st_2013_gro_ED:B-FormationOfTranscriptionFactorComplex)`, `mlee_ner:I-Organism_substance)`, `medmentions_full_ner:I-T075)`, `nlm_gene_ner:B-Domain)`, `anat_em_ner:I-Anatomical_system)`, `medmentions_full_ner:B-T057)`, `bionlp_st_2013_gro_ner:I-SecondMessenger)`, `bionlp_st_2013_gro_ner:B-GeneProduct)`, `ebm_pico_ner:I-Outcome_Other)`, `bionlp_st_2013_gro_ED:B-ProteinModification)`, `bionlp_st_2013_gro_ED:B-Modification)`, `bioinfer_ner:B-Protein_family_or_group)`, `medmentions_full_ner:B-T059)`, `bionlp_st_2013_gro_ner:B-Ligand)`, `gnormplus_ner:I-FamilyName)`, `mantra_gsc_en_emea_ner:B-CHEM)`, `bionlp_st_2013_gro_ED:I-CellGrowth)`, `genia_term_corpus_ner:B-DNA_NA)`, `mantra_gsc_en_medline_ner:B-LIVB)`, `verspoor_2013_ner:B-gender)`, `bio_sim_verb_sts:6)`, `spl_adr_200db_train_ner:B-Severity)`, `bionlp_st_2013_cg_ED:I-Breakdown)`, `ddi_corpus_ner:I-BRAND)`, `medmentions_st21pv_ner:B-T097)`, `biorelex_ner:B-experimental-construct)`, `bionlp_st_2013_ge_ED:B-Transcription)`, `chia_ner:I-Multiplier)`, `bionlp_st_2013_gro_ner:I-DNA)`, `geokhoj_v1_TEXT:0)`, `bionlp_st_2013_gro_RE:locatedIn)`, `genia_term_corpus_ner:B-virus)`, `bionlp_st_2013_gro_ner:I-SequenceHomologyAnalysis)`, `bionlp_st_2013_gro_ED:B-RegulatoryProcess)`, `bionlp_st_2013_pc_ED:B-Activation)`, `anat_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_ner:B-RuntLikeDomain)`, `bioinfer_ner:I-Protein_complex)`, `bionlp_st_2013_gro_ED:I-Increase)`, `anat_em_ner:I-Cell)`, `medmentions_full_ner:B-T131)`, `bionlp_st_2013_gro_ner:B-ProteinDomain)`, `bionlp_st_2013_gro_ner:I-ProteinCodingRegion)`, `bionlp_st_2013_gro_ner:I-PrimaryStructure)`, `seth_corpus_RE:None)`, `genia_term_corpus_ner:I-mono_cell)`, `bioscope_papers_ner:I-negation)`, `genia_term_corpus_ner:I-other_artificial_source)`, `medmentions_full_ner:I-T098)`, `bionlp_st_2013_gro_ner:I-Enhancer)`, `bionlp_st_2013_gro_ner:I-PositiveTranscriptionRegulator)`, `genia_term_corpus_ner:I-polynucleotide)`, `bionlp_st_2011_ge_ED:B-Gene_expression)`, `medmentions_full_ner:B-T121)`, `bionlp_st_2011_id_ED:I-Transcription)`, `biorelex_ner:I-protein-region)`, `chebi_nactem_fullpaper_ner:B-Metabolite)`, `diann_iber_eval_en_ner:B-Disability)`, `bionlp_st_2013_cg_ED:B-Dissociation)`, `medmentions_st21pv_ner:B-T204)`, `genia_term_corpus_ner:I-protein_subunit)`, `medmentions_full_ner:B-T023)`, `bionlp_st_2013_gro_ED:B-Splicing)`, `bionlp_st_2013_gro_ED:I-Silencing)`, `biorelex_ner:B-peptide)`, `bionlp_st_2013_gro_ED:B-BindingOfTFToTFBindingSiteOfProtein)`, `biorelex_ner:I-assay)`, `medmentions_full_ner:B-T048)`, `an_em_ner:I-Organism_substance)`, `bionlp_st_2013_gro_ner:I-Function)`, `spl_adr_200db_train_ner:B-Animal)`, `genia_term_corpus_ner:I-DNA_NA)`, `medmentions_full_ner:I-T070)`, `mlee_ner:I-Anatomical_system)`, `bioinfer_ner:B-Individual_protein)`, `biorelex_ner:B-organelle)`, `verspoor_2013_ner:I-Physiology)`, `bionlp_st_2013_gro_ner:I-ProteinComplex)`, `genia_term_corpus_ner:I-RNA_molecule)`, `mlee_ner:I-DNA_domain_or_region)`, `mlee_ED:I-Pathway)`, `bionlp_st_2013_gro_ED:B-ActivationOfProcess)`, `pico_extraction_ner:B-outcome)`, `minimayosrs_sts:7)`, `medmentions_full_ner:I-T038)`, `verspoor_2013_ner:I-size)`, `ebm_pico_ner:B-Intervention_Other)`, `bionlp_st_2013_gro_ED:B-RNABiosynthesis)`, `bionlp_st_2013_cg_ner:I-Simple_chemical)`, `mantra_gsc_en_medline_ner:I-LIVB)`, `seth_corpus_ner:B-Gene)`, `biorelex_ner:I-reagent)`, `bionlp_st_2013_cg_ED:B-Phosphorylation)`, `bionlp_st_2013_gro_ner:B-Attenuator)`, `pdr_EAE:None)`, `bionlp_st_2011_epi_ED:B-DNA_methylation)`, `bionlp_st_2013_cg_ED:I-Translation)`, `bionlp_st_2013_gro_ED:B-Transcription)`, `medmentions_st21pv_ner:I-T074)`, `bionlp_st_2013_gro_ED:B-ProteinCatabolism)`, `bionlp_st_2013_gro_ED:B-Growth)`, `chia_RE:AND)`, `bionlp_st_2013_pc_ED:I-Transcription)`, `medmentions_full_ner:I-T191)`, `medmentions_full_ner:I-T028)`, `bionlp_st_2013_cg_ED:I-Glycolysis)`, `bionlp_st_2013_ge_ED:B-Localization)`, `mlee_ner:I-Organ)`, `medmentions_full_ner:B-T033)`, `ebm_pico_ner:I-Intervention_Other)`, `bionlp_st_2013_gro_ner:B-NuclearReceptor)`, `genia_term_corpus_ner:B-ANDprotein_complexprotein_complex)`, `an_em_ner:B-Cellular_component)`, `medmentions_full_ner:I-T100)`, `geokhoj_v1_TEXT:1)`, `genia_term_corpus_ner:I-BUT_NOTother_nameother_name)`, `bionlp_st_2013_cg_ED:B-Cell_death)`, `gnormplus_ner:B-Gene)`, `genia_term_corpus_ner:I-RNA_substructure)`, `medmentions_full_ner:I-T190)`, `bionlp_st_2013_gro_ED:B-Homodimerization)`, `medmentions_full_ner:B-T051)`, `genia_term_corpus_ner:B-lipid)`, `bioinfer_ner:B-GeneproteinRNA)`, `bioinfer_ner:B-Gene)`, `medmentions_full_ner:B-T184)`, `anat_em_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelixTF)`, `bionlp_st_2013_cg_ner:I-Protein_domain_or_region)`, `genia_term_corpus_ner:I-other_organic_compound)`, `bionlp_st_2013_gro_ner:B-SmallInterferingRNA)`, `bionlp_st_2013_cg_ED:B-Growth)`, `bionlp_st_2013_cg_ED:B-Synthesis)`, `chia_RE:Has_index)`, `chia_ner:I-Device)`, `ddi_corpus_ner:B-GROUP)`, `bionlp_shared_task_2009_ED:I-Gene_expression)`, `bionlp_st_2013_gro_ner:B-MutantProtein)`, `genia_term_corpus_ner:B-DNA_substructure)`, `biorelex_ner:I-disease)`, `biorelex_ner:I-amino-acid)`, `medmentions_full_ner:B-T127)`, `ebm_pico_ner:I-Intervention_Psychological)`, `mlee_ED:I-Planned_process)`, `pubmed_qa_labeled_fold0_CLF:no)`, `mlee_ner:I-Drug_or_compound)`, `medmentions_full_ner:I-T185)`, `minimayosrs_sts:1)`, `bionlp_st_2011_epi_ED:B-DNA_demethylation)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorBindingSiteOfDNA)`, `bionlp_st_2013_gro_ED:I-ResponseProcess)`, `medmentions_full_ner:I-T201)`, `bionlp_st_2011_ge_ED:I-Transcription)`, `bionlp_st_2013_cg_ED:I-Mutation)`, `tmvar_v1_ner:I-ProteinMutation)`, `medmentions_full_ner:I-T063)`, `verspoor_2013_ner:I-Phenomena)`, `bionlp_st_2011_id_ED:B-Negative_regulation)`, `chemprot_RE:CPR:2)`, `bionlp_st_2013_gro_ner:B-ProteinSubunit)`, `medmentions_full_ner:B-T011)`, `genia_term_corpus_ner:I-ANDother_nameother_name)`, `an_em_ner:I-Tissue)`, `bionlp_st_2013_gro_ner:B-bHLHTF)`, `pico_extraction_ner:B-intervention)`, `bionlp_st_2013_gro_ED:B-Increase)`, `mlee_ner:I-Organism)`, `mantra_gsc_en_emea_ner:I-CHEM)`, `bionlp_st_2013_cg_ner:I-Organism)`, `bionlp_st_2013_gro_ner:I-ProteinDomain)`, `medmentions_full_ner:B-T185)`, `mantra_gsc_en_patents_ner:I-PROC)`, `medmentions_full_ner:I-T120)`, `bionlp_st_2013_gro_ED:B-CellularMetabolicProcess)`, `scai_chemical_ner:I-ABBREVIATION)`, `bionlp_st_2013_cg_ED:I-Planned_process)`, `bionlp_st_2013_cg_ner:B-Anatomical_system)`, `chia_ner:I-Procedure)`, `genia_term_corpus_ner:I-ANDcell_typecell_type)`, `scai_chemical_ner:I-)`, `biorelex_ner:B-experiment-tag)`, `genia_term_corpus_ner:B-ORDNA_domain_or_regionDNA_domain_or_region)`, `medmentions_full_ner:B-T044)`, `mirna_ner:B-Non-Specific_miRNAs)`, `mlee_ED:B-Cell_division)`, `bionlp_st_2011_id_ner:I-Entity)`, `bionlp_st_2013_cg_ED:B-Cell_proliferation)`, `bionlp_st_2011_epi_EAE:None)`, `bionlp_st_2013_cg_ED:B-DNA_methylation)`, `bionlp_st_2013_gro_ED:O)`, `bionlp_st_2013_gro_ED:B-Producing)`, `bionlp_st_2013_cg_EAE:Instrument)`, `bionlp_st_2013_gro_ED:B-Stabilization)`, `pcr_ner:B-Chemical)`, `bionlp_st_2013_cg_ED:B-Development)`, `ebm_pico_ner:B-Intervention_Physical)`, `bionlp_st_2011_ge_ED:I-Regulation)`, `bionlp_st_2013_pc_ED:B-Demethylation)`, `bionlp_st_2011_epi_ner:B-Protein)`, `chemprot_RE:CPR:0)`, `medmentions_full_ner:B-T055)`, `bionlp_st_2013_gro_ED:B-Decrease)`, `spl_adr_200db_train_ner:I-Severity)`, `bionlp_st_2013_gro_ner:I-Ion)`, `bionlp_st_2013_pc_ner:B-Gene_or_gene_product)`, `genia_term_corpus_ner:B-inorganic)`, `chia_ner:O)`, `linnaeus_ner:B-species)`, `biorelex_ner:I-protein)`, `mantra_gsc_en_medline_ner:B-PROC)`, `medmentions_full_ner:B-T078)`, `medmentions_full_ner:I-T062)`, `medmentions_full_ner:I-T081)`, `mantra_gsc_en_emea_ner:B-PHEN)`, `medmentions_st21pv_ner:B-T022)`, `bc5cdr_ner:I-Disease)`, `chia_ner:B-Multiplier)`, `bionlp_st_2013_gro_ner:I-bHLH)`, `bionlp_st_2013_gro_ED:B-CellularProcess)`, `bionlp_st_2013_gro_ED:B-Acetylation)`, `genia_term_corpus_ner:B-RNA_family_or_group)`, `bionlp_st_2013_gro_ED:I-IntraCellularTransport)`, `bionlp_st_2013_gro_ner:B-Chromatin)`, `bionlp_st_2013_ge_ED:B-Binding)`, `bionlp_st_2013_gro_ner:I-AminoAcid)`, `bionlp_st_2013_gro_ED:B-CellFateDetermination)`, `medmentions_full_ner:I-T091)`, `medmentions_full_ner:B-T066)`, `medmentions_full_ner:B-T022)`, `genetaggold_ner:O)`, `medmentions_full_ner:B-T074)`, `bionlp_st_2013_pc_ED:I-Gene_expression)`, `bionlp_st_2013_gro_ED:I-Disease)`, `biosses_sts:7)`, `medmentions_full_ner:B-T071)`, `medmentions_full_ner:B-T086)`, `biorelex_ner:I-protein-complex)`, `mlee_ED:B-Remodeling)`, `medmentions_st21pv_ner:I-T007)`, `bionlp_st_2011_id_ED:I-Regulation)`, `biorelex_ner:B-drug)`, `bionlp_st_2013_gro_ED:I-Transcription)`, `bionlp_st_2011_epi_EAE:Theme)`, `mantra_gsc_en_patents_ner:I-DISO)`, `anat_em_ner:I-Organ)`, `scai_chemical_ner:I-PARTIUPAC)`, `bionlp_st_2013_cg_ED:I-Metastasis)`, `medmentions_full_ner:I-T197)`, `bionlp_st_2013_pc_ED:O)`, `medmentions_st21pv_ner:B-T092)`, `bionlp_shared_task_2009_ED:B-Positive_regulation)`, `medmentions_full_ner:B-T045)`, `chemprot_RE:CPR:8)`, `bionlp_st_2013_cg_ED:B-Localization)`, `nlm_gene_ner:I-Domain)`, `verspoor_2013_ner:B-age)`, `bionlp_st_2011_epi_ED:O)`, `chebi_nactem_abstr_ann1_ner:B-Species)`, `medmentions_full_ner:B-T122)`, `bionlp_st_2011_id_ner:I-Protein)`, `bionlp_st_2013_gro_ED:I-BindingOfProteinToDNA)`, `bionlp_st_2013_gro_ner:I-RNAPolymeraseII)`, `medmentions_full_ner:I-T050)`, `genia_term_corpus_ner:B-ANDother_nameother_name)`, `nlm_gene_ner:B-STARGENE)`, `bionlp_st_2013_gro_ED:B-BindingOfMolecularEntity)`, `mirna_ner:B-GenesProteins)`, `scai_chemical_ner:B-MODIFIER)`, `mantra_gsc_en_emea_ner:B-OBJC)`, `mirna_ner:B-Diseases)`, `bionlp_st_2013_cg_ED:I-Death)`, `mantra_gsc_en_emea_ner:I-DISO)`, `bionlp_st_2013_gro_ED:I-Decrease)`, `bionlp_st_2013_gro_ner:B-DNABindingDomainOfProtein)`, `bioinfer_ner:O)`, `anat_em_ner:I-Multi-tissue_structure)`, `osiris_ner:O)`, `bionlp_st_2013_cg_EAE:None)`, `medmentions_st21pv_ner:B-T062)`, `medmentions_full_ner:B-T075)`, `genia_term_corpus_ner:I-AND_NOTcell_typecell_type)`, `bionlp_st_2013_gro_ED:B-CellCycle)`, `medmentions_full_ner:B-UnknownType)`, `bionlp_st_2013_cg_ner:I-Cancer)`, `medmentions_full_ner:I-T005)`, `genia_term_corpus_ner:I-protein_complex)`, `bionlp_st_2013_cg_ED:B-Cell_transformation)` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bigbio_mtl_en_5.2.0_3.0_1699290919040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bigbio_mtl_en_5.2.0_3.0_1699290919040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bigbio_mtl","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bigbio_mtl","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_bigscience_biomedical").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bigbio_mtl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bigscience-biomedical/bigbio-mtl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros_xx.md new file mode 100644 index 000000000000..34b4c76151bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros BertForTokenClassification from StivenLancheros +author: John Snow Labs +name: bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros +date: 2023-11-06 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros` is a Multilingual model originally trained by StivenLancheros. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros_xx_5.2.0_3.0_1699289847969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros_xx_5.2.0_3.0_1699289847969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_base_cased_v1.2_finetuned_ner_concat_craft_spanish_stivenlancheros| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|403.7 MB| + +## References + +https://huggingface.co/StivenLancheros/biobert-base-cased-v1.2-finetuned-ner-Concat_CRAFT_es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english_xx.md new file mode 100644 index 000000000000..f3a7234037dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english BertForTokenClassification from StivenLancheros +author: John Snow Labs +name: bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english +date: 2023-11-06 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english` is a Multilingual model originally trained by StivenLancheros. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english_xx_5.2.0_3.0_1699289677819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english_xx_5.2.0_3.0_1699289677819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|403.7 MB| + +## References + +https://huggingface.co/StivenLancheros/biobert-base-cased-v1.2-finetuned-ner-CRAFT_English \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros_xx.md new file mode 100644 index 000000000000..80e61e8b8446 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros BertForTokenClassification from StivenLancheros +author: John Snow Labs +name: bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros +date: 2023-11-06 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros` is a Multilingual model originally trained by StivenLancheros. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros_xx_5.2.0_3.0_1699279334393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros_xx_5.2.0_3.0_1699279334393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_base_cased_v1.2_finetuned_ner_craft_spanish_english_stivenlancheros| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|403.7 MB| + +## References + +https://huggingface.co/StivenLancheros/Biobert-base-cased-v1.2-finetuned-ner-CRAFT_es_en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_chemical_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_chemical_ner_en.md new file mode 100644 index 000000000000..c9567e87763b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_chemical_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from alvaroalon2) +author: John Snow Labs +name: bert_ner_biobert_chemical_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_chemical_ner` is a English model originally trained by `alvaroalon2`. + +## Predicted Entities + +`CHEMICAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_chemical_ner_en_5.2.0_3.0_1699291347294.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_chemical_ner_en_5.2.0_3.0_1699291347294.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_chemical_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_chemical_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.biobert.chemical.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_chemical_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/alvaroalon2/biobert_chemical_ner +- https://github.com/librairy/bio-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_genetic_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_genetic_ner_en.md new file mode 100644 index 000000000000..9287980ea28a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_genetic_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from alvaroalon2) +author: John Snow Labs +name: bert_ner_biobert_genetic_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_genetic_ner` is a English model originally trained by `alvaroalon2`. + +## Predicted Entities + +`GENETIC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_genetic_ner_en_5.2.0_3.0_1699291599788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_genetic_ner_en_5.2.0_3.0_1699291599788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_genetic_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_genetic_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.biobert").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_genetic_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/alvaroalon2/biobert_genetic_ner +- https://github.com/librairy/bio-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ncbi_disease_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ncbi_disease_ner_en.md new file mode 100644 index 000000000000..4d0d25b0ae66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ncbi_disease_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ugaray96) +author: John Snow Labs +name: bert_ner_biobert_ncbi_disease_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_ncbi_disease_ner` is a English model originally trained by `ugaray96`. + +## Predicted Entities + +`No Disease`, `Disease Continuation`, `Disease` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_ncbi_disease_ner_en_5.2.0_3.0_1699291798160.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_ncbi_disease_ner_en_5.2.0_3.0_1699291798160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_ncbi_disease_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_ncbi_disease_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.biobert.ncbi.disease.by_ugaray96").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_ncbi_disease_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ugaray96/biobert_ncbi_disease_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ner_bc2gm_corpus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ner_bc2gm_corpus_en.md new file mode 100644 index 000000000000..743d73b5f5dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ner_bc2gm_corpus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biobert_ner_bc2gm_corpus BertForTokenClassification from drAbreu +author: John Snow Labs +name: bert_ner_biobert_ner_bc2gm_corpus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biobert_ner_bc2gm_corpus` is a English model originally trained by drAbreu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_ner_bc2gm_corpus_en_5.2.0_3.0_1699288986237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_ner_bc2gm_corpus_en_5.2.0_3.0_1699288986237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_ner_bc2gm_corpus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biobert_ner_bc2gm_corpus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_ner_bc2gm_corpus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/drAbreu/bioBERT-NER-BC2GM_corpus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ner_ncbi_disease_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ner_ncbi_disease_en.md new file mode 100644 index 000000000000..8e49147f63fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_ner_ncbi_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biobert_ner_ncbi_disease BertForTokenClassification from drAbreu +author: John Snow Labs +name: bert_ner_biobert_ner_ncbi_disease +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biobert_ner_ncbi_disease` is a English model originally trained by drAbreu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_ner_ncbi_disease_en_5.2.0_3.0_1699291144471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_ner_ncbi_disease_en_5.2.0_3.0_1699291144471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_ner_ncbi_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biobert_ner_ncbi_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_ner_ncbi_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/drAbreu/bioBERT-NER-NCBI_disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_v1.1_pubmed_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_v1.1_pubmed_finetuned_ner_en.md new file mode 100644 index 000000000000..afd7350ea5d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_v1.1_pubmed_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from fidukm34) +author: John Snow Labs +name: bert_ner_biobert_v1.1_pubmed_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_v1.1_pubmed-finetuned-ner` is a English model originally trained by `fidukm34`. + +## Predicted Entities + +`Disease` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_v1.1_pubmed_finetuned_ner_en_5.2.0_3.0_1699289492921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_v1.1_pubmed_finetuned_ner_en_5.2.0_3.0_1699289492921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_v1.1_pubmed_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_v1.1_pubmed_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.biobert.pubmed.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_v1.1_pubmed_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/fidukm34/biobert_v1.1_pubmed-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner_en.md new file mode 100644 index 000000000000..b4c95407b43b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from fidukm34) +author: John Snow Labs +name: bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_v1.1_pubmed-finetuned-ner-finetuned-ner` is a English model originally trained by `fidukm34`. + +## Predicted Entities + +`Begin`, `Disease` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner_en_5.2.0_3.0_1699290236029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner_en_5.2.0_3.0_1699290236029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.biobert.pubmed.finetuned.by_fidukm34").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biobert_v1.1_pubmed_finetuned_ner_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/fidukm34/biobert_v1.1_pubmed-finetuned-ner-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bioformer_cased_v1.0_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bioformer_cased_v1.0_bc2gm_en.md new file mode 100644 index 000000000000..ff3d4e3163e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bioformer_cased_v1.0_bc2gm_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from bioformers) +author: John Snow Labs +name: bert_ner_bioformer_cased_v1.0_bc2gm +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bioformer-cased-v1.0-bc2gm` is a English model originally trained by `bioformers`. + +## Predicted Entities + +`bio` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bioformer_cased_v1.0_bc2gm_en_5.2.0_3.0_1699292053667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bioformer_cased_v1.0_bc2gm_en_5.2.0_3.0_1699292053667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bioformer_cased_v1.0_bc2gm","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bioformer_cased_v1.0_bc2gm","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bioformer.bc2gm.cased").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bioformer_cased_v1.0_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|158.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bioformers/bioformer-cased-v1.0-bc2gm +- https://doi.org/10.1186/gb-2008-9-s2-s2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bioformer_cased_v1.0_ncbi_disease_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bioformer_cased_v1.0_ncbi_disease_en.md new file mode 100644 index 000000000000..dc3895ac28b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bioformer_cased_v1.0_ncbi_disease_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from bioformers) +author: John Snow Labs +name: bert_ner_bioformer_cased_v1.0_ncbi_disease +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bioformer-cased-v1.0-ncbi-disease` is a English model originally trained by `bioformers`. + +## Predicted Entities + +`bio` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bioformer_cased_v1.0_ncbi_disease_en_5.2.0_3.0_1699290311864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bioformer_cased_v1.0_ncbi_disease_en_5.2.0_3.0_1699290311864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bioformer_cased_v1.0_ncbi_disease","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bioformer_cased_v1.0_ncbi_disease","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bioformer.ncbi.cased_disease").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bioformer_cased_v1.0_ncbi_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|158.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bioformers/bioformer-cased-v1.0-ncbi-disease +- https://doi.org/10.1016/j.jbi.2013.12.006 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biomuppet_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biomuppet_en.md new file mode 100644 index 000000000000..e964339ab282 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biomuppet_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from leonweber) +author: John Snow Labs +name: bert_ner_biomuppet +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biomuppet` is a English model originally trained by `leonweber`. + +## Predicted Entities + +`medmentions_full_ner:B-T085)`, `bionlp_st_2013_gro_ner:B-Ribosome)`, `chemdner_TEXT:MESH:D013830)`, `anat_em_ner:O)`, `cellfinder_ner:I-GeneProtein)`, `ncbi_disease_ner:B-CompositeMention)`, `bionlp_st_2013_gro_ner:B-Virus)`, `medmentions_full_ner:I-T129)`, `scai_disease_ner:B-DISEASE)`, `biorelex_ner:B-chemical)`, `chemdner_TEXT:MESH:D011166)`, `medmentions_st21pv_ner:I-T204)`, `chemdner_TEXT:MESH:D008345)`, `bionlp_st_2013_gro_NER:B-RegulationOfFunction)`, `mlee_ner:I-Cell)`, `bionlp_st_2013_gro_NER:I-RNABiosynthesis)`, `biorelex_ner:I-RNA-family)`, `bionlp_st_2013_gro_NER:B-ResponseToChemicalStimulus)`, `bionlp_st_2011_epi_NER:B-Dephosphorylation)`, `chemdner_TEXT:MESH:D003035)`, `chemdner_TEXT:MESH:D013440)`, `chemdner_TEXT:MESH:D037341)`, `chemdner_TEXT:MESH:D009532)`, `chemdner_TEXT:MESH:D019216)`, `chemdner_TEXT:MESH:D036701)`, `chemdner_TEXT:MESH:D011107)`, `bionlp_st_2013_cg_NER:B-Translation)`, `genia_term_corpus_ner:B-cell_component)`, `medmentions_full_ner:I-T065)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfDNA)`, `anat_em_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D000225)`, `genia_term_corpus_ner:I-ORDNA_domain_or_regionDNA_domain_or_region)`, `medmentions_full_ner:I-T015)`, `chemdner_TEXT:MESH:D008239)`, `bionlp_st_2013_cg_NER:I-Binding)`, `bionlp_st_2013_cg_NER:B-Amino_acid_catabolism)`, `cellfinder_ner:B-CellComponent)`, `bionlp_st_2013_gro_NER:I-MetabolicPathway)`, `bionlp_st_2013_gro_ner:B-ProteinIdentification)`, `bionlp_st_2011_ge_ner:O)`, `bionlp_st_2011_id_ner:B-Organism)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelixTF)`, `mirna_ner:B-Relation_Trigger)`, `bionlp_st_2011_ge_NER:B-Regulation)`, `bionlp_st_2013_cg_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D008055)`, `chemdner_TEXT:MESH:D009944)`, `verspoor_2013_ner:I-gene)`, `bionlp_st_2013_ge_ner:O)`, `chemdner_TEXT:MESH:D003907)`, `mlee_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D010569)`, `mlee_NER:I-Growth)`, `chemdner_TEXT:MESH:D036145)`, `medmentions_full_ner:I-T196)`, `ehr_rel_sts:1)`, `bionlp_st_2013_gro_NER:B-CellularComponentOrganizationAndBiogenesis)`, `chemdner_TEXT:MESH:D009285)`, `bionlp_st_2013_gro_NER:B-ProteinMetabolism)`, `chemdner_TEXT:MESH:D016718)`, `bionlp_st_2013_gro_NER:I-BindingOfTFToTFBindingSiteOfProtein)`, `medmentions_full_ner:I-T074)`, `chemdner_TEXT:MESH:D000432)`, `bionlp_st_2013_gro_NER:I-CellFateDetermination)`, `chia_ner:I-Reference_point)`, `bionlp_st_2013_gro_ner:B-Histone)`, `lll_RE:None)`, `scai_disease_ner:B-ADVERSE)`, `medmentions_full_ner:B-T130)`, `bionlp_st_2013_gro_NER:I-CellCyclePhaseTransition)`, `chemdner_TEXT:MESH:D000480)`, `chemdner_TEXT:MESH:D001556)`, `bionlp_st_2013_gro_ner:B-Nucleus)`, `bionlp_st_2013_gro_ner:B-AP2EREBPRelatedDomain)`, `chemdner_TEXT:MESH:D007854)`, `chemdner_TEXT:MESH:D009499)`, `genia_term_corpus_ner:B-polynucleotide)`, `bionlp_st_2013_gro_NER:I-Transcription)`, `chemdner_TEXT:MESH:D007213)`, `bionlp_st_2013_ge_NER:B-Regulation)`, `bionlp_st_2011_epi_NER:B-DNA_methylation)`, `medmentions_st21pv_ner:B-T031)`, `bionlp_st_2013_ge_NER:I-Gene_expression)`, `chemdner_TEXT:MESH:D007651)`, `bionlp_st_2013_gro_NER:B-OrganismalProcess)`, `bionlp_st_2011_epi_COREF:None)`, `medmentions_st21pv_ner:I-T062)`, `chemdner_TEXT:MESH:D002047)`, `chemdner_TEXT:MESH:D012822)`, `mantra_gsc_en_patents_ner:B-DEVI)`, `medmentions_full_ner:I-T071)`, `chemdner_TEXT:MESH:D013739)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfGeneExpression)`, `genia_term_corpus_ner:B-other_name)`, `medmentions_full_ner:B-T018)`, `chemdner_TEXT:MESH:D015242)`, `bionlp_st_2013_cg_NER:O)`, `chemdner_TEXT:MESH:D019469)`, `ncbi_disease_ner:B-DiseaseClass)`, `ebm_pico_ner:B-Intervention_Surgical)`, `chemdner_TEXT:MESH:D011422)`, `chemdner_TEXT:MESH:D002112)`, `chemdner_TEXT:MESH:D005682)`, `anat_em_ner:B-Immaterial_anatomical_entity)`, `bionlp_st_2011_epi_ner:B-Entity)`, `medmentions_full_ner:I-T169)`, `mlee_ner:B-Immaterial_anatomical_entity)`, `verspoor_2013_ner:B-Physiology)`, `cellfinder_ner:I-CellType)`, `chemdner_TEXT:MESH:D011122)`, `chemdner_TEXT:MESH:D010622)`, `chemdner_TEXT:MESH:D017378)`, `bionlp_st_2011_ge_RE:Theme)`, `chemdner_TEXT:MESH:D000431)`, `medmentions_full_ner:I-T102)`, `medmentions_full_ner:B-T097)`, `chemdner_TEXT:MESH:D007529)`, `chemdner_TEXT:MESH:D045265)`, `chemdner_TEXT:MESH:D005971)`, `an_em_ner:I-Multi-tissue_structure)`, `genia_term_corpus_ner:I-ANDDNA_family_or_groupDNA_family_or_group)`, `medmentions_full_ner:I-T080)`, `chemdner_TEXT:MESH:D002207)`, `chia_ner:I-Qualifier)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscriptionByTranscriptionRepressor)`, `an_em_ner:I-Immaterial_anatomical_entity)`, `biosses_sts:5)`, `chemdner_TEXT:MESH:D000079963)`, `chemdner_TEXT:MESH:D013196)`, `ehr_rel_sts:2)`, `chemdner_TEXT:MESH:D006152)`, `bionlp_st_2013_gro_NER:B-RegulationOfProcess)`, `mlee_NER:I-Development)`, `medmentions_full_ner:B-T197)`, `bionlp_st_2013_gro_ner:B-NucleicAcid)`, `medmentions_st21pv_ner:I-T017)`, `medmentions_full_ner:I-T046)`, `medmentions_full_ner:B-T204)`, `bionlp_st_2013_gro_NER:B-CellularDevelopmentalProcess)`, `bionlp_st_2013_cg_ner:B-Immaterial_anatomical_entity)`, `chemdner_TEXT:MESH:D014212)`, `bionlp_st_2013_cg_NER:B-Protein_processing)`, `chemdner_TEXT:MESH:D008926)`, `chia_ner:B-Visit)`, `bionlp_st_2011_ge_NER:B-Negative_regulation)`, `mantra_gsc_en_medline_ner:I-OBJC)`, `mlee_RE:FromLoc)`, `bionlp_st_2013_gro_ner:I-RNAMolecule)`, `chemdner_TEXT:MESH:D014812)`, `linnaeus_filtered_ner:I-species)`, `chebi_nactem_fullpaper_ner:B-Chemical)`, `bionlp_st_2011_ge_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_ner:B-MutantGene)`, `chemdner_TEXT:MESH:D014859)`, `bionlp_st_2019_bb_ner:B-Phenotype)`, `bionlp_st_2013_gro_NER:I-BindingOfTFToTFBindingSiteOfDNA)`, `diann_iber_eval_en_ner:I-Neg)`, `ddi_corpus_ner:B-DRUG_N)`, `bionlp_st_2013_cg_ner:B-Organ)`, `chemdner_TEXT:MESH:D009320)`, `bionlp_st_2013_cg_ner:I-Organism_subdivision)`, `bionlp_st_2013_cg_ner:B-Cellular_component)`, `chemdner_TEXT:MESH:D003188)`, `chemdner_TEXT:MESH:D001241)`, `chemdner_TEXT:MESH:D004811)`, `bioinfer_ner:I-GeneproteinRNA)`, `chemdner_TEXT:MESH:D002248)`, `bionlp_shared_task_2009_NER:B-Negative_regulation)`, `chemdner_TEXT:MESH:D000143)`, `chemdner_TEXT:MESH:D007099)`, `nlm_gene_ner:O)`, `chemdner_TEXT:MESH:D005485)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorBindingSiteOfDNA)`, `bionlp_st_2013_gro_ner:B-PhysicalContact)`, `medmentions_full_ner:B-T167)`, `medmentions_st21pv_ner:B-T091)`, `seth_corpus_ner:I-Gene)`, `bionlp_st_2011_ge_COREF:coref)`, `bionlp_st_2011_ge_NER:B-Gene_expression)`, `medmentions_full_ner:B-T031)`, `genia_relation_corpus_RE:None)`, `genia_term_corpus_ner:I-ANDDNA_domain_or_regionDNA_domain_or_region)`, `chemdner_TEXT:MESH:D014970)`, `bionlp_st_2013_gro_NER:B-Mutation)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivator)`, `chemdner_TEXT:MESH:D002217)`, `chemdner_TEXT:MESH:D003367)`, `medmentions_full_ner:I-UnknownType)`, `chemdner_TEXT:MESH:D002998)`, `bionlp_st_2013_gro_ner:I-Phenotype)`, `genia_term_corpus_ner:B-ANDDNA_family_or_groupDNA_family_or_group)`, `hprd50_RE:PPI)`, `chemdner_TEXT:MESH:D002118)`, `scai_chemical_ner:B-IUPAC)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfProtein)`, `verspoor_2013_ner:B-mutation)`, `chemdner_TEXT:MESH:D011719)`, `chemdner_TEXT:MESH:D013729)`, `bionlp_shared_task_2009_ner:O)`, `chemdner_TEXT:MESH:D005840)`, `chemdner_TEXT:MESH:D009287)`, `medmentions_full_ner:B-T029)`, `chemdner_TEXT:MESH:D037742)`, `medmentions_full_ner:I-T200)`, `chemdner_TEXT:MESH:D012503)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndRNA)`, `mirna_ner:I-Non-Specific_miRNAs)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfProtein)`, `bionlp_st_2013_pc_NER:B-Deacetylation)`, `chemprot_RE:CPR:7)`, `chia_ner:I-Value)`, `medmentions_full_ner:I-T048)`, `chemprot_ner:B-GENE-Y)`, `bionlp_st_2013_cg_NER:B-Reproduction)`, `bionlp_st_2011_id_ner:I-Regulon-operon)`, `ebm_pico_ner:I-Outcome_Adverse-effects)`, `bioinfer_ner:B-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-bZIPTF)`, `mirna_ner:I-GenesProteins)`, `biorelex_ner:I-process)`, `chemdner_TEXT:MESH:D001555)`, `genia_term_corpus_ner:B-DNA_domain_or_region)`, `cellfinder_ner:O)`, `bionlp_st_2013_gro_ner:I-MutatedProtein)`, `bionlp_st_2013_gro_NER:I-CellularComponentOrganizationAndBiogenesis)`, `spl_adr_200db_train_ner:O)`, `medmentions_full_ner:I-T026)`, `chemdner_TEXT:MESH:D013619)`, `bionlp_st_2013_gro_NER:I-BindingToRNA)`, `biorelex_ner:I-drug)`, `bionlp_st_2013_pc_NER:B-Translation)`, `mantra_gsc_en_emea_ner:B-LIVB)`, `mantra_gsc_en_patents_ner:B-PROC)`, `bionlp_st_2013_pc_NER:B-Binding)`, `bionlp_st_2013_gro_NER:B-ModificationOfMolecularEntity)`, `bionlp_st_2013_cg_NER:I-Cell_transformation)`, `scai_chemical_ner:B-TRIVIALVAR)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomain)`, `bionlp_st_2013_gro_NER:I-TranscriptionInitiation)`, `chemdner_TEXT:MESH:D010907)`, `bionlp_st_2013_gro_ner:B-InorganicChemical)`, `bionlp_st_2013_pc_RE:None)`, `chemdner_TEXT:MESH:D002922)`, `chemdner_TEXT:MESH:D010743)`, `bionlp_st_2019_bb_ner:O)`, `medmentions_full_ner:I-T001)`, `chemdner_TEXT:MESH:D001381)`, `bionlp_shared_task_2009_ner:I-Protein)`, `bionlp_st_2013_gro_ner:B-Spliceosome)`, `bionlp_st_2013_gro_ner:I-HMGTF)`, `minimayosrs_sts:3)`, `ddi_corpus_RE:ADVISE)`, `mlee_NER:B-Dissociation)`, `bionlp_st_2013_gro_ner:I-Holoenzyme)`, `chemdner_TEXT:MESH:D001552)`, `bionlp_st_2013_gro_ner:B-bHLH)`, `chemdner_TEXT:MESH:D000109)`, `chemdner_TEXT:MESH:D013449)`, `bionlp_st_2013_gro_ner:I-GeneRegion)`, `medmentions_full_ner:B-T019)`, `scai_chemical_ner:B-TRIVIAL)`, `mlee_ner:B-Gene_or_gene_product)`, `biosses_sts:3)`, `bionlp_st_2013_cg_NER:I-Pathway)`, `bionlp_st_2011_id_ner:I-Organism)`, `bionlp_st_2013_gro_ner:B-tRNA)`, `chemdner_TEXT:MESH:D013109)`, `mlee_ner:I-Immaterial_anatomical_entity)`, `medmentions_full_ner:B-T065)`, `ebm_pico_ner:I-Participant_Sample-size)`, `mlee_RE:AtLoc)`, `genia_term_corpus_ner:I-protein_family_or_group)`, `chemdner_TEXT:MESH:D002444)`, `chemdner_TEXT:MESH:D063388)`, `mlee_NER:B-Translation)`, `chemdner_TEXT:MESH:D007052)`, `bionlp_st_2013_gro_ner:B-Gene)`, `chia_ner:B-Scope)`, `bionlp_st_2013_ge_NER:I-Positive_regulation)`, `chemdner_TEXT:MESH:D007785)`, `medmentions_st21pv_ner:I-T097)`, `iepa_RE:None)`, `medmentions_full_ner:B-T001)`, `medmentions_full_ner:I-T194)`, `chemdner_TEXT:MESH:D047309)`, `bionlp_st_2013_gro_ner:B-Substrate)`, `chemdner_TEXT:MESH:D002186)`, `ebm_pico_ner:B-Outcome_Other)`, `bionlp_st_2013_gro_NER:I-OrganismalProcess)`, `bionlp_st_2013_gro_ner:B-Ion)`, `bionlp_st_2013_gro_NER:I-ProteinBiosynthesis)`, `chia_ner:B-Drug)`, `bionlp_st_2013_gro_ner:I-MolecularEntity)`, `anat_em_ner:B-Cellular_component)`, `bionlp_st_2013_cg_ner:B-Multi-tissue_structure)`, `medmentions_full_ner:I-T122)`, `an_em_ner:B-Cell)`, `chemdner_TEXT:MESH:D011564)`, `bionlp_st_2013_gro_NER:B-Splicing)`, `bionlp_st_2013_cg_NER:I-Metabolism)`, `bionlp_st_2013_pc_NER:B-Activation)`, `bionlp_st_2013_gro_ner:I-BindingSiteOfProtein)`, `bionlp_st_2011_id_ner:B-Chemical)`, `bionlp_st_2013_gro_ner:I-Ribosome)`, `nlmchem_ner:I-Chemical)`, `mirna_ner:I-Specific_miRNAs)`, `medmentions_full_ner:I-T012)`, `bionlp_st_2013_gro_NER:B-IntraCellularTransport)`, `mlee_RE:Instrument)`, `bionlp_st_2011_id_NER:I-Transcription)`, `mantra_gsc_en_patents_ner:I-ANAT)`, `an_em_ner:B-Immaterial_anatomical_entity)`, `scai_chemical_ner:I-IUPAC)`, `bionlp_st_2011_epi_NER:B-Deubiquitination)`, `chemdner_TEXT:MESH:D007295)`, `bionlp_st_2011_ge_NER:B-Binding)`, `bionlp_st_2013_pc_NER:B-Localization)`, `chia_ner:B-Procedure)`, `medmentions_full_ner:I-T109)`, `chemdner_TEXT:MESH:D002791)`, `mantra_gsc_en_medline_ner:I-CHEM)`, `chebi_nactem_fullpaper_ner:B-Biological_Activity)`, `ncbi_disease_ner:B-SpecificDisease)`, `medmentions_full_ner:B-T063)`, `chemdner_TEXT:MESH:D016595)`, `bionlp_st_2011_id_NER:B-Transcription)`, `bionlp_st_2013_gro_ner:B-DNAMolecule)`, `mlee_NER:B-Protein_processing)`, `biorelex_ner:B-protein-complex)`, `anat_em_ner:I-Cancer)`, `bionlp_st_2013_cg_RE:AtLoc)`, `medmentions_full_ner:I-T072)`, `bio_sim_verb_sts:2)`, `seth_corpus_ner:O)`, `medmentions_full_ner:B-T070)`, `biorelex_ner:I-experiment-tag)`, `chemdner_TEXT:MESH:D020126)`, `biorelex_ner:I-protein-RNA-complex)`, `bionlp_st_2013_pc_NER:I-Phosphorylation)`, `medmentions_st21pv_ner:I-T201)`, `genia_term_corpus_ner:B-protein_complex)`, `medmentions_full_ner:I-T125)`, `bionlp_st_2013_ge_ner:I-Entity)`, `chemdner_TEXT:MESH:D054659)`, `bionlp_st_2013_pc_RE:ToLoc)`, `medmentions_full_ner:B-T099)`, `bionlp_st_2013_gro_NER:B-Binding)`, `medmentions_full_ner:B-T114)`, `spl_adr_200db_train_ner:B-Factor)`, `mlee_RE:CSite)`, `bionlp_st_2013_gro_ner:B-HMG)`, `bionlp_st_2013_gro_ner:B-Operon)`, `bionlp_st_2013_ge_NER:I-Protein_catabolism)`, `ebm_pico_ner:I-Outcome_Pain)`, `bionlp_st_2013_ge_NER:B-Transcription)`, `chemdner_TEXT:MESH:D000880)`, `ebm_pico_ner:I-Outcome_Physical)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D006160)`, `gnormplus_ner:B-DomainMotif)`, `medmentions_full_ner:I-T016)`, `pdr_ner:I-Disease)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToProteinBindingSiteOfProtein)`, `chemdner_TEXT:MESH:D002264)`, `genia_term_corpus_ner:I-protein_NA)`, `bionlp_shared_task_2009_NER:I-Negative_regulation)`, `medmentions_full_ner:I-T011)`, `bionlp_st_2013_gro_NER:I-CellularMetabolicProcess)`, `mqp_sts:1)`, `an_em_ner:I-Pathological_formation)`, `bionlp_st_2011_epi_NER:B-Deacetylation)`, `bionlp_st_2013_pc_RE:Theme)`, `medmentions_full_ner:I-T103)`, `bionlp_st_2011_epi_NER:B-Methylation)`, `ebm_pico_ner:B-Intervention_Psychological)`, `bionlp_st_2013_gro_ner:B-Stress)`, `genia_term_corpus_ner:B-multi_cell)`, `bionlp_st_2013_cg_NER:B-Positive_regulation)`, `anat_em_ner:I-Cellular_component)`, `spl_adr_200db_train_ner:I-Negation)`, `chemdner_TEXT:MESH:D000605)`, `mlee_RE:Cause)`, `bionlp_st_2013_gro_ner:B-RegulatoryDNARegion)`, `bionlp_st_2013_gro_ner:I-HomeoboxTF)`, `bionlp_st_2013_gro_NER:I-GeneSilencing)`, `ddi_corpus_ner:I-DRUG)`, `bionlp_st_2013_cg_NER:I-Growth)`, `mantra_gsc_en_medline_ner:B-OBJC)`, `mayosrs_sts:3)`, `bionlp_st_2013_gro_NER:B-RNAProcessing)`, `cellfinder_ner:B-CellType)`, `medmentions_full_ner:B-T007)`, `chemprot_ner:B-GENE-N)`, `biorelex_ner:B-brand)`, `ebm_pico_ner:B-Outcome_Mental)`, `bionlp_st_2013_gro_NER:B-RegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-EukaryoticCell)`, `genia_term_corpus_ner:I-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:I-T184)`, `bionlp_st_2013_gro_NER:B-RegulatoryProcess)`, `bionlp_st_2011_id_NER:B-Negative_regulation)`, `bionlp_st_2013_cg_NER:I-Development)`, `cellfinder_ner:I-Anatomy)`, `chia_ner:B-Condition)`, `chemdner_TEXT:MESH:D003065)`, `medmentions_full_ner:B-T012)`, `bionlp_st_2011_id_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorComplex)`, `bionlp_st_2013_cg_NER:I-Carcinogenesis)`, `medmentions_full_ner:B-T064)`, `medmentions_full_ner:B-T026)`, `nlmchem_ner:B-Chemical)`, `genia_term_corpus_ner:I-RNA_domain_or_region)`, `ebm_pico_ner:I-Intervention_Educational)`, `genia_term_corpus_ner:B-ANDcell_linecell_line)`, `genia_term_corpus_ner:B-protein_substructure)`, `bionlp_st_2013_gro_NER:I-ProteinTransport)`, `bionlp_st_2013_cg_NER:B-DNA_demethylation)`, `medmentions_full_ner:I-T058)`, `biorelex_ner:B-parameter)`, `chemdner_TEXT:MESH:D013006)`, `mirna_ner:I-Relation_Trigger)`, `bionlp_st_2013_gro_ner:B-PrimaryStructure)`, `bionlp_st_2013_gro_NER:I-Phosphorylation)`, `chemdner_TEXT:MESH:D003911)`, `pico_extraction_ner:I-participant)`, `chemdner_TEXT:MESH:D010938)`, `chia_ner:B-Person)`, `an_em_ner:B-Tissue)`, `medmentions_st21pv_ner:B-T170)`, `chemdner_TEXT:MESH:D013936)`, `chemdner_TEXT:MESH:D001080)`, `mlee_RE:None)`, `chemdner_TEXT:MESH:D013669)`, `chemdner_TEXT:MESH:D009943)`, `spl_adr_200db_train_ner:I-Factor)`, `chemdner_TEXT:MESH:D044004)`, `ebm_pico_ner:I-Participant_Sex)`, `chemdner_TEXT:MESH:D000409)`, `bionlp_st_2013_cg_NER:B-Cell_division)`, `medmentions_st21pv_ner:B-T033)`, `pcr_ner:I-Herb)`, `chemdner_TEXT:MESH:D020112)`, `bionlp_st_2013_pc_NER:B-Gene_expression)`, `bionlp_st_2011_rel_ner:O)`, `chemdner_TEXT:MESH:D008610)`, `bionlp_st_2013_gro_NER:B-BindingOfDNABindingDomainOfProteinToDNA)`, `bionlp_st_2013_gro_ner:I-Cell)`, `medmentions_full_ner:I-T055)`, `bionlp_st_2013_pc_NER:I-Negative_regulation)`, `chia_RE:Has_value)`, `tmvar_v1_ner:I-SNP)`, `biorelex_ner:I-experimental-construct)`, `genia_term_corpus_ner:B-)`, `chemdner_TEXT:MESH:D053978)`, `bionlp_st_2013_gro_ner:I-Stress)`, `mlee_ner:B-Pathological_formation)`, `bionlp_st_2013_cg_ner:O)`, `chemdner_TEXT:MESH:D007631)`, `chemdner_TEXT:MESH:D011084)`, `medmentions_full_ner:B-T080)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-TranscriptionCorepressor)`, `ehr_rel_sts:4)`, `mlee_ner:I-Gene_or_gene_product)`, `chemdner_TEXT:MESH:D003474)`, `medmentions_full_ner:B-T098)`, `scicite_TEXT:method)`, `medmentions_full_ner:B-T100)`, `chemdner_TEXT:MESH:D011849)`, `medmentions_full_ner:I-T039)`, `anat_em_ner:B-Organism_subdivision)`, `bionlp_st_2013_gro_ner:I-Nucleus)`, `mlee_NER:I-Regulation)`, `bionlp_st_2013_gro_ner:I-NuclearReceptor)`, `bionlp_st_2013_ge_RE:None)`, `chemdner_TEXT:MESH:D019483)`, `bionlp_st_2013_cg_ner:B-Cell)`, `bionlp_st_2013_gro_ner:B-Holoenzyme)`, `bionlp_st_2011_epi_NER:I-Methylation)`, `bionlp_shared_task_2009_ner:B-Protein)`, `medmentions_st21pv_ner:I-T038)`, `bionlp_st_2013_gro_ner:I-DNARegion)`, `bionlp_st_2013_gro_NER:I-CellCyclePhase)`, `bionlp_st_2013_gro_ner:I-tRNA)`, `mlee_ner:I-Multi-tissue_structure)`, `chemprot_ner:O)`, `medmentions_full_ner:B-T094)`, `bionlp_st_2013_gro_RE:fromSpecies)`, `bionlp_st_2013_gro_NER:O)`, `bionlp_st_2013_gro_NER:B-Acetylation)`, `bioinfer_ner:I-Protein_family_or_group)`, `medmentions_st21pv_ner:I-T098)`, `pdr_ner:B-Disease)`, `chemdner_ner:I-Chemical)`, `bionlp_st_2013_cg_NER:B-Negative_regulation)`, `chebi_nactem_fullpaper_ner:B-Chemical_Structure)`, `bionlp_st_2011_ge_NER:I-Negative_regulation)`, `diann_iber_eval_en_ner:O)`, `bionlp_shared_task_2009_NER:I-Binding)`, `mlee_NER:I-Cell_proliferation)`, `chebi_nactem_fullpaper_ner:B-Protein)`, `bionlp_st_2013_gro_NER:B-Phosphorylation)`, `bionlp_st_2011_epi_COREF:coref)`, `medmentions_full_ner:B-T200)`, `bionlp_st_2013_cg_ner:B-Tissue)`, `chemdner_TEXT:MESH:D000082)`, `chemdner_TEXT:MESH:D037201)`, `bionlp_st_2013_gro_ner:B-ComplexMolecularEntity)`, `bionlp_st_2011_ge_RE:ToLoc)`, `diann_iber_eval_en_ner:B-Neg)`, `bionlp_st_2013_gro_ner:B-RibosomalRNA)`, `bionlp_shared_task_2009_NER:I-Protein_catabolism)`, `chemdner_TEXT:MESH:D016912)`, `medmentions_full_ner:B-T017)`, `bionlp_st_2013_gro_ner:B-CpGIsland)`, `mlee_ner:I-Organism_substance)`, `medmentions_full_ner:I-T075)`, `bionlp_st_2013_gro_ner:I-SecondMessenger)`, `bioinfer_ner:B-Protein_family_or_group)`, `bionlp_st_2013_cg_NER:I-Negative_regulation)`, `mantra_gsc_en_emea_ner:B-CHEM)`, `genia_term_corpus_ner:B-DNA_NA)`, `chemdner_TEXT:MESH:D057888)`, `chemdner_TEXT:MESH:D006495)`, `chemdner_TEXT:MESH:D006575)`, `geokhoj_v1_TEXT:0)`, `bionlp_st_2013_gro_RE:locatedIn)`, `genia_term_corpus_ner:B-virus)`, `bionlp_st_2013_gro_ner:B-RuntLikeDomain)`, `medmentions_full_ner:B-T131)`, `bionlp_st_2013_gro_ner:I-ProteinCodingRegion)`, `chemdner_TEXT:MESH:D015525)`, `genia_term_corpus_ner:I-mono_cell)`, `chemdner_TEXT:MESH:D007840)`, `medmentions_full_ner:I-T098)`, `chemdner_TEXT:MESH:D009930)`, `genia_term_corpus_ner:I-polynucleotide)`, `biorelex_ner:I-protein-region)`, `bionlp_st_2011_id_NER:I-Process)`, `bionlp_st_2013_gro_NER:I-CellularProcess)`, `medmentions_full_ner:B-T023)`, `chemdner_TEXT:MESH:D008942)`, `medmentions_full_ner:I-T070)`, `biorelex_ner:B-organelle)`, `bionlp_st_2013_gro_NER:I-Decrease)`, `verspoor_2013_ner:I-size)`, `chemdner_TEXT:MESH:D002945)`, `ebm_pico_ner:B-Intervention_Other)`, `bionlp_st_2013_cg_ner:I-Simple_chemical)`, `chemdner_TEXT:MESH:D008751)`, `chia_RE:AND)`, `medmentions_full_ner:I-T028)`, `ebm_pico_ner:I-Intervention_Other)`, `chemdner_TEXT:MESH:D005472)`, `chemdner_TEXT:MESH:D005070)`, `gnormplus_ner:B-Gene)`, `medmentions_full_ner:I-T190)`, `mlee_NER:B-Breakdown)`, `bioinfer_ner:B-GeneproteinRNA)`, `bioinfer_ner:B-Gene)`, `chemdner_TEXT:MESH:D006835)`, `chemdner_TEXT:MESH:D004298)`, `chemdner_TEXT:MESH:D002951)`, `chia_ner:I-Device)`, `bionlp_st_2013_pc_NER:B-Conversion)`, `bionlp_shared_task_2009_NER:I-Transcription)`, `mlee_NER:B-DNA_methylation)`, `pubmed_qa_labeled_fold0_CLF:no)`, `minimayosrs_sts:1)`, `chemdner_TEXT:MESH:D002166)`, `chemdner_TEXT:MESH:D005934)`, `bionlp_st_2013_gro_NER:B-CatabolicPathway)`, `tmvar_v1_ner:I-ProteinMutation)`, `verspoor_2013_ner:I-Phenomena)`, `medmentions_full_ner:B-T011)`, `chemdner_TEXT:MESH:D001218)`, `medmentions_full_ner:B-T185)`, `mantra_gsc_en_patents_ner:I-PROC)`, `medmentions_full_ner:I-T120)`, `chia_ner:I-Procedure)`, `genia_term_corpus_ner:I-ANDcell_typecell_type)`, `bionlp_st_2011_id_ner:I-Entity)`, `pcr_ner:B-Chemical)`, `bionlp_st_2013_gro_NER:B-PositiveRegulation)`, `mlee_RE:Theme)`, `bionlp_st_2011_epi_ner:B-Protein)`, `medmentions_full_ner:B-T055)`, `spl_adr_200db_train_ner:I-Severity)`, `bionlp_st_2013_gro_ner:I-Ion)`, `bionlp_st_2011_id_RE:Cause)`, `bc5cdr_ner:I-Disease)`, `bionlp_st_2013_gro_ner:I-bHLH)`, `chemdner_TEXT:MESH:D001058)`, `bionlp_st_2013_gro_ner:I-AminoAcid)`, `bionlp_st_2011_epi_NER:B-Phosphorylation)`, `medmentions_full_ner:B-T086)`, `chemdner_TEXT:MESH:D004441)`, `medmentions_st21pv_ner:I-T007)`, `biorelex_ner:B-drug)`, `mantra_gsc_en_patents_ner:I-DISO)`, `medmentions_full_ner:I-T197)`, `bionlp_st_2011_ge_RE:AtLoc)`, `bionlp_st_2013_gro_NER:B-MolecularProcess)`, `bionlp_st_2011_ge_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:B-FormationOfTranscriptionInitiationComplex)`, `bionlp_st_2011_ge_NER:I-Binding)`, `mirna_ner:B-GenesProteins)`, `mirna_ner:B-Diseases)`, `mantra_gsc_en_emea_ner:I-DISO)`, `anat_em_ner:I-Multi-tissue_structure)`, `bioinfer_ner:O)`, `chemdner_TEXT:MESH:D017673)`, `bionlp_st_2013_gro_NER:B-Methylation)`, `genia_term_corpus_ner:I-AND_NOTcell_typecell_type)`, `bionlp_st_2013_cg_NER:I-Positive_regulation)`, `bionlp_st_2013_cg_NER:B-Carcinogenesis)`, `chemdner_TEXT:MESH:D009543)`, `gnormplus_ner:I-Gene)`, `bionlp_st_2013_cg_RE:Participant)`, `chemdner_TEXT:MESH:D019804)`, `seth_corpus_RE:Equals)`, `medmentions_full_ner:I-T082)`, `hprd50_ner:O)`, `bionlp_st_2013_gro_ner:B-OxidativeStress)`, `chemdner_TEXT:MESH:D014227)`, `bio_sim_verb_sts:7)`, `bionlp_st_2011_ge_NER:I-Protein_catabolism)`, `bionlp_st_2011_ge_NER:B-Localization)`, `chemdner_TEXT:MESH:D001224)`, `chemdner_TEXT:MESH:D009842)`, `bionlp_st_2013_cg_ner:B-Amino_acid)`, `bionlp_st_2013_gro_NER:B-CellCyclePhase)`, `chemdner_TEXT:MESH:D002245)`, `bionlp_st_2013_ge_NER:I-Ubiquitination)`, `bionlp_st_2013_cg_NER:I-Cell_death)`, `pico_extraction_ner:O)`, `chemdner_TEXT:MESH:D000596)`, `chemdner_TEXT:MESH:D000638)`, `an_em_ner:B-Developing_anatomical_structure)`, `bionlp_st_2019_bb_ner:I-Phenotype)`, `bionlp_st_2013_gro_NER:I-CellDeath)`, `mantra_gsc_en_patents_ner:B-PHYS)`, `chemdner_TEXT:MESH:D009705)`, `genia_term_corpus_ner:B-protein_molecule)`, `mantra_gsc_en_medline_ner:B-PHEN)`, `bionlp_st_2013_gro_NER:I-PosttranslationalModification)`, `ddi_corpus_ner:B-BRAND)`, `mantra_gsc_en_medline_ner:B-DEVI)`, `mlee_NER:I-Planned_process)`, `tmvar_v1_ner:O)`, `bionlp_st_2011_ge_NER:I-Phosphorylation)`, `genia_term_corpus_ner:I-ANDprotein_substructureprotein_substructure)`, `medmentions_st21pv_ner:B-T007)`, `bionlp_st_2013_cg_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-Organism)`, `bionlp_st_2013_gro_ner:I-NucleicAcid)`, `medmentions_full_ner:I-T044)`, `chia_ner:I-Person)`, `chemdner_TEXT:MESH:D016572)`, `scai_disease_ner:O)`, `bionlp_st_2013_gro_ner:B-TranscriptionCofactor)`, `chemdner_TEXT:MESH:D002762)`, `chemdner_TEXT:MESH:D011685)`, `chemdner_TEXT:MESH:D005031)`, `scai_disease_ner:I-ADVERSE)`, `biorelex_ner:I-protein-isoform)`, `bionlp_shared_task_2009_COREF:None)`, `genia_term_corpus_ner:I-lipid)`, `biorelex_ner:B-RNA)`, `chemdner_TEXT:MESH:D018020)`, `scai_chemical_ner:B-FAMILY)`, `chemdner_TEXT:MESH:D017382)`, `chemdner_TEXT:MESH:D006027)`, `chemdner_TEXT:MESH:D018942)`, `medmentions_full_ner:I-T024)`, `chemdner_TEXT:MESH:D008050)`, `bionlp_st_2013_cg_NER:B-Glycosylation)`, `chemdner_TEXT:MESH:D019342)`, `chemdner_TEXT:MESH:D008774)`, `bionlp_st_2011_ge_RE:CSite)`, `bionlp_st_2013_gro_ner:B-HMGTF)`, `chemdner_ner:B-Chemical)`, `bioscope_papers_ner:B-negation)`, `biorelex_RE:bind)`, `bioinfer_ner:B-Protein_complex)`, `bionlp_st_2011_epi_NER:B-Ubiquitination)`, `bionlp_st_2013_gro_NER:I-RegulationOfTranscription)`, `chemdner_TEXT:MESH:D011134)`, `bionlp_st_2011_rel_ner:I-Entity)`, `mantra_gsc_en_medline_ner:I-PROC)`, `ncbi_disease_ner:I-DiseaseClass)`, `chemdner_TEXT:MESH:D014315)`, `bionlp_st_2013_gro_ner:I-Chromosome)`, `chemdner_TEXT:MESH:D000639)`, `chemdner_TEXT:MESH:D005740)`, `bionlp_st_2013_gro_ner:I-MolecularFunction)`, `verspoor_2013_ner:B-gene)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomainTF)`, `bionlp_st_2013_gro_ner:B-DNARegion)`, `ebm_pico_ner:B-Intervention_Educational)`, `medmentions_st21pv_ner:B-T005)`, `medmentions_full_ner:I-T022)`, `gnormplus_ner:B-FamilyName)`, `bionlp_st_2011_epi_RE:Contextgene)`, `bionlp_st_2013_pc_NER:B-Demethylation)`, `chia_ner:I-Observation)`, `medmentions_full_ner:I-T089)`, `bionlp_st_2013_gro_ner:I-ComplexMolecularEntity)`, `bionlp_st_2013_gro_ner:B-Lipid)`, `biorelex_ner:I-gene)`, `chemdner_TEXT:MESH:D003300)`, `chemdner_TEXT:MESH:D008903)`, `verspoor_2013_RE:relatedTo)`, `bionlp_st_2011_epi_NER:I-DNA_methylation)`, `genia_term_corpus_ner:I-cell_component)`, `bionlp_st_2011_ge_COREF:None)`, `ebm_pico_ner:B-Participant_Sample-size)`, `chemdner_TEXT:MESH:D043823)`, `chemdner_TEXT:MESH:D004958)`, `bionlp_st_2013_gro_ner:I-RNA)`, `chemdner_TEXT:MESH:D006150)`, `bionlp_st_2013_gro_ner:B-MolecularStructure)`, `chemdner_TEXT:MESH:D007457)`, `bionlp_st_2013_gro_ner:I-OxidativeStress)`, `scai_chemical_ner:B-PARTIUPAC)`, `mlee_NER:I-Blood_vessel_development)`, `bionlp_shared_task_2009_ner:B-Entity)`, `bionlp_st_2013_ge_RE:CSite)`, `medmentions_full_ner:B-T058)`, `chemdner_TEXT:MESH:D000628)`, `ebm_pico_ner:I-Intervention_Surgical)`, `an_em_ner:I-Organ)`, `bionlp_st_2013_gro_NER:B-Increase)`, `iepa_RE:PPI)`, `mlee_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D014284)`, `chemdner_TEXT:MESH:D014260)`, `bionlp_st_2011_epi_NER:I-Glycosylation)`, `bionlp_st_2013_gro_NER:B-BindingToProtein)`, `bionlp_st_2013_gro_NER:B-BindingToRNA)`, `medmentions_full_ner:I-T047)`, `bionlp_st_2013_gro_NER:B-Localization)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfGeneExpression)`, `medmentions_full_ner:I-T051)`, `bionlp_st_2011_id_COREF:None)`, `chemdner_TEXT:MESH:D011744)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToDNA)`, `bionlp_st_2013_gro_ner:B-CatalyticActivity)`, `chebi_nactem_abstr_ann1_ner:I-Biological_Activity)`, `bio_sim_verb_sts:1)`, `chemdner_TEXT:MESH:D012402)`, `bionlp_st_2013_gro_ner:B-bZIPTF)`, `chemdner_TEXT:MESH:D003913)`, `bionlp_shared_task_2009_RE:Site)`, `bionlp_st_2013_gro_ner:I-AntisenseRNA)`, `bionlp_st_2013_gro_NER:B-ProteinTargeting)`, `bionlp_st_2013_gro_NER:B-GeneExpression)`, `bionlp_st_2013_cg_NER:I-Blood_vessel_development)`, `mantra_gsc_en_patents_ner:I-CHEM)`, `mayosrs_sts:2)`, `chemdner_TEXT:MESH:D001645)`, `bionlp_st_2011_ge_NER:I-Transcription)`, `bionlp_st_2011_epi_NER:B-Acetylation)`, `medmentions_full_ner:B-T002)`, `verspoor_2013_ner:I-Concepts_Ideas)`, `hprd50_RE:None)`, `ddi_corpus_ner:O)`, `chemdner_TEXT:MESH:D014131)`, `ebm_pico_ner:B-Outcome_Physical)`, `medmentions_st21pv_ner:B-T103)`, `chemdner_TEXT:MESH:D016650)`, `mlee_NER:B-Cell_proliferation)`, `bionlp_st_2013_gro_ner:I-TranscriptionCoactivator)`, `chebi_nactem_fullpaper_ner:I-Chemical)`, `chemdner_TEXT:MESH:D013256)`, `biorelex_ner:I-protein-DNA-complex)`, `chemdner_TEXT:MESH:D008767)`, `bioinfer_RE:None)`, `nlm_gene_ner:B-Gene)`, `bionlp_st_2013_gro_ner:B-ReporterGene)`, `biosses_sts:1)`, `chemdner_TEXT:MESH:D000493)`, `chemdner_TEXT:MESH:D011374)`, `ebm_pico_ner:B-Intervention_Control)`, `bionlp_st_2013_pc_NER:I-Pathway)`, `chemprot_RE:CPR:3)`, `bionlp_st_2013_cg_ner:I-Amino_acid)`, `chemdner_TEXT:MESH:D005557)`, `bionlp_st_2011_ge_RE:Site)`, `bionlp_st_2013_pc_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:I-Elongation)`, `bionlp_st_2011_ge_NER:I-Localization)`, `spl_adr_200db_train_ner:B-Negation)`, `chemdner_TEXT:MESH:D010455)`, `nlm_gene_ner:B-GENERIF)`, `mlee_RE:Site)`, `bionlp_st_2013_gro_NER:B-BindingOfTFToTFBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D017953)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscription)`, `osiris_ner:B-gene)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressor)`, `medmentions_full_ner:I-T131)`, `genia_term_corpus_ner:B-protein_family_or_group)`, `genia_term_corpus_ner:B-cell_type)`, `chemdner_TEXT:MESH:D013759)`, `chemdner_TEXT:MESH:D002247)`, `scai_chemical_ner:I-FAMILY)`, `chemdner_TEXT:MESH:D006020)`, `biorelex_ner:B-DNA)`, `chebi_nactem_abstr_ann1_ner:I-Spectral_Data)`, `mantra_gsc_en_medline_ner:B-DISO)`, `chemdner_TEXT:MESH:D019829)`, `ncbi_disease_ner:I-CompositeMention)`, `chemdner_TEXT:MESH:D013876)`, `chebi_nactem_fullpaper_ner:I-Spectral_Data)`, `biorelex_ner:I-DNA)`, `chemdner_TEXT:MESH:D005492)`, `chemdner_TEXT:MESH:D011810)`, `chemdner_TEXT:MESH:D008563)`, `chemdner_TEXT:MESH:D015735)`, `bionlp_st_2019_bb_ner:B-Microorganism)`, `ddi_corpus_RE:INT)`, `medmentions_st21pv_ner:B-T038)`, `bionlp_st_2013_gro_NER:B-CellCyclePhaseTransition)`, `cellfinder_ner:B-CellLine)`, `pdr_RE:Cause)`, `chemdner_TEXT:MESH:D011433)`, `chemdner_TEXT:MESH:D011720)`, `chemdner_TEXT:MESH:D020156)`, `ebm_pico_ner:O)`, `mlee_ner:B-Organ)`, `chemdner_TEXT:MESH:D012721)`, `chebi_nactem_fullpaper_ner:I-Biological_Activity)`, `bionlp_st_2013_cg_COREF:coref)`, `chemdner_TEXT:MESH:D006918)`, `medmentions_full_ner:B-T092)`, `genia_term_corpus_ner:B-protein_NA)`, `bionlp_st_2013_ge_ner:B-Entity)`, `an_em_ner:B-Multi-tissue_structure)`, `chia_ner:I-Measurement)`, `chia_RE:Has_temporal)`, `bionlp_st_2011_id_NER:B-Protein_catabolism)`, `bionlp_st_2013_gro_NER:B-CellAdhesion)`, `bionlp_st_2013_gro_ner:B-DNABindingSite)`, `biorelex_ner:B-organism)`, `scai_disease_ner:I-DISEASE)`, `bionlp_st_2013_gro_ner:I-DNABindingSite)`, `chemdner_TEXT:MESH:D016607)`, `chemdner_TEXT:MESH:D030421)`, `bionlp_st_2013_pc_NER:I-Binding)`, `medmentions_full_ner:I-T029)`, `chemdner_TEXT:MESH:D001569)`, `genia_term_corpus_ner:B-ANDcell_typecell_type)`, `scai_chemical_ner:B-SUM)`, `chemdner_TEXT:MESH:D007656)`, `medmentions_full_ner:B-T082)`, `chemdner_TEXT:MESH:D009525)`, `medmentions_full_ner:B-T079)`, `bionlp_st_2013_cg_NER:B-Synthesis)`, `biorelex_ner:B-process)`, `bionlp_st_2013_ge_RE:Theme)`, `chemdner_TEXT:MESH:D012825)`, `chemdner_TEXT:MESH:D005462)`, `bionlp_st_2013_cg_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-CellCycle)`, `cellfinder_ner:I-CellLine)`, `bionlp_st_2013_gro_ner:I-DNABindingDomainOfProtein)`, `medmentions_st21pv_ner:B-T168)`, `genia_term_corpus_ner:B-body_part)`, `genia_term_corpus_ner:B-ANDprotein_family_or_groupprotein_family_or_group)`, `mlee_ner:B-Tissue)`, `mlee_NER:I-Localization)`, `medmentions_full_ner:B-T125)`, `bionlp_st_2013_cg_NER:B-Infection)`, `chebi_nactem_abstr_ann1_ner:I-Protein)`, `chemdner_TEXT:MESH:D009570)`, `medmentions_full_ner:I-T045)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivator)`, `verspoor_2013_ner:B-disease)`, `medmentions_full_ner:I-T056)`, `medmentions_full_ner:B-T050)`, `bionlp_st_2013_gro_ner:B-MolecularFunction)`, `medmentions_full_ner:B-T060)`, `bionlp_st_2013_gro_ner:B-Cell)`, `medmentions_full_ner:I-T060)`, `bionlp_st_2013_pc_NER:I-Gene_expression)`, `genia_term_corpus_ner:B-RNA_NA)`, `bionlp_st_2013_gro_ner:I-MessengerRNA)`, `medmentions_full_ner:I-T086)`, `an_em_RE:Part-of)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_gro_NER:I-Splicing)`, `bioinfer_RE:PPI)`, `bioscope_papers_ner:I-speculation)`, `bionlp_st_2013_gro_ner:B-HomeoBox)`, `medmentions_full_ner:B-T004)`, `chia_ner:I-Drug)`, `bionlp_st_2013_gro_ner:B-FusionOfGeneWithReporterGene)`, `genia_term_corpus_ner:I-cell_line)`, `chebi_nactem_abstr_ann1_ner:I-Metabolite)`, `bionlp_st_2013_gro_ner:I-ExpressionProfiling)`, `chemdner_TEXT:MESH:D004390)`, `medmentions_full_ner:B-T016)`, `bionlp_st_2013_cg_NER:B-Growth)`, `medmentions_full_ner:I-T170)`, `medmentions_full_ner:B-T093)`, `genia_term_corpus_ner:I-inorganic)`, `mlee_NER:B-Planned_process)`, `bionlp_st_2013_gro_RE:hasPart)`, `bionlp_st_2013_gro_ner:B-BasicDomain)`, `chemdner_TEXT:MESH:D050091)`, `medmentions_st21pv_ner:B-T037)`, `chemdner_TEXT:MESH:D011522)`, `bionlp_st_2013_ge_NER:B-Deacetylation)`, `chemdner_TEXT:MESH:D004008)`, `chemdner_TEXT:MESH:D013972)`, `bionlp_st_2013_gro_NER:B-SignalingPathway)`, `bionlp_st_2013_gro_ner:B-Promoter)`, `chemdner_TEXT:MESH:D012701)`, `an_em_COREF:None)`, `bionlp_st_2019_bb_RE:None)`, `mlee_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_NER:I-Translation)`, `chemdner_TEXT:MESH:D013453)`, `genia_term_corpus_ner:I-ANDprotein_moleculeprotein_molecule)`, `chemdner_TEXT:MESH:D002746)`, `chebi_nactem_abstr_ann1_ner:O)`, `bionlp_st_2013_pc_ner:O)`, `mayosrs_sts:7)`, `bionlp_st_2013_cg_NER:B-Pathway)`, `verspoor_2013_ner:I-age)`, `biorelex_ner:I-peptide)`, `medmentions_full_ner:I-T096)`, `chebi_nactem_fullpaper_ner:I-Chemical_Structure)`, `chemdner_TEXT:MESH:D007211)`, `medmentions_full_ner:I-T018)`, `medmentions_full_ner:B-T201)`, `bionlp_st_2013_gro_NER:B-BindingOfTFToTFBindingSiteOfProtein)`, `medmentions_full_ner:B-T054)`, `ebm_pico_ner:I-Intervention_Pharmacological)`, `chemdner_TEXT:MESH:D010672)`, `chemdner_TEXT:MESH:D004492)`, `chemdner_TEXT:MESH:D008094)`, `chemdner_TEXT:MESH:D002227)`, `chemdner_TEXT:MESH:D009553)`, `bionlp_st_2013_gro_NER:I-ResponseProcess)`, `chemdner_TEXT:MESH:D006046)`, `ebm_pico_ner:B-Participant_Condition)`, `nlm_gene_ner:I-Gene)`, `bionlp_st_2019_bb_ner:I-Habitat)`, `bionlp_shared_task_2009_COREF:coref)`, `chemdner_TEXT:MESH:D005640)`, `mantra_gsc_en_emea_ner:B-PHYS)`, `mantra_gsc_en_patents_ner:B-DISO)`, `bionlp_st_2013_gro_ner:B-Heterochromatin)`, `bionlp_st_2013_gro_NER:I-CellCycle)`, `bionlp_st_2013_cg_NER:I-Cell_proliferation)`, `bionlp_st_2013_cg_ner:B-Simple_chemical)`, `genia_term_corpus_ner:I-cell_type)`, `chemdner_TEXT:MESH:D003553)`, `bionlp_st_2013_ge_RE:Theme2)`, `tmvar_v1_ner:B-ProteinMutation)`, `chemdner_TEXT:MESH:D012717)`, `chemdner_TEXT:MESH:D026121)`, `chemdner_TEXT:MESH:D008687)`, `bionlp_st_2013_gro_NER:I-TranscriptionTermination)`, `medmentions_full_ner:B-T028)`, `biorelex_ner:B-assay)`, `genia_term_corpus_ner:B-tissue)`, `chemdner_TEXT:MESH:D009173)`, `bionlp_st_2013_gro_ner:B-TranscriptionCoactivator)`, `genia_term_corpus_ner:B-amino_acid_monomer)`, `mantra_gsc_en_emea_ner:B-DEVI)`, `bionlp_st_2013_gro_NER:B-Growth)`, `chemdner_TEXT:MESH:D017374)`, `genia_term_corpus_ner:B-other_artificial_source)`, `medmentions_full_ner:B-T072)`, `bionlp_st_2013_gro_NER:B-CellGrowth)`, `bionlp_st_2013_gro_ner:I-DoubleStrandDNA)`, `chemdner_ner:O)`, `bionlp_shared_task_2009_NER:I-Localization)`, `bionlp_st_2013_gro_NER:B-RegulationOfPathway)`, `genia_term_corpus_ner:I-amino_acid_monomer)`, `bionlp_st_2013_gro_NER:I-SPhase)`, `an_em_ner:B-Organism_substance)`, `medmentions_full_ner:B-T052)`, `genia_term_corpus_ner:B-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:B-T096)`, `chemdner_TEXT:MESH:D056831)`, `chemdner_TEXT:MESH:D010755)`, `pdr_NER:I-Cause_of_disease)`, `mlee_NER:B-Phosphorylation)`, `medmentions_full_ner:I-T064)`, `chemdner_TEXT:MESH:D005978)`, `mantra_gsc_en_medline_ner:I-PHEN)`, `bionlp_st_2013_cg_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_NER:B-Modification)`, `bionlp_st_2013_gro_ner:B-ProteinComplex)`, `bionlp_st_2013_gro_ner:B-DoubleStrandDNA)`, `medmentions_full_ner:B-T068)`, `medmentions_full_ner:I-T034)`, `bionlp_st_2011_epi_NER:B-Catalysis)`, `biosses_sts:0)`, `bionlp_st_2013_cg_ner:B-Organism_substance)`, `chemdner_TEXT:MESH:D055549)`, `bionlp_st_2013_cg_NER:B-Glycolysis)`, `chemdner_TEXT:MESH:D001761)`, `chemdner_TEXT:MESH:D011728)`, `bionlp_st_2013_gro_ner:B-Function)`, `medmentions_full_ner:I-T033)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfTranscriptionOfGene)`, `medmentions_full_ner:I-T053)`, `bionlp_st_2013_gro_ner:B-Protein)`, `genia_term_corpus_ner:I-ANDprotein_family_or_groupprotein_family_or_group)`, `bionlp_st_2013_gro_NER:I-CatabolicPathway)`, `biorelex_ner:I-chemical)`, `chemdner_TEXT:MESH:D013185)`, `biorelex_ner:I-RNA)`, `chemdner_TEXT:MESH:D009838)`, `medmentions_full_ner:I-T008)`, `chemdner_TEXT:MESH:D002104)`, `bionlp_st_2013_gro_NER:B-RNABiosynthesis)`, `verspoor_2013_ner:I-ethnicity)`, `bionlp_st_2013_gro_ner:I-SmallInterferingRNA)`, `chemdner_TEXT:MESH:D026023)`, `mlee_ner:O)`, `bionlp_st_2013_gro_NER:I-CellHomeostasis)`, `bionlp_st_2013_pc_NER:B-Pathway)`, `gnormplus_ner:I-DomainMotif)`, `bionlp_st_2013_gro_ner:I-OpenReadingFrame)`, `bionlp_st_2013_gro_NER:I-RegulationOfGeneExpression)`, `muchmore_en_ner:O)`, `chemdner_TEXT:MESH:D000911)`, `bionlp_st_2011_epi_NER:B-DNA_demethylation)`, `bionlp_st_2013_gro_ner:I-RuntLikeDomain)`, `chemdner_TEXT:MESH:D010748)`, `medmentions_full_ner:B-T008)`, `biorelex_ner:B-protein-RNA-complex)`, `bionlp_st_2013_cg_NER:I-Planned_process)`, `chemdner_TEXT:MESH:D014867)`, `mantra_gsc_en_patents_ner:I-LIVB)`, `bionlp_st_2013_gro_NER:I-Silencing)`, `chemdner_TEXT:MESH:D015306)`, `chemdner_TEXT:MESH:D001679)`, `bionlp_shared_task_2009_NER:I-Positive_regulation)`, `linnaeus_filtered_ner:O)`, `chia_RE:Has_multiplier)`, `medmentions_full_ner:B-T116)`, `bionlp_shared_task_2009_NER:B-Positive_regulation)`, `anat_em_ner:B-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D011137)`, `chemdner_TEXT:MESH:D048271)`, `chemdner_TEXT:MESH:D003975)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressorActivity)`, `bionlp_st_2011_id_ner:B-Protein)`, `bionlp_st_2013_gro_NER:I-Mutation)`, `chemdner_TEXT:MESH:D001572)`, `mantra_gsc_en_patents_ner:B-CHEM)`, `mantra_gsc_en_medline_ner:I-DEVI)`, `bionlp_st_2013_gro_ner:B-Enzyme)`, `medmentions_full_ner:B-T056)`, `mantra_gsc_en_patents_ner:B-OBJC)`, `medmentions_full_ner:B-T073)`, `anat_em_ner:I-Tissue)`, `chemdner_TEXT:MESH:D047310)`, `chia_ner:I-Scope)`, `ncbi_disease_ner:B-Modifier)`, `medmentions_st21pv_ner:B-T082)`, `medmentions_full_ner:I-T054)`, `genia_term_corpus_ner:I-carbohydrate)`, `bionlp_st_2013_cg_RE:Theme)`, `chemdner_TEXT:MESH:D009538)`, `chemdner_TEXT:MESH:D008691)`, `genia_term_corpus_ner:B-ANDprotein_substructureprotein_substructure)`, `bionlp_st_2013_cg_ner:I-Tissue)`, `chia_ner:B-Device)`, `chemdner_TEXT:MESH:D002784)`, `medmentions_full_ner:I-T007)`, `bionlp_st_2013_gro_ner:I-DNAFragment)`, `mlee_RE:ToLoc)`, `spl_adr_200db_train_ner:I-AdverseReaction)`, `bionlp_st_2013_cg_NER:B-Catabolism)`, `chemdner_TEXT:MESH:D013779)`, `bionlp_st_2013_pc_NER:B-Regulation)`, `bionlp_st_2013_gro_NER:I-Disease)`, `chia_ner:I-Condition)`, `chemdner_TEXT:MESH:D012370)`, `bionlp_st_2013_ge_NER:O)`, `bionlp_st_2013_pc_NER:B-Deubiquitination)`, `bionlp_st_2013_pc_NER:I-Translation)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_cg_NER:B-DNA_methylation)`, `bioscope_papers_ner:B-speculation)`, `chemdner_TEXT:MESH:D018130)`, `bionlp_st_2013_gro_ner:B-RNAPolymeraseII)`, `medmentions_st21pv_ner:B-T098)`, `bionlp_st_2013_gro_NER:B-Elongation)`, `bionlp_st_2013_pc_RE:Cause)`, `seth_corpus_ner:B-RS)`, `bionlp_st_2013_ge_RE:ToLoc)`, `chemdner_TEXT:MESH:D000538)`, `medmentions_full_ner:B-T192)`, `medmentions_full_ner:B-T061)`, `medmentions_full_ner:B-T032)`, `bionlp_st_2013_gro_NER:B-Transport)`, `medmentions_full_ner:I-T014)`, `chemdner_TEXT:MESH:D004137)`, `medmentions_full_ner:B-T101)`, `bionlp_st_2013_gro_NER:B-Transcription)`, `bionlp_st_2013_pc_NER:B-Transport)`, `medmentions_full_ner:I-T203)`, `ebm_pico_ner:I-Intervention_Control)`, `genia_term_corpus_ner:I-atom)`, `chemdner_TEXT:MESH:D014230)`, `osiris_ner:I-gene)`, `mantra_gsc_en_patents_ner:B-ANAT)`, `ncbi_disease_ner:I-SpecificDisease)`, `bionlp_st_2013_gro_NER:I-CellGrowth)`, `chemdner_TEXT:MESH:D001205)`, `chemdner_TEXT:MESH:D016627)`, `genia_term_corpus_ner:B-protein_subunit)`, `bionlp_st_2013_gro_ner:I-CellComponent)`, `medmentions_full_ner:B-T049)`, `scai_chemical_ner:O)`, `chemdner_TEXT:MESH:D010840)`, `chemdner_TEXT:MESH:D008694)`, `mantra_gsc_en_patents_ner:B-PHEN)`, `bionlp_st_2013_cg_RE:Cause)`, `chemdner_TEXT:MESH:D012293)`, `bionlp_st_2013_gro_NER:B-Homodimerization)`, `chemdner_TEXT:MESH:D008070)`, `chia_RE:OR)`, `bionlp_st_2013_cg_ner:I-Gene_or_gene_product)`, `verspoor_2013_ner:I-disease)`, `muchmore_en_ner:B-umlsterm)`, `chemdner_TEXT:MESH:D011794)`, `medmentions_full_ner:I-T002)`, `chemdner_TEXT:MESH:D007649)`, `genia_term_corpus_ner:B-AND_NOTcell_typecell_type)`, `medmentions_full_ner:I-T023)`, `chemprot_RE:CPR:1)`, `chemdner_TEXT:MESH:D001786)`, `bionlp_st_2013_gro_ner:B-HomeoboxTF)`, `bionlp_st_2013_cg_ner:I-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:I-Attenuator)`, `bionlp_st_2019_bb_ner:B-Habitat)`, `chemdner_TEXT:MESH:D017931)`, `medmentions_full_ner:B-T047)`, `chemdner_TEXT:MESH:D006886)`, `genia_term_corpus_ner:I-)`, `medmentions_full_ner:B-T039)`, `chemdner_TEXT:MESH:D004220)`, `bionlp_st_2013_pc_RE:FromLoc)`, `nlm_gene_ner:I-GENERIF)`, `bionlp_st_2013_ge_NER:I-Protein_modification)`, `genia_term_corpus_ner:B-RNA_molecule)`, `chemdner_TEXT:MESH:D006854)`, `chemdner_TEXT:MESH:D006493)`, `chia_ner:B-Qualifier)`, `medmentions_full_ner:I-T013)`, `ehr_rel_sts:8)`, `an_em_RE:frag)`, `genia_term_corpus_ner:I-DNA_substructure)`, `chemdner_TEXT:MESH:D063065)`, `genia_term_corpus_ner:I-ANDprotein_complexprotein_complex)`, `bionlp_st_2013_pc_NER:I-Dissociation)`, `medmentions_full_ner:I-T004)`, `bionlp_st_2013_cg_ner:B-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D010069)`, `bionlp_st_2013_gro_NER:I-Homodimerization)`, `chemdner_TEXT:MESH:D006147)`, `medmentions_full_ner:I-T041)`, `bionlp_st_2011_id_NER:B-Regulation)`, `bionlp_st_2013_gro_ner:O)`, `chemdner_TEXT:MESH:D008623)`, `bionlp_st_2013_ge_ner:I-Protein)`, `scai_chemical_ner:I-TRIVIAL)`, `an_em_ner:B-Organism_subdivision)`, `bionlp_st_2013_gro_ner:B-BindingAssay)`, `bionlp_st_2013_gro_ner:I-HMG)`, `anat_em_ner:I-Anatomical_system)`, `chemdner_TEXT:MESH:D015034)`, `mlee_NER:B-Catabolism)`, `mantra_gsc_en_medline_ner:B-LIVB)`, `ddi_corpus_ner:I-BRAND)`, `chia_ner:I-Multiplier)`, `bionlp_st_2013_gro_ner:I-SequenceHomologyAnalysis)`, `seth_corpus_RE:None)`, `bionlp_st_2013_cg_NER:B-Binding)`, `bioscope_papers_ner:I-negation)`, `chemdner_TEXT:MESH:D008741)`, `chemdner_TEXT:MESH:D052998)`, `chemdner_TEXT:MESH:D005227)`, `chemdner_TEXT:MESH:D009828)`, `spl_adr_200db_train_ner:B-Animal)`, `chemdner_TEXT:MESH:D010616)`, `bionlp_st_2013_gro_ner:I-ProteinComplex)`, `pico_extraction_ner:B-outcome)`, `mlee_NER:B-Negative_regulation)`, `chemdner_TEXT:MESH:D007093)`, `bionlp_st_2013_gro_NER:I-RNAProcessing)`, `bionlp_st_2013_gro_RE:hasAgent2)`, `biorelex_ner:I-reagent)`, `medmentions_st21pv_ner:I-T074)`, `bionlp_st_2013_gro_NER:B-BindingOfMolecularEntity)`, `chemdner_TEXT:MESH:D008911)`, `medmentions_full_ner:B-T033)`, `genia_term_corpus_ner:B-ANDprotein_complexprotein_complex)`, `medmentions_full_ner:I-T100)`, `chemdner_TEXT:MESH:D019259)`, `genia_term_corpus_ner:I-BUT_NOTother_nameother_name)`, `geokhoj_v1_TEXT:1)`, `bionlp_st_2013_cg_RE:Site)`, `medmentions_full_ner:B-T184)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelixTF)`, `bionlp_st_2013_cg_ner:I-Protein_domain_or_region)`, `genia_term_corpus_ner:I-other_organic_compound)`, `chemdner_TEXT:MESH:D010793)`, `bionlp_st_2011_id_NER:B-Phosphorylation)`, `chemdner_TEXT:MESH:D002482)`, `bionlp_st_2013_cg_NER:B-Breakdown)`, `biorelex_ner:I-disease)`, `genia_term_corpus_ner:B-DNA_substructure)`, `bionlp_st_2013_gro_RE:hasPatient)`, `medmentions_full_ner:B-T127)`, `medmentions_full_ner:I-T185)`, `bionlp_shared_task_2009_RE:AtLoc)`, `medmentions_full_ner:I-T201)`, `chemdner_TEXT:MESH:D005290)`, `mlee_NER:I-Breakdown)`, `medmentions_full_ner:I-T063)`, `chemdner_TEXT:MESH:D017964)`, `an_em_ner:I-Tissue)`, `mlee_ner:I-Organism)`, `mantra_gsc_en_emea_ner:I-CHEM)`, `bionlp_st_2013_cg_ner:B-Anatomical_system)`, `genia_term_corpus_ner:B-ORDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_pc_NER:B-Degradation)`, `chemprot_RE:CPR:0)`, `genia_term_corpus_ner:B-inorganic)`, `chemdner_TEXT:MESH:D005466)`, `chia_ner:O)`, `medmentions_full_ner:B-T078)`, `mlee_NER:B-Growth)`, `mantra_gsc_en_emea_ner:B-PHEN)`, `chemdner_TEXT:MESH:D012545)`, `bionlp_st_2013_gro_NER:B-G1Phase)`, `chemdner_TEXT:MESH:D009841)`, `bionlp_st_2013_gro_ner:B-Chromatin)`, `bionlp_st_2011_epi_RE:Site)`, `medmentions_full_ner:B-T066)`, `genetaggold_ner:O)`, `bionlp_st_2013_cg_NER:I-Gene_expression)`, `medmentions_st21pv_ner:B-T092)`, `chemprot_RE:CPR:8)`, `bionlp_st_2013_cg_RE:Instrument)`, `nlm_gene_ner:I-Domain)`, `chemdner_TEXT:MESH:D006151)`, `bionlp_st_2011_id_ner:I-Protein)`, `mlee_NER:B-Synthesis)`, `bionlp_st_2013_gro_NER:B-CellMotility)`, `scai_chemical_ner:B-MODIFIER)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfTranscription)`, `osiris_ner:O)`, `mlee_NER:B-Acetylation)`, `medmentions_st21pv_ner:B-T062)`, `chemdner_TEXT:MESH:D017705)`, `bionlp_st_2013_gro_NER:I-TranscriptionOfGene)`, `genia_term_corpus_ner:I-protein_complex)`, `chemprot_RE:CPR:10)`, `medmentions_full_ner:B-T102)`, `medmentions_full_ner:I-T171)`, `chia_ner:B-Reference_point)`, `medmentions_full_ner:B-T015)`, `bionlp_st_2013_gro_ner:I-RNAPolymerase)`, `chebi_nactem_abstr_ann1_ner:B-Metabolite)`, `bionlp_st_2013_gro_NER:I-CellDifferentiation)`, `chemdner_TEXT:MESH:D006861)`, `pubmed_qa_labeled_fold0_CLF:maybe)`, `bionlp_st_2013_gro_ner:I-Sequence)`, `mlee_NER:B-Transcription)`, `bc5cdr_ner:B-Chemical)`, `chemdner_TEXT:MESH:D000072317)`, `bionlp_st_2013_gro_NER:B-Producing)`, `genia_term_corpus_ner:B-ANDprotein_moleculeprotein_molecule)`, `bionlp_st_2011_id_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-MolecularInteraction)`, `chemdner_TEXT:MESH:D014639)`, `bionlp_st_2013_gro_NER:I-Increase)`, `mlee_NER:I-Translation)`, `medmentions_full_ner:B-T087)`, `bioscope_abstracts_ner:B-speculation)`, `ebm_pico_ner:B-Outcome_Adverse-effects)`, `mantra_gsc_en_medline_ner:B-PHYS)`, `bionlp_st_2013_gro_ner:I-Lipid)`, `bionlp_st_2011_ge_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D005278)`, `bionlp_shared_task_2009_NER:B-Phosphorylation)`, `mlee_NER:I-Gene_expression)`, `bionlp_st_2011_epi_NER:I-Deacetylation)`, `chemdner_TEXT:MESH:D002110)`, `medmentions_full_ner:I-T121)`, `bionlp_st_2011_epi_ner:I-Entity)`, `bionlp_st_2019_bb_RE:Lives_In)`, `chemdner_TEXT:MESH:D001710)`, `anat_em_ner:B-Cancer)`, `bionlp_st_2013_gro_NER:B-RNASplicing)`, `mantra_gsc_en_medline_ner:I-ANAT)`, `chemdner_TEXT:MESH:D024508)`, `chemdner_TEXT:MESH:D000537)`, `mantra_gsc_en_medline_ner:I-DISO)`, `bionlp_st_2013_gro_ner:I-Prokaryote)`, `bionlp_st_2013_gro_ner:I-Chromatin)`, `bionlp_st_2013_gro_ner:B-Nucleotide)`, `linnaeus_ner:I-species)`, `verspoor_2013_ner:I-body-part)`, `bionlp_st_2013_gro_ner:B-DNAFragment)`, `bionlp_st_2013_gro_ner:B-PositiveTranscriptionRegulator)`, `medmentions_full_ner:I-T049)`, `bionlp_st_2011_ge_ner:B-Entity)`, `medmentions_full_ner:I-T017)`, `bionlp_st_2013_gro_NER:B-TranscriptionOfGene)`, `chemdner_TEXT:MESH:D009947)`, `mlee_NER:B-Dephosphorylation)`, `bionlp_st_2013_gro_NER:B-GeneSilencing)`, `pdr_RE:None)`, `scai_chemical_ner:I-TRIVIALVAR)`, `bionlp_st_2011_epi_NER:O)`, `bionlp_st_2013_cg_ner:I-Cell)`, `sciq_SEQ:None)`, `chemdner_TEXT:MESH:D019913)`, `mlee_RE:Participant)`, `chia_ner:I-Negation)`, `chemdner_TEXT:MESH:D014801)`, `chemdner_TEXT:MESH:D058846)`, `chemdner_TEXT:MESH:D011809)`, `bionlp_st_2011_epi_ner:O)`, `bionlp_st_2013_cg_NER:I-Metastasis)`, `chemdner_TEXT:MESH:D012643)`, `an_em_ner:I-Cell)`, `bionlp_st_2013_gro_ner:I-CatalyticActivity)`, `anat_em_ner:B-Anatomical_system)`, `mlee_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:I-ChromosomalDNA)`, `anat_em_ner:B-Cell)`, `chemdner_TEXT:MESH:D000242)`, `chemdner_TEXT:MESH:D017641)`, `bioscope_abstracts_ner:I-negation)`, `medmentions_st21pv_ner:B-T058)`, `chemdner_TEXT:MESH:D008744)`, `bionlp_st_2013_gro_ner:B-UpstreamRegulatorySequence)`, `chemdner_TEXT:MESH:D008012)`, `medmentions_full_ner:B-T013)`, `bionlp_st_2011_epi_NER:B-Glycosylation)`, `chemdner_TEXT:MESH:D052999)`, `chemdner_TEXT:MESH:D002329)`, `ebm_pico_ner:I-Intervention_Physical)`, `bionlp_st_2013_pc_ner:B-Complex)`, `medmentions_st21pv_ner:I-T005)`, `chemdner_TEXT:MESH:D064704)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomainTF)`, `bionlp_st_2013_pc_ner:I-Cellular_component)`, `genia_term_corpus_ner:B-ANDDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_gro_ner:B-Chromosome)`, `chemdner_TEXT:MESH:D007546)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfGeneExpression)`, `medmentions_full_ner:I-T010)`, `pdr_NER:B-Treatment_of_disease)`, `medmentions_full_ner:B-T081)`, `bionlp_st_2011_epi_NER:B-Demethylation)`, `chemdner_TEXT:MESH:D013261)`, `bionlp_st_2013_gro_ner:I-RibosomalRNA)`, `verspoor_2013_ner:O)`, `bionlp_st_2013_gro_NER:B-DevelopmentalProcess)`, `chemdner_TEXT:MESH:D009270)`, `medmentions_full_ner:I-T130)`, `bionlp_st_2013_cg_ner:B-Organism)`, `medmentions_full_ner:B-T014)`, `chemdner_TEXT:MESH:D003374)`, `chemdner_TEXT:MESH:D011078)`, `cellfinder_ner:B-GeneProtein)`, `mayosrs_sts:6)`, `chemdner_TEXT:MESH:D005576)`, `bionlp_st_2013_ge_RE:Cause)`, `an_em_RE:None)`, `sciq_SEQ:answer)`, `bionlp_st_2013_cg_NER:B-Dissociation)`, `mlee_RE:frag)`, `bionlp_st_2013_pc_COREF:coref)`, `chemdner_TEXT:MESH:D008469)`, `ncbi_disease_ner:O)`, `bionlp_st_2011_epi_ner:I-Protein)`, `chemdner_TEXT:MESH:D011140)`, `chemdner_TEXT:MESH:D020001)`, `bionlp_st_2013_gro_ner:I-ThreeDimensionalMolecularStructure)`, `bionlp_st_2013_cg_ner:B-Cancer)`, `genia_term_corpus_ner:B-BUT_NOTother_nameother_name)`, `chemdner_TEXT:MESH:D006862)`, `medmentions_full_ner:B-T104)`, `bionlp_st_2011_epi_RE:Theme)`, `cellfinder_ner:B-Anatomy)`, `chemdner_TEXT:MESH:D010545)`, `biorelex_ner:B-RNA-family)`, `pico_extraction_ner:I-outcome)`, `mantra_gsc_en_patents_ner:I-PHYS)`, `bionlp_st_2013_pc_NER:I-Transcription)`, `bionlp_shared_task_2009_RE:Cause)`, `bionlp_st_2013_gro_ner:B-Vitamin)`, `bionlp_shared_task_2009_RE:CSite)`, `bionlp_st_2011_ge_ner:I-Protein)`, `mlee_COREF:coref)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelix)`, `bioinfer_ner:I-Gene)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivatorActivity)`, `chemdner_TEXT:MESH:D054439)`, `chemdner_TEXT:MESH:D011621)`, `ddi_corpus_ner:I-DRUG_N)`, `chemdner_TEXT:MESH:D019308)`, `bionlp_st_2013_gro_ner:I-Locus)`, `bionlp_shared_task_2009_RE:ToLoc)`, `bionlp_st_2013_cg_NER:B-Development)`, `bionlp_st_2013_gro_NER:I-CellularDevelopmentalProcess)`, `bionlp_st_2013_gro_ner:B-Eukaryote)`, `bionlp_st_2013_ge_NER:B-Negative_regulation)`, `seth_corpus_ner:I-SNP)`, `hprd50_ner:B-protein)`, `bionlp_st_2013_gro_NER:B-BindingOfProtein)`, `mlee_NER:I-Negative_regulation)`, `bionlp_st_2011_ge_NER:B-Protein_catabolism)`, `bionlp_st_2013_pc_ner:B-Cellular_component)`, `bionlp_st_2011_id_ner:I-Chemical)`, `chemdner_TEXT:MESH:D013831)`, `biorelex_COREF:None)`, `chemdner_TEXT:MESH:D005609)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactor)`, `mlee_NER:B-Regulation)`, `chemdner_TEXT:MESH:D059808)`, `bionlp_st_2013_gro_ner:I-bHLHTF)`, `chemdner_TEXT:MESH:D010121)`, `chemdner_TEXT:MESH:D017608)`, `chemdner_TEXT:MESH:D007455)`, `mlee_NER:B-Blood_vessel_development)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorComplex)`, `biorelex_ner:B-disease)`, `bionlp_st_2013_cg_NER:B-Cell_differentiation)`, `medmentions_st21pv_ner:I-T092)`, `chemdner_TEXT:MESH:D007477)`, `medmentions_full_ner:B-T168)`, `pcr_ner:I-Chemical)`, `chemdner_TEXT:MESH:D009636)`, `chemdner_TEXT:MESH:D008051)`, `bionlp_shared_task_2009_NER:I-Gene_expression)`, `chemprot_ner:I-GENE-N)`, `biorelex_ner:B-reagent)`, `chemdner_TEXT:MESH:D020123)`, `nlmchem_ner:O)`, `ebm_pico_ner:I-Outcome_Mental)`, `chemdner_TEXT:MESH:D004040)`, `chemdner_TEXT:MESH:D000450)`, `chebi_nactem_fullpaper_ner:O)`, `biorelex_ner:B-protein-isoform)`, `chemdner_TEXT:MESH:D001564)`, `medmentions_full_ner:I-T095)`, `mlee_NER:I-Remodeling)`, `bionlp_st_2013_cg_RE:None)`, `biorelex_ner:O)`, `seth_corpus_RE:AssociatedTo)`, `bioscope_abstracts_ner:B-negation)`, `chebi_nactem_fullpaper_ner:I-Metabolite)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressorActivity)`, `bionlp_st_2013_cg_NER:B-Transcription)`, `bionlp_st_2011_ge_ner:B-Protein)`, `bionlp_st_2013_ge_ner:B-Protein)`, `bionlp_st_2013_gro_ner:I-Tissue)`, `chemdner_TEXT:MESH:D044005)`, `genia_term_corpus_ner:I-protein_substructure)`, `bionlp_st_2013_gro_ner:I-TranslationFactor)`, `minimayosrs_sts:5)`, `chemdner_TEXT:MESH:D012834)`, `ncbi_disease_ner:I-Modifier)`, `mlee_NER:B-Death)`, `medmentions_full_ner:B-T196)`, `bio_sim_verb_sts:4)`, `bionlp_st_2013_gro_NER:B-CellHomeostasis)`, `chemdner_TEXT:MESH:D006001)`, `bionlp_st_2013_gro_RE:encodes)`, `biorelex_ner:B-fusion-protein)`, `mlee_COREF:None)`, `chemdner_TEXT:MESH:D001623)`, `chemdner_TEXT:MESH:D000812)`, `medmentions_full_ner:B-T046)`, `bionlp_shared_task_2009_NER:O)`, `chemdner_TEXT:MESH:D000735)`, `gnormplus_ner:O)`, `chemdner_TEXT:MESH:D014635)`, `bionlp_st_2013_gro_NER:B-Mitosis)`, `chemdner_TEXT:MESH:D003847)`, `chemdner_TEXT:MESH:D002809)`, `medmentions_full_ner:I-T116)`, `chemdner_TEXT:MESH:D060406)`, `chemprot_ner:B-CHEMICAL)`, `chemdner_TEXT:MESH:D016642)`, `bionlp_st_2013_cg_NER:B-Phosphorylation)`, `an_em_ner:B-Organ)`, `chemdner_TEXT:MESH:D013431)`, `bionlp_shared_task_2009_RE:None)`, `medmentions_full_ner:B-T041)`, `mlee_ner:I-Tissue)`, `chemdner_TEXT:MESH:D023303)`, `ebm_pico_ner:I-Participant_Condition)`, `bionlp_st_2013_gro_ner:I-TATAbox)`, `bionlp_st_2013_gro_ner:I-bZIP)`, `bionlp_st_2011_epi_RE:Sidechain)`, `bionlp_st_2013_gro_ner:B-LivingEntity)`, `mantra_gsc_en_medline_ner:B-CHEM)`, `chemdner_TEXT:MESH:D007659)`, `medmentions_full_ner:I-T085)`, `bionlp_st_2013_cg_ner:I-Organism_substance)`, `medmentions_full_ner:B-T067)`, `chemdner_TEXT:MESH:D057846)`, `bionlp_st_2013_gro_NER:I-SignalingPathway)`, `bc5cdr_ner:I-Chemical)`, `nlm_gene_ner:I-STARGENE)`, `medmentions_full_ner:B-T090)`, `medmentions_full_ner:I-T037)`, `medmentions_full_ner:B-T037)`, `minimayosrs_sts:6)`, `medmentions_full_ner:I-T020)`, `chebi_nactem_fullpaper_ner:B-Species)`, `mirna_ner:O)`, `bionlp_st_2011_id_RE:Participant)`, `bionlp_st_2013_ge_NER:B-Binding)`, `ddi_corpus_ner:B-DRUG)`, `medmentions_full_ner:I-T078)`, `chemdner_TEXT:MESH:D012965)`, `bionlp_st_2013_cg_ner:I-Organ)`, `bionlp_st_2011_id_NER:B-Binding)`, `chemdner_TEXT:MESH:D006571)`, `mayosrs_sts:4)`, `chemdner_TEXT:MESH:D026422)`, `genia_term_corpus_ner:I-RNA_NA)`, `bionlp_st_2011_epi_RE:None)`, `chemdner_TEXT:MESH:D012265)`, `medmentions_full_ner:B-T195)`, `chemdner_TEXT:MESH:D014443)`, `bionlp_st_2013_gro_ner:I-OrganicChemical)`, `ebm_pico_ner:B-Participant_Age)`, `chemdner_TEXT:MESH:D009584)`, `chemdner_TEXT:MESH:D010862)`, `verspoor_2013_ner:B-Concepts_Ideas)`, `bionlp_st_2013_gro_NER:B-ActivationOfProcess)`, `chemdner_TEXT:MESH:D010118)`, `biorelex_COREF:coref)`, `bionlp_st_2013_gro_ner:I-Enzyme)`, `chemdner_TEXT:MESH:D012530)`, `chemdner_TEXT:MESH:D002351)`, `biorelex_ner:B-gene)`, `chemdner_TEXT:MESH:D013213)`, `medmentions_full_ner:B-T103)`, `chemdner_TEXT:MESH:D010091)`, `ebm_pico_ner:B-Participant_Sex)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndDNA)`, `bionlp_st_2013_gro_ner:B-Phenotype)`, `chemdner_TEXT:MESH:D019791)`, `chemdner_TEXT:MESH:D014280)`, `chemdner_TEXT:MESH:D011094)`, `chia_RE:None)`, `biorelex_RE:None)`, `chemdner_TEXT:MESH:D005230)`, `verspoor_2013_ner:B-cohort-patient)`, `chemdner_TEXT:MESH:D013645)`, `bionlp_st_2013_gro_ner:B-SecondMessenger)`, `mlee_ner:B-Cellular_component)`, `bionlp_shared_task_2009_NER:I-Phosphorylation)`, `mlee_ner:B-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D017275)`, `chemdner_TEXT:MESH:D007053)`, `bionlp_st_2013_ge_RE:Site)`, `genia_term_corpus_ner:O)`, `chemprot_RE:CPR:6)`, `chemdner_TEXT:MESH:D006859)`, `genia_term_corpus_ner:I-other_name)`, `medmentions_full_ner:I-T042)`, `pdr_ner:O)`, `medmentions_full_ner:I-T057)`, `bionlp_st_2013_pc_RE:Product)`, `verspoor_2013_ner:B-size)`, `bionlp_st_2013_pc_NER:B-Acetylation)`, `medmentions_st21pv_ner:B-T017)`, `chia_ner:B-Temporal)`, `chemdner_TEXT:MESH:D003404)`, `bionlp_st_2013_gro_RE:None)`, `bionlp_shared_task_2009_NER:B-Gene_expression)`, `mqp_sts:3)`, `bionlp_st_2013_gro_ner:B-Chemical)`, `chemdner_TEXT:MESH:D013754)`, `mantra_gsc_en_medline_ner:B-GEOG)`, `mirna_ner:B-Specific_miRNAs)`, `chemdner_TEXT:MESH:D012492)`, `medmentions_full_ner:B-T190)`, `bionlp_st_2013_cg_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:B-RNA)`, `chemdner_TEXT:MESH:D011743)`, `chemdner_TEXT:MESH:D010795)`, `bionlp_st_2013_gro_NER:I-PositiveRegulation)`, `chemdner_TEXT:MESH:D002241)`, `medmentions_full_ner:B-T038)`, `bionlp_st_2013_gro_RE:hasAgent)`, `mlee_ner:B-Organism)`, `medmentions_full_ner:I-T168)`, `bioscope_abstracts_ner:O)`, `chemdner_TEXT:MESH:D002599)`, `bionlp_st_2013_pc_ner:I-Simple_chemical)`, `medmentions_full_ner:I-T066)`, `chemdner_TEXT:MESH:D019695)`, `bionlp_st_2013_ge_NER:I-Transcription)`, `mantra_gsc_en_emea_ner:B-DISO)`, `bionlp_st_2013_gro_NER:B-CellDeath)`, `medmentions_st21pv_ner:I-T031)`, `chemdner_TEXT:MESH:D004317)`, `bionlp_st_2013_gro_ner:B-TATAbox)`, `chemdner_TEXT:MESH:D052203)`, `bionlp_st_2013_gro_NER:B-CellFateDetermination)`, `medmentions_st21pv_ner:I-T022)`, `bionlp_st_2013_ge_NER:B-Protein_catabolism)`, `bionlp_st_2011_epi_NER:I-Catalysis)`, `verspoor_2013_ner:I-cohort-patient)`, `chemdner_TEXT:MESH:D010100)`, `an_em_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D045162)`, `chia_RE:Has_qualifier)`, `verspoor_2013_RE:has)`, `chemdner_TEXT:MESH:D021382)`, `bionlp_st_2013_ge_NER:B-Acetylation)`, `medmentions_full_ner:I-T079)`, `bionlp_st_2013_gro_NER:B-Maintenance)`, `biorelex_ner:I-protein-domain)`, `chebi_nactem_abstr_ann1_ner:I-Chemical)`, `bioscope_papers_ner:O)`, `chia_RE:Has_scope)`, `bc5cdr_ner:B-Disease)`, `mlee_ner:I-Cellular_component)`, `medmentions_full_ner:I-T195)`, `spl_adr_200db_train_ner:B-AdverseReaction)`, `bionlp_st_2013_gro_ner:I-Promoter)`, `medmentions_full_ner:B-T040)`, `chemdner_TEXT:MESH:D005960)`, `chemdner_TEXT:MESH:D004164)`, `chemdner_TEXT:MESH:D015032)`, `chemdner_TEXT:MESH:D014255)`, `ebm_pico_ner:B-Outcome_Pain)`, `bionlp_st_2013_gro_ner:I-UpstreamRegulatorySequence)`, `bionlp_st_2013_pc_NER:I-Positive_regulation)`, `bionlp_st_2013_cg_NER:I-Regulation)`, `chemdner_TEXT:MESH:D001151)`, `medmentions_full_ner:I-T077)`, `chemdner_TEXT:MESH:D000081)`, `bionlp_st_2013_gro_NER:B-Stabilization)`, `mayosrs_sts:1)`, `biorelex_ner:B-mutation)`, `chemdner_TEXT:MESH:D000241)`, `chemdner_TEXT:MESH:D007930)`, `bionlp_st_2013_gro_NER:B-MetabolicPathway)`, `chemdner_TEXT:MESH:D013629)`, `chemdner_TEXT:MESH:D016202)`, `tmvar_v1_ner:I-DNAMutation)`, `chemdner_TEXT:MESH:D012502)`, `chemdner_TEXT:MESH:D044945)`, `bionlp_st_2013_cg_ner:I-Cellular_component)`, `mlee_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:I-AP2EREBPRelatedDomain)`, `chemdner_TEXT:MESH:D002338)`, `mayosrs_sts:5)`, `bionlp_st_2013_gro_ner:B-Intron)`, `genia_term_corpus_ner:I-DNA_domain_or_region)`, `anat_em_ner:I-Immaterial_anatomical_entity)`, `bionlp_st_2013_gro_ner:B-MutatedProtein)`, `ebm_pico_ner:I-Outcome_Mortality)`, `bionlp_st_2013_gro_ner:B-ProteinCodingRegion)`, `chemdner_TEXT:MESH:D005047)`, `chia_ner:B-Mood)`, `medmentions_st21pv_ner:O)`, `cellfinder_ner:I-Species)`, `bionlp_st_2013_gro_ner:I-InorganicChemical)`, `bionlp_st_2011_id_ner:B-Entity)`, `bionlp_st_2013_cg_NER:I-Catabolism)`, `an_em_ner:I-Cellular_component)`, `medmentions_full_ner:B-T021)`, `bionlp_st_2013_gro_NER:B-Heterodimerization)`, `chemdner_TEXT:MESH:D008315)`, `medmentions_st21pv_ner:I-T170)`, `chemdner_TEXT:MESH:D050112)`, `chia_RE:Subsumes)`, `medmentions_full_ner:I-T099)`, `bionlp_st_2013_gro_ner:I-Protein)`, `chemdner_TEXT:MESH:D047071)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorActivity)`, `mlee_ner:B-Organism_subdivision)`, `chemdner_TEXT:MESH:D016559)`, `medmentions_full_ner:B-T129)`, `genia_term_corpus_ner:I-protein_molecule)`, `mlee_ner:B-Drug_or_compound)`, `bionlp_st_2013_gro_NER:B-Silencing)`, `bionlp_st_2013_gro_ner:I-MolecularStructure)`, `genia_term_corpus_ner:B-nucleotide)`, `chemdner_TEXT:MESH:D003042)`, `mantra_gsc_en_emea_ner:B-ANAT)`, `chemdner_TEXT:MESH:D006690)`, `genia_term_corpus_ner:I-ANDcell_linecell_line)`, `chemdner_TEXT:MESH:D005473)`, `mantra_gsc_en_medline_ner:I-PHYS)`, `bionlp_st_2013_cg_NER:B-Blood_vessel_development)`, `bionlp_st_2013_gro_ner:B-BetaScaffoldDomain_WithMinorGrooveContacts)`, `chemdner_TEXT:MESH:D001549)`, `chia_ner:B-Measurement)`, `bionlp_st_2011_id_ner:B-Regulon-operon)`, `bionlp_st_2013_cg_NER:B-Acetylation)`, `pdr_ner:B-Plant)`, `mlee_NER:B-Development)`, `linnaeus_filtered_ner:B-species)`, `bionlp_st_2013_pc_RE:AtLoc)`, `medmentions_full_ner:I-T192)`, `bionlp_st_2013_gro_ner:B-BindingSiteOfProtein)`, `bionlp_st_2013_ge_NER:B-Ubiquitination)`, `bionlp_st_2013_gro_ner:I-ProteinCodingDNARegion)`, `chemdner_TEXT:MESH:D009647)`, `bionlp_st_2013_gro_ner:I-Ligand)`, `bionlp_st_2011_id_ner:O)`, `bionlp_st_2013_gro_NER:I-RNASplicing)`, `bionlp_st_2013_gro_ner:I-ComplexOfProteinAndRNA)`, `bionlp_st_2011_id_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D007501)`, `ehr_rel_sts:5)`, `bionlp_st_2013_gro_ner:B-TranscriptionRegulator)`, `medmentions_full_ner:B-T089)`, `bionlp_st_2011_epi_NER:I-DNA_demethylation)`, `mirna_ner:B-Species)`, `bionlp_st_2013_gro_ner:I-TranscriptionRegulator)`, `bionlp_st_2013_gro_NER:B-ProteinBiosynthesis)`, `scai_chemical_ner:B-ABBREVIATION)`, `bionlp_st_2013_gro_ner:I-Virus)`, `bionlp_st_2011_ge_NER:O)`, `medmentions_full_ner:B-T203)`, `bionlp_st_2013_cg_NER:I-Mutation)`, `bionlp_st_2013_gro_ner:B-ThreeDimensionalMolecularStructure)`, `genetaggold_ner:I-NEWGENE)`, `chemdner_TEXT:MESH:D010705)`, `chia_ner:I-Mood)`, `medmentions_full_ner:I-T068)`, `minimayosrs_sts:4)`, `medmentions_full_ner:I-T097)`, `bionlp_st_2013_gro_ner:I-BetaScaffoldDomain_WithMinorGrooveContacts)`, `mantra_gsc_en_emea_ner:I-PHYS)`, `medmentions_full_ner:I-T104)`, `bio_sim_verb_sts:5)`, `chebi_nactem_abstr_ann1_ner:B-Biological_Activity)`, `bionlp_st_2013_gro_NER:B-IntraCellularProcess)`, `mantra_gsc_en_emea_ner:I-PHEN)`, `mlee_ner:B-Cell)`, `chemdner_TEXT:MESH:D045784)`, `bionlp_st_2013_gro_ner:I-Vitamin)`, `chemdner_TEXT:MESH:D010416)`, `bionlp_st_2013_gro_ner:B-FusionGene)`, `bionlp_st_2013_gro_ner:I-FusionProtein)`, `mlee_NER:B-Remodeling)`, `minimayosrs_sts:8)`, `bionlp_st_2013_gro_ner:B-Enhancer)`, `mantra_gsc_en_emea_ner:O)`, `bionlp_st_2013_gro_ner:B-OpenReadingFrame)`, `bionlp_st_2013_pc_COREF:None)`, `medmentions_full_ner:I-T123)`, `bionlp_st_2013_gro_NER:I-RegulatoryProcess)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfGeneExpression)`, `nlm_gene_ner:B-Domain)`, `bionlp_st_2013_pc_NER:B-Methylation)`, `medmentions_full_ner:B-T057)`, `chemdner_TEXT:MESH:D010226)`, `bionlp_st_2013_gro_ner:B-GeneProduct)`, `ebm_pico_ner:I-Outcome_Other)`, `chemdner_TEXT:MESH:D005223)`, `pdr_RE:Theme)`, `bionlp_shared_task_2009_NER:B-Protein_catabolism)`, `chemdner_TEXT:MESH:D019344)`, `gnormplus_ner:I-FamilyName)`, `verspoor_2013_ner:B-gender)`, `bionlp_st_2013_gro_NER:B-TranscriptionInitiation)`, `spl_adr_200db_train_ner:B-Severity)`, `medmentions_st21pv_ner:B-T097)`, `anat_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_NER:I-RNAMetabolism)`, `bioinfer_ner:I-Protein_complex)`, `anat_em_ner:I-Cell)`, `bionlp_st_2013_gro_ner:B-ProteinDomain)`, `bionlp_st_2013_gro_ner:I-PrimaryStructure)`, `genia_term_corpus_ner:I-other_artificial_source)`, `chemdner_TEXT:MESH:D010098)`, `bionlp_st_2013_gro_ner:I-Enhancer)`, `bionlp_st_2013_gro_ner:I-PositiveTranscriptionRegulator)`, `chemdner_TEXT:MESH:D004051)`, `chemdner_TEXT:MESH:D013853)`, `chebi_nactem_fullpaper_ner:B-Metabolite)`, `diann_iber_eval_en_ner:B-Disability)`, `biorelex_ner:B-peptide)`, `medmentions_full_ner:B-T048)`, `bionlp_st_2013_gro_ner:I-Function)`, `genia_term_corpus_ner:I-DNA_NA)`, `mlee_ner:I-Anatomical_system)`, `bioinfer_ner:B-Individual_protein)`, `verspoor_2013_ner:I-Physiology)`, `genia_term_corpus_ner:I-RNA_molecule)`, `chemdner_TEXT:MESH:D000255)`, `minimayosrs_sts:7)`, `mlee_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-ResponseProcess)`, `mantra_gsc_en_medline_ner:I-LIVB)`, `chemdner_TEXT:MESH:D010649)`, `seth_corpus_ner:B-Gene)`, `bionlp_st_2013_gro_ner:B-Attenuator)`, `chemdner_TEXT:MESH:D015363)`, `bionlp_st_2013_pc_NER:B-Inactivation)`, `medmentions_full_ner:I-T191)`, `mlee_ner:I-Organ)`, `chemdner_TEXT:MESH:D011765)`, `bionlp_shared_task_2009_NER:B-Binding)`, `an_em_ner:B-Cellular_component)`, `genia_term_corpus_ner:I-RNA_substructure)`, `medmentions_full_ner:B-T051)`, `anat_em_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_RE:hasPatient3)`, `chemdner_TEXT:MESH:D013634)`, `chemdner_TEXT:MESH:D014414)`, `chia_RE:Has_index)`, `ddi_corpus_ner:B-GROUP)`, `bionlp_st_2013_gro_ner:B-MutantProtein)`, `bionlp_st_2013_ge_NER:I-Negative_regulation)`, `biorelex_ner:I-amino-acid)`, `chemdner_TEXT:MESH:D053279)`, `chemprot_RE:CPR:2)`, `bionlp_st_2013_gro_ner:B-bHLHTF)`, `bionlp_st_2013_cg_NER:I-Breakdown)`, `scai_chemical_ner:I-ABBREVIATION)`, `pdr_NER:B-Cause_of_disease)`, `chemdner_TEXT:MESH:D002219)`, `medmentions_full_ner:B-T044)`, `mirna_ner:B-Non-Specific_miRNAs)`, `chemdner_TEXT:MESH:D020748)`, `bionlp_shared_task_2009_RE:Theme)`, `chemdner_TEXT:MESH:D001647)`, `bionlp_st_2011_ge_NER:I-Regulation)`, `bionlp_st_2013_pc_ner:B-Gene_or_gene_product)`, `biorelex_ner:I-protein)`, `mantra_gsc_en_medline_ner:B-PROC)`, `medmentions_full_ner:I-T081)`, `medmentions_st21pv_ner:B-T022)`, `chia_ner:B-Multiplier)`, `bionlp_st_2013_gro_NER:B-GeneMutation)`, `chemdner_TEXT:MESH:D002232)`, `chemdner_TEXT:MESH:D010456)`, `biosses_sts:7)`, `medmentions_full_ner:B-T071)`, `chemdner_TEXT:MESH:D008628)`, `biorelex_ner:I-protein-complex)`, `chemdner_TEXT:MESH:D007328)`, `bionlp_st_2013_pc_NER:I-Activation)`, `bionlp_st_2013_cg_NER:B-Metabolism)`, `scai_chemical_ner:I-PARTIUPAC)`, `verspoor_2013_ner:B-age)`, `medmentions_full_ner:B-T122)`, `medmentions_full_ner:I-T050)`, `genia_term_corpus_ner:B-ANDother_nameother_name)`, `bionlp_st_2013_gro_NER:B-SPhase)`, `chemdner_TEXT:MESH:D012500)`, `mlee_NER:B-Metabolism)`, `bionlp_st_2011_id_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D002794)`, `bionlp_st_2013_gro_NER:B-ProteinTransport)`, `chemdner_TEXT:MESH:D006028)`, `bionlp_st_2013_gro_RE:hasPatient2)`, `chemdner_TEXT:MESH:D009822)`, `bionlp_st_2013_cg_ner:I-Cancer)`, `bionlp_shared_task_2009_ner:I-Entity)`, `pcr_ner:B-Herb)`, `pubmed_qa_labeled_fold0_CLF:yes)`, `bionlp_st_2013_gro_NER:I-NegativeRegulation)`, `bionlp_st_2013_cg_NER:B-Dephosphorylation)`, `anat_em_ner:B-Multi-tissue_structure)`, `chemdner_TEXT:MESH:D008274)`, `medmentions_full_ner:B-T025)`, `chemprot_RE:CPR:9)`, `bionlp_st_2013_pc_RE:Participant)`, `bionlp_st_2013_pc_ner:B-Simple_chemical)`, `genia_term_corpus_ner:B-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:B-bZIP)`, `bionlp_st_2013_gro_ner:I-Eukaryote)`, `bionlp_st_2013_pc_ner:I-Complex)`, `hprd50_ner:I-protein)`, `medmentions_full_ner:B-T020)`, `bionlp_st_2013_gro_ner:B-Agonist)`, `medmentions_full_ner:B-T030)`, `chemdner_TEXT:MESH:D009536)`, `medmentions_full_ner:B-T169)`, `genia_term_corpus_ner:I-nucleotide)`, `bionlp_st_2013_gro_NER:I-ProteinCatabolism)`, `bc5cdr_ner:O)`, `chemdner_TEXT:MESH:D003078)`, `medmentions_full_ner:I-T040)`, `chemdner_TEXT:MESH:D005963)`, `bionlp_st_2013_gro_ner:B-ExpressionProfiling)`, `mantra_gsc_en_emea_ner:I-DEVI)`, `mlee_NER:B-Cell_division)`, `ebm_pico_ner:B-Intervention_Pharmacological)`, `chemdner_TEXT:MESH:D008790)`, `mantra_gsc_en_emea_ner:I-ANAT)`, `mantra_gsc_en_medline_ner:B-ANAT)`, `chemdner_TEXT:MESH:D003545)`, `bionlp_st_2013_gro_NER:I-IntraCellularTransport)`, `bionlp_st_2013_gro_NER:I-CellDivision)`, `chemdner_TEXT:MESH:D013438)`, `bionlp_st_2011_id_NER:I-Negative_regulation)`, `bionlp_st_2013_gro_NER:I-DevelopmentalProcess)`, `mlee_ner:B-Protein_domain_or_region)`, `chemdner_TEXT:MESH:D014978)`, `bionlp_st_2011_id_NER:O)`, `bionlp_st_2013_gro_ner:I-ReporterGeneConstruction)`, `medmentions_full_ner:I-T025)`, `bionlp_st_2019_bb_RE:Exhibits)`, `ddi_corpus_ner:I-GROUP)`, `chemdner_TEXT:MESH:D011241)`, `chemdner_TEXT:MESH:D010446)`, `bionlp_st_2013_gro_ner:I-ExperimentalMethod)`, `anat_em_ner:B-Tissue)`, `chemdner_TEXT:MESH:D000470)`, `bionlp_st_2013_pc_NER:I-Inactivation)`, `bionlp_st_2013_gro_ner:I-Agonist)`, `medmentions_full_ner:B-T024)`, `mlee_NER:I-Transcription)`, `bionlp_st_2011_epi_NER:B-Deglycosylation)`, `bionlp_st_2013_cg_NER:B-Cell_death)`, `chemdner_TEXT:MESH:D000266)`, `chemdner_TEXT:MESH:D019833)`, `genia_term_corpus_ner:I-RNA_family_or_group)`, `biosses_sts:8)`, `lll_RE:genic_interaction)`, `bionlp_st_2013_gro_ner:B-OrganicChemical)`, `chemdner_TEXT:MESH:D013267)`, `bionlp_st_2013_gro_ner:I-TranscriptionCofactor)`, `biorelex_ner:B-protein-region)`, `chemdner_TEXT:MESH:D001565)`, `genia_term_corpus_ner:B-cell_line)`, `bionlp_st_2013_gro_NER:B-Cleavage)`, `ddi_corpus_RE:EFFECT)`, `bionlp_st_2013_cg_NER:B-Planned_process)`, `bionlp_st_2013_cg_ner:I-Immaterial_anatomical_entity)`, `chemdner_TEXT:MESH:D007660)`, `medmentions_full_ner:I-T090)`, `bionlp_st_2013_gro_ner:I-CpGIsland)`, `bionlp_st_2013_gro_ner:B-AminoAcid)`, `chemdner_TEXT:MESH:D001095)`, `mlee_NER:I-Death)`, `bionlp_st_2013_cg_ner:I-Anatomical_system)`, `bionlp_st_2013_gro_NER:B-Decrease)`, `bionlp_st_2013_pc_NER:B-Hydroxylation)`, `chemdner_TEXT:None)`, `bio_sim_verb_sts:3)`, `biorelex_ner:B-protein)`, `bionlp_st_2013_gro_ner:I-BasicDomain)`, `bionlp_st_2011_ge_ner:I-Entity)`, `bionlp_st_2013_gro_ner:B-PhysicalContinuant)`, `chemprot_RE:CPR:4)`, `chemdner_TEXT:MESH:D003345)`, `chemdner_TEXT:MESH:D010080)`, `mantra_gsc_en_patents_ner:O)`, `bionlp_st_2013_gro_ner:B-AntisenseRNA)`, `bionlp_st_2013_gro_ner:B-ProteinCodingDNARegion)`, `chemdner_TEXT:MESH:D010768)`, `chebi_nactem_fullpaper_ner:I-Protein)`, `genia_term_corpus_ner:I-multi_cell)`, `bionlp_st_2013_gro_ner:I-Gene)`, `medmentions_full_ner:B-T042)`, `chemdner_TEXT:MESH:D006034)`, `biorelex_ner:I-brand)`, `chebi_nactem_abstr_ann1_ner:I-Species)`, `chemdner_TEXT:MESH:D012236)`, `bionlp_st_2013_gro_ner:I-GeneProduct)`, `chemdner_TEXT:MESH:D005665)`, `chemdner_TEXT:MESH:D008715)`, `medmentions_st21pv_ner:I-T103)`, `ddi_corpus_RE:None)`, `medmentions_st21pv_ner:I-T091)`, `chemdner_TEXT:MESH:D019158)`, `chemdner_TEXT:MESH:D001280)`, `chemdner_TEXT:MESH:D009249)`, `medmentions_full_ner:I-T067)`, `medmentions_full_ner:B-T005)`, `bionlp_st_2013_cg_NER:I-Remodeling)`, `chemdner_TEXT:MESH:D000166)`, `osiris_ner:B-variant)`, `spl_adr_200db_train_ner:I-DrugClass)`, `mirna_ner:I-Species)`, `medmentions_st21pv_ner:I-T033)`, `ebm_pico_ner:I-Participant_Age)`, `medmentions_full_ner:B-T095)`, `bionlp_st_2013_gro_NER:B-RNAMetabolism)`, `chemdner_TEXT:MESH:D005231)`, `medmentions_full_ner:B-T062)`, `bionlp_st_2011_ge_NER:I-Gene_expression)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactor)`, `genia_term_corpus_ner:B-protein_domain_or_region)`, `mantra_gsc_en_emea_ner:B-PROC)`, `mlee_NER:I-Pathway)`, `bionlp_st_2013_gro_NER:I-BindingOfProteinToProteinBindingSiteOfProtein)`, `bionlp_st_2011_id_COREF:coref)`, `biosses_sts:6)`, `biorelex_ner:I-organism)`, `chia_ner:B-Value)`, `verspoor_2013_ner:B-body-part)`, `chemdner_TEXT:MESH:D004974)`, `chia_RE:Has_mood)`, `medmentions_st21pv_ner:B-T074)`, `chemdner_TEXT:MESH:D000535)`, `verspoor_2013_ner:I-Disorder)`, `bionlp_st_2013_gro_NER:B-BindingToMolecularEntity)`, `bionlp_st_2013_gro_ner:I-ReporterGene)`, `mayosrs_sts:8)`, `bionlp_st_2013_cg_ner:I-DNA_domain_or_region)`, `bionlp_st_2013_gro_NER:I-Pathway)`, `medmentions_st21pv_ner:I-T168)`, `bionlp_st_2013_gro_NER:B-NegativeRegulation)`, `medmentions_full_ner:B-T123)`, `bionlp_st_2013_pc_NER:B-Positive_regulation)`, `bionlp_st_2013_gro_NER:I-FormationOfProteinDNAComplex)`, `chemdner_TEXT:MESH:D000577)`, `mlee_NER:B-Ubiquitination)`, `chemdner_TEXT:MESH:D003630)`, `bionlp_st_2013_gro_ner:B-Transcript)`, `bionlp_st_2013_cg_NER:I-Transcription)`, `anat_em_ner:B-Organ)`, `anat_em_ner:I-Organism_substance)`, `spl_adr_200db_train_ner:B-DrugClass)`, `bionlp_st_2013_gro_ner:I-ProteinSubunit)`, `biorelex_ner:B-protein-domain)`, `chemdner_TEXT:MESH:D006051)`, `bionlp_st_2011_id_NER:B-Process)`, `bionlp_st_2013_pc_NER:B-Ubiquitination)`, `bionlp_st_2013_pc_NER:B-Transcription)`, `chemdner_TEXT:MESH:D006838)`, `bionlp_st_2013_gro_RE:hasPatient5)`, `bionlp_st_2013_ge_NER:B-Localization)`, `chemdner_TEXT:MESH:D011759)`, `chemdner_TEXT:MESH:D053243)`, `biorelex_ner:I-mutation)`, `mantra_gsc_en_emea_ner:I-LIVB)`, `bionlp_st_2013_gro_NER:I-Transport)`, `bionlp_st_2011_id_RE:Site)`, `chemdner_TEXT:MESH:D015474)`, `bionlp_st_2013_gro_NER:B-Dimerization)`, `bionlp_st_2013_cg_NER:I-Localization)`, `medmentions_full_ner:I-T032)`, `chemdner_TEXT:MESH:D018036)`, `medmentions_full_ner:I-T167)`, `chemprot_RE:CPR:5)`, `minimayosrs_sts:2)`, `biorelex_ner:B-protein-DNA-complex)`, `cellfinder_ner:I-CellComponent)`, `nlm_gene_ner:B-Other)`, `medmentions_full_ner:I-T019)`, `chebi_nactem_abstr_ann1_ner:B-Spectral_Data)`, `bionlp_st_2013_cg_ner:I-Multi-tissue_structure)`, `medmentions_full_ner:B-T010)`, `mantra_gsc_en_medline_ner:I-GEOG)`, `chemprot_ner:I-GENE-Y)`, `mirna_ner:I-Diseases)`, `an_em_ner:O)`, `bionlp_st_2013_cg_NER:B-Remodeling)`, `medmentions_st21pv_ner:I-T058)`, `scicite_TEXT:background)`, `bionlp_st_2013_cg_NER:B-Mutation)`, `genia_term_corpus_ner:B-mono_cell)`, `bionlp_st_2013_gro_ner:B-DNA)`, `medmentions_full_ner:I-T114)`, `bionlp_st_2011_id_RE:Theme)`, `genetaggold_ner:B-NEWGENE)`, `mlee_ner:I-Organism_subdivision)`, `bionlp_shared_task_2009_NER:I-Regulation)`, `bionlp_st_2013_gro_ner:B-Microorganism)`, `chemdner_TEXT:MESH:D006108)`, `biorelex_ner:B-amino-acid)`, `bioinfer_ner:I-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-Chemical)`, `mantra_gsc_en_patents_ner:I-DEVI)`, `mantra_gsc_en_medline_ner:O)`, `bionlp_st_2013_pc_NER:I-Regulation)`, `medmentions_full_ner:B-T043)`, `scicite_TEXT:result)`, `bionlp_st_2013_ge_NER:I-Binding)`, `chemdner_TEXT:MESH:D011441)`, `genia_term_corpus_ner:I-protein_domain_or_region)`, `bionlp_st_2011_epi_RE:Cause)`, `bionlp_st_2013_gro_ner:B-Nucleosome)`, `chemdner_TEXT:MESH:D011223)`, `chebi_nactem_abstr_ann1_ner:B-Protein)`, `bionlp_st_2013_gro_RE:hasFunction)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorActivity)`, `biorelex_ner:B-protein-family)`, `bionlp_st_2013_cg_ner:B-Gene_or_gene_product)`, `tmvar_v1_ner:B-SNP)`, `bionlp_st_2013_gro_ner:B-ExperimentalMethod)`, `bionlp_st_2013_gro_ner:B-ReporterGeneConstruction)`, `bionlp_st_2011_ge_NER:B-Transcription)`, `chemdner_TEXT:MESH:D004041)`, `chemdner_TEXT:MESH:D000631)`, `chebi_nactem_fullpaper_ner:I-Species)`, `medmentions_full_ner:B-T170)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelix)`, `bionlp_st_2013_cg_ner:B-Organism_subdivision)`, `genia_term_corpus_ner:I-DNA_molecule)`, `bionlp_st_2013_cg_NER:I-Glycolysis)`, `an_em_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_NER:B-TranscriptionTermination)`, `bionlp_st_2013_gro_NER:B-CellAging)`, `bionlp_st_2013_cg_ner:B-Protein_domain_or_region)`, `anat_em_ner:B-Organism_substance)`, `medmentions_full_ner:B-T053)`, `mlee_ner:B-Multi-tissue_structure)`, `biosses_sts:4)`, `bioscope_abstracts_ner:I-speculation)`, `chemdner_TEXT:MESH:D053644)`, `bionlp_st_2013_cg_NER:I-Translation)`, `tmvar_v1_ner:B-DNAMutation)`, `genia_term_corpus_ner:B-RNA_substructure)`, `an_em_ner:B-Anatomical_system)`, `bionlp_st_2013_gro_ner:B-Conformation)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfTranscriptionOfGene)`, `medmentions_full_ner:I-T069)`, `chemdner_TEXT:MESH:D006820)`, `chemdner_TEXT:MESH:D015725)`, `chemdner_TEXT:MESH:D010281)`, `mlee_NER:B-Pathway)`, `bionlp_st_2011_id_NER:I-Regulation)`, `bionlp_st_2013_gro_NER:I-GeneExpression)`, `medmentions_full_ner:I-T073)`, `biosses_sts:2)`, `medmentions_full_ner:I-T043)`, `chemdner_TEXT:MESH:D001152)`, `bionlp_st_2013_gro_ner:I-DNAMolecule)`, `chemdner_TEXT:MESH:D015636)`, `chemdner_TEXT:MESH:D000666)`, `chemprot_RE:None)`, `bionlp_st_2013_gro_ner:B-Sequence)`, `chemdner_TEXT:MESH:D009151)`, `chia_ner:B-Observation)`, `an_em_COREF:coref)`, `medmentions_full_ner:B-T120)`, `bionlp_st_2013_gro_ner:B-Tissue)`, `bionlp_st_2013_gro_ner:B-MolecularEntity)`, `bionlp_st_2013_pc_NER:B-Dephosphorylation)`, `chemdner_TEXT:MESH:D044242)`, `bionlp_st_2013_gro_ner:B-FusionProtein)`, `biorelex_ner:B-cell)`, `bionlp_st_2013_gro_NER:B-Disease)`, `bionlp_st_2011_id_RE:None)`, `biorelex_ner:B-protein-motif)`, `bionlp_st_2013_pc_NER:I-Localization)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomain)`, `bionlp_st_2013_gro_ner:B-Locus)`, `genia_term_corpus_ner:B-other_organic_compound)`, `seth_corpus_ner:B-SNP)`, `pcr_ner:O)`, `genia_term_corpus_ner:I-virus)`, `bionlp_st_2013_gro_ner:I-Peptide)`, `chebi_nactem_abstr_ann1_ner:B-Chemical)`, `bionlp_st_2013_gro_ner:B-RNAMolecule)`, `bionlp_st_2013_gro_ner:B-SequenceHomologyAnalysis)`, `chemdner_TEXT:MESH:D005054)`, `bionlp_st_2013_ge_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:B-CellularProcess)`, `bionlp_st_2013_ge_RE:Site2)`, `verspoor_2013_ner:B-Phenomena)`, `chia_ner:I-Temporal)`, `bionlp_st_2013_gro_NER:I-Localization)`, `bionlp_st_2013_cg_NER:B-Ubiquitination)`, `chemdner_TEXT:MESH:D009020)`, `bionlp_st_2013_cg_RE:FromLoc)`, `mlee_ner:B-Organism_substance)`, `genia_term_corpus_ner:I-tissue)`, `medmentions_st21pv_ner:I-T082)`, `chemdner_TEXT:MESH:D054358)`, `medmentions_full_ner:I-T052)`, `chemdner_TEXT:MESH:D005459)`, `chemdner_TEXT:MESH:D047188)`, `medmentions_full_ner:I-T031)`, `chemdner_TEXT:MESH:D013890)`, `chemdner_TEXT:MESH:D004573)`, `genia_term_corpus_ner:B-peptide)`, `an_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_ner:B-MessengerRNA)`, `medmentions_full_ner:B-T171)`, `bionlp_st_2013_gro_NER:B-Affecting)`, `genia_term_corpus_ner:I-body_part)`, `bionlp_st_2013_gro_ner:B-Prokaryote)`, `chemdner_TEXT:MESH:D013844)`, `medmentions_full_ner:I-T061)`, `bionlp_st_2013_pc_NER:B-Negative_regulation)`, `bionlp_st_2013_gro_ner:I-EukaryoticCell)`, `pdr_ner:I-Plant)`, `chemdner_TEXT:MESH:D024341)`, `medmentions_full_ner:I-T092)`, `chemdner_TEXT:MESH:D020319)`, `bionlp_st_2013_cg_NER:B-Cell_transformation)`, `bionlp_st_2013_gro_NER:B-BindingOfTranscriptionFactorToDNA)`, `an_em_ner:I-Anatomical_system)`, `bionlp_st_2011_epi_NER:B-Hydroxylation)`, `bionlp_st_2013_gro_ner:I-Exon)`, `cellfinder_ner:B-Species)`, `bionlp_st_2013_gro_NER:B-Pathway)`, `bionlp_st_2013_ge_NER:B-Protein_modification)`, `bionlp_st_2013_gro_ner:I-FusionGene)`, `bionlp_st_2011_rel_ner:B-Entity)`, `bionlp_st_2011_id_RE:CSite)`, `bionlp_st_2013_ge_NER:B-Positive_regulation)`, `bionlp_st_2013_gro_ner:I-BindingAssay)`, `bionlp_st_2013_gro_NER:B-CellDivision)`, `bionlp_st_2019_bb_ner:I-Microorganism)`, `medmentions_full_ner:I-T059)`, `chemdner_TEXT:MESH:D011108)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-GeneRegion)`, `bionlp_st_2013_cg_COREF:None)`, `chemdner_TEXT:MESH:D010261)`, `mlee_NER:B-Binding)`, `chemprot_ner:I-CHEMICAL)`, `bionlp_st_2011_id_RE:ToLoc)`, `biorelex_ner:I-organelle)`, `chemdner_TEXT:MESH:D004318)`, `genia_term_corpus_ner:I-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:B-RNAPolymerase)`, `bionlp_st_2013_gro_ner:B-CellComponent)`, `bionlp_st_2013_gro_NER:B-RegulationOfGeneExpression)`, `bionlp_st_2013_gro_ner:B-Peptide)`, `bionlp_shared_task_2009_NER:B-Transcription)`, `biorelex_ner:B-tissue)`, `pico_extraction_ner:B-participant)`, `chia_ner:I-Visit)`, `chemdner_TEXT:MESH:D011807)`, `chemdner_TEXT:MESH:D014501)`, `bionlp_st_2013_gro_NER:I-IntraCellularProcess)`, `ehr_rel_sts:7)`, `pico_extraction_ner:I-intervention)`, `chemdner_TEXT:MESH:D001599)`, `bionlp_st_2013_gro_ner:I-RegulatoryDNARegion)`, `medmentions_st21pv_ner:I-T037)`, `chemdner_TEXT:MESH:D055768)`, `bionlp_st_2013_gro_ner:B-ChromosomalDNA)`, `chemdner_TEXT:MESH:D008550)`, `bionlp_st_2013_pc_RE:Site)`, `medmentions_full_ner:I-T087)`, `chemdner_TEXT:MESH:D001583)`, `bionlp_st_2011_epi_NER:B-Dehydroxylation)`, `ehr_rel_sts:3)`, `bionlp_st_2013_gro_ner:I-MutantProtein)`, `chemdner_TEXT:MESH:D011804)`, `medmentions_full_ner:B-T091)`, `bionlp_st_2013_cg_RE:CSite)`, `linnaeus_ner:O)`, `medmentions_st21pv_ner:B-T201)`, `verspoor_2013_ner:B-Disorder)`, `bionlp_st_2013_cg_NER:I-Death)`, `bioinfer_ner:I-Individual_protein)`, `medmentions_full_ner:B-T191)`, `verspoor_2013_ner:B-ethnicity)`, `chemdner_TEXT:MESH:D002083)`, `genia_term_corpus_ner:B-carbohydrate)`, `genia_term_corpus_ner:B-DNA_molecule)`, `medmentions_full_ner:B-T069)`, `pdr_NER:I-Treatment_of_disease)`, `mlee_ner:B-Anatomical_system)`, `chebi_nactem_fullpaper_ner:B-Spectral_Data)`, `chemdner_TEXT:MESH:D005419)`, `bionlp_st_2013_gro_ner:I-Nucleotide)`, `medmentions_full_ner:B-T194)`, `chemdner_TEXT:MESH:D005947)`, `chemdner_TEXT:MESH:D008627)`, `bionlp_st_2013_gro_NER:B-ExperimentalIntervention)`, `chemdner_TEXT:MESH:D011073)`, `chia_RE:Has_negation)`, `verspoor_2013_ner:I-mutation)`, `chemdner_TEXT:MESH:D004224)`, `chemdner_TEXT:MESH:D005663)`, `medmentions_full_ner:I-T094)`, `chemdner_TEXT:MESH:D006877)`, `ebm_pico_ner:B-Outcome_Mortality)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressor)`, `biorelex_ner:I-cell)`, `bionlp_st_2013_gro_NER:I-BindingOfProteinToDNA)`, `verspoor_2013_RE:None)`, `bionlp_st_2013_gro_NER:B-ProteinModification)`, `chemdner_TEXT:MESH:D047090)`, `medmentions_full_ner:I-T204)`, `chemdner_TEXT:MESH:D006843)`, `biorelex_ner:I-protein-family)`, `chemdner_TEXT:MESH:D012694)`, `bionlp_st_2013_gro_ner:B-TranslationFactor)`, `scai_chemical_ner:B-)`, `bionlp_st_2013_gro_ner:B-Exon)`, `medmentions_full_ner:I-T083)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivatorActivity)`, `medmentions_full_ner:I-T101)`, `medmentions_full_ner:B-T034)`, `bionlp_st_2013_gro_ner:I-Histone)`, `ddi_corpus_RE:MECHANISM)`, `mantra_gsc_en_emea_ner:I-PROC)`, `genia_term_corpus_ner:I-peptide)`, `bionlp_st_2013_cg_NER:B-Cell_proliferation)`, `chemdner_TEXT:MESH:D004140)`, `medmentions_full_ner:B-T083)`, `diann_iber_eval_en_ner:I-Disability)`, `bionlp_st_2013_gro_NER:B-PosttranslationalModification)`, `biorelex_ner:I-fusion-protein)`, `chemdner_TEXT:MESH:D020910)`, `chemdner_TEXT:MESH:D014747)`, `bionlp_st_2013_ge_NER:B-Gene_expression)`, `biorelex_ner:I-tissue)`, `mantra_gsc_en_patents_ner:B-LIVB)`, `medmentions_full_ner:O)`, `medmentions_full_ner:B-T077)`, `bionlp_st_2013_gro_ner:I-Operon)`, `chemdner_TEXT:MESH:D002392)`, `chemdner_TEXT:MESH:D014498)`, `chemdner_TEXT:MESH:D002368)`, `chemdner_TEXT:MESH:D018817)`, `bionlp_st_2013_ge_NER:I-Regulation)`, `genia_term_corpus_ner:B-atom)`, `chemdner_TEXT:MESH:D011092)`, `chemdner_TEXT:MESH:D015283)`, `chemdner_TEXT:MESH:D018698)`, `chemdner_TEXT:MESH:D009569)`, `muchmore_en_ner:I-umlsterm)`, `bionlp_st_2013_cg_NER:B-Death)`, `nlm_gene_ner:I-Other)`, `medmentions_full_ner:B-T109)`, `osiris_ner:I-variant)`, `ehr_rel_sts:6)`, `chemdner_TEXT:MESH:D001120)`, `mlee_ner:I-Protein_domain_or_region)`, `bionlp_st_2013_pc_NER:B-Dissociation)`, `bionlp_st_2013_cg_NER:B-Metastasis)`, `chemdner_TEXT:MESH:D014204)`, `chemdner_TEXT:MESH:D005857)`, `medmentions_full_ner:I-T030)`, `chemdner_TEXT:MESH:D019256)`, `bionlp_st_2013_gro_ner:B-Polymerase)`, `chia_ner:B-Negation)`, `bionlp_st_2013_gro_NER:B-CellularMetabolicProcess)`, `bionlp_st_2013_gro_NER:B-CellDifferentiation)`, `biorelex_ner:I-protein-motif)`, `medmentions_full_ner:I-T093)`, `chemdner_TEXT:MESH:D019820)`, `anat_em_ner:B-Pathological_formation)`, `bionlp_shared_task_2009_NER:B-Localization)`, `genia_term_corpus_ner:B-RNA_domain_or_region)`, `chemdner_TEXT:MESH:D014668)`, `bionlp_st_2013_pc_ner:I-Gene_or_gene_product)`, `chemdner_TEXT:MESH:D019207)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToProteinBindingSiteOfDNA)`, `medmentions_full_ner:B-T059)`, `bionlp_st_2013_gro_ner:B-Ligand)`, `bio_sim_verb_sts:6)`, `biorelex_ner:B-experimental-construct)`, `bionlp_st_2013_gro_ner:I-DNA)`, `pdr_NER:O)`, `chemdner_TEXT:MESH:D008670)`, `bionlp_st_2011_ge_RE:Cause)`, `chemdner_TEXT:MESH:D015232)`, `bionlp_st_2013_pc_NER:O)`, `bionlp_st_2013_gro_NER:B-FormationOfProteinDNAComplex)`, `medmentions_full_ner:B-T121)`, `bionlp_shared_task_2009_NER:B-Regulation)`, `chemdner_TEXT:MESH:D009534)`, `chemdner_TEXT:MESH:D014451)`, `bionlp_st_2011_id_RE:AtLoc)`, `chemdner_TEXT:MESH:D011799)`, `medmentions_st21pv_ner:B-T204)`, `genia_term_corpus_ner:I-protein_subunit)`, `biorelex_ner:I-assay)`, `chemdner_TEXT:MESH:D005680)`, `an_em_ner:I-Organism_substance)`, `chemdner_TEXT:MESH:D010368)`, `chemdner_TEXT:MESH:D000872)`, `bionlp_st_2011_id_NER:I-Gene_expression)`, `bionlp_st_2013_cg_NER:B-Regulation)`, `mlee_ner:I-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D001393)`, `medmentions_full_ner:I-T038)`, `chemdner_TEXT:MESH:D047311)`, `chemdner_TEXT:MESH:D011453)`, `chemdner_TEXT:MESH:D020106)`, `chemdner_TEXT:MESH:D019257)`, `bionlp_st_2013_gro_ner:B-NuclearReceptor)`, `chemdner_TEXT:MESH:D002117)`, `genia_term_corpus_ner:B-lipid)`, `bionlp_st_2013_gro_ner:B-SmallInterferingRNA)`, `chemdner_TEXT:MESH:D011205)`, `chemdner_TEXT:MESH:D002686)`, `bionlp_st_2013_gro_NER:B-Translation)`, `ebm_pico_ner:I-Intervention_Psychological)`, `mlee_ner:I-Drug_or_compound)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D000688)`, `bionlp_st_2011_ge_RE:None)`, `bionlp_st_2013_gro_ner:B-ProteinSubunit)`, `genia_term_corpus_ner:I-ANDother_nameother_name)`, `bionlp_st_2013_gro_NER:I-Heterodimerization)`, `pico_extraction_ner:B-intervention)`, `bionlp_st_2013_cg_ner:I-Organism)`, `bionlp_st_2013_gro_ner:I-ProteinDomain)`, `bionlp_st_2013_gro_NER:I-BindingToProtein)`, `scai_chemical_ner:I-)`, `biorelex_ner:B-experiment-tag)`, `ebm_pico_ner:B-Intervention_Physical)`, `bionlp_st_2013_cg_RE:ToLoc)`, `bionlp_st_2013_gro_NER:B-FormationOfTranscriptionFactorComplex)`, `linnaeus_ner:B-species)`, `medmentions_full_ner:I-T062)`, `chemdner_TEXT:MESH:D014640)`, `mlee_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D008701)`, `mlee_NER:O)`, `chemdner_TEXT:MESH:D014302)`, `genia_term_corpus_ner:B-RNA_family_or_group)`, `medmentions_full_ner:I-T091)`, `medmentions_full_ner:B-T022)`, `medmentions_full_ner:B-T074)`, `bionlp_st_2013_gro_NER:B-ProteinCatabolism)`, `bionlp_st_2013_gro_RE:hasPatient4)`, `chemdner_TEXT:MESH:D011388)`, `bionlp_st_2013_ge_NER:I-Phosphorylation)`, `bionlp_st_2013_gro_NER:I-CellAdhesion)`, `anat_em_ner:I-Organ)`, `medmentions_full_ner:B-T045)`, `chemdner_TEXT:MESH:D008727)`, `chebi_nactem_abstr_ann1_ner:B-Species)`, `bionlp_st_2013_gro_ner:I-RNAPolymeraseII)`, `nlm_gene_ner:B-STARGENE)`, `mantra_gsc_en_emea_ner:B-OBJC)`, `bionlp_st_2013_gro_ner:B-DNABindingDomainOfProtein)`, `chemdner_TEXT:MESH:D010636)`, `chemdner_TEXT:MESH:D004061)`, `mlee_NER:I-Binding)`, `medmentions_full_ner:B-T075)`, `medmentions_full_ner:B-UnknownType)`, `chemdner_TEXT:MESH:D019081)`, `bionlp_st_2013_gro_NER:I-Binding)`, `medmentions_full_ner:I-T005)`, `chemdner_TEXT:MESH:D009821)` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biomuppet_en_5.2.0_3.0_1699292355718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biomuppet_en_5.2.0_3.0_1699292355718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biomuppet","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biomuppet","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.biomuppet.by_leonweber").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biomuppet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|420.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/leonweber/biomuppet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_biobert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_biobert_512_en.md new file mode 100644 index 000000000000..362bc912fb2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_biobert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13_modified_biobert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13_modified_biobert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13_modified_biobert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_biobert_512_en_5.2.0_3.0_1699274732095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_biobert_512_en_5.2.0_3.0_1699274732095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13_modified_biobert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13_modified_biobert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13_modified_biobert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13-Modified-BioBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_bluebert_512_en.md new file mode 100644 index 000000000000..4ee3fd4f518e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13_modified_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13_modified_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13_modified_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_bluebert_512_en_5.2.0_3.0_1699276004787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_bluebert_512_en_5.2.0_3.0_1699276004787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13_modified_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13_modified_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13_modified_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13-Modified-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_pubmedbert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_pubmedbert_384_en.md new file mode 100644 index 000000000000..02d7391407da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_pubmedbert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13_modified_pubmedbert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13_modified_pubmedbert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13_modified_pubmedbert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_pubmedbert_384_en_5.2.0_3.0_1699275324723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_pubmedbert_384_en_5.2.0_3.0_1699275324723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13_modified_pubmedbert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13_modified_pubmedbert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13_modified_pubmedbert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13-Modified-PubMedBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_scibert_384_en.md new file mode 100644 index 000000000000..f5e452733cc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13_modified_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13_modified_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13_modified_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_scibert_384_en_5.2.0_3.0_1699276265112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_scibert_384_en_5.2.0_3.0_1699276265112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13_modified_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13_modified_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13_modified_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13-Modified-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_scibert_512_en.md new file mode 100644 index 000000000000..84721ba646c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13_modified_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13_modified_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13_modified_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13_modified_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_scibert_512_en_5.2.0_3.0_1699276463597.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13_modified_scibert_512_en_5.2.0_3.0_1699276463597.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13_modified_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13_modified_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13_modified_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13-Modified-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_biobert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_biobert_512_en.md new file mode 100644 index 000000000000..a3d34898cffa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_biobert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_chem_original_biobert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_chem_original_biobert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_chem_original_biobert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_biobert_512_en_5.2.0_3.0_1699274844380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_biobert_512_en_5.2.0_3.0_1699274844380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_chem_original_biobert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_chem_original_biobert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_chem_original_biobert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BIONLP13CG-CHEM-Chem-Original-BioBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_bluebert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_bluebert_384_en.md new file mode 100644 index 000000000000..0d170c64dc2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_bluebert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_chem_original_bluebert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_chem_original_bluebert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_chem_original_bluebert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_bluebert_384_en_5.2.0_3.0_1699275032863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_bluebert_384_en_5.2.0_3.0_1699275032863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_chem_original_bluebert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_chem_original_bluebert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_chem_original_bluebert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BIONLP13CG-CHEM-Chem-Original-BlueBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_bluebert_512_en.md new file mode 100644 index 000000000000..4853f5d40084 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_chem_original_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_chem_original_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_chem_original_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_bluebert_512_en_5.2.0_3.0_1699275031577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_bluebert_512_en_5.2.0_3.0_1699275031577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_chem_original_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_chem_original_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_chem_original_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BIONLP13CG-CHEM-Chem-Original-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_scibert_384_en.md new file mode 100644 index 000000000000..39eb23b14cb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_chem_original_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_chem_original_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_chem_original_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_scibert_384_en_5.2.0_3.0_1699274232298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_scibert_384_en_5.2.0_3.0_1699274232298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_chem_original_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_chem_original_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_chem_original_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BIONLP13CG-CHEM-Chem-Original-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_scibert_512_en.md new file mode 100644 index 000000000000..3e5c40433417 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_chem_original_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_chem_original_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_chem_original_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_chem_original_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_scibert_512_en_5.2.0_3.0_1699274439743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_chem_original_scibert_512_en_5.2.0_3.0_1699274439743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_chem_original_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_chem_original_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_chem_original_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BIONLP13CG-CHEM-Chem-Original-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalanced_biobert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalanced_biobert_en.md new file mode 100644 index 000000000000..3c0e149bb95d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalanced_biobert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_imbalanced_biobert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_imbalanced_biobert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_imbalanced_biobert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_imbalanced_biobert_en_5.2.0_3.0_1699274649073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_imbalanced_biobert_en_5.2.0_3.0_1699274649073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_imbalanced_biobert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_imbalanced_biobert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_imbalanced_biobert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem_Imbalanced-biobert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased_en.md new file mode 100644 index 000000000000..266b4329bad8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased_en_5.2.0_3.0_1699274783237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased_en_5.2.0_3.0_1699274783237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_imbalanced_scibert_scivocab_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem_Imbalanced-scibert_scivocab_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalancedpubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalancedpubmedbert_en.md new file mode 100644 index 000000000000..82fab6c08e15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_imbalancedpubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_imbalancedpubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_imbalancedpubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_imbalancedpubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_imbalancedpubmedbert_en_5.2.0_3.0_1699274624854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_imbalancedpubmedbert_en_5.2.0_3.0_1699274624854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_imbalancedpubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_imbalancedpubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_imbalancedpubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem_ImbalancedPubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_biobert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_biobert_384_en.md new file mode 100644 index 000000000000..dbd6a32f88d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_biobert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_biobert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_biobert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_biobert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_biobert_384_en_5.2.0_3.0_1699275213898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_biobert_384_en_5.2.0_3.0_1699275213898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_biobert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_biobert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_biobert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-BioBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_biobert_large_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_biobert_large_en.md new file mode 100644 index 000000000000..1e30d144c31d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_biobert_large_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_biobert_large BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_biobert_large +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_biobert_large` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_biobert_large_en_5.2.0_3.0_1699275590721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_biobert_large_en_5.2.0_3.0_1699275590721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_biobert_large","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_biobert_large", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_biobert_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified_biobert-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_bioformers_2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_bioformers_2_en.md new file mode 100644 index 000000000000..4b8421bc0e0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_bioformers_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_bioformers_2 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_bioformers_2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_bioformers_2` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_bioformers_2_en_5.2.0_3.0_1699274003972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_bioformers_2_en_5.2.0_3.0_1699274003972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_bioformers_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_bioformers_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_bioformers_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|158.5 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-Bioformers_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_bioformers_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_bioformers_en.md new file mode 100644 index 000000000000..f85cb100f93c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_bioformers_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_bioformers BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_bioformers +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_bioformers` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_bioformers_en_5.2.0_3.0_1699274794459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_bioformers_en_5.2.0_3.0_1699274794459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_bioformers","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_bioformers", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_bioformers| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|158.5 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-Bioformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest_en.md new file mode 100644 index 000000000000..7d5a5511d739 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest_en_5.2.0_3.0_1699274376316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest_en_5.2.0_3.0_1699274376316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_pubmedabstract_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-pubmedabstract_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_384_en.md new file mode 100644 index 000000000000..cc321eeea4b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_pubmedbert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_pubmedbert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_pubmedbert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_384_en_5.2.0_3.0_1699275219094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_384_en_5.2.0_3.0_1699275219094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_pubmedbert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-PubMedBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_512_en.md new file mode 100644 index 000000000000..fd2b3fa9935c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_512_en_5.2.0_3.0_1699275405080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_512_en_5.2.0_3.0_1699275405080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3_en.md new file mode 100644 index 000000000000..1bbd243dcf19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3_en_5.2.0_3.0_1699274957348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3_en_5.2.0_3.0_1699274957348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_pubmedbert_abstract_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-PubMedBert-abstract-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_en.md new file mode 100644 index 000000000000..c60a2f583f26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_en_5.2.0_3.0_1699274953196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_en_5.2.0_3.0_1699274953196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified_PubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3_en.md new file mode 100644 index 000000000000..6d3f69bf357d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3_en_5.2.0_3.0_1699274187305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3_en_5.2.0_3.0_1699274187305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_pubmedbert_full_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified-PubMedBert-full-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_scibert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_scibert_en.md new file mode 100644 index 000000000000..1f4a4e40c483 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_modified_scibert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_modified_scibert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_modified_scibert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_modified_scibert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_scibert_en_5.2.0_3.0_1699275139781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_modified_scibert_en_5.2.0_3.0_1699275139781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_modified_scibert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_modified_scibert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_modified_scibert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Modified_SciBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_original_biobert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_original_biobert_384_en.md new file mode 100644 index 000000000000..70133ca7f5d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_original_biobert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_original_biobert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_original_biobert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_original_biobert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_original_biobert_384_en_5.2.0_3.0_1699275144020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_original_biobert_384_en_5.2.0_3.0_1699275144020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_original_biobert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_original_biobert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_original_biobert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Original-BioBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_original_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_original_pubmedbert_512_en.md new file mode 100644 index 000000000000..4ab5ca7bf71e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_chem_original_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_chem_original_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_chem_original_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_chem_original_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_original_pubmedbert_512_en_5.2.0_3.0_1699275329371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_chem_original_pubmedbert_512_en_5.2.0_3.0_1699275329371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_chem_original_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_chem_original_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_chem_original_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Chem-Original-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_biobert_v1.1_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_biobert_v1.1_latest_en.md new file mode 100644 index 000000000000..610608675098 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_biobert_v1.1_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_modified_biobert_v1.1_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_modified_biobert_v1.1_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_modified_biobert_v1.1_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_biobert_v1.1_latest_en_5.2.0_3.0_1699274550015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_biobert_v1.1_latest_en_5.2.0_3.0_1699274550015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_modified_biobert_v1.1_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_modified_biobert_v1.1_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_modified_biobert_v1.1_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Modified-biobert-v1.1_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md new file mode 100644 index 000000000000..98bb00899c5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699275791072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699275791072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_modified_bluebert_pubmed_uncased_l_12_h_768_a_12_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Modified-bluebert_pubmed_uncased_L-12_H-768_A-12_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_pubmedabstract_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_pubmedabstract_latest_en.md new file mode 100644 index 000000000000..90fc37c08594 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_pubmedabstract_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_modified_pubmedabstract_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_modified_pubmedabstract_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_modified_pubmedabstract_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_pubmedabstract_latest_en_5.2.0_3.0_1699275511797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_pubmedabstract_latest_en_5.2.0_3.0_1699275511797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_modified_pubmedabstract_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_modified_pubmedabstract_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_modified_pubmedabstract_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Modified-pubmedabstract_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_scibert_uncased_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_scibert_uncased_latest_en.md new file mode 100644 index 000000000000..2f1ae54999e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_modified_scibert_uncased_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_modified_scibert_uncased_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_modified_scibert_uncased_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_modified_scibert_uncased_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_scibert_uncased_latest_en_5.2.0_3.0_1699275701881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_modified_scibert_uncased_latest_en_5.2.0_3.0_1699275701881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_modified_scibert_uncased_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_modified_scibert_uncased_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_modified_scibert_uncased_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Modified-scibert-uncased_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md new file mode 100644 index 000000000000..203efb5c6497 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699275408948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest_en_5.2.0_3.0_1699275408948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_original_bluebert_pubmed_uncased_l_12_h_768_a_12_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Original-bluebert_pubmed_uncased_L-12_H-768_A-12_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_pubmedbert_abstract_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_pubmedbert_abstract_latest_en.md new file mode 100644 index 000000000000..978055d5b1d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_pubmedbert_abstract_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_original_pubmedbert_abstract_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_original_pubmedbert_abstract_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_original_pubmedbert_abstract_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_original_pubmedbert_abstract_latest_en_5.2.0_3.0_1699275892957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_original_pubmedbert_abstract_latest_en_5.2.0_3.0_1699275892957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_original_pubmedbert_abstract_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_original_pubmedbert_abstract_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_original_pubmedbert_abstract_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Original-PubmedBert-abstract-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_scibert_latest_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_scibert_latest_en.md new file mode 100644 index 000000000000..5de8bc4af881 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bionlp13cg_original_scibert_latest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_bionlp13cg_original_scibert_latest BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_bionlp13cg_original_scibert_latest +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_bionlp13cg_original_scibert_latest` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_original_scibert_latest_en_5.2.0_3.0_1699276075015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bionlp13cg_original_scibert_latest_en_5.2.0_3.0_1699276075015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bionlp13cg_original_scibert_latest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_bionlp13cg_original_scibert_latest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bionlp13cg_original_scibert_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioNLP13CG-Original-scibert_latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_cd_modified_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_cd_modified_pubmedbert_512_en.md new file mode 100644 index 000000000000..8e0803db3c16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_cd_modified_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_cd_modified_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_cd_modified_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_cd_modified_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_cd_modified_pubmedbert_512_en_5.2.0_3.0_1699276662292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_cd_modified_pubmedbert_512_en_5.2.0_3.0_1699276662292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_cd_modified_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_cd_modified_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_cd_modified_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-CD-Modified-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_cd_original_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_cd_original_pubmedbert_512_en.md new file mode 100644 index 000000000000..420612084afb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_cd_original_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_cd_original_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_cd_original_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_cd_original_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_cd_original_pubmedbert_512_en_5.2.0_3.0_1699276214446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_cd_original_pubmedbert_512_en_5.2.0_3.0_1699276214446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_cd_original_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_cd_original_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_cd_original_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-CD-Original-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_10_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_10_en.md new file mode 100644 index 000000000000..0607844d31b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_128_10 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_128_10 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_128_10` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_10_en_5.2.0_3.0_1699275506907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_10_en_5.2.0_3.0_1699275506907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_128_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_128_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_128_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-128-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_20_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_20_en.md new file mode 100644 index 000000000000..e7f0e7f75360 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_128_20 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_128_20 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_128_20` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_20_en_5.2.0_3.0_1699275592998.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_20_en_5.2.0_3.0_1699275592998.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_128_20","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_128_20", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_128_20| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-128-20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_32_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_32_en.md new file mode 100644 index 000000000000..5b7bb2fb58b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_128_32 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_128_32 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_128_32` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_32_en_5.2.0_3.0_1699276875786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_32_en_5.2.0_3.0_1699276875786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_128_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_128_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_128_32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-128-32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_5_en.md new file mode 100644 index 000000000000..1bfc05db693b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_128_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_128_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_128_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_128_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_5_en_5.2.0_3.0_1699274929567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_128_5_en_5.2.0_3.0_1699274929567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_128_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_128_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_128_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-128-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_13_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_13_en.md new file mode 100644 index 000000000000..9435b9eb2684 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_256_13 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_256_13 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_256_13` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_256_13_en_5.2.0_3.0_1699275703659.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_256_13_en_5.2.0_3.0_1699275703659.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_256_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_256_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_256_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-256-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_40_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_40_en.md new file mode 100644 index 000000000000..1b37fe999f79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_40_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_256_40 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_256_40 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_256_40` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_256_40_en_5.2.0_3.0_1699277094888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_256_40_en_5.2.0_3.0_1699277094888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_256_40","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_256_40", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_256_40| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-256-40 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_5_en.md new file mode 100644 index 000000000000..e46274380884 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_256_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_256_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_256_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_256_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_256_5_en_5.2.0_3.0_1699276425934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_256_5_en_5.2.0_3.0_1699276425934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_256_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_256_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_256_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-256-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_320_8_10_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_320_8_10_en.md new file mode 100644 index 000000000000..15a4ac4aefc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_320_8_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_320_8_10 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_320_8_10 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_320_8_10` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_320_8_10_en_5.2.0_3.0_1699277294407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_320_8_10_en_5.2.0_3.0_1699277294407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_320_8_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_320_8_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_320_8_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-320-8-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_320_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_320_8_en.md new file mode 100644 index 000000000000..2c7c16c44287 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_320_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_320_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_320_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_320_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_320_8_en_5.2.0_3.0_1699276633251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_320_8_en_5.2.0_3.0_1699276633251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_320_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_320_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_320_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-320-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_5_en.md new file mode 100644 index 000000000000..a80ca0dd76eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_384_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_384_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_384_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_384_5_en_5.2.0_3.0_1699275791323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_384_5_en_5.2.0_3.0_1699275791323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_384_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_384_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_384_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-384-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_8_10_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_8_10_en.md new file mode 100644 index 000000000000..fbde2f1d8e79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_8_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_384_8_10 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_384_8_10 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_384_8_10` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_384_8_10_en_5.2.0_3.0_1699276859764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_384_8_10_en_5.2.0_3.0_1699276859764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_384_8_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_384_8_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_384_8_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-384-8-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_8_en.md new file mode 100644 index 000000000000..1300e988cf6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_384_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_384_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_384_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_384_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_384_8_en_5.2.0_3.0_1699275893103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_384_8_en_5.2.0_3.0_1699275893103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_384_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_384_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_384_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-384-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_5_30_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_5_30_en.md new file mode 100644 index 000000000000..bc0622011f6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_5_30_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_512_5_30 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_512_5_30 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_512_5_30` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_512_5_30_en_5.2.0_3.0_1699277468873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_512_5_30_en_5.2.0_3.0_1699277468873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_512_5_30","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_512_5_30", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_512_5_30| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-512-5-30 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_5_en.md new file mode 100644 index 000000000000..10afe078590c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_512_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_512_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_512_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_512_5_en_5.2.0_3.0_1699276225421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_512_5_en_5.2.0_3.0_1699276225421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_512_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_512_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_512_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-512-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_en.md new file mode 100644 index 000000000000..4eaa73b6403b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_modified_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_modified_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_modified_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_modified_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_512_en_5.2.0_3.0_1699276014517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_modified_pubmedbert_512_en_5.2.0_3.0_1699276014517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_modified_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_modified_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_modified_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Modified-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_10_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_10_en.md new file mode 100644 index 000000000000..ed2990a40297 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_128_10 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_128_10 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_128_10` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_10_en_5.2.0_3.0_1699277118803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_10_en_5.2.0_3.0_1699277118803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_128_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_128_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_128_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-128-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_20_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_20_en.md new file mode 100644 index 000000000000..8a8a4b47b4b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_128_20 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_128_20 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_128_20` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_20_en_5.2.0_3.0_1699276077937.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_20_en_5.2.0_3.0_1699276077937.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_128_20","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_128_20", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_128_20| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-128-20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_32_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_32_en.md new file mode 100644 index 000000000000..228c2378d5f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_128_32 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_128_32 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_128_32` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_32_en_5.2.0_3.0_1699277653046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_32_en_5.2.0_3.0_1699277653046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_128_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_128_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_128_32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-128-32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_5_en.md new file mode 100644 index 000000000000..3ca36210d295 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_128_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_128_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_128_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_128_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_5_en_5.2.0_3.0_1699277323078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_128_5_en_5.2.0_3.0_1699277323078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_128_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_128_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_128_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-128-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_13_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_13_en.md new file mode 100644 index 000000000000..67ffb6245311 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_256_13 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_256_13 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_256_13` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_256_13_en_5.2.0_3.0_1699275107330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_256_13_en_5.2.0_3.0_1699275107330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_256_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_256_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_256_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-256-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_40_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_40_en.md new file mode 100644 index 000000000000..bcfe21773aba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_40_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_256_40 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_256_40 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_256_40` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_256_40_en_5.2.0_3.0_1699276278419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_256_40_en_5.2.0_3.0_1699276278419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_256_40","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_256_40", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_256_40| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-256-40 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_5_en.md new file mode 100644 index 000000000000..62a697b9334a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_256_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_256_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_256_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_256_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_256_5_en_5.2.0_3.0_1699277529723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_256_5_en_5.2.0_3.0_1699277529723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_256_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_256_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_256_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-256-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_320_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_320_8_en.md new file mode 100644 index 000000000000..604210639e58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_320_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_320_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_320_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_320_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_320_8_en_5.2.0_3.0_1699277844292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_320_8_en_5.2.0_3.0_1699277844292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_320_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_320_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_320_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-320-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_384_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_384_5_en.md new file mode 100644 index 000000000000..14ac4d1296eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_384_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_384_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_384_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_384_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_384_5_en_5.2.0_3.0_1699278035812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_384_5_en_5.2.0_3.0_1699278035812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_384_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_384_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_384_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-384-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_384_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_384_8_en.md new file mode 100644 index 000000000000..f6768f522af6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_384_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_384_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_384_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_384_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_384_8_en_5.2.0_3.0_1699275286558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_384_8_en_5.2.0_3.0_1699275286558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_384_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_384_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_384_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-384-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_5_30_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_5_30_en.md new file mode 100644 index 000000000000..66e7116f1cd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_5_30_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_512_5_30 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_512_5_30 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_512_5_30` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_512_5_30_en_5.2.0_3.0_1699277910897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_512_5_30_en_5.2.0_3.0_1699277910897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_512_5_30","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_512_5_30", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_512_5_30| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-512-5-30 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_5_en.md new file mode 100644 index 000000000000..d2beada4fd91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_512_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_512_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_512_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_512_5_en_5.2.0_3.0_1699275472902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_512_5_en_5.2.0_3.0_1699275472902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_512_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_512_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_512_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-512-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_en.md new file mode 100644 index 000000000000..8eb12712c0b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_chem_original_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_chem_original_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_chem_original_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_chem_original_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_512_en_5.2.0_3.0_1699277718595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_chem_original_pubmedbert_512_en_5.2.0_3.0_1699277718595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_chem_original_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_chem_original_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_chem_original_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Chem-Original-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_128_32_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_128_32_en.md new file mode 100644 index 000000000000..42891d4d7167 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_128_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_128_32 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_128_32 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_128_32` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_128_32_en_5.2.0_3.0_1699278090285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_128_32_en_5.2.0_3.0_1699278090285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_128_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_128_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_128_32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-128-32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_256_13_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_256_13_en.md new file mode 100644 index 000000000000..9d3e9d0fe83d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_256_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_256_13 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_256_13 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_256_13` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_256_13_en_5.2.0_3.0_1699275638621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_256_13_en_5.2.0_3.0_1699275638621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_256_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_256_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_256_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-256-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_256_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_256_5_en.md new file mode 100644 index 000000000000..30b90c9962ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_256_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_256_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_256_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_256_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_256_5_en_5.2.0_3.0_1699276417238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_256_5_en_5.2.0_3.0_1699276417238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_256_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_256_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_256_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-256-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_320_8_10_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_320_8_10_en.md new file mode 100644 index 000000000000..b0a8781c8681 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_320_8_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_320_8_10 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_320_8_10 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_320_8_10` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_320_8_10_en_5.2.0_3.0_1699278243282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_320_8_10_en_5.2.0_3.0_1699278243282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_320_8_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_320_8_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_320_8_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-320-8-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_320_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_320_8_en.md new file mode 100644 index 000000000000..5a3019821a13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_320_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_320_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_320_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_320_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_320_8_en_5.2.0_3.0_1699276454804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_320_8_en_5.2.0_3.0_1699276454804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_320_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_320_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_320_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-320-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_5_en.md new file mode 100644 index 000000000000..49bfb8654332 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_384_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_384_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_384_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_384_5_en_5.2.0_3.0_1699278284185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_384_5_en_5.2.0_3.0_1699278284185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_384_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_384_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_384_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-384-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_8_10_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_8_10_en.md new file mode 100644 index 000000000000..e98fc477c817 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_8_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_384_8_10 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_384_8_10 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_384_8_10` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_384_8_10_en_5.2.0_3.0_1699278511994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_384_8_10_en_5.2.0_3.0_1699278511994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_384_8_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_384_8_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_384_8_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-384-8-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_8_en.md new file mode 100644 index 000000000000..70d5dc8391f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_384_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_384_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_384_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_384_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_384_8_en_5.2.0_3.0_1699276643424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_384_8_en_5.2.0_3.0_1699276643424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_384_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_384_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_384_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-384-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_512_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_512_5_en.md new file mode 100644 index 000000000000..0a436116f640 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_512_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_512_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_512_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_512_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_512_5_en_5.2.0_3.0_1699275794523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_512_5_en_5.2.0_3.0_1699275794523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_512_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_512_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_512_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-512-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_512_en.md new file mode 100644 index 000000000000..3cee19c6e586 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_modified_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_modified_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_modified_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_modified_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_512_en_5.2.0_3.0_1699278711478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_modified_pubmedbert_512_en_5.2.0_3.0_1699278711478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_modified_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_modified_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_modified_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Modified-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_128_32_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_128_32_en.md new file mode 100644 index 000000000000..997f25f2488e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_128_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_128_32 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_128_32 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_128_32` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_128_32_en_5.2.0_3.0_1699278492204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_128_32_en_5.2.0_3.0_1699278492204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_128_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_128_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_128_32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-128-32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_256_13_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_256_13_en.md new file mode 100644 index 000000000000..81193a24201c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_256_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_256_13 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_256_13 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_256_13` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_256_13_en_5.2.0_3.0_1699276868812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_256_13_en_5.2.0_3.0_1699276868812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_256_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_256_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_256_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-256-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_256_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_256_5_en.md new file mode 100644 index 000000000000..1a940b322231 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_256_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_256_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_256_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_256_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_256_5_en_5.2.0_3.0_1699278924480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_256_5_en_5.2.0_3.0_1699278924480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_256_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_256_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_256_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-256-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_320_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_320_8_en.md new file mode 100644 index 000000000000..20c0b9a912eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_320_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_320_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_320_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_320_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_320_8_en_5.2.0_3.0_1699276014578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_320_8_en_5.2.0_3.0_1699276014578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_320_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_320_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_320_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-320-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_384_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_384_5_en.md new file mode 100644 index 000000000000..efb9a2f8f904 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_384_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_384_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_384_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_384_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_384_5_en_5.2.0_3.0_1699276221284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_384_5_en_5.2.0_3.0_1699276221284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_384_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_384_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_384_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-384-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_384_8_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_384_8_en.md new file mode 100644 index 000000000000..1d06d90d1c6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_384_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_384_8 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_384_8 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_384_8` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_384_8_en_5.2.0_3.0_1699276637633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_384_8_en_5.2.0_3.0_1699276637633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_384_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_384_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_384_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-384-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_512_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_512_5_en.md new file mode 100644 index 000000000000..1927a111501f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_512_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_512_5 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_512_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_512_5` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_512_5_en_5.2.0_3.0_1699278719798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_512_5_en_5.2.0_3.0_1699278719798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_512_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_512_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_512_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-512-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_512_en.md new file mode 100644 index 000000000000..ccb73448f5f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_biored_dis_original_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_biored_dis_original_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_biored_dis_original_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_biored_dis_original_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_512_en_5.2.0_3.0_1699279126066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_biored_dis_original_pubmedbert_512_en_5.2.0_3.0_1699279126066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_biored_dis_original_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_biored_dis_original_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_biored_dis_original_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/BioRed-Dis-Original-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_body_site_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_body_site_en.md new file mode 100644 index 000000000000..8f5dad3c26d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_body_site_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Maaly) +author: John Snow Labs +name: bert_ner_body_site +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `body-site` is a English model originally trained by `Maaly`. + +## Predicted Entities + +`anatomy` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_body_site_en_5.2.0_3.0_1699292606882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_body_site_en_5.2.0_3.0_1699292606882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_body_site","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_body_site","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.body_site.by_maaly").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_body_site| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Maaly/body-site +- https://gitlab.com/maaly7/emerald_metagenomics_annotations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_brjezierski_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_brjezierski_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..878da66c8664 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_brjezierski_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from brjezierski) +author: John Snow Labs +name: bert_ner_brjezierski_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `brjezierski`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_brjezierski_bert_finetuned_ner_en_5.2.0_3.0_1699292901234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_brjezierski_bert_finetuned_ner_en_5.2.0_3.0_1699292901234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_brjezierski_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_brjezierski_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_brjezierski").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_brjezierski_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/brjezierski/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_buehlpa_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_buehlpa_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..17fe5474470c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_buehlpa_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from buehlpa) +author: John Snow Labs +name: bert_ner_buehlpa_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `buehlpa`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_buehlpa_bert_finetuned_ner_en_5.2.0_3.0_1699291833509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_buehlpa_bert_finetuned_ner_en_5.2.0_3.0_1699291833509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_buehlpa_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_buehlpa_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_buehlpa").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_buehlpa_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/buehlpa/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bunsen_base_best_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bunsen_base_best_en.md new file mode 100644 index 000000000000..181356b9084f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_bunsen_base_best_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from leonweber) +author: John Snow Labs +name: bert_ner_bunsen_base_best +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bunsen_base_best` is a English model originally trained by `leonweber`. + +## Predicted Entities + +`medmentions_full_ner:B-T085)`, `bionlp_st_2013_gro_ner:B-Ribosome)`, `chemdner_TEXT:MESH:D013830)`, `anat_em_ner:O)`, `cellfinder_ner:I-GeneProtein)`, `ncbi_disease_ner:B-CompositeMention)`, `bionlp_st_2013_gro_ner:B-Virus)`, `medmentions_full_ner:I-T129)`, `scai_disease_ner:B-DISEASE)`, `biorelex_ner:B-chemical)`, `chemdner_TEXT:MESH:D011166)`, `medmentions_st21pv_ner:I-T204)`, `chemdner_TEXT:MESH:D008345)`, `bionlp_st_2013_gro_NER:B-RegulationOfFunction)`, `mlee_ner:I-Cell)`, `bionlp_st_2013_gro_NER:I-RNABiosynthesis)`, `biorelex_ner:I-RNA-family)`, `bionlp_st_2013_gro_NER:B-ResponseToChemicalStimulus)`, `bionlp_st_2011_epi_NER:B-Dephosphorylation)`, `chemdner_TEXT:MESH:D003035)`, `chemdner_TEXT:MESH:D013440)`, `chemdner_TEXT:MESH:D037341)`, `chemdner_TEXT:MESH:D009532)`, `chemdner_TEXT:MESH:D019216)`, `chemdner_TEXT:MESH:D036701)`, `chemdner_TEXT:MESH:D011107)`, `bionlp_st_2013_cg_NER:B-Translation)`, `genia_term_corpus_ner:B-cell_component)`, `medmentions_full_ner:I-T065)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfDNA)`, `anat_em_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D000225)`, `genia_term_corpus_ner:I-ORDNA_domain_or_regionDNA_domain_or_region)`, `medmentions_full_ner:I-T015)`, `chemdner_TEXT:MESH:D008239)`, `bionlp_st_2013_cg_NER:I-Binding)`, `bionlp_st_2013_cg_NER:B-Amino_acid_catabolism)`, `cellfinder_ner:B-CellComponent)`, `bionlp_st_2013_gro_NER:I-MetabolicPathway)`, `bionlp_st_2013_gro_ner:B-ProteinIdentification)`, `bionlp_st_2011_ge_ner:O)`, `bionlp_st_2011_id_ner:B-Organism)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelixTF)`, `mirna_ner:B-Relation_Trigger)`, `bionlp_st_2011_ge_NER:B-Regulation)`, `bionlp_st_2013_cg_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D008055)`, `chemdner_TEXT:MESH:D009944)`, `verspoor_2013_ner:I-gene)`, `bionlp_st_2013_ge_ner:O)`, `meddocan_ner:B-SEXO_SUJETO_ASISTENCIA)`, `chemdner_TEXT:MESH:D003907)`, `mlee_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D010569)`, `mlee_NER:I-Growth)`, `meddocan_ner:B-NUMERO_TELEFONO)`, `chemdner_TEXT:MESH:D036145)`, `medmentions_full_ner:I-T196)`, `ehr_rel_sts:1)`, `bionlp_st_2013_gro_NER:B-CellularComponentOrganizationAndBiogenesis)`, `chemdner_TEXT:MESH:D009285)`, `bionlp_st_2013_gro_NER:B-ProteinMetabolism)`, `chemdner_TEXT:MESH:D016718)`, `bionlp_st_2013_gro_NER:I-BindingOfTFToTFBindingSiteOfProtein)`, `medmentions_full_ner:I-T074)`, `chemdner_TEXT:MESH:D000432)`, `bionlp_st_2013_gro_NER:I-CellFateDetermination)`, `chia_ner:I-Reference_point)`, `bionlp_st_2013_gro_ner:B-Histone)`, `lll_RE:None)`, `scai_disease_ner:B-ADVERSE)`, `medmentions_full_ner:B-T130)`, `bionlp_st_2013_gro_NER:I-CellCyclePhaseTransition)`, `chemdner_TEXT:MESH:D000480)`, `chemdner_TEXT:MESH:D001556)`, `bionlp_st_2013_gro_ner:B-Nucleus)`, `bionlp_st_2013_gro_ner:B-AP2EREBPRelatedDomain)`, `chemdner_TEXT:MESH:D007854)`, `chemdner_TEXT:MESH:D009499)`, `genia_term_corpus_ner:B-polynucleotide)`, `bionlp_st_2013_gro_NER:I-Transcription)`, `chemdner_TEXT:MESH:D007213)`, `bionlp_st_2013_ge_NER:B-Regulation)`, `bionlp_st_2011_epi_NER:B-DNA_methylation)`, `medmentions_st21pv_ner:B-T031)`, `bionlp_st_2013_ge_NER:I-Gene_expression)`, `chemdner_TEXT:MESH:D007651)`, `bionlp_st_2013_gro_NER:B-OrganismalProcess)`, `bionlp_st_2011_epi_COREF:None)`, `medmentions_st21pv_ner:I-T062)`, `chemdner_TEXT:MESH:D002047)`, `chemdner_TEXT:MESH:D012822)`, `mantra_gsc_en_patents_ner:B-DEVI)`, `medmentions_full_ner:I-T071)`, `chemdner_TEXT:MESH:D013739)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfGeneExpression)`, `genia_term_corpus_ner:B-other_name)`, `medmentions_full_ner:B-T018)`, `chemdner_TEXT:MESH:D015242)`, `bionlp_st_2013_cg_NER:O)`, `chemdner_TEXT:MESH:D019469)`, `ncbi_disease_ner:B-DiseaseClass)`, `ebm_pico_ner:B-Intervention_Surgical)`, `chemdner_TEXT:MESH:D011422)`, `chemdner_TEXT:MESH:D002112)`, `chemdner_TEXT:MESH:D005682)`, `anat_em_ner:B-Immaterial_anatomical_entity)`, `bionlp_st_2011_epi_ner:B-Entity)`, `medmentions_full_ner:I-T169)`, `mlee_ner:B-Immaterial_anatomical_entity)`, `verspoor_2013_ner:B-Physiology)`, `cellfinder_ner:I-CellType)`, `chemdner_TEXT:MESH:D011122)`, `chemdner_TEXT:MESH:D010622)`, `chemdner_TEXT:MESH:D017378)`, `bionlp_st_2011_ge_RE:Theme)`, `chemdner_TEXT:MESH:D000431)`, `medmentions_full_ner:I-T102)`, `medmentions_full_ner:B-T097)`, `chemdner_TEXT:MESH:D007529)`, `chemdner_TEXT:MESH:D045265)`, `chemdner_TEXT:MESH:D005971)`, `an_em_ner:I-Multi-tissue_structure)`, `genia_term_corpus_ner:I-ANDDNA_family_or_groupDNA_family_or_group)`, `medmentions_full_ner:I-T080)`, `chemdner_TEXT:MESH:D002207)`, `chia_ner:I-Qualifier)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscriptionByTranscriptionRepressor)`, `an_em_ner:I-Immaterial_anatomical_entity)`, `biosses_sts:5)`, `chemdner_TEXT:MESH:D000079963)`, `chemdner_TEXT:MESH:D013196)`, `ehr_rel_sts:2)`, `chemdner_TEXT:MESH:D006152)`, `bionlp_st_2013_gro_NER:B-RegulationOfProcess)`, `mlee_NER:I-Development)`, `medmentions_full_ner:B-T197)`, `bionlp_st_2013_gro_ner:B-NucleicAcid)`, `medmentions_st21pv_ner:I-T017)`, `medmentions_full_ner:I-T046)`, `medmentions_full_ner:B-T204)`, `bionlp_st_2013_gro_NER:B-CellularDevelopmentalProcess)`, `bionlp_st_2013_cg_ner:B-Immaterial_anatomical_entity)`, `chemdner_TEXT:MESH:D014212)`, `bionlp_st_2013_cg_NER:B-Protein_processing)`, `chemdner_TEXT:MESH:D008926)`, `chia_ner:B-Visit)`, `bionlp_st_2011_ge_NER:B-Negative_regulation)`, `mantra_gsc_en_medline_ner:I-OBJC)`, `bionlp_st_2013_gro_ner:I-RNAMolecule)`, `chemdner_TEXT:MESH:D014812)`, `linnaeus_filtered_ner:I-species)`, `chebi_nactem_fullpaper_ner:B-Chemical)`, `bionlp_st_2011_ge_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_ner:B-MutantGene)`, `chemdner_TEXT:MESH:D014859)`, `bionlp_st_2019_bb_ner:B-Phenotype)`, `bionlp_st_2013_gro_NER:I-BindingOfTFToTFBindingSiteOfDNA)`, `diann_iber_eval_en_ner:I-Neg)`, `ddi_corpus_ner:B-DRUG_N)`, `meddocan_ner:B-ID_TITULACION_PERSONAL_SANITARIO)`, `bionlp_st_2013_cg_ner:B-Organ)`, `chemdner_TEXT:MESH:D009320)`, `bionlp_st_2013_cg_ner:I-Organism_subdivision)`, `bionlp_st_2013_cg_ner:B-Cellular_component)`, `chemdner_TEXT:MESH:D003188)`, `chemdner_TEXT:MESH:D001241)`, `chemdner_TEXT:MESH:D004811)`, `bioinfer_ner:I-GeneproteinRNA)`, `chemdner_TEXT:MESH:D002248)`, `bionlp_shared_task_2009_NER:B-Negative_regulation)`, `chemdner_TEXT:MESH:D000143)`, `chemdner_TEXT:MESH:D007099)`, `nlm_gene_ner:O)`, `chemdner_TEXT:MESH:D005485)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorBindingSiteOfDNA)`, `bionlp_st_2013_gro_ner:B-PhysicalContact)`, `medmentions_full_ner:B-T167)`, `medmentions_st21pv_ner:B-T091)`, `seth_corpus_ner:I-Gene)`, `bionlp_st_2011_ge_COREF:coref)`, `bionlp_st_2011_ge_NER:B-Gene_expression)`, `medmentions_full_ner:B-T031)`, `genia_relation_corpus_RE:None)`, `genia_term_corpus_ner:I-ANDDNA_domain_or_regionDNA_domain_or_region)`, `chemdner_TEXT:MESH:D014970)`, `bionlp_st_2013_gro_NER:B-Mutation)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivator)`, `chemdner_TEXT:MESH:D002217)`, `chemdner_TEXT:MESH:D003367)`, `medmentions_full_ner:I-UnknownType)`, `chemdner_TEXT:MESH:D002998)`, `bionlp_st_2013_gro_ner:I-Phenotype)`, `genia_term_corpus_ner:B-ANDDNA_family_or_groupDNA_family_or_group)`, `hprd50_RE:PPI)`, `chemdner_TEXT:MESH:D002118)`, `scai_chemical_ner:B-IUPAC)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfProtein)`, `verspoor_2013_ner:B-mutation)`, `chemdner_TEXT:MESH:D011719)`, `chemdner_TEXT:MESH:D013729)`, `bionlp_shared_task_2009_ner:O)`, `chemdner_TEXT:MESH:D005840)`, `chemdner_TEXT:MESH:D009287)`, `medmentions_full_ner:B-T029)`, `chemdner_TEXT:MESH:D037742)`, `medmentions_full_ner:I-T200)`, `chemdner_TEXT:MESH:D012503)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndRNA)`, `mirna_ner:I-Non-Specific_miRNAs)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfProtein)`, `bionlp_st_2013_pc_NER:B-Deacetylation)`, `meddocan_ner:B-NOMBRE_PERSONAL_SANITARIO)`, `chemprot_RE:CPR:7)`, `chia_ner:I-Value)`, `medmentions_full_ner:I-T048)`, `chemprot_ner:B-GENE-Y)`, `bionlp_st_2013_cg_NER:B-Reproduction)`, `pharmaconer_ner:B-UNCLEAR)`, `bionlp_st_2011_id_ner:I-Regulon-operon)`, `ebm_pico_ner:I-Outcome_Adverse-effects)`, `bioinfer_ner:B-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-bZIPTF)`, `mirna_ner:I-GenesProteins)`, `biorelex_ner:I-process)`, `chemdner_TEXT:MESH:D001555)`, `genia_term_corpus_ner:B-DNA_domain_or_region)`, `cellfinder_ner:O)`, `bionlp_st_2013_gro_ner:I-MutatedProtein)`, `bionlp_st_2013_gro_NER:I-CellularComponentOrganizationAndBiogenesis)`, `spl_adr_200db_train_ner:O)`, `medmentions_full_ner:I-T026)`, `chemdner_TEXT:MESH:D013619)`, `bionlp_st_2013_gro_NER:I-BindingToRNA)`, `biorelex_ner:I-drug)`, `bionlp_st_2013_pc_NER:B-Translation)`, `mantra_gsc_en_emea_ner:B-LIVB)`, `mantra_gsc_en_patents_ner:B-PROC)`, `bionlp_st_2013_pc_NER:B-Binding)`, `bionlp_st_2013_gro_NER:B-ModificationOfMolecularEntity)`, `bionlp_st_2013_cg_NER:I-Cell_transformation)`, `scai_chemical_ner:B-TRIVIALVAR)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomain)`, `bionlp_st_2013_gro_NER:I-TranscriptionInitiation)`, `chemdner_TEXT:MESH:D010907)`, `bionlp_st_2013_gro_ner:B-InorganicChemical)`, `bionlp_st_2013_pc_RE:None)`, `chemdner_TEXT:MESH:D002922)`, `chemdner_TEXT:MESH:D010743)`, `bionlp_st_2019_bb_ner:O)`, `medmentions_full_ner:I-T001)`, `chemdner_TEXT:MESH:D001381)`, `bionlp_shared_task_2009_ner:I-Protein)`, `bionlp_st_2013_gro_ner:B-Spliceosome)`, `bionlp_st_2013_gro_ner:I-HMGTF)`, `minimayosrs_sts:3)`, `ddi_corpus_RE:ADVISE)`, `mlee_NER:B-Dissociation)`, `bionlp_st_2013_gro_ner:I-Holoenzyme)`, `chemdner_TEXT:MESH:D001552)`, `bionlp_st_2013_gro_ner:B-bHLH)`, `chemdner_TEXT:MESH:D000109)`, `chemdner_TEXT:MESH:D013449)`, `bionlp_st_2013_gro_ner:I-GeneRegion)`, `medmentions_full_ner:B-T019)`, `scai_chemical_ner:B-TRIVIAL)`, `mlee_ner:B-Gene_or_gene_product)`, `biosses_sts:3)`, `bionlp_st_2013_cg_NER:I-Pathway)`, `bionlp_st_2011_id_ner:I-Organism)`, `bionlp_st_2013_gro_ner:B-tRNA)`, `chemdner_TEXT:MESH:D013109)`, `mlee_ner:I-Immaterial_anatomical_entity)`, `medmentions_full_ner:B-T065)`, `ebm_pico_ner:I-Participant_Sample-size)`, `genia_term_corpus_ner:I-protein_family_or_group)`, `chemdner_TEXT:MESH:D002444)`, `chemdner_TEXT:MESH:D063388)`, `mlee_NER:B-Translation)`, `chemdner_TEXT:MESH:D007052)`, `bionlp_st_2013_gro_ner:B-Gene)`, `chia_ner:B-Scope)`, `bionlp_st_2013_ge_NER:I-Positive_regulation)`, `chemdner_TEXT:MESH:D007785)`, `medmentions_st21pv_ner:I-T097)`, `iepa_RE:None)`, `medmentions_full_ner:B-T001)`, `medmentions_full_ner:I-T194)`, `chemdner_TEXT:MESH:D047309)`, `bionlp_st_2013_gro_ner:B-Substrate)`, `chemdner_TEXT:MESH:D002186)`, `ebm_pico_ner:B-Outcome_Other)`, `bionlp_st_2013_gro_NER:I-OrganismalProcess)`, `bionlp_st_2013_gro_ner:B-Ion)`, `bionlp_st_2013_gro_NER:I-ProteinBiosynthesis)`, `chia_ner:B-Drug)`, `bionlp_st_2013_gro_ner:I-MolecularEntity)`, `cadec_ner:I-Symptom)`, `anat_em_ner:B-Cellular_component)`, `bionlp_st_2013_cg_ner:B-Multi-tissue_structure)`, `medmentions_full_ner:I-T122)`, `an_em_ner:B-Cell)`, `chemdner_TEXT:MESH:D011564)`, `bionlp_st_2013_gro_NER:B-Splicing)`, `bionlp_st_2013_cg_NER:I-Metabolism)`, `bionlp_st_2013_pc_NER:B-Activation)`, `bionlp_st_2013_gro_ner:I-BindingSiteOfProtein)`, `bionlp_st_2011_id_ner:B-Chemical)`, `bionlp_st_2013_gro_ner:I-Ribosome)`, `nlmchem_ner:I-Chemical)`, `mirna_ner:I-Specific_miRNAs)`, `medmentions_full_ner:I-T012)`, `bionlp_st_2013_gro_NER:B-IntraCellularTransport)`, `bionlp_st_2011_id_NER:I-Transcription)`, `mantra_gsc_en_patents_ner:I-ANAT)`, `an_em_ner:B-Immaterial_anatomical_entity)`, `scai_chemical_ner:I-IUPAC)`, `distemist_ner:B-ENFERMEDAD)`, `bionlp_st_2011_epi_NER:B-Deubiquitination)`, `chemdner_TEXT:MESH:D007295)`, `meddocan_ner:I-NOMBRE_SUJETO_ASISTENCIA)`, `bionlp_st_2011_ge_NER:B-Binding)`, `bionlp_st_2013_pc_NER:B-Localization)`, `chia_ner:B-Procedure)`, `medmentions_full_ner:I-T109)`, `chemdner_TEXT:MESH:D002791)`, `mantra_gsc_en_medline_ner:I-CHEM)`, `chebi_nactem_fullpaper_ner:B-Biological_Activity)`, `ncbi_disease_ner:B-SpecificDisease)`, `medmentions_full_ner:B-T063)`, `chemdner_TEXT:MESH:D016595)`, `bionlp_st_2011_id_NER:B-Transcription)`, `bionlp_st_2013_gro_ner:B-DNAMolecule)`, `mlee_NER:B-Protein_processing)`, `biorelex_ner:B-protein-complex)`, `anat_em_ner:I-Cancer)`, `bionlp_st_2013_cg_RE:AtLoc)`, `medmentions_full_ner:I-T072)`, `bio_sim_verb_sts:2)`, `seth_corpus_ner:O)`, `medmentions_full_ner:B-T070)`, `biorelex_ner:I-experiment-tag)`, `chemdner_TEXT:MESH:D020126)`, `biorelex_ner:I-protein-RNA-complex)`, `bionlp_st_2013_pc_NER:I-Phosphorylation)`, `medmentions_st21pv_ner:I-T201)`, `genia_term_corpus_ner:B-protein_complex)`, `medmentions_full_ner:I-T125)`, `bionlp_st_2013_ge_ner:I-Entity)`, `chemdner_TEXT:MESH:D054659)`, `bionlp_st_2013_pc_RE:ToLoc)`, `medmentions_full_ner:B-T099)`, `bionlp_st_2013_gro_NER:B-Binding)`, `medmentions_full_ner:B-T114)`, `spl_adr_200db_train_ner:B-Factor)`, `bionlp_st_2013_gro_ner:B-HMG)`, `bionlp_st_2013_gro_ner:B-Operon)`, `bionlp_st_2013_ge_NER:I-Protein_catabolism)`, `ebm_pico_ner:I-Outcome_Pain)`, `bionlp_st_2013_ge_NER:B-Transcription)`, `chemdner_TEXT:MESH:D000880)`, `ebm_pico_ner:I-Outcome_Physical)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D006160)`, `gnormplus_ner:B-DomainMotif)`, `medmentions_full_ner:I-T016)`, `pharmaconer_ner:O)`, `pdr_ner:I-Disease)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToProteinBindingSiteOfProtein)`, `chemdner_TEXT:MESH:D002264)`, `genia_term_corpus_ner:I-protein_NA)`, `bionlp_shared_task_2009_NER:I-Negative_regulation)`, `medmentions_full_ner:I-T011)`, `bionlp_st_2013_gro_NER:I-CellularMetabolicProcess)`, `mqp_sts:1)`, `an_em_ner:I-Pathological_formation)`, `bionlp_st_2011_epi_NER:B-Deacetylation)`, `bionlp_st_2013_pc_RE:Theme)`, `medmentions_full_ner:I-T103)`, `bionlp_st_2011_epi_NER:B-Methylation)`, `ebm_pico_ner:B-Intervention_Psychological)`, `bionlp_st_2013_gro_ner:B-Stress)`, `genia_term_corpus_ner:B-multi_cell)`, `bionlp_st_2013_cg_NER:B-Positive_regulation)`, `anat_em_ner:I-Cellular_component)`, `spl_adr_200db_train_ner:I-Negation)`, `chemdner_TEXT:MESH:D000605)`, `bionlp_st_2013_gro_ner:B-RegulatoryDNARegion)`, `bionlp_st_2013_gro_ner:I-HomeoboxTF)`, `bionlp_st_2013_gro_NER:I-GeneSilencing)`, `ddi_corpus_ner:I-DRUG)`, `bionlp_st_2013_cg_NER:I-Growth)`, `mantra_gsc_en_medline_ner:B-OBJC)`, `mayosrs_sts:3)`, `bionlp_st_2013_gro_NER:B-RNAProcessing)`, `cellfinder_ner:B-CellType)`, `medmentions_full_ner:B-T007)`, `chemprot_ner:B-GENE-N)`, `biorelex_ner:B-brand)`, `ebm_pico_ner:B-Outcome_Mental)`, `bionlp_st_2013_gro_NER:B-RegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-EukaryoticCell)`, `genia_term_corpus_ner:I-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:I-T184)`, `bionlp_st_2013_gro_NER:B-RegulatoryProcess)`, `bionlp_st_2011_id_NER:B-Negative_regulation)`, `bionlp_st_2013_cg_NER:I-Development)`, `cellfinder_ner:I-Anatomy)`, `chia_ner:B-Condition)`, `chemdner_TEXT:MESH:D003065)`, `medmentions_full_ner:B-T012)`, `bionlp_st_2011_id_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorComplex)`, `bionlp_st_2013_cg_NER:I-Carcinogenesis)`, `medmentions_full_ner:B-T064)`, `medmentions_full_ner:B-T026)`, `nlmchem_ner:B-Chemical)`, `genia_term_corpus_ner:I-RNA_domain_or_region)`, `ebm_pico_ner:I-Intervention_Educational)`, `genia_term_corpus_ner:B-ANDcell_linecell_line)`, `distemist_ner:I-ENFERMEDAD)`, `genia_term_corpus_ner:B-protein_substructure)`, `bionlp_st_2013_gro_NER:I-ProteinTransport)`, `bionlp_st_2013_cg_NER:B-DNA_demethylation)`, `medmentions_full_ner:I-T058)`, `biorelex_ner:B-parameter)`, `chemdner_TEXT:MESH:D013006)`, `mirna_ner:I-Relation_Trigger)`, `bionlp_st_2013_gro_ner:B-PrimaryStructure)`, `bionlp_st_2013_gro_NER:I-Phosphorylation)`, `chemdner_TEXT:MESH:D003911)`, `pico_extraction_ner:I-participant)`, `chemdner_TEXT:MESH:D010938)`, `chia_ner:B-Person)`, `an_em_ner:B-Tissue)`, `medmentions_st21pv_ner:B-T170)`, `chemdner_TEXT:MESH:D013936)`, `chemdner_TEXT:MESH:D001080)`, `mlee_RE:None)`, `chemdner_TEXT:MESH:D013669)`, `chemdner_TEXT:MESH:D009943)`, `spl_adr_200db_train_ner:I-Factor)`, `chemdner_TEXT:MESH:D044004)`, `ebm_pico_ner:I-Participant_Sex)`, `chemdner_TEXT:MESH:D000409)`, `bionlp_st_2013_cg_NER:B-Cell_division)`, `medmentions_st21pv_ner:B-T033)`, `pcr_ner:I-Herb)`, `chemdner_TEXT:MESH:D020112)`, `bionlp_st_2013_pc_NER:B-Gene_expression)`, `bionlp_st_2011_rel_ner:O)`, `chemdner_TEXT:MESH:D008610)`, `bionlp_st_2013_gro_NER:B-BindingOfDNABindingDomainOfProteinToDNA)`, `bionlp_st_2013_gro_ner:I-Cell)`, `medmentions_full_ner:I-T055)`, `bionlp_st_2013_pc_NER:I-Negative_regulation)`, `chia_RE:Has_value)`, `tmvar_v1_ner:I-SNP)`, `biorelex_ner:I-experimental-construct)`, `genia_term_corpus_ner:B-)`, `chemdner_TEXT:MESH:D053978)`, `bionlp_st_2013_gro_ner:I-Stress)`, `mlee_ner:B-Pathological_formation)`, `bionlp_st_2013_cg_ner:O)`, `chemdner_TEXT:MESH:D007631)`, `chemdner_TEXT:MESH:D011084)`, `medmentions_full_ner:B-T080)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-TranscriptionCorepressor)`, `ehr_rel_sts:4)`, `mlee_ner:I-Gene_or_gene_product)`, `chemdner_TEXT:MESH:D003474)`, `medmentions_full_ner:B-T098)`, `scicite_TEXT:method)`, `medmentions_full_ner:B-T100)`, `chemdner_TEXT:MESH:D011849)`, `medmentions_full_ner:I-T039)`, `anat_em_ner:B-Organism_subdivision)`, `bionlp_st_2013_gro_ner:I-Nucleus)`, `mlee_NER:I-Regulation)`, `bionlp_st_2013_gro_ner:I-NuclearReceptor)`, `bionlp_st_2013_ge_RE:None)`, `chemdner_TEXT:MESH:D019483)`, `bionlp_st_2013_cg_ner:B-Cell)`, `bionlp_st_2013_gro_ner:B-Holoenzyme)`, `bionlp_st_2011_epi_NER:I-Methylation)`, `bionlp_shared_task_2009_ner:B-Protein)`, `medmentions_st21pv_ner:I-T038)`, `bionlp_st_2013_gro_ner:I-DNARegion)`, `bionlp_st_2013_gro_NER:I-CellCyclePhase)`, `bionlp_st_2013_gro_ner:I-tRNA)`, `mlee_ner:I-Multi-tissue_structure)`, `chemprot_ner:O)`, `medmentions_full_ner:B-T094)`, `bionlp_st_2013_gro_RE:fromSpecies)`, `bionlp_st_2013_gro_NER:O)`, `bionlp_st_2013_gro_NER:B-Acetylation)`, `bioinfer_ner:I-Protein_family_or_group)`, `medmentions_st21pv_ner:I-T098)`, `pdr_ner:B-Disease)`, `chemdner_ner:I-Chemical)`, `bionlp_st_2013_cg_NER:B-Negative_regulation)`, `chebi_nactem_fullpaper_ner:B-Chemical_Structure)`, `bionlp_st_2011_ge_NER:I-Negative_regulation)`, `sciq_CLF:no)`, `diann_iber_eval_en_ner:O)`, `bionlp_shared_task_2009_NER:I-Binding)`, `mlee_NER:I-Cell_proliferation)`, `chebi_nactem_fullpaper_ner:B-Protein)`, `bionlp_st_2013_gro_NER:B-Phosphorylation)`, `bionlp_st_2011_epi_COREF:coref)`, `medmentions_full_ner:B-T200)`, `bionlp_st_2013_cg_ner:B-Tissue)`, `chemdner_TEXT:MESH:D000082)`, `chemdner_TEXT:MESH:D037201)`, `bionlp_st_2013_gro_ner:B-ComplexMolecularEntity)`, `bionlp_st_2011_ge_RE:ToLoc)`, `diann_iber_eval_en_ner:B-Neg)`, `bionlp_st_2013_gro_ner:B-RibosomalRNA)`, `bionlp_shared_task_2009_NER:I-Protein_catabolism)`, `chemdner_TEXT:MESH:D016912)`, `medmentions_full_ner:B-T017)`, `bionlp_st_2013_gro_ner:B-CpGIsland)`, `mlee_ner:I-Organism_substance)`, `medmentions_full_ner:I-T075)`, `bionlp_st_2013_gro_ner:I-SecondMessenger)`, `bioinfer_ner:B-Protein_family_or_group)`, `bionlp_st_2013_cg_NER:I-Negative_regulation)`, `mantra_gsc_en_emea_ner:B-CHEM)`, `genia_term_corpus_ner:B-DNA_NA)`, `chemdner_TEXT:MESH:D057888)`, `chemdner_TEXT:MESH:D006495)`, `chemdner_TEXT:MESH:D006575)`, `geokhoj_v1_TEXT:0)`, `bionlp_st_2013_gro_RE:locatedIn)`, `genia_term_corpus_ner:B-virus)`, `bionlp_st_2013_gro_ner:B-RuntLikeDomain)`, `medmentions_full_ner:B-T131)`, `bionlp_st_2013_gro_ner:I-ProteinCodingRegion)`, `chemdner_TEXT:MESH:D015525)`, `genia_term_corpus_ner:I-mono_cell)`, `chemdner_TEXT:MESH:D007840)`, `medmentions_full_ner:I-T098)`, `meddocan_ner:I-ID_SUJETO_ASISTENCIA)`, `chemdner_TEXT:MESH:D009930)`, `genia_term_corpus_ner:I-polynucleotide)`, `biorelex_ner:I-protein-region)`, `bionlp_st_2011_id_NER:I-Process)`, `bionlp_st_2013_gro_NER:I-CellularProcess)`, `medmentions_full_ner:B-T023)`, `chemdner_TEXT:MESH:D008942)`, `medmentions_full_ner:I-T070)`, `biorelex_ner:B-organelle)`, `bionlp_st_2013_gro_NER:I-Decrease)`, `verspoor_2013_ner:I-size)`, `chemdner_TEXT:MESH:D002945)`, `ebm_pico_ner:B-Intervention_Other)`, `bionlp_st_2013_cg_ner:I-Simple_chemical)`, `chemdner_TEXT:MESH:D008751)`, `chia_RE:AND)`, `medmentions_full_ner:I-T028)`, `ebm_pico_ner:I-Intervention_Other)`, `chemdner_TEXT:MESH:D005472)`, `chemdner_TEXT:MESH:D005070)`, `gnormplus_ner:B-Gene)`, `medmentions_full_ner:I-T190)`, `mlee_NER:B-Breakdown)`, `bioinfer_ner:B-GeneproteinRNA)`, `bioinfer_ner:B-Gene)`, `chemdner_TEXT:MESH:D006835)`, `chemdner_TEXT:MESH:D004298)`, `chemdner_TEXT:MESH:D002951)`, `chia_ner:I-Device)`, `bionlp_st_2013_pc_NER:B-Conversion)`, `bionlp_shared_task_2009_NER:I-Transcription)`, `mlee_NER:B-DNA_methylation)`, `pubmed_qa_labeled_fold0_CLF:no)`, `minimayosrs_sts:1)`, `chemdner_TEXT:MESH:D002166)`, `chemdner_TEXT:MESH:D005934)`, `bionlp_st_2013_gro_NER:B-CatabolicPathway)`, `tmvar_v1_ner:I-ProteinMutation)`, `verspoor_2013_ner:I-Phenomena)`, `medmentions_full_ner:B-T011)`, `chemdner_TEXT:MESH:D001218)`, `medmentions_full_ner:B-T185)`, `mantra_gsc_en_patents_ner:I-PROC)`, `medmentions_full_ner:I-T120)`, `chia_ner:I-Procedure)`, `genia_term_corpus_ner:I-ANDcell_typecell_type)`, `bionlp_st_2011_id_ner:I-Entity)`, `pcr_ner:B-Chemical)`, `bionlp_st_2013_gro_NER:B-PositiveRegulation)`, `bionlp_st_2011_epi_ner:B-Protein)`, `medmentions_full_ner:B-T055)`, `spl_adr_200db_train_ner:I-Severity)`, `bionlp_st_2013_gro_ner:I-Ion)`, `bionlp_st_2011_id_RE:Cause)`, `bc5cdr_ner:I-Disease)`, `bionlp_st_2013_gro_ner:I-bHLH)`, `chemdner_TEXT:MESH:D001058)`, `bionlp_st_2013_gro_ner:I-AminoAcid)`, `bionlp_st_2011_epi_NER:B-Phosphorylation)`, `medmentions_full_ner:B-T086)`, `chemdner_TEXT:MESH:D004441)`, `medmentions_st21pv_ner:I-T007)`, `biorelex_ner:B-drug)`, `mantra_gsc_en_patents_ner:I-DISO)`, `medmentions_full_ner:I-T197)`, `meddocan_ner:I-FAMILIARES_SUJETO_ASISTENCIA)`, `bionlp_st_2011_ge_RE:AtLoc)`, `bionlp_st_2013_gro_NER:B-MolecularProcess)`, `bionlp_st_2011_ge_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:B-FormationOfTranscriptionInitiationComplex)`, `bionlp_st_2011_ge_NER:I-Binding)`, `mirna_ner:B-GenesProteins)`, `mirna_ner:B-Diseases)`, `mantra_gsc_en_emea_ner:I-DISO)`, `anat_em_ner:I-Multi-tissue_structure)`, `bioinfer_ner:O)`, `chemdner_TEXT:MESH:D017673)`, `bionlp_st_2013_gro_NER:B-Methylation)`, `genia_term_corpus_ner:I-AND_NOTcell_typecell_type)`, `bionlp_st_2013_cg_NER:I-Positive_regulation)`, `bionlp_st_2013_cg_NER:B-Carcinogenesis)`, `chemdner_TEXT:MESH:D009543)`, `gnormplus_ner:I-Gene)`, `bionlp_st_2013_cg_RE:Participant)`, `chemdner_TEXT:MESH:D019804)`, `seth_corpus_RE:Equals)`, `medmentions_full_ner:I-T082)`, `hprd50_ner:O)`, `bionlp_st_2013_gro_ner:B-OxidativeStress)`, `chemdner_TEXT:MESH:D014227)`, `bio_sim_verb_sts:7)`, `bionlp_st_2011_ge_NER:I-Protein_catabolism)`, `bionlp_st_2011_ge_NER:B-Localization)`, `chemdner_TEXT:MESH:D001224)`, `chemdner_TEXT:MESH:D009842)`, `bionlp_st_2013_cg_ner:B-Amino_acid)`, `bionlp_st_2013_gro_NER:B-CellCyclePhase)`, `chemdner_TEXT:MESH:D002245)`, `bionlp_st_2013_ge_NER:I-Ubiquitination)`, `bionlp_st_2013_cg_NER:I-Cell_death)`, `pico_extraction_ner:O)`, `chemdner_TEXT:MESH:D000596)`, `chemdner_TEXT:MESH:D000638)`, `an_em_ner:B-Developing_anatomical_structure)`, `bionlp_st_2019_bb_ner:I-Phenotype)`, `bionlp_st_2013_gro_NER:I-CellDeath)`, `mantra_gsc_en_patents_ner:B-PHYS)`, `chemdner_TEXT:MESH:D009705)`, `genia_term_corpus_ner:B-protein_molecule)`, `mantra_gsc_en_medline_ner:B-PHEN)`, `bionlp_st_2013_gro_NER:I-PosttranslationalModification)`, `ddi_corpus_ner:B-BRAND)`, `mantra_gsc_en_medline_ner:B-DEVI)`, `mlee_NER:I-Planned_process)`, `tmvar_v1_ner:O)`, `bionlp_st_2011_ge_NER:I-Phosphorylation)`, `genia_term_corpus_ner:I-ANDprotein_substructureprotein_substructure)`, `medmentions_st21pv_ner:B-T007)`, `bionlp_st_2013_cg_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-Organism)`, `bionlp_st_2013_gro_ner:I-NucleicAcid)`, `medmentions_full_ner:I-T044)`, `chia_ner:I-Person)`, `chemdner_TEXT:MESH:D016572)`, `scai_disease_ner:O)`, `bionlp_st_2013_gro_ner:B-TranscriptionCofactor)`, `chemdner_TEXT:MESH:D002762)`, `chemdner_TEXT:MESH:D011685)`, `chemdner_TEXT:MESH:D005031)`, `scai_disease_ner:I-ADVERSE)`, `biorelex_ner:I-protein-isoform)`, `bionlp_shared_task_2009_COREF:None)`, `meddocan_ner:B-EDAD_SUJETO_ASISTENCIA)`, `genia_term_corpus_ner:I-lipid)`, `biorelex_ner:B-RNA)`, `chemdner_TEXT:MESH:D018020)`, `scai_chemical_ner:B-FAMILY)`, `meddocan_ner:B-ID_SUJETO_ASISTENCIA)`, `chemdner_TEXT:MESH:D017382)`, `chemdner_TEXT:MESH:D006027)`, `chemdner_TEXT:MESH:D018942)`, `medmentions_full_ner:I-T024)`, `chemdner_TEXT:MESH:D008050)`, `bionlp_st_2013_cg_NER:B-Glycosylation)`, `chemdner_TEXT:MESH:D019342)`, `chemdner_TEXT:MESH:D008774)`, `bionlp_st_2011_ge_RE:CSite)`, `bionlp_st_2013_gro_ner:B-HMGTF)`, `chemdner_ner:B-Chemical)`, `bioscope_papers_ner:B-negation)`, `biorelex_RE:bind)`, `bioinfer_ner:B-Protein_complex)`, `bionlp_st_2011_epi_NER:B-Ubiquitination)`, `bionlp_st_2013_gro_NER:I-RegulationOfTranscription)`, `chemdner_TEXT:MESH:D011134)`, `bionlp_st_2011_rel_ner:I-Entity)`, `mantra_gsc_en_medline_ner:I-PROC)`, `ncbi_disease_ner:I-DiseaseClass)`, `chemdner_TEXT:MESH:D014315)`, `bionlp_st_2013_gro_ner:I-Chromosome)`, `chemdner_TEXT:MESH:D000639)`, `chemdner_TEXT:MESH:D005740)`, `bionlp_st_2013_gro_ner:I-MolecularFunction)`, `verspoor_2013_ner:B-gene)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomainTF)`, `bionlp_st_2013_gro_ner:B-DNARegion)`, `meddocan_ner:I-NUMERO_FAX)`, `ebm_pico_ner:B-Intervention_Educational)`, `medmentions_st21pv_ner:B-T005)`, `medmentions_full_ner:I-T022)`, `gnormplus_ner:B-FamilyName)`, `bionlp_st_2011_epi_RE:Contextgene)`, `bionlp_st_2013_pc_NER:B-Demethylation)`, `chia_ner:I-Observation)`, `medmentions_full_ner:I-T089)`, `bionlp_st_2013_gro_ner:I-ComplexMolecularEntity)`, `bionlp_st_2013_gro_ner:B-Lipid)`, `biorelex_ner:I-gene)`, `chemdner_TEXT:MESH:D003300)`, `chemdner_TEXT:MESH:D008903)`, `verspoor_2013_RE:relatedTo)`, `bionlp_st_2011_epi_NER:I-DNA_methylation)`, `genia_term_corpus_ner:I-cell_component)`, `bionlp_st_2011_ge_COREF:None)`, `ebm_pico_ner:B-Participant_Sample-size)`, `chemdner_TEXT:MESH:D043823)`, `chemdner_TEXT:MESH:D004958)`, `bionlp_st_2013_gro_ner:I-RNA)`, `chemdner_TEXT:MESH:D006150)`, `bionlp_st_2013_gro_ner:B-MolecularStructure)`, `meddocan_ner:B-OTROS_SUJETO_ASISTENCIA)`, `chemdner_TEXT:MESH:D007457)`, `bionlp_st_2013_gro_ner:I-OxidativeStress)`, `scai_chemical_ner:B-PARTIUPAC)`, `mlee_NER:I-Blood_vessel_development)`, `bionlp_shared_task_2009_ner:B-Entity)`, `bionlp_st_2013_ge_RE:CSite)`, `medmentions_full_ner:B-T058)`, `chemdner_TEXT:MESH:D000628)`, `ebm_pico_ner:I-Intervention_Surgical)`, `an_em_ner:I-Organ)`, `bionlp_st_2013_gro_NER:B-Increase)`, `iepa_RE:PPI)`, `mlee_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D014284)`, `chemdner_TEXT:MESH:D014260)`, `bionlp_st_2011_epi_NER:I-Glycosylation)`, `bionlp_st_2013_gro_NER:B-BindingToProtein)`, `bionlp_st_2013_gro_NER:B-BindingToRNA)`, `medmentions_full_ner:I-T047)`, `bionlp_st_2013_gro_NER:B-Localization)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfGeneExpression)`, `medmentions_full_ner:I-T051)`, `bionlp_st_2011_id_COREF:None)`, `chemdner_TEXT:MESH:D011744)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToDNA)`, `bionlp_st_2013_gro_ner:B-CatalyticActivity)`, `chebi_nactem_abstr_ann1_ner:I-Biological_Activity)`, `cadec_ner:B-Symptom)`, `bio_sim_verb_sts:1)`, `chemdner_TEXT:MESH:D012402)`, `bionlp_st_2013_gro_ner:B-bZIPTF)`, `chemdner_TEXT:MESH:D003913)`, `bionlp_shared_task_2009_RE:Site)`, `bionlp_st_2013_gro_ner:I-AntisenseRNA)`, `bionlp_st_2013_gro_NER:B-ProteinTargeting)`, `bionlp_st_2013_gro_NER:B-GeneExpression)`, `bionlp_st_2013_cg_NER:I-Blood_vessel_development)`, `mantra_gsc_en_patents_ner:I-CHEM)`, `mayosrs_sts:2)`, `chemdner_TEXT:MESH:D001645)`, `bionlp_st_2011_ge_NER:I-Transcription)`, `bionlp_st_2011_epi_NER:B-Acetylation)`, `medmentions_full_ner:B-T002)`, `verspoor_2013_ner:I-Concepts_Ideas)`, `hprd50_RE:None)`, `ddi_corpus_ner:O)`, `chemdner_TEXT:MESH:D014131)`, `ebm_pico_ner:B-Outcome_Physical)`, `medmentions_st21pv_ner:B-T103)`, `chemdner_TEXT:MESH:D016650)`, `mlee_NER:B-Cell_proliferation)`, `bionlp_st_2013_gro_ner:I-TranscriptionCoactivator)`, `chebi_nactem_fullpaper_ner:I-Chemical)`, `chemdner_TEXT:MESH:D013256)`, `biorelex_ner:I-protein-DNA-complex)`, `chemdner_TEXT:MESH:D008767)`, `bioinfer_RE:None)`, `nlm_gene_ner:B-Gene)`, `bionlp_st_2013_gro_ner:B-ReporterGene)`, `biosses_sts:1)`, `chemdner_TEXT:MESH:D000493)`, `chemdner_TEXT:MESH:D011374)`, `cadec_ner:I-Drug)`, `ebm_pico_ner:B-Intervention_Control)`, `bionlp_st_2013_pc_NER:I-Pathway)`, `chemprot_RE:CPR:3)`, `bionlp_st_2013_cg_ner:I-Amino_acid)`, `chemdner_TEXT:MESH:D005557)`, `bionlp_st_2011_ge_RE:Site)`, `bionlp_st_2013_pc_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:I-Elongation)`, `bionlp_st_2011_ge_NER:I-Localization)`, `spl_adr_200db_train_ner:B-Negation)`, `chemdner_TEXT:MESH:D010455)`, `nlm_gene_ner:B-GENERIF)`, `bionlp_st_2013_gro_NER:B-BindingOfTFToTFBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D017953)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscription)`, `osiris_ner:B-gene)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressor)`, `medmentions_full_ner:I-T131)`, `genia_term_corpus_ner:B-protein_family_or_group)`, `genia_term_corpus_ner:B-cell_type)`, `chemdner_TEXT:MESH:D013759)`, `chemdner_TEXT:MESH:D002247)`, `meddocan_ner:I-NOMBRE_PERSONAL_SANITARIO)`, `scai_chemical_ner:I-FAMILY)`, `chemdner_TEXT:MESH:D006020)`, `biorelex_ner:B-DNA)`, `chebi_nactem_abstr_ann1_ner:I-Spectral_Data)`, `mantra_gsc_en_medline_ner:B-DISO)`, `pharmaconer_ner:B-NORMALIZABLES)`, `chemdner_TEXT:MESH:D019829)`, `ncbi_disease_ner:I-CompositeMention)`, `chemdner_TEXT:MESH:D013876)`, `chebi_nactem_fullpaper_ner:I-Spectral_Data)`, `biorelex_ner:I-DNA)`, `chemdner_TEXT:MESH:D005492)`, `chemdner_TEXT:MESH:D011810)`, `chemdner_TEXT:MESH:D008563)`, `chemdner_TEXT:MESH:D015735)`, `bionlp_st_2019_bb_ner:B-Microorganism)`, `ddi_corpus_RE:INT)`, `medmentions_st21pv_ner:B-T038)`, `bionlp_st_2013_gro_NER:B-CellCyclePhaseTransition)`, `cellfinder_ner:B-CellLine)`, `pdr_RE:Cause)`, `meddocan_ner:B-PAIS)`, `chemdner_TEXT:MESH:D011433)`, `chemdner_TEXT:MESH:D011720)`, `chemdner_TEXT:MESH:D020156)`, `ebm_pico_ner:O)`, `mlee_ner:B-Organ)`, `chemdner_TEXT:MESH:D012721)`, `chebi_nactem_fullpaper_ner:I-Biological_Activity)`, `bionlp_st_2013_cg_COREF:coref)`, `chemdner_TEXT:MESH:D006918)`, `medmentions_full_ner:B-T092)`, `genia_term_corpus_ner:B-protein_NA)`, `bionlp_st_2013_ge_ner:B-Entity)`, `an_em_ner:B-Multi-tissue_structure)`, `chia_ner:I-Measurement)`, `chia_RE:Has_temporal)`, `bionlp_st_2011_id_NER:B-Protein_catabolism)`, `bionlp_st_2013_gro_NER:B-CellAdhesion)`, `bionlp_st_2013_gro_ner:B-DNABindingSite)`, `biorelex_ner:B-organism)`, `scai_disease_ner:I-DISEASE)`, `bionlp_st_2013_gro_ner:I-DNABindingSite)`, `chemdner_TEXT:MESH:D016607)`, `chemdner_TEXT:MESH:D030421)`, `bionlp_st_2013_pc_NER:I-Binding)`, `medmentions_full_ner:I-T029)`, `chemdner_TEXT:MESH:D001569)`, `genia_term_corpus_ner:B-ANDcell_typecell_type)`, `scai_chemical_ner:B-SUM)`, `chemdner_TEXT:MESH:D007656)`, `medmentions_full_ner:B-T082)`, `chemdner_TEXT:MESH:D009525)`, `medmentions_full_ner:B-T079)`, `bionlp_st_2013_cg_NER:B-Synthesis)`, `biorelex_ner:B-process)`, `bionlp_st_2013_ge_RE:Theme)`, `chemdner_TEXT:MESH:D012825)`, `chemdner_TEXT:MESH:D005462)`, `bionlp_st_2013_cg_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-CellCycle)`, `cellfinder_ner:I-CellLine)`, `bionlp_st_2013_gro_ner:I-DNABindingDomainOfProtein)`, `medmentions_st21pv_ner:B-T168)`, `genia_term_corpus_ner:B-body_part)`, `genia_term_corpus_ner:B-ANDprotein_family_or_groupprotein_family_or_group)`, `mlee_ner:B-Tissue)`, `meddocan_ner:B-ID_ASEGURAMIENTO)`, `mlee_NER:I-Localization)`, `medmentions_full_ner:B-T125)`, `meddocan_ner:I-CENTRO_SALUD)`, `bionlp_st_2013_cg_NER:B-Infection)`, `chebi_nactem_abstr_ann1_ner:I-Protein)`, `chemdner_TEXT:MESH:D009570)`, `medmentions_full_ner:I-T045)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivator)`, `verspoor_2013_ner:B-disease)`, `medmentions_full_ner:I-T056)`, `medmentions_full_ner:B-T050)`, `bionlp_st_2013_gro_ner:B-MolecularFunction)`, `medmentions_full_ner:B-T060)`, `bionlp_st_2013_gro_ner:B-Cell)`, `medmentions_full_ner:I-T060)`, `bionlp_st_2013_pc_NER:I-Gene_expression)`, `genia_term_corpus_ner:B-RNA_NA)`, `bionlp_st_2013_gro_ner:I-MessengerRNA)`, `medmentions_full_ner:I-T086)`, `an_em_RE:Part-of)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_gro_NER:I-Splicing)`, `bioinfer_RE:PPI)`, `bioscope_papers_ner:I-speculation)`, `bionlp_st_2013_gro_ner:B-HomeoBox)`, `medmentions_full_ner:B-T004)`, `chia_ner:I-Drug)`, `bionlp_st_2013_gro_ner:B-FusionOfGeneWithReporterGene)`, `genia_term_corpus_ner:I-cell_line)`, `chebi_nactem_abstr_ann1_ner:I-Metabolite)`, `bionlp_st_2013_gro_ner:I-ExpressionProfiling)`, `chemdner_TEXT:MESH:D004390)`, `medmentions_full_ner:B-T016)`, `bionlp_st_2013_cg_NER:B-Growth)`, `medmentions_full_ner:I-T170)`, `medmentions_full_ner:B-T093)`, `genia_term_corpus_ner:I-inorganic)`, `mlee_NER:B-Planned_process)`, `bionlp_st_2013_gro_RE:hasPart)`, `bionlp_st_2013_gro_ner:B-BasicDomain)`, `chemdner_TEXT:MESH:D050091)`, `medmentions_st21pv_ner:B-T037)`, `chemdner_TEXT:MESH:D011522)`, `bionlp_st_2013_ge_NER:B-Deacetylation)`, `chemdner_TEXT:MESH:D004008)`, `chemdner_TEXT:MESH:D013972)`, `bionlp_st_2013_gro_NER:B-SignalingPathway)`, `bionlp_st_2013_gro_ner:B-Promoter)`, `chemdner_TEXT:MESH:D012701)`, `an_em_COREF:None)`, `bionlp_st_2019_bb_RE:None)`, `mlee_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_NER:I-Translation)`, `chemdner_TEXT:MESH:D013453)`, `genia_term_corpus_ner:I-ANDprotein_moleculeprotein_molecule)`, `chemdner_TEXT:MESH:D002746)`, `chebi_nactem_abstr_ann1_ner:O)`, `bionlp_st_2013_pc_ner:O)`, `mayosrs_sts:7)`, `bionlp_st_2013_cg_NER:B-Pathway)`, `verspoor_2013_ner:I-age)`, `biorelex_ner:I-peptide)`, `medmentions_full_ner:I-T096)`, `chebi_nactem_fullpaper_ner:I-Chemical_Structure)`, `chemdner_TEXT:MESH:D007211)`, `medmentions_full_ner:I-T018)`, `medmentions_full_ner:B-T201)`, `bionlp_st_2013_gro_NER:B-BindingOfTFToTFBindingSiteOfProtein)`, `medmentions_full_ner:B-T054)`, `ebm_pico_ner:I-Intervention_Pharmacological)`, `chemdner_TEXT:MESH:D010672)`, `chemdner_TEXT:MESH:D004492)`, `chemdner_TEXT:MESH:D008094)`, `chemdner_TEXT:MESH:D002227)`, `chemdner_TEXT:MESH:D009553)`, `bionlp_st_2013_gro_NER:I-ResponseProcess)`, `chemdner_TEXT:MESH:D006046)`, `ebm_pico_ner:B-Participant_Condition)`, `nlm_gene_ner:I-Gene)`, `bionlp_st_2019_bb_ner:I-Habitat)`, `bionlp_shared_task_2009_COREF:coref)`, `chemdner_TEXT:MESH:D005640)`, `mantra_gsc_en_emea_ner:B-PHYS)`, `mantra_gsc_en_patents_ner:B-DISO)`, `bionlp_st_2013_gro_ner:B-Heterochromatin)`, `bionlp_st_2013_gro_NER:I-CellCycle)`, `bionlp_st_2013_cg_NER:I-Cell_proliferation)`, `bionlp_st_2013_cg_ner:B-Simple_chemical)`, `genia_term_corpus_ner:I-cell_type)`, `chemdner_TEXT:MESH:D003553)`, `bionlp_st_2013_ge_RE:Theme2)`, `tmvar_v1_ner:B-ProteinMutation)`, `chemdner_TEXT:MESH:D012717)`, `chemdner_TEXT:MESH:D026121)`, `chemdner_TEXT:MESH:D008687)`, `bionlp_st_2013_gro_NER:I-TranscriptionTermination)`, `medmentions_full_ner:B-T028)`, `biorelex_ner:B-assay)`, `genia_term_corpus_ner:B-tissue)`, `chemdner_TEXT:MESH:D009173)`, `bionlp_st_2013_gro_ner:B-TranscriptionCoactivator)`, `genia_term_corpus_ner:B-amino_acid_monomer)`, `mantra_gsc_en_emea_ner:B-DEVI)`, `bionlp_st_2013_gro_NER:B-Growth)`, `chemdner_TEXT:MESH:D017374)`, `genia_term_corpus_ner:B-other_artificial_source)`, `medmentions_full_ner:B-T072)`, `bionlp_st_2013_gro_NER:B-CellGrowth)`, `bionlp_st_2013_gro_ner:I-DoubleStrandDNA)`, `chemdner_ner:O)`, `bionlp_shared_task_2009_NER:I-Localization)`, `bionlp_st_2013_gro_NER:B-RegulationOfPathway)`, `genia_term_corpus_ner:I-amino_acid_monomer)`, `bionlp_st_2013_gro_NER:I-SPhase)`, `an_em_ner:B-Organism_substance)`, `medmentions_full_ner:B-T052)`, `meddocan_ner:B-TERRITORIO)`, `genia_term_corpus_ner:B-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:B-T096)`, `chemdner_TEXT:MESH:D056831)`, `chemdner_TEXT:MESH:D010755)`, `pdr_NER:I-Cause_of_disease)`, `mlee_NER:B-Phosphorylation)`, `medmentions_full_ner:I-T064)`, `chemdner_TEXT:MESH:D005978)`, `mantra_gsc_en_medline_ner:I-PHEN)`, `bionlp_st_2013_cg_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_NER:B-Modification)`, `bionlp_st_2013_gro_ner:B-ProteinComplex)`, `bionlp_st_2013_gro_ner:B-DoubleStrandDNA)`, `medmentions_full_ner:B-T068)`, `medmentions_full_ner:I-T034)`, `bionlp_st_2011_epi_NER:B-Catalysis)`, `biosses_sts:0)`, `bionlp_st_2013_cg_ner:B-Organism_substance)`, `chemdner_TEXT:MESH:D055549)`, `bionlp_st_2013_cg_NER:B-Glycolysis)`, `chemdner_TEXT:MESH:D001761)`, `chemdner_TEXT:MESH:D011728)`, `bionlp_st_2013_gro_ner:B-Function)`, `medmentions_full_ner:I-T033)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfTranscriptionOfGene)`, `medmentions_full_ner:I-T053)`, `bionlp_st_2013_gro_ner:B-Protein)`, `genia_term_corpus_ner:I-ANDprotein_family_or_groupprotein_family_or_group)`, `bionlp_st_2013_gro_NER:I-CatabolicPathway)`, `biorelex_ner:I-chemical)`, `chemdner_TEXT:MESH:D013185)`, `biorelex_ner:I-RNA)`, `chemdner_TEXT:MESH:D009838)`, `medmentions_full_ner:I-T008)`, `meddocan_ner:B-INSTITUCION)`, `chemdner_TEXT:MESH:D002104)`, `bionlp_st_2013_gro_NER:B-RNABiosynthesis)`, `verspoor_2013_ner:I-ethnicity)`, `bionlp_st_2013_gro_ner:I-SmallInterferingRNA)`, `chemdner_TEXT:MESH:D026023)`, `mlee_ner:O)`, `bionlp_st_2013_gro_NER:I-CellHomeostasis)`, `bionlp_st_2013_pc_NER:B-Pathway)`, `gnormplus_ner:I-DomainMotif)`, `bionlp_st_2013_gro_ner:I-OpenReadingFrame)`, `bionlp_st_2013_gro_NER:I-RegulationOfGeneExpression)`, `muchmore_en_ner:O)`, `chemdner_TEXT:MESH:D000911)`, `bionlp_st_2011_epi_NER:B-DNA_demethylation)`, `meddocan_ner:B-CENTRO_SALUD)`, `bionlp_st_2013_gro_ner:I-RuntLikeDomain)`, `chemdner_TEXT:MESH:D010748)`, `medmentions_full_ner:B-T008)`, `biorelex_ner:B-protein-RNA-complex)`, `bionlp_st_2013_cg_NER:I-Planned_process)`, `chemdner_TEXT:MESH:D014867)`, `mantra_gsc_en_patents_ner:I-LIVB)`, `bionlp_st_2013_gro_NER:I-Silencing)`, `chemdner_TEXT:MESH:D015306)`, `chemdner_TEXT:MESH:D001679)`, `bionlp_shared_task_2009_NER:I-Positive_regulation)`, `linnaeus_filtered_ner:O)`, `chia_RE:Has_multiplier)`, `medmentions_full_ner:B-T116)`, `bionlp_shared_task_2009_NER:B-Positive_regulation)`, `anat_em_ner:B-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D011137)`, `chemdner_TEXT:MESH:D048271)`, `chemdner_TEXT:MESH:D003975)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressorActivity)`, `bionlp_st_2011_id_ner:B-Protein)`, `bionlp_st_2013_gro_NER:I-Mutation)`, `chemdner_TEXT:MESH:D001572)`, `mantra_gsc_en_patents_ner:B-CHEM)`, `mantra_gsc_en_medline_ner:I-DEVI)`, `bionlp_st_2013_gro_ner:B-Enzyme)`, `medmentions_full_ner:B-T056)`, `meddocan_ner:I-TERRITORIO)`, `mantra_gsc_en_patents_ner:B-OBJC)`, `medmentions_full_ner:B-T073)`, `anat_em_ner:I-Tissue)`, `chemdner_TEXT:MESH:D047310)`, `chia_ner:I-Scope)`, `ncbi_disease_ner:B-Modifier)`, `medmentions_st21pv_ner:B-T082)`, `medmentions_full_ner:I-T054)`, `genia_term_corpus_ner:I-carbohydrate)`, `bionlp_st_2013_cg_RE:Theme)`, `chemdner_TEXT:MESH:D009538)`, `chemdner_TEXT:MESH:D008691)`, `genia_term_corpus_ner:B-ANDprotein_substructureprotein_substructure)`, `bionlp_st_2013_cg_ner:I-Tissue)`, `chia_ner:B-Device)`, `chemdner_TEXT:MESH:D002784)`, `medmentions_full_ner:I-T007)`, `bionlp_st_2013_gro_ner:I-DNAFragment)`, `spl_adr_200db_train_ner:I-AdverseReaction)`, `bionlp_st_2013_cg_NER:B-Catabolism)`, `chemdner_TEXT:MESH:D013779)`, `bionlp_st_2013_pc_NER:B-Regulation)`, `bionlp_st_2013_gro_NER:I-Disease)`, `chia_ner:I-Condition)`, `chemdner_TEXT:MESH:D012370)`, `bionlp_st_2013_ge_NER:O)`, `bionlp_st_2013_pc_NER:B-Deubiquitination)`, `bionlp_st_2013_pc_NER:I-Translation)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_cg_NER:B-DNA_methylation)`, `bioscope_papers_ner:B-speculation)`, `chemdner_TEXT:MESH:D018130)`, `bionlp_st_2013_gro_ner:B-RNAPolymeraseII)`, `medmentions_st21pv_ner:B-T098)`, `bionlp_st_2013_gro_NER:B-Elongation)`, `bionlp_st_2013_pc_RE:Cause)`, `seth_corpus_ner:B-RS)`, `bionlp_st_2013_ge_RE:ToLoc)`, `chemdner_TEXT:MESH:D000538)`, `medmentions_full_ner:B-T192)`, `medmentions_full_ner:B-T061)`, `medmentions_full_ner:B-T032)`, `bionlp_st_2013_gro_NER:B-Transport)`, `medmentions_full_ner:I-T014)`, `chemdner_TEXT:MESH:D004137)`, `medmentions_full_ner:B-T101)`, `bionlp_st_2013_gro_NER:B-Transcription)`, `bionlp_st_2013_pc_NER:B-Transport)`, `medmentions_full_ner:I-T203)`, `ebm_pico_ner:I-Intervention_Control)`, `genia_term_corpus_ner:I-atom)`, `chemdner_TEXT:MESH:D014230)`, `cadec_ner:B-Drug)`, `osiris_ner:I-gene)`, `mantra_gsc_en_patents_ner:B-ANAT)`, `ncbi_disease_ner:I-SpecificDisease)`, `bionlp_st_2013_gro_NER:I-CellGrowth)`, `chemdner_TEXT:MESH:D001205)`, `chemdner_TEXT:MESH:D016627)`, `meddocan_ner:B-FAMILIARES_SUJETO_ASISTENCIA)`, `genia_term_corpus_ner:B-protein_subunit)`, `bionlp_st_2013_gro_ner:I-CellComponent)`, `medmentions_full_ner:B-T049)`, `scai_chemical_ner:O)`, `chemdner_TEXT:MESH:D010840)`, `chemdner_TEXT:MESH:D008694)`, `mantra_gsc_en_patents_ner:B-PHEN)`, `bionlp_st_2013_cg_RE:Cause)`, `chemdner_TEXT:MESH:D012293)`, `bionlp_st_2013_gro_NER:B-Homodimerization)`, `chemdner_TEXT:MESH:D008070)`, `chia_RE:OR)`, `bionlp_st_2013_cg_ner:I-Gene_or_gene_product)`, `verspoor_2013_ner:I-disease)`, `muchmore_en_ner:B-umlsterm)`, `chemdner_TEXT:MESH:D011794)`, `medmentions_full_ner:I-T002)`, `chemdner_TEXT:MESH:D007649)`, `genia_term_corpus_ner:B-AND_NOTcell_typecell_type)`, `medmentions_full_ner:I-T023)`, `chemprot_RE:CPR:1)`, `chemdner_TEXT:MESH:D001786)`, `bionlp_st_2013_gro_ner:B-HomeoboxTF)`, `bionlp_st_2013_cg_ner:I-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:I-Attenuator)`, `bionlp_st_2019_bb_ner:B-Habitat)`, `chemdner_TEXT:MESH:D017931)`, `medmentions_full_ner:B-T047)`, `chemdner_TEXT:MESH:D006886)`, `genia_term_corpus_ner:I-)`, `medmentions_full_ner:B-T039)`, `chemdner_TEXT:MESH:D004220)`, `bionlp_st_2013_pc_RE:FromLoc)`, `nlm_gene_ner:I-GENERIF)`, `bionlp_st_2013_ge_NER:I-Protein_modification)`, `genia_term_corpus_ner:B-RNA_molecule)`, `chemdner_TEXT:MESH:D006854)`, `chemdner_TEXT:MESH:D006493)`, `chia_ner:B-Qualifier)`, `medmentions_full_ner:I-T013)`, `ehr_rel_sts:8)`, `an_em_RE:frag)`, `genia_term_corpus_ner:I-DNA_substructure)`, `chemdner_TEXT:MESH:D063065)`, `genia_term_corpus_ner:I-ANDprotein_complexprotein_complex)`, `pharmaconer_ner:I-NORMALIZABLES)`, `bionlp_st_2013_pc_NER:I-Dissociation)`, `medmentions_full_ner:I-T004)`, `bionlp_st_2013_cg_ner:B-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D010069)`, `bionlp_st_2013_gro_NER:I-Homodimerization)`, `chemdner_TEXT:MESH:D006147)`, `medmentions_full_ner:I-T041)`, `distemist_ner:O)`, `bionlp_st_2011_id_NER:B-Regulation)`, `bionlp_st_2013_gro_ner:O)`, `chemdner_TEXT:MESH:D008623)`, `bionlp_st_2013_ge_ner:I-Protein)`, `scai_chemical_ner:I-TRIVIAL)`, `an_em_ner:B-Organism_subdivision)`, `bionlp_st_2013_gro_ner:B-BindingAssay)`, `bionlp_st_2013_gro_ner:I-HMG)`, `anat_em_ner:I-Anatomical_system)`, `chemdner_TEXT:MESH:D015034)`, `mlee_NER:B-Catabolism)`, `mantra_gsc_en_medline_ner:B-LIVB)`, `meddocan_ner:B-HOSPITAL)`, `ddi_corpus_ner:I-BRAND)`, `chia_ner:I-Multiplier)`, `bionlp_st_2013_gro_ner:I-SequenceHomologyAnalysis)`, `seth_corpus_RE:None)`, `bionlp_st_2013_cg_NER:B-Binding)`, `bioscope_papers_ner:I-negation)`, `cadec_ner:B-Finding)`, `chemdner_TEXT:MESH:D008741)`, `chemdner_TEXT:MESH:D052998)`, `chemdner_TEXT:MESH:D005227)`, `meddocan_ner:I-ID_TITULACION_PERSONAL_SANITARIO)`, `chemdner_TEXT:MESH:D009828)`, `spl_adr_200db_train_ner:B-Animal)`, `chemdner_TEXT:MESH:D010616)`, `bionlp_st_2013_gro_ner:I-ProteinComplex)`, `pico_extraction_ner:B-outcome)`, `mlee_NER:B-Negative_regulation)`, `chemdner_TEXT:MESH:D007093)`, `bionlp_st_2013_gro_NER:I-RNAProcessing)`, `biorelex_ner:I-reagent)`, `medmentions_st21pv_ner:I-T074)`, `bionlp_st_2013_gro_NER:B-BindingOfMolecularEntity)`, `chemdner_TEXT:MESH:D008911)`, `medmentions_full_ner:B-T033)`, `genia_term_corpus_ner:B-ANDprotein_complexprotein_complex)`, `medmentions_full_ner:I-T100)`, `chemdner_TEXT:MESH:D019259)`, `genia_term_corpus_ner:I-BUT_NOTother_nameother_name)`, `geokhoj_v1_TEXT:1)`, `bionlp_st_2013_cg_RE:Site)`, `medmentions_full_ner:B-T184)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelixTF)`, `bionlp_st_2013_cg_ner:I-Protein_domain_or_region)`, `genia_term_corpus_ner:I-other_organic_compound)`, `chemdner_TEXT:MESH:D010793)`, `bionlp_st_2011_id_NER:B-Phosphorylation)`, `chemdner_TEXT:MESH:D002482)`, `bionlp_st_2013_cg_NER:B-Breakdown)`, `biorelex_ner:I-disease)`, `genia_term_corpus_ner:B-DNA_substructure)`, `medmentions_full_ner:B-T127)`, `medmentions_full_ner:I-T185)`, `bionlp_shared_task_2009_RE:AtLoc)`, `medmentions_full_ner:I-T201)`, `chemdner_TEXT:MESH:D005290)`, `mlee_NER:I-Breakdown)`, `medmentions_full_ner:I-T063)`, `chemdner_TEXT:MESH:D017964)`, `an_em_ner:I-Tissue)`, `mlee_ner:I-Organism)`, `mantra_gsc_en_emea_ner:I-CHEM)`, `bionlp_st_2013_cg_ner:B-Anatomical_system)`, `genia_term_corpus_ner:B-ORDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_pc_NER:B-Degradation)`, `chemprot_RE:CPR:0)`, `genia_term_corpus_ner:B-inorganic)`, `chemdner_TEXT:MESH:D005466)`, `chia_ner:O)`, `medmentions_full_ner:B-T078)`, `mlee_NER:B-Growth)`, `mantra_gsc_en_emea_ner:B-PHEN)`, `chemdner_TEXT:MESH:D012545)`, `bionlp_st_2013_gro_NER:B-G1Phase)`, `chemdner_TEXT:MESH:D009841)`, `bionlp_st_2013_gro_ner:B-Chromatin)`, `bionlp_st_2011_epi_RE:Site)`, `medmentions_full_ner:B-T066)`, `genetaggold_ner:O)`, `bionlp_st_2013_cg_NER:I-Gene_expression)`, `medmentions_st21pv_ner:B-T092)`, `chemprot_RE:CPR:8)`, `bionlp_st_2013_cg_RE:Instrument)`, `nlm_gene_ner:I-Domain)`, `chemdner_TEXT:MESH:D006151)`, `bionlp_st_2011_id_ner:I-Protein)`, `meddocan_ner:I-FECHAS)`, `mlee_NER:B-Synthesis)`, `bionlp_st_2013_gro_NER:B-CellMotility)`, `scai_chemical_ner:B-MODIFIER)`, `pharmaconer_ner:B-PROTEINAS)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfTranscription)`, `osiris_ner:O)`, `mlee_NER:B-Acetylation)`, `medmentions_st21pv_ner:B-T062)`, `chemdner_TEXT:MESH:D017705)`, `bionlp_st_2013_gro_NER:I-TranscriptionOfGene)`, `genia_term_corpus_ner:I-protein_complex)`, `chemprot_RE:CPR:10)`, `medmentions_full_ner:B-T102)`, `medmentions_full_ner:I-T171)`, `chia_ner:B-Reference_point)`, `medmentions_full_ner:B-T015)`, `bionlp_st_2013_gro_ner:I-RNAPolymerase)`, `chebi_nactem_abstr_ann1_ner:B-Metabolite)`, `bionlp_st_2013_gro_NER:I-CellDifferentiation)`, `chemdner_TEXT:MESH:D006861)`, `pubmed_qa_labeled_fold0_CLF:maybe)`, `bionlp_st_2013_gro_ner:I-Sequence)`, `mlee_NER:B-Transcription)`, `bc5cdr_ner:B-Chemical)`, `chemdner_TEXT:MESH:D000072317)`, `bionlp_st_2013_gro_NER:B-Producing)`, `genia_term_corpus_ner:B-ANDprotein_moleculeprotein_molecule)`, `bionlp_st_2011_id_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-MolecularInteraction)`, `chemdner_TEXT:MESH:D014639)`, `bionlp_st_2013_gro_NER:I-Increase)`, `mlee_NER:I-Translation)`, `medmentions_full_ner:B-T087)`, `bioscope_abstracts_ner:B-speculation)`, `ebm_pico_ner:B-Outcome_Adverse-effects)`, `mantra_gsc_en_medline_ner:B-PHYS)`, `bionlp_st_2013_gro_ner:I-Lipid)`, `bionlp_st_2011_ge_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D005278)`, `bionlp_shared_task_2009_NER:B-Phosphorylation)`, `mlee_NER:I-Gene_expression)`, `bionlp_st_2011_epi_NER:I-Deacetylation)`, `chemdner_TEXT:MESH:D002110)`, `medmentions_full_ner:I-T121)`, `bionlp_st_2011_epi_ner:I-Entity)`, `bionlp_st_2019_bb_RE:Lives_In)`, `chemdner_TEXT:MESH:D001710)`, `anat_em_ner:B-Cancer)`, `bionlp_st_2013_gro_NER:B-RNASplicing)`, `mantra_gsc_en_medline_ner:I-ANAT)`, `chemdner_TEXT:MESH:D024508)`, `chemdner_TEXT:MESH:D000537)`, `mantra_gsc_en_medline_ner:I-DISO)`, `bionlp_st_2013_gro_ner:I-Prokaryote)`, `bionlp_st_2013_gro_ner:I-Chromatin)`, `meddocan_ner:B-NUMERO_FAX)`, `bionlp_st_2013_gro_ner:B-Nucleotide)`, `linnaeus_ner:I-species)`, `verspoor_2013_ner:I-body-part)`, `bionlp_st_2013_gro_ner:B-DNAFragment)`, `bionlp_st_2013_gro_ner:B-PositiveTranscriptionRegulator)`, `medmentions_full_ner:I-T049)`, `bionlp_st_2011_ge_ner:B-Entity)`, `medmentions_full_ner:I-T017)`, `bionlp_st_2013_gro_NER:B-TranscriptionOfGene)`, `chemdner_TEXT:MESH:D009947)`, `mlee_NER:B-Dephosphorylation)`, `bionlp_st_2013_gro_NER:B-GeneSilencing)`, `pdr_RE:None)`, `scai_chemical_ner:I-TRIVIALVAR)`, `bionlp_st_2011_epi_NER:O)`, `bionlp_st_2013_cg_ner:I-Cell)`, `sciq_SEQ:None)`, `chemdner_TEXT:MESH:D019913)`, `chia_ner:I-Negation)`, `chemdner_TEXT:MESH:D014801)`, `chemdner_TEXT:MESH:D058846)`, `chemdner_TEXT:MESH:D011809)`, `bionlp_st_2011_epi_ner:O)`, `bionlp_st_2013_cg_NER:I-Metastasis)`, `chemdner_TEXT:MESH:D012643)`, `an_em_ner:I-Cell)`, `bionlp_st_2013_gro_ner:I-CatalyticActivity)`, `anat_em_ner:B-Anatomical_system)`, `mlee_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:I-ChromosomalDNA)`, `anat_em_ner:B-Cell)`, `chemdner_TEXT:MESH:D000242)`, `chemdner_TEXT:MESH:D017641)`, `bioscope_abstracts_ner:I-negation)`, `medmentions_st21pv_ner:B-T058)`, `chemdner_TEXT:MESH:D008744)`, `bionlp_st_2013_gro_ner:B-UpstreamRegulatorySequence)`, `chemdner_TEXT:MESH:D008012)`, `medmentions_full_ner:B-T013)`, `bionlp_st_2011_epi_NER:B-Glycosylation)`, `chemdner_TEXT:MESH:D052999)`, `chemdner_TEXT:MESH:D002329)`, `ebm_pico_ner:I-Intervention_Physical)`, `bionlp_st_2013_pc_ner:B-Complex)`, `medmentions_st21pv_ner:I-T005)`, `chemdner_TEXT:MESH:D064704)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomainTF)`, `bionlp_st_2013_pc_ner:I-Cellular_component)`, `genia_term_corpus_ner:B-ANDDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_gro_ner:B-Chromosome)`, `chemdner_TEXT:MESH:D007546)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfGeneExpression)`, `medmentions_full_ner:I-T010)`, `pdr_NER:B-Treatment_of_disease)`, `medmentions_full_ner:B-T081)`, `bionlp_st_2011_epi_NER:B-Demethylation)`, `chemdner_TEXT:MESH:D013261)`, `bionlp_st_2013_gro_ner:I-RibosomalRNA)`, `verspoor_2013_ner:O)`, `bionlp_st_2013_gro_NER:B-DevelopmentalProcess)`, `chemdner_TEXT:MESH:D009270)`, `medmentions_full_ner:I-T130)`, `bionlp_st_2013_cg_ner:B-Organism)`, `medmentions_full_ner:B-T014)`, `chemdner_TEXT:MESH:D003374)`, `chemdner_TEXT:MESH:D011078)`, `cellfinder_ner:B-GeneProtein)`, `mayosrs_sts:6)`, `chemdner_TEXT:MESH:D005576)`, `bionlp_st_2013_ge_RE:Cause)`, `an_em_RE:None)`, `sciq_SEQ:answer)`, `bionlp_st_2013_cg_NER:B-Dissociation)`, `mlee_RE:frag)`, `bionlp_st_2013_pc_COREF:coref)`, `meddocan_ner:B-NOMBRE_SUJETO_ASISTENCIA)`, `chemdner_TEXT:MESH:D008469)`, `ncbi_disease_ner:O)`, `bionlp_st_2011_epi_ner:I-Protein)`, `chemdner_TEXT:MESH:D011140)`, `chemdner_TEXT:MESH:D020001)`, `bionlp_st_2013_gro_ner:I-ThreeDimensionalMolecularStructure)`, `bionlp_st_2013_cg_ner:B-Cancer)`, `genia_term_corpus_ner:B-BUT_NOTother_nameother_name)`, `chemdner_TEXT:MESH:D006862)`, `medmentions_full_ner:B-T104)`, `bionlp_st_2011_epi_RE:Theme)`, `cellfinder_ner:B-Anatomy)`, `chemdner_TEXT:MESH:D010545)`, `biorelex_ner:B-RNA-family)`, `pico_extraction_ner:I-outcome)`, `mantra_gsc_en_patents_ner:I-PHYS)`, `bionlp_st_2013_pc_NER:I-Transcription)`, `bionlp_shared_task_2009_RE:Cause)`, `bionlp_st_2013_gro_ner:B-Vitamin)`, `bionlp_shared_task_2009_RE:CSite)`, `bionlp_st_2011_ge_ner:I-Protein)`, `mlee_COREF:coref)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelix)`, `bioinfer_ner:I-Gene)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivatorActivity)`, `chemdner_TEXT:MESH:D054439)`, `chemdner_TEXT:MESH:D011621)`, `ddi_corpus_ner:I-DRUG_N)`, `chemdner_TEXT:MESH:D019308)`, `bionlp_st_2013_gro_ner:I-Locus)`, `bionlp_shared_task_2009_RE:ToLoc)`, `bionlp_st_2013_cg_NER:B-Development)`, `bionlp_st_2013_gro_NER:I-CellularDevelopmentalProcess)`, `bionlp_st_2013_gro_ner:B-Eukaryote)`, `bionlp_st_2013_ge_NER:B-Negative_regulation)`, `seth_corpus_ner:I-SNP)`, `hprd50_ner:B-protein)`, `bionlp_st_2013_gro_NER:B-BindingOfProtein)`, `mlee_NER:I-Negative_regulation)`, `bionlp_st_2011_ge_NER:B-Protein_catabolism)`, `bionlp_st_2013_pc_ner:B-Cellular_component)`, `bionlp_st_2011_id_ner:I-Chemical)`, `chemdner_TEXT:MESH:D013831)`, `biorelex_COREF:None)`, `chemdner_TEXT:MESH:D005609)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactor)`, `mlee_NER:B-Regulation)`, `chemdner_TEXT:MESH:D059808)`, `bionlp_st_2013_gro_ner:I-bHLHTF)`, `chemdner_TEXT:MESH:D010121)`, `chemdner_TEXT:MESH:D017608)`, `chemdner_TEXT:MESH:D007455)`, `mlee_NER:B-Blood_vessel_development)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorComplex)`, `biorelex_ner:B-disease)`, `bionlp_st_2013_cg_NER:B-Cell_differentiation)`, `medmentions_st21pv_ner:I-T092)`, `chemdner_TEXT:MESH:D007477)`, `medmentions_full_ner:B-T168)`, `pcr_ner:I-Chemical)`, `chemdner_TEXT:MESH:D009636)`, `chemdner_TEXT:MESH:D008051)`, `pharmaconer_ner:I-UNCLEAR)`, `bionlp_shared_task_2009_NER:I-Gene_expression)`, `chemprot_ner:I-GENE-N)`, `biorelex_ner:B-reagent)`, `chemdner_TEXT:MESH:D020123)`, `nlmchem_ner:O)`, `ebm_pico_ner:I-Outcome_Mental)`, `chemdner_TEXT:MESH:D004040)`, `chemdner_TEXT:MESH:D000450)`, `chebi_nactem_fullpaper_ner:O)`, `biorelex_ner:B-protein-isoform)`, `chemdner_TEXT:MESH:D001564)`, `medmentions_full_ner:I-T095)`, `mlee_NER:I-Remodeling)`, `bionlp_st_2013_cg_RE:None)`, `biorelex_ner:O)`, `seth_corpus_RE:AssociatedTo)`, `bioscope_abstracts_ner:B-negation)`, `chebi_nactem_fullpaper_ner:I-Metabolite)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressorActivity)`, `bionlp_st_2013_cg_NER:B-Transcription)`, `bionlp_st_2011_ge_ner:B-Protein)`, `bionlp_st_2013_ge_ner:B-Protein)`, `bionlp_st_2013_gro_ner:I-Tissue)`, `chemdner_TEXT:MESH:D044005)`, `genia_term_corpus_ner:I-protein_substructure)`, `bionlp_st_2013_gro_ner:I-TranslationFactor)`, `minimayosrs_sts:5)`, `chemdner_TEXT:MESH:D012834)`, `ncbi_disease_ner:I-Modifier)`, `mlee_NER:B-Death)`, `medmentions_full_ner:B-T196)`, `bio_sim_verb_sts:4)`, `bionlp_st_2013_gro_NER:B-CellHomeostasis)`, `chemdner_TEXT:MESH:D006001)`, `bionlp_st_2013_gro_RE:encodes)`, `biorelex_ner:B-fusion-protein)`, `mlee_COREF:None)`, `chemdner_TEXT:MESH:D001623)`, `chemdner_TEXT:MESH:D000812)`, `medmentions_full_ner:B-T046)`, `bionlp_shared_task_2009_NER:O)`, `chemdner_TEXT:MESH:D000735)`, `gnormplus_ner:O)`, `chemdner_TEXT:MESH:D014635)`, `bionlp_st_2013_gro_NER:B-Mitosis)`, `chemdner_TEXT:MESH:D003847)`, `chemdner_TEXT:MESH:D002809)`, `medmentions_full_ner:I-T116)`, `chemdner_TEXT:MESH:D060406)`, `chemprot_ner:B-CHEMICAL)`, `chemdner_TEXT:MESH:D016642)`, `bionlp_st_2013_cg_NER:B-Phosphorylation)`, `an_em_ner:B-Organ)`, `chemdner_TEXT:MESH:D013431)`, `bionlp_shared_task_2009_RE:None)`, `medmentions_full_ner:B-T041)`, `mlee_ner:I-Tissue)`, `chemdner_TEXT:MESH:D023303)`, `ebm_pico_ner:I-Participant_Condition)`, `bionlp_st_2013_gro_ner:I-TATAbox)`, `bionlp_st_2013_gro_ner:I-bZIP)`, `bionlp_st_2011_epi_RE:Sidechain)`, `bionlp_st_2013_gro_ner:B-LivingEntity)`, `mantra_gsc_en_medline_ner:B-CHEM)`, `chemdner_TEXT:MESH:D007659)`, `medmentions_full_ner:I-T085)`, `bionlp_st_2013_cg_ner:I-Organism_substance)`, `medmentions_full_ner:B-T067)`, `chemdner_TEXT:MESH:D057846)`, `bionlp_st_2013_gro_NER:I-SignalingPathway)`, `bc5cdr_ner:I-Chemical)`, `nlm_gene_ner:I-STARGENE)`, `medmentions_full_ner:B-T090)`, `medmentions_full_ner:I-T037)`, `medmentions_full_ner:B-T037)`, `minimayosrs_sts:6)`, `medmentions_full_ner:I-T020)`, `chebi_nactem_fullpaper_ner:B-Species)`, `mirna_ner:O)`, `bionlp_st_2011_id_RE:Participant)`, `bionlp_st_2013_ge_NER:B-Binding)`, `ddi_corpus_ner:B-DRUG)`, `medmentions_full_ner:I-T078)`, `chemdner_TEXT:MESH:D012965)`, `bionlp_st_2013_cg_ner:I-Organ)`, `bionlp_st_2011_id_NER:B-Binding)`, `chemdner_TEXT:MESH:D006571)`, `mayosrs_sts:4)`, `chemdner_TEXT:MESH:D026422)`, `genia_term_corpus_ner:I-RNA_NA)`, `bionlp_st_2011_epi_RE:None)`, `chemdner_TEXT:MESH:D012265)`, `medmentions_full_ner:B-T195)`, `chemdner_TEXT:MESH:D014443)`, `bionlp_st_2013_gro_ner:I-OrganicChemical)`, `ebm_pico_ner:B-Participant_Age)`, `chemdner_TEXT:MESH:D009584)`, `chemdner_TEXT:MESH:D010862)`, `verspoor_2013_ner:B-Concepts_Ideas)`, `bionlp_st_2013_gro_NER:B-ActivationOfProcess)`, `chemdner_TEXT:MESH:D010118)`, `pharmaconer_ner:I-PROTEINAS)`, `biorelex_COREF:coref)`, `bionlp_st_2013_gro_ner:I-Enzyme)`, `chemdner_TEXT:MESH:D012530)`, `chemdner_TEXT:MESH:D002351)`, `biorelex_ner:B-gene)`, `chemdner_TEXT:MESH:D013213)`, `medmentions_full_ner:B-T103)`, `chemdner_TEXT:MESH:D010091)`, `ebm_pico_ner:B-Participant_Sex)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndDNA)`, `bionlp_st_2013_gro_ner:B-Phenotype)`, `chemdner_TEXT:MESH:D019791)`, `chemdner_TEXT:MESH:D014280)`, `chemdner_TEXT:MESH:D011094)`, `chia_RE:None)`, `biorelex_RE:None)`, `chemdner_TEXT:MESH:D005230)`, `verspoor_2013_ner:B-cohort-patient)`, `chemdner_TEXT:MESH:D013645)`, `bionlp_st_2013_gro_ner:B-SecondMessenger)`, `mlee_ner:B-Cellular_component)`, `bionlp_shared_task_2009_NER:I-Phosphorylation)`, `mlee_ner:B-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D017275)`, `chemdner_TEXT:MESH:D007053)`, `bionlp_st_2013_ge_RE:Site)`, `genia_term_corpus_ner:O)`, `chemprot_RE:CPR:6)`, `chemdner_TEXT:MESH:D006859)`, `genia_term_corpus_ner:I-other_name)`, `medmentions_full_ner:I-T042)`, `pdr_ner:O)`, `medmentions_full_ner:I-T057)`, `bionlp_st_2013_pc_RE:Product)`, `verspoor_2013_ner:B-size)`, `bionlp_st_2013_pc_NER:B-Acetylation)`, `medmentions_st21pv_ner:B-T017)`, `chia_ner:B-Temporal)`, `chemdner_TEXT:MESH:D003404)`, `bionlp_st_2013_gro_RE:None)`, `bionlp_shared_task_2009_NER:B-Gene_expression)`, `mqp_sts:3)`, `bionlp_st_2013_gro_ner:B-Chemical)`, `chemdner_TEXT:MESH:D013754)`, `mantra_gsc_en_medline_ner:B-GEOG)`, `mirna_ner:B-Specific_miRNAs)`, `chemdner_TEXT:MESH:D012492)`, `medmentions_full_ner:B-T190)`, `bionlp_st_2013_cg_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:B-RNA)`, `chemdner_TEXT:MESH:D011743)`, `chemdner_TEXT:MESH:D010795)`, `bionlp_st_2013_gro_NER:I-PositiveRegulation)`, `chemdner_TEXT:MESH:D002241)`, `medmentions_full_ner:B-T038)`, `mlee_ner:B-Organism)`, `medmentions_full_ner:I-T168)`, `bioscope_abstracts_ner:O)`, `chemdner_TEXT:MESH:D002599)`, `bionlp_st_2013_pc_ner:I-Simple_chemical)`, `medmentions_full_ner:I-T066)`, `chemdner_TEXT:MESH:D019695)`, `bionlp_st_2013_ge_NER:I-Transcription)`, `pharmaconer_ner:I-NO_NORMALIZABLES)`, `mantra_gsc_en_emea_ner:B-DISO)`, `bionlp_st_2013_gro_NER:B-CellDeath)`, `medmentions_st21pv_ner:I-T031)`, `chemdner_TEXT:MESH:D004317)`, `bionlp_st_2013_gro_ner:B-TATAbox)`, `chemdner_TEXT:MESH:D052203)`, `bionlp_st_2013_gro_NER:B-CellFateDetermination)`, `medmentions_st21pv_ner:I-T022)`, `bionlp_st_2013_ge_NER:B-Protein_catabolism)`, `bionlp_st_2011_epi_NER:I-Catalysis)`, `verspoor_2013_ner:I-cohort-patient)`, `chemdner_TEXT:MESH:D010100)`, `an_em_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D045162)`, `chia_RE:Has_qualifier)`, `verspoor_2013_RE:has)`, `chemdner_TEXT:MESH:D021382)`, `bionlp_st_2013_ge_NER:B-Acetylation)`, `medmentions_full_ner:I-T079)`, `bionlp_st_2013_gro_NER:B-Maintenance)`, `biorelex_ner:I-protein-domain)`, `chebi_nactem_abstr_ann1_ner:I-Chemical)`, `bioscope_papers_ner:O)`, `chia_RE:Has_scope)`, `bc5cdr_ner:B-Disease)`, `mlee_ner:I-Cellular_component)`, `medmentions_full_ner:I-T195)`, `spl_adr_200db_train_ner:B-AdverseReaction)`, `bionlp_st_2013_gro_ner:I-Promoter)`, `medmentions_full_ner:B-T040)`, `chemdner_TEXT:MESH:D005960)`, `chemdner_TEXT:MESH:D004164)`, `chemdner_TEXT:MESH:D015032)`, `chemdner_TEXT:MESH:D014255)`, `ebm_pico_ner:B-Outcome_Pain)`, `bionlp_st_2013_gro_ner:I-UpstreamRegulatorySequence)`, `meddocan_ner:I-CALLE)`, `bionlp_st_2013_pc_NER:I-Positive_regulation)`, `bionlp_st_2013_cg_NER:I-Regulation)`, `chemdner_TEXT:MESH:D001151)`, `medmentions_full_ner:I-T077)`, `chemdner_TEXT:MESH:D000081)`, `bionlp_st_2013_gro_NER:B-Stabilization)`, `mayosrs_sts:1)`, `biorelex_ner:B-mutation)`, `chemdner_TEXT:MESH:D000241)`, `chemdner_TEXT:MESH:D007930)`, `bionlp_st_2013_gro_NER:B-MetabolicPathway)`, `chemdner_TEXT:MESH:D013629)`, `chemdner_TEXT:MESH:D016202)`, `tmvar_v1_ner:I-DNAMutation)`, `chemdner_TEXT:MESH:D012502)`, `chemdner_TEXT:MESH:D044945)`, `bionlp_st_2013_cg_ner:I-Cellular_component)`, `mlee_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:I-AP2EREBPRelatedDomain)`, `chemdner_TEXT:MESH:D002338)`, `mayosrs_sts:5)`, `bionlp_st_2013_gro_ner:B-Intron)`, `genia_term_corpus_ner:I-DNA_domain_or_region)`, `anat_em_ner:I-Immaterial_anatomical_entity)`, `bionlp_st_2013_gro_ner:B-MutatedProtein)`, `ebm_pico_ner:I-Outcome_Mortality)`, `bionlp_st_2013_gro_ner:B-ProteinCodingRegion)`, `chemdner_TEXT:MESH:D005047)`, `chia_ner:B-Mood)`, `medmentions_st21pv_ner:O)`, `cellfinder_ner:I-Species)`, `bionlp_st_2013_gro_ner:I-InorganicChemical)`, `bionlp_st_2011_id_ner:B-Entity)`, `bionlp_st_2013_cg_NER:I-Catabolism)`, `an_em_ner:I-Cellular_component)`, `medmentions_full_ner:B-T021)`, `bionlp_st_2013_gro_NER:B-Heterodimerization)`, `chemdner_TEXT:MESH:D008315)`, `medmentions_st21pv_ner:I-T170)`, `chemdner_TEXT:MESH:D050112)`, `meddocan_ner:I-ID_ASEGURAMIENTO)`, `chia_RE:Subsumes)`, `medmentions_full_ner:I-T099)`, `bionlp_st_2013_gro_ner:I-Protein)`, `chemdner_TEXT:MESH:D047071)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorActivity)`, `mlee_ner:B-Organism_subdivision)`, `chemdner_TEXT:MESH:D016559)`, `medmentions_full_ner:B-T129)`, `genia_term_corpus_ner:I-protein_molecule)`, `mlee_ner:B-Drug_or_compound)`, `bionlp_st_2013_gro_NER:B-Silencing)`, `bionlp_st_2013_gro_ner:I-MolecularStructure)`, `genia_term_corpus_ner:B-nucleotide)`, `chemdner_TEXT:MESH:D003042)`, `mantra_gsc_en_emea_ner:B-ANAT)`, `meddocan_ner:I-SEXO_SUJETO_ASISTENCIA)`, `chemdner_TEXT:MESH:D006690)`, `genia_term_corpus_ner:I-ANDcell_linecell_line)`, `meddocan_ner:I-OTROS_SUJETO_ASISTENCIA)`, `chemdner_TEXT:MESH:D005473)`, `mantra_gsc_en_medline_ner:I-PHYS)`, `bionlp_st_2013_cg_NER:B-Blood_vessel_development)`, `bionlp_st_2013_gro_ner:B-BetaScaffoldDomain_WithMinorGrooveContacts)`, `chemdner_TEXT:MESH:D001549)`, `chia_ner:B-Measurement)`, `bionlp_st_2011_id_ner:B-Regulon-operon)`, `bionlp_st_2013_cg_NER:B-Acetylation)`, `pdr_ner:B-Plant)`, `mlee_NER:B-Development)`, `linnaeus_filtered_ner:B-species)`, `bionlp_st_2013_pc_RE:AtLoc)`, `medmentions_full_ner:I-T192)`, `bionlp_st_2013_gro_ner:B-BindingSiteOfProtein)`, `bionlp_st_2013_ge_NER:B-Ubiquitination)`, `bionlp_st_2013_gro_ner:I-ProteinCodingDNARegion)`, `chemdner_TEXT:MESH:D009647)`, `bionlp_st_2013_gro_ner:I-Ligand)`, `bionlp_st_2011_id_ner:O)`, `bionlp_st_2013_gro_NER:I-RNASplicing)`, `bionlp_st_2013_gro_ner:I-ComplexOfProteinAndRNA)`, `bionlp_st_2011_id_NER:B-Gene_expression)`, `meddocan_ner:I-HOSPITAL)`, `chemdner_TEXT:MESH:D007501)`, `ehr_rel_sts:5)`, `bionlp_st_2013_gro_ner:B-TranscriptionRegulator)`, `medmentions_full_ner:B-T089)`, `bionlp_st_2011_epi_NER:I-DNA_demethylation)`, `mirna_ner:B-Species)`, `bionlp_st_2013_gro_ner:I-TranscriptionRegulator)`, `bionlp_st_2013_gro_NER:B-ProteinBiosynthesis)`, `scai_chemical_ner:B-ABBREVIATION)`, `bionlp_st_2013_gro_ner:I-Virus)`, `bionlp_st_2011_ge_NER:O)`, `medmentions_full_ner:B-T203)`, `bionlp_st_2013_cg_NER:I-Mutation)`, `bionlp_st_2013_gro_ner:B-ThreeDimensionalMolecularStructure)`, `genetaggold_ner:I-NEWGENE)`, `chemdner_TEXT:MESH:D010705)`, `chia_ner:I-Mood)`, `medmentions_full_ner:I-T068)`, `minimayosrs_sts:4)`, `medmentions_full_ner:I-T097)`, `bionlp_st_2013_gro_ner:I-BetaScaffoldDomain_WithMinorGrooveContacts)`, `mantra_gsc_en_emea_ner:I-PHYS)`, `medmentions_full_ner:I-T104)`, `bio_sim_verb_sts:5)`, `chebi_nactem_abstr_ann1_ner:B-Biological_Activity)`, `bionlp_st_2013_gro_NER:B-IntraCellularProcess)`, `mantra_gsc_en_emea_ner:I-PHEN)`, `mlee_ner:B-Cell)`, `chemdner_TEXT:MESH:D045784)`, `bionlp_st_2013_gro_ner:I-Vitamin)`, `chemdner_TEXT:MESH:D010416)`, `bionlp_st_2013_gro_ner:B-FusionGene)`, `bionlp_st_2013_gro_ner:I-FusionProtein)`, `mlee_NER:B-Remodeling)`, `minimayosrs_sts:8)`, `bionlp_st_2013_gro_ner:B-Enhancer)`, `mantra_gsc_en_emea_ner:O)`, `bionlp_st_2013_gro_ner:B-OpenReadingFrame)`, `bionlp_st_2013_pc_COREF:None)`, `medmentions_full_ner:I-T123)`, `bionlp_st_2013_gro_NER:I-RegulatoryProcess)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfGeneExpression)`, `nlm_gene_ner:B-Domain)`, `bionlp_st_2013_pc_NER:B-Methylation)`, `medmentions_full_ner:B-T057)`, `chemdner_TEXT:MESH:D010226)`, `bionlp_st_2013_gro_ner:B-GeneProduct)`, `ebm_pico_ner:I-Outcome_Other)`, `chemdner_TEXT:MESH:D005223)`, `pdr_RE:Theme)`, `bionlp_shared_task_2009_NER:B-Protein_catabolism)`, `chemdner_TEXT:MESH:D019344)`, `gnormplus_ner:I-FamilyName)`, `verspoor_2013_ner:B-gender)`, `bionlp_st_2013_gro_NER:B-TranscriptionInitiation)`, `spl_adr_200db_train_ner:B-Severity)`, `medmentions_st21pv_ner:B-T097)`, `anat_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_NER:I-RNAMetabolism)`, `bioinfer_ner:I-Protein_complex)`, `anat_em_ner:I-Cell)`, `bionlp_st_2013_gro_ner:B-ProteinDomain)`, `bionlp_st_2013_gro_ner:I-PrimaryStructure)`, `genia_term_corpus_ner:I-other_artificial_source)`, `chemdner_TEXT:MESH:D010098)`, `bionlp_st_2013_gro_ner:I-Enhancer)`, `bionlp_st_2013_gro_ner:I-PositiveTranscriptionRegulator)`, `chemdner_TEXT:MESH:D004051)`, `chemdner_TEXT:MESH:D013853)`, `chebi_nactem_fullpaper_ner:B-Metabolite)`, `diann_iber_eval_en_ner:B-Disability)`, `biorelex_ner:B-peptide)`, `medmentions_full_ner:B-T048)`, `bionlp_st_2013_gro_ner:I-Function)`, `genia_term_corpus_ner:I-DNA_NA)`, `mlee_ner:I-Anatomical_system)`, `bioinfer_ner:B-Individual_protein)`, `verspoor_2013_ner:I-Physiology)`, `genia_term_corpus_ner:I-RNA_molecule)`, `chemdner_TEXT:MESH:D000255)`, `minimayosrs_sts:7)`, `mlee_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-ResponseProcess)`, `mantra_gsc_en_medline_ner:I-LIVB)`, `chemdner_TEXT:MESH:D010649)`, `seth_corpus_ner:B-Gene)`, `bionlp_st_2013_gro_ner:B-Attenuator)`, `chemdner_TEXT:MESH:D015363)`, `bionlp_st_2013_pc_NER:B-Inactivation)`, `medmentions_full_ner:I-T191)`, `mlee_ner:I-Organ)`, `chemdner_TEXT:MESH:D011765)`, `bionlp_shared_task_2009_NER:B-Binding)`, `an_em_ner:B-Cellular_component)`, `genia_term_corpus_ner:I-RNA_substructure)`, `medmentions_full_ner:B-T051)`, `anat_em_ner:I-Pathological_formation)`, `chemdner_TEXT:MESH:D013634)`, `chemdner_TEXT:MESH:D014414)`, `chia_RE:Has_index)`, `ddi_corpus_ner:B-GROUP)`, `bionlp_st_2013_gro_ner:B-MutantProtein)`, `bionlp_st_2013_ge_NER:I-Negative_regulation)`, `biorelex_ner:I-amino-acid)`, `chemdner_TEXT:MESH:D053279)`, `chemprot_RE:CPR:2)`, `bionlp_st_2013_gro_ner:B-bHLHTF)`, `bionlp_st_2013_cg_NER:I-Breakdown)`, `scai_chemical_ner:I-ABBREVIATION)`, `pdr_NER:B-Cause_of_disease)`, `chemdner_TEXT:MESH:D002219)`, `medmentions_full_ner:B-T044)`, `mirna_ner:B-Non-Specific_miRNAs)`, `chemdner_TEXT:MESH:D020748)`, `bionlp_shared_task_2009_RE:Theme)`, `chemdner_TEXT:MESH:D001647)`, `bionlp_st_2011_ge_NER:I-Regulation)`, `bionlp_st_2013_pc_ner:B-Gene_or_gene_product)`, `biorelex_ner:I-protein)`, `mantra_gsc_en_medline_ner:B-PROC)`, `medmentions_full_ner:I-T081)`, `medmentions_st21pv_ner:B-T022)`, `chia_ner:B-Multiplier)`, `bionlp_st_2013_gro_NER:B-GeneMutation)`, `chemdner_TEXT:MESH:D002232)`, `chemdner_TEXT:MESH:D010456)`, `biosses_sts:7)`, `medmentions_full_ner:B-T071)`, `chemdner_TEXT:MESH:D008628)`, `cadec_ner:O)`, `biorelex_ner:I-protein-complex)`, `chemdner_TEXT:MESH:D007328)`, `bionlp_st_2013_pc_NER:I-Activation)`, `bionlp_st_2013_cg_NER:B-Metabolism)`, `scai_chemical_ner:I-PARTIUPAC)`, `verspoor_2013_ner:B-age)`, `medmentions_full_ner:B-T122)`, `medmentions_full_ner:I-T050)`, `genia_term_corpus_ner:B-ANDother_nameother_name)`, `bionlp_st_2013_gro_NER:B-SPhase)`, `chemdner_TEXT:MESH:D012500)`, `mlee_NER:B-Metabolism)`, `bionlp_st_2011_id_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D002794)`, `bionlp_st_2013_gro_NER:B-ProteinTransport)`, `chemdner_TEXT:MESH:D006028)`, `chemdner_TEXT:MESH:D009822)`, `bionlp_st_2013_cg_ner:I-Cancer)`, `bionlp_shared_task_2009_ner:I-Entity)`, `pcr_ner:B-Herb)`, `pubmed_qa_labeled_fold0_CLF:yes)`, `bionlp_st_2013_gro_NER:I-NegativeRegulation)`, `bionlp_st_2013_cg_NER:B-Dephosphorylation)`, `anat_em_ner:B-Multi-tissue_structure)`, `chemdner_TEXT:MESH:D008274)`, `medmentions_full_ner:B-T025)`, `chemprot_RE:CPR:9)`, `bionlp_st_2013_pc_RE:Participant)`, `bionlp_st_2013_pc_ner:B-Simple_chemical)`, `genia_term_corpus_ner:B-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:B-bZIP)`, `bionlp_st_2013_gro_ner:I-Eukaryote)`, `bionlp_st_2013_pc_ner:I-Complex)`, `hprd50_ner:I-protein)`, `medmentions_full_ner:B-T020)`, `bionlp_st_2013_gro_ner:B-Agonist)`, `medmentions_full_ner:B-T030)`, `chemdner_TEXT:MESH:D009536)`, `medmentions_full_ner:B-T169)`, `genia_term_corpus_ner:I-nucleotide)`, `bionlp_st_2013_gro_NER:I-ProteinCatabolism)`, `bc5cdr_ner:O)`, `chemdner_TEXT:MESH:D003078)`, `medmentions_full_ner:I-T040)`, `chemdner_TEXT:MESH:D005963)`, `bionlp_st_2013_gro_ner:B-ExpressionProfiling)`, `mantra_gsc_en_emea_ner:I-DEVI)`, `mlee_NER:B-Cell_division)`, `ebm_pico_ner:B-Intervention_Pharmacological)`, `chemdner_TEXT:MESH:D008790)`, `mantra_gsc_en_emea_ner:I-ANAT)`, `mantra_gsc_en_medline_ner:B-ANAT)`, `chemdner_TEXT:MESH:D003545)`, `bionlp_st_2013_gro_NER:I-IntraCellularTransport)`, `bionlp_st_2013_gro_NER:I-CellDivision)`, `chemdner_TEXT:MESH:D013438)`, `bionlp_st_2011_id_NER:I-Negative_regulation)`, `bionlp_st_2013_gro_NER:I-DevelopmentalProcess)`, `mlee_ner:B-Protein_domain_or_region)`, `chemdner_TEXT:MESH:D014978)`, `bionlp_st_2011_id_NER:O)`, `bionlp_st_2013_gro_ner:I-ReporterGeneConstruction)`, `medmentions_full_ner:I-T025)`, `bionlp_st_2019_bb_RE:Exhibits)`, `ddi_corpus_ner:I-GROUP)`, `chemdner_TEXT:MESH:D011241)`, `chemdner_TEXT:MESH:D010446)`, `bionlp_st_2013_gro_ner:I-ExperimentalMethod)`, `anat_em_ner:B-Tissue)`, `chemdner_TEXT:MESH:D000470)`, `bionlp_st_2013_pc_NER:I-Inactivation)`, `bionlp_st_2013_gro_ner:I-Agonist)`, `medmentions_full_ner:B-T024)`, `mlee_NER:I-Transcription)`, `bionlp_st_2011_epi_NER:B-Deglycosylation)`, `bionlp_st_2013_cg_NER:B-Cell_death)`, `chemdner_TEXT:MESH:D000266)`, `chemdner_TEXT:MESH:D019833)`, `genia_term_corpus_ner:I-RNA_family_or_group)`, `biosses_sts:8)`, `lll_RE:genic_interaction)`, `bionlp_st_2013_gro_ner:B-OrganicChemical)`, `chemdner_TEXT:MESH:D013267)`, `bionlp_st_2013_gro_ner:I-TranscriptionCofactor)`, `biorelex_ner:B-protein-region)`, `chemdner_TEXT:MESH:D001565)`, `genia_term_corpus_ner:B-cell_line)`, `bionlp_st_2013_gro_NER:B-Cleavage)`, `ddi_corpus_RE:EFFECT)`, `bionlp_st_2013_cg_NER:B-Planned_process)`, `bionlp_st_2013_cg_ner:I-Immaterial_anatomical_entity)`, `chemdner_TEXT:MESH:D007660)`, `medmentions_full_ner:I-T090)`, `bionlp_st_2013_gro_ner:I-CpGIsland)`, `bionlp_st_2013_gro_ner:B-AminoAcid)`, `chemdner_TEXT:MESH:D001095)`, `mlee_NER:I-Death)`, `meddocan_ner:I-EDAD_SUJETO_ASISTENCIA)`, `bionlp_st_2013_cg_ner:I-Anatomical_system)`, `bionlp_st_2013_gro_NER:B-Decrease)`, `bionlp_st_2013_pc_NER:B-Hydroxylation)`, `chemdner_TEXT:None)`, `bio_sim_verb_sts:3)`, `biorelex_ner:B-protein)`, `bionlp_st_2013_gro_ner:I-BasicDomain)`, `bionlp_st_2011_ge_ner:I-Entity)`, `bionlp_st_2013_gro_ner:B-PhysicalContinuant)`, `chemprot_RE:CPR:4)`, `chemdner_TEXT:MESH:D003345)`, `chemdner_TEXT:MESH:D010080)`, `mantra_gsc_en_patents_ner:O)`, `bionlp_st_2013_gro_ner:B-AntisenseRNA)`, `bionlp_st_2013_gro_ner:B-ProteinCodingDNARegion)`, `chemdner_TEXT:MESH:D010768)`, `chebi_nactem_fullpaper_ner:I-Protein)`, `genia_term_corpus_ner:I-multi_cell)`, `bionlp_st_2013_gro_ner:I-Gene)`, `medmentions_full_ner:B-T042)`, `chemdner_TEXT:MESH:D006034)`, `biorelex_ner:I-brand)`, `chebi_nactem_abstr_ann1_ner:I-Species)`, `chemdner_TEXT:MESH:D012236)`, `bionlp_st_2013_gro_ner:I-GeneProduct)`, `chemdner_TEXT:MESH:D005665)`, `chemdner_TEXT:MESH:D008715)`, `medmentions_st21pv_ner:I-T103)`, `ddi_corpus_RE:None)`, `medmentions_st21pv_ner:I-T091)`, `chemdner_TEXT:MESH:D019158)`, `chemdner_TEXT:MESH:D001280)`, `chemdner_TEXT:MESH:D009249)`, `medmentions_full_ner:I-T067)`, `medmentions_full_ner:B-T005)`, `meddocan_ner:O)`, `bionlp_st_2013_cg_NER:I-Remodeling)`, `meddocan_ner:B-ID_EMPLEO_PERSONAL_SANITARIO)`, `chemdner_TEXT:MESH:D000166)`, `osiris_ner:B-variant)`, `spl_adr_200db_train_ner:I-DrugClass)`, `mirna_ner:I-Species)`, `medmentions_st21pv_ner:I-T033)`, `ebm_pico_ner:I-Participant_Age)`, `medmentions_full_ner:B-T095)`, `bionlp_st_2013_gro_NER:B-RNAMetabolism)`, `chemdner_TEXT:MESH:D005231)`, `medmentions_full_ner:B-T062)`, `bionlp_st_2011_ge_NER:I-Gene_expression)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactor)`, `genia_term_corpus_ner:B-protein_domain_or_region)`, `mantra_gsc_en_emea_ner:B-PROC)`, `mlee_NER:I-Pathway)`, `bionlp_st_2013_gro_NER:I-BindingOfProteinToProteinBindingSiteOfProtein)`, `bionlp_st_2011_id_COREF:coref)`, `biosses_sts:6)`, `biorelex_ner:I-organism)`, `chia_ner:B-Value)`, `verspoor_2013_ner:B-body-part)`, `chemdner_TEXT:MESH:D004974)`, `chia_RE:Has_mood)`, `medmentions_st21pv_ner:B-T074)`, `chemdner_TEXT:MESH:D000535)`, `verspoor_2013_ner:I-Disorder)`, `bionlp_st_2013_gro_NER:B-BindingToMolecularEntity)`, `bionlp_st_2013_gro_ner:I-ReporterGene)`, `mayosrs_sts:8)`, `bionlp_st_2013_cg_ner:I-DNA_domain_or_region)`, `bionlp_st_2013_gro_NER:I-Pathway)`, `medmentions_st21pv_ner:I-T168)`, `bionlp_st_2013_gro_NER:B-NegativeRegulation)`, `medmentions_full_ner:B-T123)`, `bionlp_st_2013_pc_NER:B-Positive_regulation)`, `bionlp_st_2013_gro_NER:I-FormationOfProteinDNAComplex)`, `chemdner_TEXT:MESH:D000577)`, `mlee_NER:B-Ubiquitination)`, `chemdner_TEXT:MESH:D003630)`, `bionlp_st_2013_gro_ner:B-Transcript)`, `bionlp_st_2013_cg_NER:I-Transcription)`, `anat_em_ner:B-Organ)`, `anat_em_ner:I-Organism_substance)`, `spl_adr_200db_train_ner:B-DrugClass)`, `bionlp_st_2013_gro_ner:I-ProteinSubunit)`, `biorelex_ner:B-protein-domain)`, `chemdner_TEXT:MESH:D006051)`, `bionlp_st_2011_id_NER:B-Process)`, `bionlp_st_2013_pc_NER:B-Ubiquitination)`, `bionlp_st_2013_pc_NER:B-Transcription)`, `chemdner_TEXT:MESH:D006838)`, `cadec_ner:I-Disease)`, `bionlp_st_2013_ge_NER:B-Localization)`, `pharmaconer_ner:B-NO_NORMALIZABLES)`, `chemdner_TEXT:MESH:D011759)`, `chemdner_TEXT:MESH:D053243)`, `biorelex_ner:I-mutation)`, `mantra_gsc_en_emea_ner:I-LIVB)`, `bionlp_st_2013_gro_NER:I-Transport)`, `bionlp_st_2011_id_RE:Site)`, `chemdner_TEXT:MESH:D015474)`, `bionlp_st_2013_gro_NER:B-Dimerization)`, `bionlp_st_2013_cg_NER:I-Localization)`, `medmentions_full_ner:I-T032)`, `chemdner_TEXT:MESH:D018036)`, `meddocan_ner:B-FECHAS)`, `medmentions_full_ner:I-T167)`, `chemprot_RE:CPR:5)`, `minimayosrs_sts:2)`, `biorelex_ner:B-protein-DNA-complex)`, `cellfinder_ner:I-CellComponent)`, `nlm_gene_ner:B-Other)`, `medmentions_full_ner:I-T019)`, `chebi_nactem_abstr_ann1_ner:B-Spectral_Data)`, `bionlp_st_2013_cg_ner:I-Multi-tissue_structure)`, `medmentions_full_ner:B-T010)`, `mantra_gsc_en_medline_ner:I-GEOG)`, `chemprot_ner:I-GENE-Y)`, `mirna_ner:I-Diseases)`, `an_em_ner:O)`, `bionlp_st_2013_cg_NER:B-Remodeling)`, `medmentions_st21pv_ner:I-T058)`, `scicite_TEXT:background)`, `bionlp_st_2013_cg_NER:B-Mutation)`, `genia_term_corpus_ner:B-mono_cell)`, `bionlp_st_2013_gro_ner:B-DNA)`, `medmentions_full_ner:I-T114)`, `bionlp_st_2011_id_RE:Theme)`, `genetaggold_ner:B-NEWGENE)`, `mlee_ner:I-Organism_subdivision)`, `sciq_CLF:yes)`, `bionlp_shared_task_2009_NER:I-Regulation)`, `bionlp_st_2013_gro_ner:B-Microorganism)`, `chemdner_TEXT:MESH:D006108)`, `biorelex_ner:B-amino-acid)`, `bioinfer_ner:I-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-Chemical)`, `mantra_gsc_en_patents_ner:I-DEVI)`, `mantra_gsc_en_medline_ner:O)`, `bionlp_st_2013_pc_NER:I-Regulation)`, `medmentions_full_ner:B-T043)`, `scicite_TEXT:result)`, `bionlp_st_2013_ge_NER:I-Binding)`, `meddocan_ner:I-INSTITUCION)`, `chemdner_TEXT:MESH:D011441)`, `genia_term_corpus_ner:I-protein_domain_or_region)`, `bionlp_st_2011_epi_RE:Cause)`, `bionlp_st_2013_gro_ner:B-Nucleosome)`, `chemdner_TEXT:MESH:D011223)`, `chebi_nactem_abstr_ann1_ner:B-Protein)`, `bionlp_st_2013_gro_RE:hasFunction)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorActivity)`, `biorelex_ner:B-protein-family)`, `bionlp_st_2013_cg_ner:B-Gene_or_gene_product)`, `tmvar_v1_ner:B-SNP)`, `bionlp_st_2013_gro_ner:B-ExperimentalMethod)`, `bionlp_st_2013_gro_ner:B-ReporterGeneConstruction)`, `bionlp_st_2011_ge_NER:B-Transcription)`, `chemdner_TEXT:MESH:D004041)`, `chemdner_TEXT:MESH:D000631)`, `meddocan_ner:I-ID_EMPLEO_PERSONAL_SANITARIO)`, `chebi_nactem_fullpaper_ner:I-Species)`, `medmentions_full_ner:B-T170)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelix)`, `bionlp_st_2013_cg_ner:B-Organism_subdivision)`, `genia_term_corpus_ner:I-DNA_molecule)`, `bionlp_st_2013_cg_NER:I-Glycolysis)`, `an_em_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_NER:B-TranscriptionTermination)`, `bionlp_st_2013_gro_NER:B-CellAging)`, `bionlp_st_2013_cg_ner:B-Protein_domain_or_region)`, `anat_em_ner:B-Organism_substance)`, `medmentions_full_ner:B-T053)`, `mlee_ner:B-Multi-tissue_structure)`, `biosses_sts:4)`, `bioscope_abstracts_ner:I-speculation)`, `chemdner_TEXT:MESH:D053644)`, `bionlp_st_2013_cg_NER:I-Translation)`, `tmvar_v1_ner:B-DNAMutation)`, `genia_term_corpus_ner:B-RNA_substructure)`, `an_em_ner:B-Anatomical_system)`, `bionlp_st_2013_gro_ner:B-Conformation)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfTranscriptionOfGene)`, `medmentions_full_ner:I-T069)`, `chemdner_TEXT:MESH:D006820)`, `chemdner_TEXT:MESH:D015725)`, `chemdner_TEXT:MESH:D010281)`, `mlee_NER:B-Pathway)`, `bionlp_st_2011_id_NER:I-Regulation)`, `bionlp_st_2013_gro_NER:I-GeneExpression)`, `medmentions_full_ner:I-T073)`, `biosses_sts:2)`, `medmentions_full_ner:I-T043)`, `chemdner_TEXT:MESH:D001152)`, `bionlp_st_2013_gro_ner:I-DNAMolecule)`, `chemdner_TEXT:MESH:D015636)`, `chemdner_TEXT:MESH:D000666)`, `chemprot_RE:None)`, `bionlp_st_2013_gro_ner:B-Sequence)`, `chemdner_TEXT:MESH:D009151)`, `chia_ner:B-Observation)`, `an_em_COREF:coref)`, `medmentions_full_ner:B-T120)`, `bionlp_st_2013_gro_ner:B-Tissue)`, `bionlp_st_2013_gro_ner:B-MolecularEntity)`, `bionlp_st_2013_pc_NER:B-Dephosphorylation)`, `chemdner_TEXT:MESH:D044242)`, `bionlp_st_2013_gro_ner:B-FusionProtein)`, `biorelex_ner:B-cell)`, `bionlp_st_2013_gro_NER:B-Disease)`, `bionlp_st_2011_id_RE:None)`, `biorelex_ner:B-protein-motif)`, `bionlp_st_2013_pc_NER:I-Localization)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomain)`, `bionlp_st_2013_gro_ner:B-Locus)`, `genia_term_corpus_ner:B-other_organic_compound)`, `seth_corpus_ner:B-SNP)`, `pcr_ner:O)`, `genia_term_corpus_ner:I-virus)`, `bionlp_st_2013_gro_ner:I-Peptide)`, `chebi_nactem_abstr_ann1_ner:B-Chemical)`, `bionlp_st_2013_gro_ner:B-RNAMolecule)`, `bionlp_st_2013_gro_ner:B-SequenceHomologyAnalysis)`, `chemdner_TEXT:MESH:D005054)`, `bionlp_st_2013_ge_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:B-CellularProcess)`, `bionlp_st_2013_ge_RE:Site2)`, `verspoor_2013_ner:B-Phenomena)`, `chia_ner:I-Temporal)`, `bionlp_st_2013_gro_NER:I-Localization)`, `bionlp_st_2013_cg_NER:B-Ubiquitination)`, `chemdner_TEXT:MESH:D009020)`, `bionlp_st_2013_cg_RE:FromLoc)`, `mlee_ner:B-Organism_substance)`, `genia_term_corpus_ner:I-tissue)`, `medmentions_st21pv_ner:I-T082)`, `chemdner_TEXT:MESH:D054358)`, `medmentions_full_ner:I-T052)`, `chemdner_TEXT:MESH:D005459)`, `chemdner_TEXT:MESH:D047188)`, `medmentions_full_ner:I-T031)`, `chemdner_TEXT:MESH:D013890)`, `chemdner_TEXT:MESH:D004573)`, `genia_term_corpus_ner:B-peptide)`, `an_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_ner:B-MessengerRNA)`, `medmentions_full_ner:B-T171)`, `bionlp_st_2013_gro_NER:B-Affecting)`, `genia_term_corpus_ner:I-body_part)`, `bionlp_st_2013_gro_ner:B-Prokaryote)`, `chemdner_TEXT:MESH:D013844)`, `medmentions_full_ner:I-T061)`, `bionlp_st_2013_pc_NER:B-Negative_regulation)`, `bionlp_st_2013_gro_ner:I-EukaryoticCell)`, `pdr_ner:I-Plant)`, `cadec_ner:I-ADR)`, `chemdner_TEXT:MESH:D024341)`, `medmentions_full_ner:I-T092)`, `chemdner_TEXT:MESH:D020319)`, `bionlp_st_2013_cg_NER:B-Cell_transformation)`, `bionlp_st_2013_gro_NER:B-BindingOfTranscriptionFactorToDNA)`, `an_em_ner:I-Anatomical_system)`, `bionlp_st_2011_epi_NER:B-Hydroxylation)`, `bionlp_st_2013_gro_ner:I-Exon)`, `cellfinder_ner:B-Species)`, `bionlp_st_2013_gro_NER:B-Pathway)`, `bionlp_st_2013_ge_NER:B-Protein_modification)`, `bionlp_st_2013_gro_ner:I-FusionGene)`, `bionlp_st_2011_rel_ner:B-Entity)`, `bionlp_st_2011_id_RE:CSite)`, `bionlp_st_2013_ge_NER:B-Positive_regulation)`, `bionlp_st_2013_gro_ner:I-BindingAssay)`, `bionlp_st_2013_gro_NER:B-CellDivision)`, `bionlp_st_2019_bb_ner:I-Microorganism)`, `medmentions_full_ner:I-T059)`, `chemdner_TEXT:MESH:D011108)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-GeneRegion)`, `bionlp_st_2013_cg_COREF:None)`, `chemdner_TEXT:MESH:D010261)`, `mlee_NER:B-Binding)`, `chemprot_ner:I-CHEMICAL)`, `bionlp_st_2011_id_RE:ToLoc)`, `biorelex_ner:I-organelle)`, `chemdner_TEXT:MESH:D004318)`, `genia_term_corpus_ner:I-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:B-RNAPolymerase)`, `bionlp_st_2013_gro_ner:B-CellComponent)`, `bionlp_st_2013_gro_NER:B-RegulationOfGeneExpression)`, `bionlp_st_2013_gro_ner:B-Peptide)`, `bionlp_shared_task_2009_NER:B-Transcription)`, `biorelex_ner:B-tissue)`, `pico_extraction_ner:B-participant)`, `chia_ner:I-Visit)`, `chemdner_TEXT:MESH:D011807)`, `chemdner_TEXT:MESH:D014501)`, `bionlp_st_2013_gro_NER:I-IntraCellularProcess)`, `ehr_rel_sts:7)`, `pico_extraction_ner:I-intervention)`, `chemdner_TEXT:MESH:D001599)`, `bionlp_st_2013_gro_ner:I-RegulatoryDNARegion)`, `medmentions_st21pv_ner:I-T037)`, `chemdner_TEXT:MESH:D055768)`, `bionlp_st_2013_gro_ner:B-ChromosomalDNA)`, `chemdner_TEXT:MESH:D008550)`, `bionlp_st_2013_pc_RE:Site)`, `cadec_ner:B-ADR)`, `medmentions_full_ner:I-T087)`, `chemdner_TEXT:MESH:D001583)`, `bionlp_st_2011_epi_NER:B-Dehydroxylation)`, `ehr_rel_sts:3)`, `bionlp_st_2013_gro_ner:I-MutantProtein)`, `chemdner_TEXT:MESH:D011804)`, `medmentions_full_ner:B-T091)`, `bionlp_st_2013_cg_RE:CSite)`, `linnaeus_ner:O)`, `medmentions_st21pv_ner:B-T201)`, `verspoor_2013_ner:B-Disorder)`, `bionlp_st_2013_cg_NER:I-Death)`, `bioinfer_ner:I-Individual_protein)`, `medmentions_full_ner:B-T191)`, `verspoor_2013_ner:B-ethnicity)`, `chemdner_TEXT:MESH:D002083)`, `genia_term_corpus_ner:B-carbohydrate)`, `genia_term_corpus_ner:B-DNA_molecule)`, `medmentions_full_ner:B-T069)`, `pdr_NER:I-Treatment_of_disease)`, `mlee_ner:B-Anatomical_system)`, `chebi_nactem_fullpaper_ner:B-Spectral_Data)`, `cadec_ner:B-Disease)`, `chemdner_TEXT:MESH:D005419)`, `bionlp_st_2013_gro_ner:I-Nucleotide)`, `medmentions_full_ner:B-T194)`, `chemdner_TEXT:MESH:D005947)`, `chemdner_TEXT:MESH:D008627)`, `bionlp_st_2013_gro_NER:B-ExperimentalIntervention)`, `chemdner_TEXT:MESH:D011073)`, `chia_RE:Has_negation)`, `verspoor_2013_ner:I-mutation)`, `chemdner_TEXT:MESH:D004224)`, `chemdner_TEXT:MESH:D005663)`, `medmentions_full_ner:I-T094)`, `chemdner_TEXT:MESH:D006877)`, `ebm_pico_ner:B-Outcome_Mortality)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressor)`, `biorelex_ner:I-cell)`, `bionlp_st_2013_gro_NER:I-BindingOfProteinToDNA)`, `verspoor_2013_RE:None)`, `bionlp_st_2013_gro_NER:B-ProteinModification)`, `chemdner_TEXT:MESH:D047090)`, `medmentions_full_ner:I-T204)`, `chemdner_TEXT:MESH:D006843)`, `biorelex_ner:I-protein-family)`, `chemdner_TEXT:MESH:D012694)`, `bionlp_st_2013_gro_ner:B-TranslationFactor)`, `scai_chemical_ner:B-)`, `bionlp_st_2013_gro_ner:B-Exon)`, `medmentions_full_ner:I-T083)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivatorActivity)`, `meddocan_ner:I-NUMERO_TELEFONO)`, `medmentions_full_ner:I-T101)`, `medmentions_full_ner:B-T034)`, `bionlp_st_2013_gro_ner:I-Histone)`, `ddi_corpus_RE:MECHANISM)`, `mantra_gsc_en_emea_ner:I-PROC)`, `genia_term_corpus_ner:I-peptide)`, `bionlp_st_2013_cg_NER:B-Cell_proliferation)`, `meddocan_ner:I-PAIS)`, `chemdner_TEXT:MESH:D004140)`, `medmentions_full_ner:B-T083)`, `diann_iber_eval_en_ner:I-Disability)`, `bionlp_st_2013_gro_NER:B-PosttranslationalModification)`, `biorelex_ner:I-fusion-protein)`, `chemdner_TEXT:MESH:D020910)`, `chemdner_TEXT:MESH:D014747)`, `bionlp_st_2013_ge_NER:B-Gene_expression)`, `biorelex_ner:I-tissue)`, `mantra_gsc_en_patents_ner:B-LIVB)`, `medmentions_full_ner:O)`, `medmentions_full_ner:B-T077)`, `bionlp_st_2013_gro_ner:I-Operon)`, `chemdner_TEXT:MESH:D002392)`, `chemdner_TEXT:MESH:D014498)`, `chemdner_TEXT:MESH:D002368)`, `chemdner_TEXT:MESH:D018817)`, `bionlp_st_2013_ge_NER:I-Regulation)`, `genia_term_corpus_ner:B-atom)`, `chemdner_TEXT:MESH:D011092)`, `chemdner_TEXT:MESH:D015283)`, `chemdner_TEXT:MESH:D018698)`, `cadec_ner:I-Finding)`, `chemdner_TEXT:MESH:D009569)`, `muchmore_en_ner:I-umlsterm)`, `bionlp_st_2013_cg_NER:B-Death)`, `nlm_gene_ner:I-Other)`, `medmentions_full_ner:B-T109)`, `osiris_ner:I-variant)`, `ehr_rel_sts:6)`, `chemdner_TEXT:MESH:D001120)`, `mlee_ner:I-Protein_domain_or_region)`, `bionlp_st_2013_pc_NER:B-Dissociation)`, `bionlp_st_2013_cg_NER:B-Metastasis)`, `chemdner_TEXT:MESH:D014204)`, `chemdner_TEXT:MESH:D005857)`, `medmentions_full_ner:I-T030)`, `chemdner_TEXT:MESH:D019256)`, `bionlp_st_2013_gro_ner:B-Polymerase)`, `chia_ner:B-Negation)`, `bionlp_st_2013_gro_NER:B-CellularMetabolicProcess)`, `bionlp_st_2013_gro_NER:B-CellDifferentiation)`, `biorelex_ner:I-protein-motif)`, `medmentions_full_ner:I-T093)`, `chemdner_TEXT:MESH:D019820)`, `anat_em_ner:B-Pathological_formation)`, `meddocan_ner:I-PROFESION)`, `bionlp_shared_task_2009_NER:B-Localization)`, `genia_term_corpus_ner:B-RNA_domain_or_region)`, `chemdner_TEXT:MESH:D014668)`, `bionlp_st_2013_pc_ner:I-Gene_or_gene_product)`, `chemdner_TEXT:MESH:D019207)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToProteinBindingSiteOfDNA)`, `medmentions_full_ner:B-T059)`, `bionlp_st_2013_gro_ner:B-Ligand)`, `bio_sim_verb_sts:6)`, `biorelex_ner:B-experimental-construct)`, `bionlp_st_2013_gro_ner:I-DNA)`, `pdr_NER:O)`, `chemdner_TEXT:MESH:D008670)`, `bionlp_st_2011_ge_RE:Cause)`, `meddocan_ner:B-CALLE)`, `chemdner_TEXT:MESH:D015232)`, `bionlp_st_2013_pc_NER:O)`, `bionlp_st_2013_gro_NER:B-FormationOfProteinDNAComplex)`, `medmentions_full_ner:B-T121)`, `bionlp_shared_task_2009_NER:B-Regulation)`, `chemdner_TEXT:MESH:D009534)`, `chemdner_TEXT:MESH:D014451)`, `bionlp_st_2011_id_RE:AtLoc)`, `chemdner_TEXT:MESH:D011799)`, `medmentions_st21pv_ner:B-T204)`, `genia_term_corpus_ner:I-protein_subunit)`, `biorelex_ner:I-assay)`, `chemdner_TEXT:MESH:D005680)`, `an_em_ner:I-Organism_substance)`, `chemdner_TEXT:MESH:D010368)`, `chemdner_TEXT:MESH:D000872)`, `bionlp_st_2011_id_NER:I-Gene_expression)`, `bionlp_st_2013_cg_NER:B-Regulation)`, `mlee_ner:I-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D001393)`, `medmentions_full_ner:I-T038)`, `chemdner_TEXT:MESH:D047311)`, `chemdner_TEXT:MESH:D011453)`, `chemdner_TEXT:MESH:D020106)`, `chemdner_TEXT:MESH:D019257)`, `bionlp_st_2013_gro_ner:B-NuclearReceptor)`, `chemdner_TEXT:MESH:D002117)`, `genia_term_corpus_ner:B-lipid)`, `bionlp_st_2013_gro_ner:B-SmallInterferingRNA)`, `chemdner_TEXT:MESH:D011205)`, `chemdner_TEXT:MESH:D002686)`, `bionlp_st_2013_gro_NER:B-Translation)`, `ebm_pico_ner:I-Intervention_Psychological)`, `mlee_ner:I-Drug_or_compound)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D000688)`, `bionlp_st_2011_ge_RE:None)`, `bionlp_st_2013_gro_ner:B-ProteinSubunit)`, `genia_term_corpus_ner:I-ANDother_nameother_name)`, `bionlp_st_2013_gro_NER:I-Heterodimerization)`, `pico_extraction_ner:B-intervention)`, `bionlp_st_2013_cg_ner:I-Organism)`, `bionlp_st_2013_gro_ner:I-ProteinDomain)`, `bionlp_st_2013_gro_NER:I-BindingToProtein)`, `scai_chemical_ner:I-)`, `biorelex_ner:B-experiment-tag)`, `ebm_pico_ner:B-Intervention_Physical)`, `bionlp_st_2013_cg_RE:ToLoc)`, `bionlp_st_2013_gro_NER:B-FormationOfTranscriptionFactorComplex)`, `linnaeus_ner:B-species)`, `medmentions_full_ner:I-T062)`, `chemdner_TEXT:MESH:D014640)`, `mlee_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D008701)`, `mlee_NER:O)`, `chemdner_TEXT:MESH:D014302)`, `genia_term_corpus_ner:B-RNA_family_or_group)`, `medmentions_full_ner:I-T091)`, `medmentions_full_ner:B-T022)`, `medmentions_full_ner:B-T074)`, `bionlp_st_2013_gro_NER:B-ProteinCatabolism)`, `chemdner_TEXT:MESH:D011388)`, `bionlp_st_2013_ge_NER:I-Phosphorylation)`, `bionlp_st_2013_gro_NER:I-CellAdhesion)`, `anat_em_ner:I-Organ)`, `medmentions_full_ner:B-T045)`, `chemdner_TEXT:MESH:D008727)`, `chebi_nactem_abstr_ann1_ner:B-Species)`, `bionlp_st_2013_gro_ner:I-RNAPolymeraseII)`, `nlm_gene_ner:B-STARGENE)`, `mantra_gsc_en_emea_ner:B-OBJC)`, `meddocan_ner:B-PROFESION)`, `bionlp_st_2013_gro_ner:B-DNABindingDomainOfProtein)`, `chemdner_TEXT:MESH:D010636)`, `chemdner_TEXT:MESH:D004061)`, `mlee_NER:I-Binding)`, `medmentions_full_ner:B-T075)`, `medmentions_full_ner:B-UnknownType)`, `chemdner_TEXT:MESH:D019081)`, `bionlp_st_2013_gro_NER:I-Binding)`, `medmentions_full_ner:I-T005)`, `chemdner_TEXT:MESH:D009821)` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_bunsen_base_best_en_5.2.0_3.0_1699290578555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_bunsen_base_best_en_5.2.0_3.0_1699290578555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bunsen_base_best","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_bunsen_base_best","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.base.by_leonweber").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_bunsen_base_best| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|420.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/leonweber/bunsen_base_best \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_buntan_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_buntan_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..85665eeb3f54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_buntan_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_buntan_bert_finetuned_ner BertForTokenClassification from Buntan +author: John Snow Labs +name: bert_ner_buntan_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_buntan_bert_finetuned_ner` is a English model originally trained by Buntan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_buntan_bert_finetuned_ner_en_5.2.0_3.0_1699276868703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_buntan_bert_finetuned_ner_en_5.2.0_3.0_1699276868703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_buntan_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_buntan_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_buntan_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Buntan/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_butchland_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_butchland_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..945ca52090f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_butchland_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from butchland) +author: John Snow Labs +name: bert_ner_butchland_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `butchland`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_butchland_bert_finetuned_ner_en_5.2.0_3.0_1699290573795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_butchland_bert_finetuned_ner_en_5.2.0_3.0_1699290573795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_butchland_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_butchland_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_butchland").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_butchland_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/butchland/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_carblacac_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_carblacac_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..1edc7b398005 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_carblacac_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from carblacac) +author: John Snow Labs +name: bert_ner_carblacac_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `carblacac`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_carblacac_bert_finetuned_ner_en_5.2.0_3.0_1699289829935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_carblacac_bert_finetuned_ner_en_5.2.0_3.0_1699289829935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_carblacac_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_carblacac_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_carblacac").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_carblacac_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/carblacac/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ce_bert_finetuned_ner_ce.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ce_bert_finetuned_ner_ce.md new file mode 100644 index 000000000000..9cd45be80567 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ce_bert_finetuned_ner_ce.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chechen bert_ner_ce_bert_finetuned_ner BertForTokenClassification from Ce +author: John Snow Labs +name: bert_ner_ce_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ce, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ce +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ce_bert_finetuned_ner` is a Chechen model originally trained by Ce. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ce_bert_finetuned_ner_ce_5.2.0_3.0_1699278079714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ce_bert_finetuned_ner_ce_5.2.0_3.0_1699278079714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ce_bert_finetuned_ner","ce") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ce_bert_finetuned_ner", "ce") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ce_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ce| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Ce/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_chandrasutrisnotjhong_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_chandrasutrisnotjhong_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..caf6a70acd73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_chandrasutrisnotjhong_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from chandrasutrisnotjhong) +author: John Snow Labs +name: bert_ner_chandrasutrisnotjhong_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `chandrasutrisnotjhong`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_chandrasutrisnotjhong_bert_finetuned_ner_en_5.2.0_3.0_1699290113067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_chandrasutrisnotjhong_bert_finetuned_ner_en_5.2.0_3.0_1699290113067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_chandrasutrisnotjhong_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_chandrasutrisnotjhong_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_chandrasutrisnotjhong").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_chandrasutrisnotjhong_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/chandrasutrisnotjhong/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_hineng_lid_lince_hi.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_hineng_lid_lince_hi.md new file mode 100644 index 000000000000..7b08fd6b2896 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_hineng_lid_lince_hi.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Hindi Named Entity Recognition (from sagorsarker) +author: John Snow Labs +name: bert_ner_codeswitch_hineng_lid_lince +date: 2023-11-06 +tags: [bert, ner, token_classification, hi, open_source, onnx] +task: Named Entity Recognition +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `codeswitch-hineng-lid-lince` is a Hindi model orginally trained by `sagorsarker`. + +## Predicted Entities + +`mixed`, `hin`, `other`, `unk`, `en`, `ambiguous`, `ne`, `fw` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_hineng_lid_lince_hi_5.2.0_3.0_1699290564166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_hineng_lid_lince_hi_5.2.0_3.0_1699290564166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_hineng_lid_lince","hi") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["मुझे स्पार्क एनएलपी बहुत पसंद है"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_hineng_lid_lince","hi") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("मुझे स्पार्क एनएलपी बहुत पसंद है").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_codeswitch_hineng_lid_lince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|hi| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sagorsarker/codeswitch-hineng-lid-lince +- https://ritual.uh.edu/lince/home +- https://github.com/sagorbrur/codeswitch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_hineng_ner_lince_hi.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_hineng_ner_lince_hi.md new file mode 100644 index 000000000000..91676e0beb87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_hineng_ner_lince_hi.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Hindi Named Entity Recognition (from sagorsarker) +author: John Snow Labs +name: bert_ner_codeswitch_hineng_ner_lince +date: 2023-11-06 +tags: [bert, ner, token_classification, hi, open_source, onnx] +task: Named Entity Recognition +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `codeswitch-hineng-ner-lince` is a Hindi model orginally trained by `sagorsarker`. + +## Predicted Entities + +`PERSON`, `ORGANISATION`, `PLACE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_hineng_ner_lince_hi_5.2.0_3.0_1699293356568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_hineng_ner_lince_hi_5.2.0_3.0_1699293356568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_hineng_ner_lince","hi") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["मुझे स्पार्क एनएलपी बहुत पसंद है"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_hineng_ner_lince","hi") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("मुझे स्पार्क एनएलपी बहुत पसंद है").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_codeswitch_hineng_ner_lince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|hi| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sagorsarker/codeswitch-hineng-ner-lince +- https://ritual.uh.edu/lince/home +- https://github.com/sagorbrur/codeswitch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_nepeng_lid_lince_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_nepeng_lid_lince_en.md new file mode 100644 index 000000000000..1c7d391ccef7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_nepeng_lid_lince_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English Named Entity Recognition (from sagorsarker) +author: John Snow Labs +name: bert_ner_codeswitch_nepeng_lid_lince +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `codeswitch-nepeng-lid-lince` is a English model orginally trained by `sagorsarker`. + +## Predicted Entities + +`mixed`, `other`, `en`, `ambiguous`, `ne`, `nep` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_nepeng_lid_lince_en_5.2.0_3.0_1699290953334.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_nepeng_lid_lince_en_5.2.0_3.0_1699290953334.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_nepeng_lid_lince","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_nepeng_lid_lince","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.codeswitch_nepeng_lid_lince.by_sagorsarker").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_codeswitch_nepeng_lid_lince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sagorsarker/codeswitch-nepeng-lid-lince +- https://ritual.uh.edu/lince/home +- https://github.com/sagorbrur/codeswitch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_spaeng_lid_lince_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_spaeng_lid_lince_en.md new file mode 100644 index 000000000000..7f7333c5a345 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_spaeng_lid_lince_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English Named Entity Recognition (from sagorsarker) +author: John Snow Labs +name: bert_ner_codeswitch_spaeng_lid_lince +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `codeswitch-spaeng-lid-lince` is a English model orginally trained by `sagorsarker`. + +## Predicted Entities + +`mixed`, `other`, `unk`, `en`, `ambiguous`, `spa`, `ne`, `fw` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_spaeng_lid_lince_en_5.2.0_3.0_1699291315591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_spaeng_lid_lince_en_5.2.0_3.0_1699291315591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_spaeng_lid_lince","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_spaeng_lid_lince","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.codeswitch_spaeng_lid_lince.by_sagorsarker").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_codeswitch_spaeng_lid_lince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sagorsarker/codeswitch-spaeng-lid-lince +- https://ritual.uh.edu/lince/home +- https://github.com/sagorbrur/codeswitch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_spaeng_ner_lince_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_spaeng_ner_lince_en.md new file mode 100644 index 000000000000..96356ddb2d18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_codeswitch_spaeng_ner_lince_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English Named Entity Recognition (from sagorsarker) +author: John Snow Labs +name: bert_ner_codeswitch_spaeng_ner_lince +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `codeswitch-spaeng-ner-lince` is a English model orginally trained by `sagorsarker`. + +## Predicted Entities + +`LOC`, `TIME`, `PER`, `PROD`, `TITLE`, `OTHER`, `GROUP`, `ORG`, `EVENT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_spaeng_ner_lince_en_5.2.0_3.0_1699292369140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_codeswitch_spaeng_ner_lince_en_5.2.0_3.0_1699292369140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_spaeng_ner_lince","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_codeswitch_spaeng_ner_lince","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.codeswitch_spaeng_ner_lince.by_sagorsarker").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_codeswitch_spaeng_ner_lince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sagorsarker/codeswitch-spaeng-ner-lince +- https://ritual.uh.edu/lince/home +- https://github.com/sagorbrur/codeswitch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_conll12v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_conll12v2_en.md new file mode 100644 index 000000000000..36b770276e25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_conll12v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_conll12v2 BertForTokenClassification from ramybaly +author: John Snow Labs +name: bert_ner_conll12v2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_conll12v2` is a English model originally trained by ramybaly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_conll12v2_en_5.2.0_3.0_1699280431648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_conll12v2_en_5.2.0_3.0_1699280431648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_conll12v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_conll12v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_conll12v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|625.6 MB| + +## References + +https://huggingface.co/ramybaly/CoNLL12V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_core_term_ner_v1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_core_term_ner_v1_en.md new file mode 100644 index 000000000000..bfed20f55daf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_core_term_ner_v1_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from leemeng) +author: John Snow Labs +name: bert_ner_core_term_ner_v1 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `core-term-ner-v1` is a English model originally trained by `leemeng`. + +## Predicted Entities + +`CORE`, `E-CORE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_core_term_ner_v1_en_5.2.0_3.0_1699293639702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_core_term_ner_v1_en_5.2.0_3.0_1699293639702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_core_term_ner_v1","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_core_term_ner_v1","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_leemeng").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_core_term_ner_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/leemeng/core-term-ner-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_imbalanced_scibert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_imbalanced_scibert_en.md new file mode 100644 index 000000000000..feabaca066f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_imbalanced_scibert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_chem_imbalanced_scibert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_chem_imbalanced_scibert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_chem_imbalanced_scibert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_imbalanced_scibert_en_5.2.0_3.0_1699279571759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_imbalanced_scibert_en_5.2.0_3.0_1699279571759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_chem_imbalanced_scibert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_chem_imbalanced_scibert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_chem_imbalanced_scibert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Chem_Imbalanced-SciBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_biobert_v1.1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_biobert_v1.1_en.md new file mode 100644 index 000000000000..ec532d639471 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_biobert_v1.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_chem_modified_biobert_v1.1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_chem_modified_biobert_v1.1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_chem_modified_biobert_v1.1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_biobert_v1.1_en_5.2.0_3.0_1699278934330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_biobert_v1.1_en_5.2.0_3.0_1699278934330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_chem_modified_biobert_v1.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_chem_modified_biobert_v1.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_chem_modified_biobert_v1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Chem-Modified-biobert-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md new file mode 100644 index 000000000000..fdb84094d38d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699277118912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract_en_5.2.0_3.0_1699277118912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_chem_modified_biomednlp_pubmedbert_base_uncased_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Chem-Modified-BiomedNLP-PubMedBERT-base-uncased-abstract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_pubmedbert_en.md new file mode 100644 index 000000000000..d89bf1211fd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_pubmedbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_chem_modified_pubmedbert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_chem_modified_pubmedbert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_chem_modified_pubmedbert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_pubmedbert_en_5.2.0_3.0_1699279778044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_pubmedbert_en_5.2.0_3.0_1699279778044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_chem_modified_pubmedbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_chem_modified_pubmedbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_chem_modified_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Chem-Modified_PubMedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_scibert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_scibert_en.md new file mode 100644 index 000000000000..9b44a16ae581 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_modified_scibert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_chem_modified_scibert BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_chem_modified_scibert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_chem_modified_scibert` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_scibert_en_5.2.0_3.0_1699277322588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_modified_scibert_en_5.2.0_3.0_1699277322588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_chem_modified_scibert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_chem_modified_scibert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_chem_modified_scibert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Chem-Modified_SciBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_original_biobert_v1.1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_original_biobert_v1.1_en.md new file mode 100644 index 000000000000..20f2f0ec1da0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_chem_original_biobert_v1.1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_chem_original_biobert_v1.1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_chem_original_biobert_v1.1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_chem_original_biobert_v1.1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_original_biobert_v1.1_en_5.2.0_3.0_1699277075527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_chem_original_biobert_v1.1_en_5.2.0_3.0_1699277075527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_chem_original_biobert_v1.1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_chem_original_biobert_v1.1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_chem_original_biobert_v1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Chem_Original-biobert-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_bluebert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_bluebert_384_en.md new file mode 100644 index 000000000000..3d18a633f45a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_bluebert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_modified_bluebert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_modified_bluebert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_modified_bluebert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_bluebert_384_en_5.2.0_3.0_1699277260749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_bluebert_384_en_5.2.0_3.0_1699277260749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_modified_bluebert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_modified_bluebert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_modified_bluebert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Modified-BlueBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_bluebert_512_en.md new file mode 100644 index 000000000000..73e4a5d3f254 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_modified_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_modified_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_modified_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_bluebert_512_en_5.2.0_3.0_1699276419432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_bluebert_512_en_5.2.0_3.0_1699276419432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_modified_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_modified_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_modified_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Modified-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_pubmedbert_512_en.md new file mode 100644 index 000000000000..fff659323110 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_modified_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_modified_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_modified_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_pubmedbert_512_en_5.2.0_3.0_1699279132658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_pubmedbert_512_en_5.2.0_3.0_1699279132658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_modified_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_modified_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_modified_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Modified-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_scibert_384_en.md new file mode 100644 index 000000000000..b4a3b14fa0be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_modified_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_modified_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_modified_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_scibert_384_en_5.2.0_3.0_1699277419528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_scibert_384_en_5.2.0_3.0_1699277419528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_modified_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_modified_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_modified_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Modified-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_scibert_512_en.md new file mode 100644 index 000000000000..4466a756a47b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_modified_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_modified_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_modified_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_modified_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_scibert_512_en_5.2.0_3.0_1699277525779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_modified_scibert_512_en_5.2.0_3.0_1699277525779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_modified_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_modified_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_modified_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Modified-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_biobert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_biobert_384_en.md new file mode 100644 index 000000000000..2a0259f07f06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_biobert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_biobert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_biobert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_biobert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_biobert_384_en_5.2.0_3.0_1699277600302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_biobert_384_en_5.2.0_3.0_1699277600302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_biobert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_biobert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_biobert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-BioBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_biobert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_biobert_512_en.md new file mode 100644 index 000000000000..be64d4b21c68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_biobert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_biobert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_biobert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_biobert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_biobert_512_en_5.2.0_3.0_1699279990618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_biobert_512_en_5.2.0_3.0_1699279990618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_biobert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_biobert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_biobert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-BioBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_bluebert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_bluebert_384_en.md new file mode 100644 index 000000000000..ebb634e152a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_bluebert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_bluebert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_bluebert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_bluebert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_bluebert_384_en_5.2.0_3.0_1699279342315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_bluebert_384_en_5.2.0_3.0_1699279342315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_bluebert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_bluebert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_bluebert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-BlueBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_bluebert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_bluebert_512_en.md new file mode 100644 index 000000000000..cd8a91e81851 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_bluebert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_bluebert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_bluebert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_bluebert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_bluebert_512_en_5.2.0_3.0_1699276628507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_bluebert_512_en_5.2.0_3.0_1699276628507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_bluebert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_bluebert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_bluebert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-BlueBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_pubmedbert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_pubmedbert_384_en.md new file mode 100644 index 000000000000..ae76ba111e3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_pubmedbert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_pubmedbert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_pubmedbert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_pubmedbert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_pubmedbert_384_en_5.2.0_3.0_1699277790730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_pubmedbert_384_en_5.2.0_3.0_1699277790730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_pubmedbert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_pubmedbert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_pubmedbert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-PubMedBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_pubmedbert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_pubmedbert_512_en.md new file mode 100644 index 000000000000..038d15c47ff6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_pubmedbert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_pubmedbert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_pubmedbert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_pubmedbert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_pubmedbert_512_en_5.2.0_3.0_1699277708883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_pubmedbert_512_en_5.2.0_3.0_1699277708883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_pubmedbert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_pubmedbert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_pubmedbert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-PubMedBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_scibert_384_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_scibert_384_en.md new file mode 100644 index 000000000000..72e1ccd21824 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_scibert_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_scibert_384 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_scibert_384 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_scibert_384` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_scibert_384_en_5.2.0_3.0_1699277891716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_scibert_384_en_5.2.0_3.0_1699277891716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_scibert_384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_scibert_384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_scibert_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-SciBERT-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_scibert_512_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_scibert_512_en.md new file mode 100644 index 000000000000..597b0decaa7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_craft_original_scibert_512_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_craft_original_scibert_512 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_craft_original_scibert_512 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_craft_original_scibert_512` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_scibert_512_en_5.2.0_3.0_1699280166206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_craft_original_scibert_512_en_5.2.0_3.0_1699280166206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_craft_original_scibert_512","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_craft_original_scibert_512", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_craft_original_scibert_512| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/CRAFT-Original-SciBERT-512 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dani_91_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dani_91_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..3a2ac1d4ab6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dani_91_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_dani_91_bert_finetuned_ner BertForTokenClassification from Dani-91 +author: John Snow Labs +name: bert_ner_dani_91_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_dani_91_bert_finetuned_ner` is a English model originally trained by Dani-91. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_dani_91_bert_finetuned_ner_en_5.2.0_3.0_1699279550805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_dani_91_bert_finetuned_ner_en_5.2.0_3.0_1699279550805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dani_91_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_dani_91_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_dani_91_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Dani-91/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_danish_bert_ner_da.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_danish_bert_ner_da.md new file mode 100644 index 000000000000..b03e2e1083cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_danish_bert_ner_da.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Danish bert_ner_danish_bert_ner BertForTokenClassification from DaNLP +author: John Snow Labs +name: bert_ner_danish_bert_ner +date: 2023-11-06 +tags: [bert, da, open_source, token_classification, onnx] +task: Named Entity Recognition +language: da +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_danish_bert_ner` is a Danish model originally trained by DaNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_danish_bert_ner_da_5.2.0_3.0_1699292560480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_danish_bert_ner_da_5.2.0_3.0_1699292560480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_danish_bert_ner","da") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_danish_bert_ner", "da") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_danish_bert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|da| +|Size:|412.3 MB| + +## References + +https://huggingface.co/DaNLP/da-bert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_datauma_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_datauma_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ab7b2ed229ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_datauma_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from datauma) +author: John Snow Labs +name: bert_ner_datauma_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `datauma`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_datauma_bert_finetuned_ner_en_5.2.0_3.0_1699293996752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_datauma_bert_finetuned_ner_en_5.2.0_3.0_1699293996752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_datauma_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_datauma_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_datauma").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_datauma_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/datauma/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_davemse_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_davemse_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..30ca05afe48f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_davemse_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_davemse_bert_finetuned_ner BertForTokenClassification from DaveMSE +author: John Snow Labs +name: bert_ner_davemse_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_davemse_bert_finetuned_ner` is a English model originally trained by DaveMSE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_davemse_bert_finetuned_ner_en_5.2.0_3.0_1699277973617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_davemse_bert_finetuned_ner_en_5.2.0_3.0_1699277973617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_davemse_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_davemse_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_davemse_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/DaveMSE/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dbert_ner_en.md new file mode 100644 index 000000000000..3dd532109088 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dbert_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from deeq) +author: John Snow Labs +name: bert_ner_dbert_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `dbert-ner` is a English model originally trained by `deeq`. + +## Predicted Entities + +`FLD-B`, `CVL-I`, `PLT-B`, `AFW-B`, `AFW-I`, `ORG-B`, `ORG-I`, `EVT-B`, `ANM-B`, `PER-I`, `NUM-B`, `MAT-I`, `PLT-I`, `PER-B`, `TIM-B`, `FLD-I`, `CVL-B`, `DAT-B`, `LOC-B`, `TRM-B`, `EVT-I`, `LOC-I`, `NUM-I`, `DAT-I`, `MAT-B`, `ANM-I`, `TRM-I`, `TIM-I` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_dbert_ner_en_5.2.0_3.0_1699292826245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_dbert_ner_en_5.2.0_3.0_1699292826245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dbert_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dbert_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_deeq").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_dbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|421.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/deeq/dbert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english_en.md new file mode 100644 index 000000000000..2d72049e1338 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Large Cased model (from dbmdz) +author: John Snow Labs +name: bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-finetuned-conll03-english` is a English model originally trained by `dbmdz`. + +## Predicted Entities + +`PER`, `LOC`, `MISC`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english_en_5.2.0_3.0_1699291043018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english_en_5.2.0_3.0_1699291043018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.cased_large_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_dbmdz_bert_large_cased_finetuned_conll03_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deformer_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deformer_en.md new file mode 100644 index 000000000000..363d16fa8bd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deformer_en.md @@ -0,0 +1,119 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Lauler) +author: John Snow Labs +name: bert_ner_deformer +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deformer` is a English model originally trained by `Lauler`. + +## Predicted Entities + +`DE`, `ord`, `DEM` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_deformer_en_5.2.0_3.0_1699293114150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_deformer_en_5.2.0_3.0_1699293114150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_deformer","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_deformer","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_lauler").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_deformer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Lauler/deformer +- https://opus.nlpl.eu/download.php?f=wikimedia/v20210402/mono/sv.txt.gz +- https://opus.nlpl.eu/download.php?f=JRC-Acquis/mono/JRC-Acquis.raw.sv.gz +- https://opus.nlpl.eu/ +- https://opus.nlpl.eu/download.php?f=Europarl/v8/mono/sv.txt.gz +- https://www4.isof.se/cgi-bin/srfl/visasvar.py?sok=dem%20som&svar=79718&log_id=705355 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deid_bert_i2b2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deid_bert_i2b2_en.md new file mode 100644 index 000000000000..4e633768767a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deid_bert_i2b2_en.md @@ -0,0 +1,121 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from obi) +author: John Snow Labs +name: bert_ner_deid_bert_i2b2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deid_bert_i2b2` is a English model originally trained by `obi`. + +## Predicted Entities + +`L-HOSP`, `L-DATE`, `L-AGE`, `HOSP`, `DATE`, `PATIENT`, `U-DATE`, `PHONE`, `U-HOSP`, `ID`, `U-LOC`, `U-OTHERPHI`, `U-ID`, `U-PATIENT`, `U-EMAIL`, `U-PHONE`, `LOC`, `L-EMAIL`, `U-PATORG`, `L-PHONE`, `EMAIL`, `AGE`, `L-PATIENT`, `L-OTHERPHI`, `L-LOC`, `U-STAFF`, `L-PATORG`, `L-STAFF`, `PATORG`, `U-AGE`, `L-ID`, `OTHERPHI`, `STAFF` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_deid_bert_i2b2_en_5.2.0_3.0_1699291149565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_deid_bert_i2b2_en_5.2.0_3.0_1699291149565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_deid_bert_i2b2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_deid_bert_i2b2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_obi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_deid_bert_i2b2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/obi/deid_bert_i2b2 +- https://github.com/obi-ml-public/ehr_deidentification/tree/master/steps/forward_pass +- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978170/ +- https://arxiv.org/pdf/1904.03323.pdf +- https://github.com/obi-ml-public/ehr_deidentification/tree/master/steps/train +- https://github.com/obi-ml-public/ehr_deidentification/blob/master/AnnotationGuidelines.md +- https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html +- https://github.com/obi-ml-public/ehr_deidentification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deval_bert_base_ner_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deval_bert_base_ner_finetuned_ner_en.md new file mode 100644 index 000000000000..fd489b11f424 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_deval_bert_base_ner_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_deval_bert_base_ner_finetuned_ner BertForTokenClassification from deval +author: John Snow Labs +name: bert_ner_deval_bert_base_ner_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_deval_bert_base_ner_finetuned_ner` is a English model originally trained by deval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_deval_bert_base_ner_finetuned_ner_en_5.2.0_3.0_1699291236473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_deval_bert_base_ner_finetuned_ner_en_5.2.0_3.0_1699291236473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_deval_bert_base_ner_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_deval_bert_base_ner_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_deval_bert_base_ner_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/deval/bert-base-NER-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_distilbert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_distilbert_finetuned_ner_en.md new file mode 100644 index 000000000000..7e983f81ecd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_distilbert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from EhsanYB) +author: John Snow Labs +name: bert_ner_distilbert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-finetuned-ner` is a English model originally trained by `EhsanYB`. + +## Predicted Entities + +`PER`, `ORG`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_distilbert_finetuned_ner_en_5.2.0_3.0_1699293383077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_distilbert_finetuned_ner_en_5.2.0_3.0_1699293383077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_distilbert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_distilbert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.distilled_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_distilbert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/EhsanYB/distilbert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_docusco_bert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_docusco_bert_en.md new file mode 100644 index 000000000000..a925bf92e095 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_docusco_bert_en.md @@ -0,0 +1,122 @@ +--- +layout: model +title: English Named Entity Recognition (from browndw) +author: John Snow Labs +name: bert_ner_docusco_bert +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `docusco-bert` is a English model orginally trained by `browndw`. + +## Predicted Entities + +`Interactive`, `AcademicTerms`, `InformationChange`, `MetadiscourseCohesive`, `FirstPerson`, `InformationPlace`, `Updates`, `InformationChangeneritive`, `Reasoning`, `PublicTerms`, `Citation`, `Future`, `CitationHedged`, `InformationExnerition`, `Contingent`, `Strategic`, `PAD`, `CitationAuthority`, `Facilitate`, `Positive`, `ConfidenceHigh`, `InformationStates`, `AcademicWritingMoves`, `Uncertainty`, `SyntacticComplexity`, `Responsibility`, `Character`, `Narrative`, `MetadiscourseInteractive`, `InformationTopics`, `ConfidenceLow`, `ConfidenceHedged`, `ForceStressed`, `Negative`, `InformationChangeNegative`, `Description`, `Inquiry`, `InformationReportVerbs` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_docusco_bert_en_5.2.0_3.0_1699291798166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_docusco_bert_en_5.2.0_3.0_1699291798166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_docusco_bert","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_docusco_bert","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_browndw").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_docusco_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/browndw/docusco-bert +- https://www.english-corpora.org/coca/ +- https://www.cmu.edu/dietrich/english/research-and-publications/docuscope.html +- https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=docuscope&btnG= +- https://graphics.cs.wisc.edu/WP/vep/2017/02/14/guest-post-data-mining-king-lear/ +- https://journals.sagepub.com/doi/full/10.1177/2055207619844865 +- https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging) +- https://www.english-corpora.org/coca/ +- https://arxiv.org/pdf/1810.04805 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dpuccine_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dpuccine_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..7f1a7c5349e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dpuccine_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from dpuccine) +author: John Snow Labs +name: bert_ner_dpuccine_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `dpuccine`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_dpuccine_bert_finetuned_ner_en_5.2.0_3.0_1699294296104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_dpuccine_bert_finetuned_ner_en_5.2.0_3.0_1699294296104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dpuccine_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dpuccine_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_dpuccine").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_dpuccine_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/dpuccine/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dsghrg_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dsghrg_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..c9f7a6221735 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dsghrg_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from dsghrg) +author: John Snow Labs +name: bert_ner_dsghrg_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `dsghrg`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_dsghrg_bert_finetuned_ner_en_5.2.0_3.0_1699293628203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_dsghrg_bert_finetuned_ner_en_5.2.0_3.0_1699293628203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dsghrg_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dsghrg_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_dsghrg").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_dsghrg_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/dsghrg/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dshvadskiy_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dshvadskiy_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..8dde70f48428 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dshvadskiy_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from dshvadskiy) +author: John Snow Labs +name: bert_ner_dshvadskiy_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `dshvadskiy`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_dshvadskiy_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699291502933.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_dshvadskiy_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699291502933.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dshvadskiy_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dshvadskiy_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_dshvadskiy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_dshvadskiy_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/dshvadskiy/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dshvadskiy_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dshvadskiy_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..879e27d5a1dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_dshvadskiy_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from dshvadskiy) +author: John Snow Labs +name: bert_ner_dshvadskiy_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `dshvadskiy`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_dshvadskiy_bert_finetuned_ner_en_5.2.0_3.0_1699291414329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_dshvadskiy_bert_finetuned_ner_en_5.2.0_3.0_1699291414329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dshvadskiy_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_dshvadskiy_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_dshvadskiy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_dshvadskiy_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/dshvadskiy/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2002 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ehelpbertpt_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ehelpbertpt_en.md new file mode 100644 index 000000000000..2f0e7bfe090a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ehelpbertpt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ehelpbertpt BertForTokenClassification from pucpr +author: John Snow Labs +name: bert_ner_ehelpbertpt +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ehelpbertpt` is a English model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ehelpbertpt_en_5.2.0_3.0_1699292038956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ehelpbertpt_en_5.2.0_3.0_1699292038956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ehelpbertpt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ehelpbertpt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ehelpbertpt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.2 MB| + +## References + +https://huggingface.co/pucpr/eHelpBERTpt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_emmanuel_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_emmanuel_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ce304a684ba3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_emmanuel_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_emmanuel_bert_finetuned_ner BertForTokenClassification from Emmanuel +author: John Snow Labs +name: bert_ner_emmanuel_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_emmanuel_bert_finetuned_ner` is a English model originally trained by Emmanuel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_emmanuel_bert_finetuned_ner_en_5.2.0_3.0_1699279747008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_emmanuel_bert_finetuned_ner_en_5.2.0_3.0_1699279747008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_emmanuel_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_emmanuel_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_emmanuel_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Emmanuel/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_envoy_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_envoy_en.md new file mode 100644 index 000000000000..fc861107f137 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_envoy_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from fagner) +author: John Snow Labs +name: bert_ner_envoy +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `envoy` is a English model originally trained by `fagner`. + +## Predicted Entities + +`Disease`, `Anatomy`, `Chemical` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_envoy_en_5.2.0_3.0_1699292316494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_envoy_en_5.2.0_3.0_1699292316494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_envoy","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_envoy","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_fagner").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_envoy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/fagner/envoy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_en.md new file mode 100644 index 000000000000..2223b23af8b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_epiextract4gard BertForTokenClassification from wzkariampuzha +author: John Snow Labs +name: bert_ner_epiextract4gard +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_epiextract4gard` is a English model originally trained by wzkariampuzha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_epiextract4gard_en_5.2.0_3.0_1699278256014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_epiextract4gard_en_5.2.0_3.0_1699278256014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_epiextract4gard","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_epiextract4gard", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_epiextract4gard| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/wzkariampuzha/EpiExtract4GARD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_v1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_v1_en.md new file mode 100644 index 000000000000..55b4711c4ce6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_epiextract4gard_v1 BertForTokenClassification from ncats +author: John Snow Labs +name: bert_ner_epiextract4gard_v1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_epiextract4gard_v1` is a English model originally trained by ncats. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_epiextract4gard_v1_en_5.2.0_3.0_1699278238688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_epiextract4gard_v1_en_5.2.0_3.0_1699278238688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_epiextract4gard_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_epiextract4gard_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_epiextract4gard_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ncats/EpiExtract4GARD-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_v2_en.md new file mode 100644 index 000000000000..5b5a9a1631ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_epiextract4gard_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_epiextract4gard_v2 BertForTokenClassification from ncats +author: John Snow Labs +name: bert_ner_epiextract4gard_v2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_epiextract4gard_v2` is a English model originally trained by ncats. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_epiextract4gard_v2_en_5.2.0_3.0_1699278614978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_epiextract4gard_v2_en_5.2.0_3.0_1699278614978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_epiextract4gard_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_epiextract4gard_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_epiextract4gard_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ncats/EpiExtract4GARD-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_estbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_estbert_ner_en.md new file mode 100644 index 000000000000..dd674ad15a47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_estbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_estbert_ner BertForTokenClassification from tartuNLP +author: John Snow Labs +name: bert_ner_estbert_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_estbert_ner` is a English model originally trained by tartuNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_estbert_ner_en_5.2.0_3.0_1699278798442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_estbert_ner_en_5.2.0_3.0_1699278798442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_estbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_estbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_estbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|463.5 MB| + +## References + +https://huggingface.co/tartuNLP/EstBERT_NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_fancyerii_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_fancyerii_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..6351c22d2d08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_fancyerii_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from fancyerii) +author: John Snow Labs +name: bert_ner_fancyerii_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `fancyerii`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_fancyerii_bert_finetuned_ner_en_5.2.0_3.0_1699294144516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_fancyerii_bert_finetuned_ner_en_5.2.0_3.0_1699294144516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_fancyerii_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_fancyerii_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_fancyerii").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_fancyerii_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/fancyerii/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_far50brbert_base_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_far50brbert_base_en.md new file mode 100644 index 000000000000..301042698ab5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_far50brbert_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_far50brbert_base BertForTokenClassification from giggio +author: John Snow Labs +name: bert_ner_far50brbert_base +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_far50brbert_base` is a English model originally trained by giggio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_far50brbert_base_en_5.2.0_3.0_1699279937363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_far50brbert_base_en_5.2.0_3.0_1699279937363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_far50brbert_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_far50brbert_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_far50brbert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/giggio/Far50BrBERT-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_far75brbert_base_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_far75brbert_base_en.md new file mode 100644 index 000000000000..e3ddef788b6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_far75brbert_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_far75brbert_base BertForTokenClassification from giggio +author: John Snow Labs +name: bert_ner_far75brbert_base +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_far75brbert_base` is a English model originally trained by giggio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_far75brbert_base_en_5.2.0_3.0_1699278977851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_far75brbert_base_en_5.2.0_3.0_1699278977851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_far75brbert_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_far75brbert_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_far75brbert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/giggio/Far75BrBERT-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_farbrbert_base_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_farbrbert_base_en.md new file mode 100644 index 000000000000..01c7da3a4c52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_farbrbert_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_farbrbert_base BertForTokenClassification from giggio +author: John Snow Labs +name: bert_ner_farbrbert_base +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_farbrbert_base` is a English model originally trained by giggio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_farbrbert_base_en_5.2.0_3.0_1699280127667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_farbrbert_base_en_5.2.0_3.0_1699280127667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_farbrbert_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_farbrbert_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_farbrbert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/giggio/FarBrBERT-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_foo_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_foo_en.md new file mode 100644 index 000000000000..858eff9e8ce3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_foo_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from leonweber) +author: John Snow Labs +name: bert_ner_foo +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `foo` is a English model originally trained by `leonweber`. + +## Predicted Entities + +`medmentions_full_ner:B-T085)`, `bionlp_st_2013_gro_ner:B-Ribosome)`, `chemdner_TEXT:MESH:D013830)`, `anat_em_ner:O)`, `cellfinder_ner:I-GeneProtein)`, `ncbi_disease_ner:B-CompositeMention)`, `bionlp_st_2013_gro_ner:B-Virus)`, `medmentions_full_ner:I-T129)`, `scai_disease_ner:B-DISEASE)`, `biorelex_ner:B-chemical)`, `chemdner_TEXT:MESH:D011166)`, `medmentions_st21pv_ner:I-T204)`, `chemdner_TEXT:MESH:D008345)`, `bionlp_st_2013_gro_NER:B-RegulationOfFunction)`, `mlee_ner:I-Cell)`, `bionlp_st_2013_gro_NER:I-RNABiosynthesis)`, `biorelex_ner:I-RNA-family)`, `bionlp_st_2013_gro_NER:B-ResponseToChemicalStimulus)`, `bionlp_st_2011_epi_NER:B-Dephosphorylation)`, `chemdner_TEXT:MESH:D003035)`, `chemdner_TEXT:MESH:D013440)`, `chemdner_TEXT:MESH:D037341)`, `chemdner_TEXT:MESH:D009532)`, `chemdner_TEXT:MESH:D019216)`, `chemdner_TEXT:MESH:D036701)`, `chemdner_TEXT:MESH:D011107)`, `bionlp_st_2013_cg_NER:B-Translation)`, `genia_term_corpus_ner:B-cell_component)`, `medmentions_full_ner:I-T065)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfDNA)`, `anat_em_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D000225)`, `genia_term_corpus_ner:I-ORDNA_domain_or_regionDNA_domain_or_region)`, `medmentions_full_ner:I-T015)`, `chemdner_TEXT:MESH:D008239)`, `bionlp_st_2013_cg_NER:I-Binding)`, `bionlp_st_2013_cg_NER:B-Amino_acid_catabolism)`, `cellfinder_ner:B-CellComponent)`, `bionlp_st_2013_gro_NER:I-MetabolicPathway)`, `bionlp_st_2013_gro_ner:B-ProteinIdentification)`, `bionlp_st_2011_ge_ner:O)`, `bionlp_st_2011_id_ner:B-Organism)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelixTF)`, `mirna_ner:B-Relation_Trigger)`, `bionlp_st_2011_ge_NER:B-Regulation)`, `bionlp_st_2013_cg_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D008055)`, `chemdner_TEXT:MESH:D009944)`, `verspoor_2013_ner:I-gene)`, `bionlp_st_2013_ge_ner:O)`, `chemdner_TEXT:MESH:D003907)`, `mlee_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D010569)`, `mlee_NER:I-Growth)`, `chemdner_TEXT:MESH:D036145)`, `medmentions_full_ner:I-T196)`, `ehr_rel_sts:1)`, `bionlp_st_2013_gro_NER:B-CellularComponentOrganizationAndBiogenesis)`, `chemdner_TEXT:MESH:D009285)`, `bionlp_st_2013_gro_NER:B-ProteinMetabolism)`, `chemdner_TEXT:MESH:D016718)`, `bionlp_st_2013_gro_NER:I-BindingOfTFToTFBindingSiteOfProtein)`, `medmentions_full_ner:I-T074)`, `chemdner_TEXT:MESH:D000432)`, `bionlp_st_2013_gro_NER:I-CellFateDetermination)`, `chia_ner:I-Reference_point)`, `bionlp_st_2013_gro_ner:B-Histone)`, `lll_RE:None)`, `scai_disease_ner:B-ADVERSE)`, `medmentions_full_ner:B-T130)`, `bionlp_st_2013_gro_NER:I-CellCyclePhaseTransition)`, `chemdner_TEXT:MESH:D000480)`, `chemdner_TEXT:MESH:D001556)`, `bionlp_st_2013_gro_ner:B-Nucleus)`, `bionlp_st_2013_gro_ner:B-AP2EREBPRelatedDomain)`, `chemdner_TEXT:MESH:D007854)`, `chemdner_TEXT:MESH:D009499)`, `genia_term_corpus_ner:B-polynucleotide)`, `bionlp_st_2013_gro_NER:I-Transcription)`, `chemdner_TEXT:MESH:D007213)`, `bionlp_st_2013_ge_NER:B-Regulation)`, `bionlp_st_2011_epi_NER:B-DNA_methylation)`, `medmentions_st21pv_ner:B-T031)`, `bionlp_st_2013_ge_NER:I-Gene_expression)`, `chemdner_TEXT:MESH:D007651)`, `bionlp_st_2013_gro_NER:B-OrganismalProcess)`, `bionlp_st_2011_epi_COREF:None)`, `medmentions_st21pv_ner:I-T062)`, `chemdner_TEXT:MESH:D002047)`, `chemdner_TEXT:MESH:D012822)`, `mantra_gsc_en_patents_ner:B-DEVI)`, `medmentions_full_ner:I-T071)`, `chemdner_TEXT:MESH:D013739)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfGeneExpression)`, `genia_term_corpus_ner:B-other_name)`, `medmentions_full_ner:B-T018)`, `chemdner_TEXT:MESH:D015242)`, `bionlp_st_2013_cg_NER:O)`, `chemdner_TEXT:MESH:D019469)`, `ncbi_disease_ner:B-DiseaseClass)`, `ebm_pico_ner:B-Intervention_Surgical)`, `chemdner_TEXT:MESH:D011422)`, `chemdner_TEXT:MESH:D002112)`, `chemdner_TEXT:MESH:D005682)`, `anat_em_ner:B-Immaterial_anatomical_entity)`, `bionlp_st_2011_epi_ner:B-Entity)`, `medmentions_full_ner:I-T169)`, `mlee_ner:B-Immaterial_anatomical_entity)`, `verspoor_2013_ner:B-Physiology)`, `cellfinder_ner:I-CellType)`, `chemdner_TEXT:MESH:D011122)`, `chemdner_TEXT:MESH:D010622)`, `chemdner_TEXT:MESH:D017378)`, `bionlp_st_2011_ge_RE:Theme)`, `chemdner_TEXT:MESH:D000431)`, `medmentions_full_ner:I-T102)`, `medmentions_full_ner:B-T097)`, `chemdner_TEXT:MESH:D007529)`, `chemdner_TEXT:MESH:D045265)`, `chemdner_TEXT:MESH:D005971)`, `an_em_ner:I-Multi-tissue_structure)`, `genia_term_corpus_ner:I-ANDDNA_family_or_groupDNA_family_or_group)`, `medmentions_full_ner:I-T080)`, `chemdner_TEXT:MESH:D002207)`, `chia_ner:I-Qualifier)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscriptionByTranscriptionRepressor)`, `an_em_ner:I-Immaterial_anatomical_entity)`, `biosses_sts:5)`, `chemdner_TEXT:MESH:D000079963)`, `chemdner_TEXT:MESH:D013196)`, `ehr_rel_sts:2)`, `chemdner_TEXT:MESH:D006152)`, `bionlp_st_2013_gro_NER:B-RegulationOfProcess)`, `mlee_NER:I-Development)`, `medmentions_full_ner:B-T197)`, `bionlp_st_2013_gro_ner:B-NucleicAcid)`, `medmentions_st21pv_ner:I-T017)`, `medmentions_full_ner:I-T046)`, `medmentions_full_ner:B-T204)`, `bionlp_st_2013_gro_NER:B-CellularDevelopmentalProcess)`, `bionlp_st_2013_cg_ner:B-Immaterial_anatomical_entity)`, `chemdner_TEXT:MESH:D014212)`, `bionlp_st_2013_cg_NER:B-Protein_processing)`, `chemdner_TEXT:MESH:D008926)`, `chia_ner:B-Visit)`, `bionlp_st_2011_ge_NER:B-Negative_regulation)`, `mantra_gsc_en_medline_ner:I-OBJC)`, `mlee_RE:FromLoc)`, `bionlp_st_2013_gro_ner:I-RNAMolecule)`, `chemdner_TEXT:MESH:D014812)`, `linnaeus_filtered_ner:I-species)`, `chebi_nactem_fullpaper_ner:B-Chemical)`, `bionlp_st_2011_ge_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_ner:B-MutantGene)`, `chemdner_TEXT:MESH:D014859)`, `bionlp_st_2019_bb_ner:B-Phenotype)`, `bionlp_st_2013_gro_NER:I-BindingOfTFToTFBindingSiteOfDNA)`, `diann_iber_eval_en_ner:I-Neg)`, `ddi_corpus_ner:B-DRUG_N)`, `bionlp_st_2013_cg_ner:B-Organ)`, `chemdner_TEXT:MESH:D009320)`, `bionlp_st_2013_cg_ner:I-Organism_subdivision)`, `bionlp_st_2013_cg_ner:B-Cellular_component)`, `chemdner_TEXT:MESH:D003188)`, `chemdner_TEXT:MESH:D001241)`, `chemdner_TEXT:MESH:D004811)`, `bioinfer_ner:I-GeneproteinRNA)`, `chemdner_TEXT:MESH:D002248)`, `bionlp_shared_task_2009_NER:B-Negative_regulation)`, `chemdner_TEXT:MESH:D000143)`, `chemdner_TEXT:MESH:D007099)`, `nlm_gene_ner:O)`, `chemdner_TEXT:MESH:D005485)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorBindingSiteOfDNA)`, `bionlp_st_2013_gro_ner:B-PhysicalContact)`, `medmentions_full_ner:B-T167)`, `medmentions_st21pv_ner:B-T091)`, `seth_corpus_ner:I-Gene)`, `bionlp_st_2011_ge_COREF:coref)`, `bionlp_st_2011_ge_NER:B-Gene_expression)`, `medmentions_full_ner:B-T031)`, `genia_relation_corpus_RE:None)`, `genia_term_corpus_ner:I-ANDDNA_domain_or_regionDNA_domain_or_region)`, `chemdner_TEXT:MESH:D014970)`, `bionlp_st_2013_gro_NER:B-Mutation)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivator)`, `chemdner_TEXT:MESH:D002217)`, `chemdner_TEXT:MESH:D003367)`, `medmentions_full_ner:I-UnknownType)`, `chemdner_TEXT:MESH:D002998)`, `bionlp_st_2013_gro_ner:I-Phenotype)`, `genia_term_corpus_ner:B-ANDDNA_family_or_groupDNA_family_or_group)`, `hprd50_RE:PPI)`, `chemdner_TEXT:MESH:D002118)`, `scai_chemical_ner:B-IUPAC)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfProtein)`, `verspoor_2013_ner:B-mutation)`, `chemdner_TEXT:MESH:D011719)`, `chemdner_TEXT:MESH:D013729)`, `bionlp_shared_task_2009_ner:O)`, `chemdner_TEXT:MESH:D005840)`, `chemdner_TEXT:MESH:D009287)`, `medmentions_full_ner:B-T029)`, `chemdner_TEXT:MESH:D037742)`, `medmentions_full_ner:I-T200)`, `chemdner_TEXT:MESH:D012503)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndRNA)`, `mirna_ner:I-Non-Specific_miRNAs)`, `bionlp_st_2013_gro_ner:B-ProteinBindingSiteOfProtein)`, `bionlp_st_2013_pc_NER:B-Deacetylation)`, `chemprot_RE:CPR:7)`, `chia_ner:I-Value)`, `medmentions_full_ner:I-T048)`, `chemprot_ner:B-GENE-Y)`, `bionlp_st_2013_cg_NER:B-Reproduction)`, `bionlp_st_2011_id_ner:I-Regulon-operon)`, `ebm_pico_ner:I-Outcome_Adverse-effects)`, `bioinfer_ner:B-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-bZIPTF)`, `mirna_ner:I-GenesProteins)`, `biorelex_ner:I-process)`, `chemdner_TEXT:MESH:D001555)`, `genia_term_corpus_ner:B-DNA_domain_or_region)`, `cellfinder_ner:O)`, `bionlp_st_2013_gro_ner:I-MutatedProtein)`, `bionlp_st_2013_gro_NER:I-CellularComponentOrganizationAndBiogenesis)`, `spl_adr_200db_train_ner:O)`, `medmentions_full_ner:I-T026)`, `chemdner_TEXT:MESH:D013619)`, `bionlp_st_2013_gro_NER:I-BindingToRNA)`, `biorelex_ner:I-drug)`, `bionlp_st_2013_pc_NER:B-Translation)`, `mantra_gsc_en_emea_ner:B-LIVB)`, `mantra_gsc_en_patents_ner:B-PROC)`, `bionlp_st_2013_pc_NER:B-Binding)`, `bionlp_st_2013_gro_NER:B-ModificationOfMolecularEntity)`, `bionlp_st_2013_cg_NER:I-Cell_transformation)`, `scai_chemical_ner:B-TRIVIALVAR)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomain)`, `bionlp_st_2013_gro_NER:I-TranscriptionInitiation)`, `chemdner_TEXT:MESH:D010907)`, `bionlp_st_2013_gro_ner:B-InorganicChemical)`, `bionlp_st_2013_pc_RE:None)`, `chemdner_TEXT:MESH:D002922)`, `chemdner_TEXT:MESH:D010743)`, `bionlp_st_2019_bb_ner:O)`, `medmentions_full_ner:I-T001)`, `chemdner_TEXT:MESH:D001381)`, `bionlp_shared_task_2009_ner:I-Protein)`, `bionlp_st_2013_gro_ner:B-Spliceosome)`, `bionlp_st_2013_gro_ner:I-HMGTF)`, `minimayosrs_sts:3)`, `ddi_corpus_RE:ADVISE)`, `mlee_NER:B-Dissociation)`, `bionlp_st_2013_gro_ner:I-Holoenzyme)`, `chemdner_TEXT:MESH:D001552)`, `bionlp_st_2013_gro_ner:B-bHLH)`, `chemdner_TEXT:MESH:D000109)`, `chemdner_TEXT:MESH:D013449)`, `bionlp_st_2013_gro_ner:I-GeneRegion)`, `medmentions_full_ner:B-T019)`, `scai_chemical_ner:B-TRIVIAL)`, `mlee_ner:B-Gene_or_gene_product)`, `biosses_sts:3)`, `bionlp_st_2013_cg_NER:I-Pathway)`, `bionlp_st_2011_id_ner:I-Organism)`, `bionlp_st_2013_gro_ner:B-tRNA)`, `chemdner_TEXT:MESH:D013109)`, `mlee_ner:I-Immaterial_anatomical_entity)`, `medmentions_full_ner:B-T065)`, `ebm_pico_ner:I-Participant_Sample-size)`, `mlee_RE:AtLoc)`, `genia_term_corpus_ner:I-protein_family_or_group)`, `chemdner_TEXT:MESH:D002444)`, `chemdner_TEXT:MESH:D063388)`, `mlee_NER:B-Translation)`, `chemdner_TEXT:MESH:D007052)`, `bionlp_st_2013_gro_ner:B-Gene)`, `chia_ner:B-Scope)`, `bionlp_st_2013_ge_NER:I-Positive_regulation)`, `chemdner_TEXT:MESH:D007785)`, `medmentions_st21pv_ner:I-T097)`, `iepa_RE:None)`, `medmentions_full_ner:B-T001)`, `medmentions_full_ner:I-T194)`, `chemdner_TEXT:MESH:D047309)`, `bionlp_st_2013_gro_ner:B-Substrate)`, `chemdner_TEXT:MESH:D002186)`, `ebm_pico_ner:B-Outcome_Other)`, `bionlp_st_2013_gro_NER:I-OrganismalProcess)`, `bionlp_st_2013_gro_ner:B-Ion)`, `bionlp_st_2013_gro_NER:I-ProteinBiosynthesis)`, `chia_ner:B-Drug)`, `bionlp_st_2013_gro_ner:I-MolecularEntity)`, `anat_em_ner:B-Cellular_component)`, `bionlp_st_2013_cg_ner:B-Multi-tissue_structure)`, `medmentions_full_ner:I-T122)`, `an_em_ner:B-Cell)`, `chemdner_TEXT:MESH:D011564)`, `bionlp_st_2013_gro_NER:B-Splicing)`, `bionlp_st_2013_cg_NER:I-Metabolism)`, `bionlp_st_2013_pc_NER:B-Activation)`, `bionlp_st_2013_gro_ner:I-BindingSiteOfProtein)`, `bionlp_st_2011_id_ner:B-Chemical)`, `bionlp_st_2013_gro_ner:I-Ribosome)`, `nlmchem_ner:I-Chemical)`, `mirna_ner:I-Specific_miRNAs)`, `medmentions_full_ner:I-T012)`, `bionlp_st_2013_gro_NER:B-IntraCellularTransport)`, `mlee_RE:Instrument)`, `bionlp_st_2011_id_NER:I-Transcription)`, `mantra_gsc_en_patents_ner:I-ANAT)`, `an_em_ner:B-Immaterial_anatomical_entity)`, `scai_chemical_ner:I-IUPAC)`, `bionlp_st_2011_epi_NER:B-Deubiquitination)`, `chemdner_TEXT:MESH:D007295)`, `bionlp_st_2011_ge_NER:B-Binding)`, `bionlp_st_2013_pc_NER:B-Localization)`, `chia_ner:B-Procedure)`, `medmentions_full_ner:I-T109)`, `chemdner_TEXT:MESH:D002791)`, `mantra_gsc_en_medline_ner:I-CHEM)`, `chebi_nactem_fullpaper_ner:B-Biological_Activity)`, `ncbi_disease_ner:B-SpecificDisease)`, `medmentions_full_ner:B-T063)`, `chemdner_TEXT:MESH:D016595)`, `bionlp_st_2011_id_NER:B-Transcription)`, `bionlp_st_2013_gro_ner:B-DNAMolecule)`, `mlee_NER:B-Protein_processing)`, `biorelex_ner:B-protein-complex)`, `anat_em_ner:I-Cancer)`, `bionlp_st_2013_cg_RE:AtLoc)`, `medmentions_full_ner:I-T072)`, `bio_sim_verb_sts:2)`, `seth_corpus_ner:O)`, `medmentions_full_ner:B-T070)`, `biorelex_ner:I-experiment-tag)`, `chemdner_TEXT:MESH:D020126)`, `biorelex_ner:I-protein-RNA-complex)`, `bionlp_st_2013_pc_NER:I-Phosphorylation)`, `medmentions_st21pv_ner:I-T201)`, `genia_term_corpus_ner:B-protein_complex)`, `medmentions_full_ner:I-T125)`, `bionlp_st_2013_ge_ner:I-Entity)`, `chemdner_TEXT:MESH:D054659)`, `bionlp_st_2013_pc_RE:ToLoc)`, `medmentions_full_ner:B-T099)`, `bionlp_st_2013_gro_NER:B-Binding)`, `medmentions_full_ner:B-T114)`, `spl_adr_200db_train_ner:B-Factor)`, `mlee_RE:CSite)`, `bionlp_st_2013_gro_ner:B-HMG)`, `bionlp_st_2013_gro_ner:B-Operon)`, `bionlp_st_2013_ge_NER:I-Protein_catabolism)`, `ebm_pico_ner:I-Outcome_Pain)`, `bionlp_st_2013_ge_NER:B-Transcription)`, `chemdner_TEXT:MESH:D000880)`, `ebm_pico_ner:I-Outcome_Physical)`, `bionlp_st_2013_gro_ner:I-ProteinBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D006160)`, `gnormplus_ner:B-DomainMotif)`, `medmentions_full_ner:I-T016)`, `pdr_ner:I-Disease)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToProteinBindingSiteOfProtein)`, `chemdner_TEXT:MESH:D002264)`, `genia_term_corpus_ner:I-protein_NA)`, `bionlp_shared_task_2009_NER:I-Negative_regulation)`, `medmentions_full_ner:I-T011)`, `bionlp_st_2013_gro_NER:I-CellularMetabolicProcess)`, `mqp_sts:1)`, `an_em_ner:I-Pathological_formation)`, `bionlp_st_2011_epi_NER:B-Deacetylation)`, `bionlp_st_2013_pc_RE:Theme)`, `medmentions_full_ner:I-T103)`, `bionlp_st_2011_epi_NER:B-Methylation)`, `ebm_pico_ner:B-Intervention_Psychological)`, `bionlp_st_2013_gro_ner:B-Stress)`, `genia_term_corpus_ner:B-multi_cell)`, `bionlp_st_2013_cg_NER:B-Positive_regulation)`, `anat_em_ner:I-Cellular_component)`, `spl_adr_200db_train_ner:I-Negation)`, `chemdner_TEXT:MESH:D000605)`, `mlee_RE:Cause)`, `bionlp_st_2013_gro_ner:B-RegulatoryDNARegion)`, `bionlp_st_2013_gro_ner:I-HomeoboxTF)`, `bionlp_st_2013_gro_NER:I-GeneSilencing)`, `ddi_corpus_ner:I-DRUG)`, `bionlp_st_2013_cg_NER:I-Growth)`, `mantra_gsc_en_medline_ner:B-OBJC)`, `mayosrs_sts:3)`, `bionlp_st_2013_gro_NER:B-RNAProcessing)`, `cellfinder_ner:B-CellType)`, `medmentions_full_ner:B-T007)`, `chemprot_ner:B-GENE-N)`, `biorelex_ner:B-brand)`, `ebm_pico_ner:B-Outcome_Mental)`, `bionlp_st_2013_gro_NER:B-RegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-EukaryoticCell)`, `genia_term_corpus_ner:I-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:I-T184)`, `bionlp_st_2013_gro_NER:B-RegulatoryProcess)`, `bionlp_st_2011_id_NER:B-Negative_regulation)`, `bionlp_st_2013_cg_NER:I-Development)`, `cellfinder_ner:I-Anatomy)`, `chia_ner:B-Condition)`, `chemdner_TEXT:MESH:D003065)`, `medmentions_full_ner:B-T012)`, `bionlp_st_2011_id_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorComplex)`, `bionlp_st_2013_cg_NER:I-Carcinogenesis)`, `medmentions_full_ner:B-T064)`, `medmentions_full_ner:B-T026)`, `nlmchem_ner:B-Chemical)`, `genia_term_corpus_ner:I-RNA_domain_or_region)`, `ebm_pico_ner:I-Intervention_Educational)`, `genia_term_corpus_ner:B-ANDcell_linecell_line)`, `genia_term_corpus_ner:B-protein_substructure)`, `bionlp_st_2013_gro_NER:I-ProteinTransport)`, `bionlp_st_2013_cg_NER:B-DNA_demethylation)`, `medmentions_full_ner:I-T058)`, `biorelex_ner:B-parameter)`, `chemdner_TEXT:MESH:D013006)`, `mirna_ner:I-Relation_Trigger)`, `bionlp_st_2013_gro_ner:B-PrimaryStructure)`, `bionlp_st_2013_gro_NER:I-Phosphorylation)`, `chemdner_TEXT:MESH:D003911)`, `pico_extraction_ner:I-participant)`, `chemdner_TEXT:MESH:D010938)`, `chia_ner:B-Person)`, `an_em_ner:B-Tissue)`, `medmentions_st21pv_ner:B-T170)`, `chemdner_TEXT:MESH:D013936)`, `chemdner_TEXT:MESH:D001080)`, `mlee_RE:None)`, `chemdner_TEXT:MESH:D013669)`, `chemdner_TEXT:MESH:D009943)`, `spl_adr_200db_train_ner:I-Factor)`, `chemdner_TEXT:MESH:D044004)`, `ebm_pico_ner:I-Participant_Sex)`, `chemdner_TEXT:MESH:D000409)`, `bionlp_st_2013_cg_NER:B-Cell_division)`, `medmentions_st21pv_ner:B-T033)`, `pcr_ner:I-Herb)`, `chemdner_TEXT:MESH:D020112)`, `bionlp_st_2013_pc_NER:B-Gene_expression)`, `bionlp_st_2011_rel_ner:O)`, `chemdner_TEXT:MESH:D008610)`, `bionlp_st_2013_gro_NER:B-BindingOfDNABindingDomainOfProteinToDNA)`, `bionlp_st_2013_gro_ner:I-Cell)`, `medmentions_full_ner:I-T055)`, `bionlp_st_2013_pc_NER:I-Negative_regulation)`, `chia_RE:Has_value)`, `tmvar_v1_ner:I-SNP)`, `biorelex_ner:I-experimental-construct)`, `genia_term_corpus_ner:B-)`, `chemdner_TEXT:MESH:D053978)`, `bionlp_st_2013_gro_ner:I-Stress)`, `mlee_ner:B-Pathological_formation)`, `bionlp_st_2013_cg_ner:O)`, `chemdner_TEXT:MESH:D007631)`, `chemdner_TEXT:MESH:D011084)`, `medmentions_full_ner:B-T080)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-TranscriptionCorepressor)`, `ehr_rel_sts:4)`, `mlee_ner:I-Gene_or_gene_product)`, `chemdner_TEXT:MESH:D003474)`, `medmentions_full_ner:B-T098)`, `scicite_TEXT:method)`, `medmentions_full_ner:B-T100)`, `chemdner_TEXT:MESH:D011849)`, `medmentions_full_ner:I-T039)`, `anat_em_ner:B-Organism_subdivision)`, `bionlp_st_2013_gro_ner:I-Nucleus)`, `mlee_NER:I-Regulation)`, `bionlp_st_2013_gro_ner:I-NuclearReceptor)`, `bionlp_st_2013_ge_RE:None)`, `chemdner_TEXT:MESH:D019483)`, `bionlp_st_2013_cg_ner:B-Cell)`, `bionlp_st_2013_gro_ner:B-Holoenzyme)`, `bionlp_st_2011_epi_NER:I-Methylation)`, `bionlp_shared_task_2009_ner:B-Protein)`, `medmentions_st21pv_ner:I-T038)`, `bionlp_st_2013_gro_ner:I-DNARegion)`, `bionlp_st_2013_gro_NER:I-CellCyclePhase)`, `bionlp_st_2013_gro_ner:I-tRNA)`, `mlee_ner:I-Multi-tissue_structure)`, `chemprot_ner:O)`, `medmentions_full_ner:B-T094)`, `bionlp_st_2013_gro_RE:fromSpecies)`, `bionlp_st_2013_gro_NER:O)`, `bionlp_st_2013_gro_NER:B-Acetylation)`, `bioinfer_ner:I-Protein_family_or_group)`, `medmentions_st21pv_ner:I-T098)`, `pdr_ner:B-Disease)`, `chemdner_ner:I-Chemical)`, `bionlp_st_2013_cg_NER:B-Negative_regulation)`, `chebi_nactem_fullpaper_ner:B-Chemical_Structure)`, `bionlp_st_2011_ge_NER:I-Negative_regulation)`, `diann_iber_eval_en_ner:O)`, `bionlp_shared_task_2009_NER:I-Binding)`, `mlee_NER:I-Cell_proliferation)`, `chebi_nactem_fullpaper_ner:B-Protein)`, `bionlp_st_2013_gro_NER:B-Phosphorylation)`, `bionlp_st_2011_epi_COREF:coref)`, `medmentions_full_ner:B-T200)`, `bionlp_st_2013_cg_ner:B-Tissue)`, `chemdner_TEXT:MESH:D000082)`, `chemdner_TEXT:MESH:D037201)`, `bionlp_st_2013_gro_ner:B-ComplexMolecularEntity)`, `bionlp_st_2011_ge_RE:ToLoc)`, `diann_iber_eval_en_ner:B-Neg)`, `bionlp_st_2013_gro_ner:B-RibosomalRNA)`, `bionlp_shared_task_2009_NER:I-Protein_catabolism)`, `chemdner_TEXT:MESH:D016912)`, `medmentions_full_ner:B-T017)`, `bionlp_st_2013_gro_ner:B-CpGIsland)`, `mlee_ner:I-Organism_substance)`, `medmentions_full_ner:I-T075)`, `bionlp_st_2013_gro_ner:I-SecondMessenger)`, `bioinfer_ner:B-Protein_family_or_group)`, `bionlp_st_2013_cg_NER:I-Negative_regulation)`, `mantra_gsc_en_emea_ner:B-CHEM)`, `genia_term_corpus_ner:B-DNA_NA)`, `chemdner_TEXT:MESH:D057888)`, `chemdner_TEXT:MESH:D006495)`, `chemdner_TEXT:MESH:D006575)`, `geokhoj_v1_TEXT:0)`, `bionlp_st_2013_gro_RE:locatedIn)`, `genia_term_corpus_ner:B-virus)`, `bionlp_st_2013_gro_ner:B-RuntLikeDomain)`, `medmentions_full_ner:B-T131)`, `bionlp_st_2013_gro_ner:I-ProteinCodingRegion)`, `chemdner_TEXT:MESH:D015525)`, `genia_term_corpus_ner:I-mono_cell)`, `chemdner_TEXT:MESH:D007840)`, `medmentions_full_ner:I-T098)`, `chemdner_TEXT:MESH:D009930)`, `genia_term_corpus_ner:I-polynucleotide)`, `biorelex_ner:I-protein-region)`, `bionlp_st_2011_id_NER:I-Process)`, `bionlp_st_2013_gro_NER:I-CellularProcess)`, `medmentions_full_ner:B-T023)`, `chemdner_TEXT:MESH:D008942)`, `medmentions_full_ner:I-T070)`, `biorelex_ner:B-organelle)`, `bionlp_st_2013_gro_NER:I-Decrease)`, `verspoor_2013_ner:I-size)`, `chemdner_TEXT:MESH:D002945)`, `ebm_pico_ner:B-Intervention_Other)`, `bionlp_st_2013_cg_ner:I-Simple_chemical)`, `chemdner_TEXT:MESH:D008751)`, `chia_RE:AND)`, `medmentions_full_ner:I-T028)`, `ebm_pico_ner:I-Intervention_Other)`, `chemdner_TEXT:MESH:D005472)`, `chemdner_TEXT:MESH:D005070)`, `gnormplus_ner:B-Gene)`, `medmentions_full_ner:I-T190)`, `mlee_NER:B-Breakdown)`, `bioinfer_ner:B-GeneproteinRNA)`, `bioinfer_ner:B-Gene)`, `chemdner_TEXT:MESH:D006835)`, `chemdner_TEXT:MESH:D004298)`, `chemdner_TEXT:MESH:D002951)`, `chia_ner:I-Device)`, `bionlp_st_2013_pc_NER:B-Conversion)`, `bionlp_shared_task_2009_NER:I-Transcription)`, `mlee_NER:B-DNA_methylation)`, `pubmed_qa_labeled_fold0_CLF:no)`, `minimayosrs_sts:1)`, `chemdner_TEXT:MESH:D002166)`, `chemdner_TEXT:MESH:D005934)`, `bionlp_st_2013_gro_NER:B-CatabolicPathway)`, `tmvar_v1_ner:I-ProteinMutation)`, `verspoor_2013_ner:I-Phenomena)`, `medmentions_full_ner:B-T011)`, `chemdner_TEXT:MESH:D001218)`, `medmentions_full_ner:B-T185)`, `mantra_gsc_en_patents_ner:I-PROC)`, `medmentions_full_ner:I-T120)`, `chia_ner:I-Procedure)`, `genia_term_corpus_ner:I-ANDcell_typecell_type)`, `bionlp_st_2011_id_ner:I-Entity)`, `pcr_ner:B-Chemical)`, `bionlp_st_2013_gro_NER:B-PositiveRegulation)`, `mlee_RE:Theme)`, `bionlp_st_2011_epi_ner:B-Protein)`, `medmentions_full_ner:B-T055)`, `spl_adr_200db_train_ner:I-Severity)`, `bionlp_st_2013_gro_ner:I-Ion)`, `bionlp_st_2011_id_RE:Cause)`, `bc5cdr_ner:I-Disease)`, `bionlp_st_2013_gro_ner:I-bHLH)`, `chemdner_TEXT:MESH:D001058)`, `bionlp_st_2013_gro_ner:I-AminoAcid)`, `bionlp_st_2011_epi_NER:B-Phosphorylation)`, `medmentions_full_ner:B-T086)`, `chemdner_TEXT:MESH:D004441)`, `medmentions_st21pv_ner:I-T007)`, `biorelex_ner:B-drug)`, `mantra_gsc_en_patents_ner:I-DISO)`, `medmentions_full_ner:I-T197)`, `bionlp_st_2011_ge_RE:AtLoc)`, `bionlp_st_2013_gro_NER:B-MolecularProcess)`, `bionlp_st_2011_ge_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:B-FormationOfTranscriptionInitiationComplex)`, `bionlp_st_2011_ge_NER:I-Binding)`, `mirna_ner:B-GenesProteins)`, `mirna_ner:B-Diseases)`, `mantra_gsc_en_emea_ner:I-DISO)`, `anat_em_ner:I-Multi-tissue_structure)`, `bioinfer_ner:O)`, `chemdner_TEXT:MESH:D017673)`, `bionlp_st_2013_gro_NER:B-Methylation)`, `genia_term_corpus_ner:I-AND_NOTcell_typecell_type)`, `bionlp_st_2013_cg_NER:I-Positive_regulation)`, `bionlp_st_2013_cg_NER:B-Carcinogenesis)`, `chemdner_TEXT:MESH:D009543)`, `gnormplus_ner:I-Gene)`, `bionlp_st_2013_cg_RE:Participant)`, `chemdner_TEXT:MESH:D019804)`, `seth_corpus_RE:Equals)`, `medmentions_full_ner:I-T082)`, `hprd50_ner:O)`, `bionlp_st_2013_gro_ner:B-OxidativeStress)`, `chemdner_TEXT:MESH:D014227)`, `bio_sim_verb_sts:7)`, `bionlp_st_2011_ge_NER:I-Protein_catabolism)`, `bionlp_st_2011_ge_NER:B-Localization)`, `chemdner_TEXT:MESH:D001224)`, `chemdner_TEXT:MESH:D009842)`, `bionlp_st_2013_cg_ner:B-Amino_acid)`, `bionlp_st_2013_gro_NER:B-CellCyclePhase)`, `chemdner_TEXT:MESH:D002245)`, `bionlp_st_2013_ge_NER:I-Ubiquitination)`, `bionlp_st_2013_cg_NER:I-Cell_death)`, `pico_extraction_ner:O)`, `chemdner_TEXT:MESH:D000596)`, `chemdner_TEXT:MESH:D000638)`, `an_em_ner:B-Developing_anatomical_structure)`, `bionlp_st_2019_bb_ner:I-Phenotype)`, `bionlp_st_2013_gro_NER:I-CellDeath)`, `mantra_gsc_en_patents_ner:B-PHYS)`, `chemdner_TEXT:MESH:D009705)`, `genia_term_corpus_ner:B-protein_molecule)`, `mantra_gsc_en_medline_ner:B-PHEN)`, `bionlp_st_2013_gro_NER:I-PosttranslationalModification)`, `ddi_corpus_ner:B-BRAND)`, `mantra_gsc_en_medline_ner:B-DEVI)`, `mlee_NER:I-Planned_process)`, `tmvar_v1_ner:O)`, `bionlp_st_2011_ge_NER:I-Phosphorylation)`, `genia_term_corpus_ner:I-ANDprotein_substructureprotein_substructure)`, `medmentions_st21pv_ner:B-T007)`, `bionlp_st_2013_cg_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:B-Organism)`, `bionlp_st_2013_gro_ner:I-NucleicAcid)`, `medmentions_full_ner:I-T044)`, `chia_ner:I-Person)`, `chemdner_TEXT:MESH:D016572)`, `scai_disease_ner:O)`, `bionlp_st_2013_gro_ner:B-TranscriptionCofactor)`, `chemdner_TEXT:MESH:D002762)`, `chemdner_TEXT:MESH:D011685)`, `chemdner_TEXT:MESH:D005031)`, `scai_disease_ner:I-ADVERSE)`, `biorelex_ner:I-protein-isoform)`, `bionlp_shared_task_2009_COREF:None)`, `genia_term_corpus_ner:I-lipid)`, `biorelex_ner:B-RNA)`, `chemdner_TEXT:MESH:D018020)`, `scai_chemical_ner:B-FAMILY)`, `chemdner_TEXT:MESH:D017382)`, `chemdner_TEXT:MESH:D006027)`, `chemdner_TEXT:MESH:D018942)`, `medmentions_full_ner:I-T024)`, `chemdner_TEXT:MESH:D008050)`, `bionlp_st_2013_cg_NER:B-Glycosylation)`, `chemdner_TEXT:MESH:D019342)`, `chemdner_TEXT:MESH:D008774)`, `bionlp_st_2011_ge_RE:CSite)`, `bionlp_st_2013_gro_ner:B-HMGTF)`, `chemdner_ner:B-Chemical)`, `bioscope_papers_ner:B-negation)`, `biorelex_RE:bind)`, `bioinfer_ner:B-Protein_complex)`, `bionlp_st_2011_epi_NER:B-Ubiquitination)`, `bionlp_st_2013_gro_NER:I-RegulationOfTranscription)`, `chemdner_TEXT:MESH:D011134)`, `bionlp_st_2011_rel_ner:I-Entity)`, `mantra_gsc_en_medline_ner:I-PROC)`, `ncbi_disease_ner:I-DiseaseClass)`, `chemdner_TEXT:MESH:D014315)`, `bionlp_st_2013_gro_ner:I-Chromosome)`, `chemdner_TEXT:MESH:D000639)`, `chemdner_TEXT:MESH:D005740)`, `bionlp_st_2013_gro_ner:I-MolecularFunction)`, `verspoor_2013_ner:B-gene)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomainTF)`, `bionlp_st_2013_gro_ner:B-DNARegion)`, `ebm_pico_ner:B-Intervention_Educational)`, `medmentions_st21pv_ner:B-T005)`, `medmentions_full_ner:I-T022)`, `gnormplus_ner:B-FamilyName)`, `bionlp_st_2011_epi_RE:Contextgene)`, `bionlp_st_2013_pc_NER:B-Demethylation)`, `chia_ner:I-Observation)`, `medmentions_full_ner:I-T089)`, `bionlp_st_2013_gro_ner:I-ComplexMolecularEntity)`, `bionlp_st_2013_gro_ner:B-Lipid)`, `biorelex_ner:I-gene)`, `chemdner_TEXT:MESH:D003300)`, `chemdner_TEXT:MESH:D008903)`, `verspoor_2013_RE:relatedTo)`, `bionlp_st_2011_epi_NER:I-DNA_methylation)`, `genia_term_corpus_ner:I-cell_component)`, `bionlp_st_2011_ge_COREF:None)`, `ebm_pico_ner:B-Participant_Sample-size)`, `chemdner_TEXT:MESH:D043823)`, `chemdner_TEXT:MESH:D004958)`, `bionlp_st_2013_gro_ner:I-RNA)`, `chemdner_TEXT:MESH:D006150)`, `bionlp_st_2013_gro_ner:B-MolecularStructure)`, `chemdner_TEXT:MESH:D007457)`, `bionlp_st_2013_gro_ner:I-OxidativeStress)`, `scai_chemical_ner:B-PARTIUPAC)`, `mlee_NER:I-Blood_vessel_development)`, `bionlp_shared_task_2009_ner:B-Entity)`, `bionlp_st_2013_ge_RE:CSite)`, `medmentions_full_ner:B-T058)`, `chemdner_TEXT:MESH:D000628)`, `ebm_pico_ner:I-Intervention_Surgical)`, `an_em_ner:I-Organ)`, `bionlp_st_2013_gro_NER:B-Increase)`, `iepa_RE:PPI)`, `mlee_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D014284)`, `chemdner_TEXT:MESH:D014260)`, `bionlp_st_2011_epi_NER:I-Glycosylation)`, `bionlp_st_2013_gro_NER:B-BindingToProtein)`, `bionlp_st_2013_gro_NER:B-BindingToRNA)`, `medmentions_full_ner:I-T047)`, `bionlp_st_2013_gro_NER:B-Localization)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfGeneExpression)`, `medmentions_full_ner:I-T051)`, `bionlp_st_2011_id_COREF:None)`, `chemdner_TEXT:MESH:D011744)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToDNA)`, `bionlp_st_2013_gro_ner:B-CatalyticActivity)`, `chebi_nactem_abstr_ann1_ner:I-Biological_Activity)`, `bio_sim_verb_sts:1)`, `chemdner_TEXT:MESH:D012402)`, `bionlp_st_2013_gro_ner:B-bZIPTF)`, `chemdner_TEXT:MESH:D003913)`, `bionlp_shared_task_2009_RE:Site)`, `bionlp_st_2013_gro_ner:I-AntisenseRNA)`, `bionlp_st_2013_gro_NER:B-ProteinTargeting)`, `bionlp_st_2013_gro_NER:B-GeneExpression)`, `bionlp_st_2013_cg_NER:I-Blood_vessel_development)`, `mantra_gsc_en_patents_ner:I-CHEM)`, `mayosrs_sts:2)`, `chemdner_TEXT:MESH:D001645)`, `bionlp_st_2011_ge_NER:I-Transcription)`, `bionlp_st_2011_epi_NER:B-Acetylation)`, `medmentions_full_ner:B-T002)`, `verspoor_2013_ner:I-Concepts_Ideas)`, `hprd50_RE:None)`, `ddi_corpus_ner:O)`, `chemdner_TEXT:MESH:D014131)`, `ebm_pico_ner:B-Outcome_Physical)`, `medmentions_st21pv_ner:B-T103)`, `chemdner_TEXT:MESH:D016650)`, `mlee_NER:B-Cell_proliferation)`, `bionlp_st_2013_gro_ner:I-TranscriptionCoactivator)`, `chebi_nactem_fullpaper_ner:I-Chemical)`, `chemdner_TEXT:MESH:D013256)`, `biorelex_ner:I-protein-DNA-complex)`, `chemdner_TEXT:MESH:D008767)`, `bioinfer_RE:None)`, `nlm_gene_ner:B-Gene)`, `bionlp_st_2013_gro_ner:B-ReporterGene)`, `biosses_sts:1)`, `chemdner_TEXT:MESH:D000493)`, `chemdner_TEXT:MESH:D011374)`, `ebm_pico_ner:B-Intervention_Control)`, `bionlp_st_2013_pc_NER:I-Pathway)`, `chemprot_RE:CPR:3)`, `bionlp_st_2013_cg_ner:I-Amino_acid)`, `chemdner_TEXT:MESH:D005557)`, `bionlp_st_2011_ge_RE:Site)`, `bionlp_st_2013_pc_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:I-Elongation)`, `bionlp_st_2011_ge_NER:I-Localization)`, `spl_adr_200db_train_ner:B-Negation)`, `chemdner_TEXT:MESH:D010455)`, `nlm_gene_ner:B-GENERIF)`, `mlee_RE:Site)`, `bionlp_st_2013_gro_NER:B-BindingOfTFToTFBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D017953)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscription)`, `osiris_ner:B-gene)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressor)`, `medmentions_full_ner:I-T131)`, `genia_term_corpus_ner:B-protein_family_or_group)`, `genia_term_corpus_ner:B-cell_type)`, `chemdner_TEXT:MESH:D013759)`, `chemdner_TEXT:MESH:D002247)`, `scai_chemical_ner:I-FAMILY)`, `chemdner_TEXT:MESH:D006020)`, `biorelex_ner:B-DNA)`, `chebi_nactem_abstr_ann1_ner:I-Spectral_Data)`, `mantra_gsc_en_medline_ner:B-DISO)`, `chemdner_TEXT:MESH:D019829)`, `ncbi_disease_ner:I-CompositeMention)`, `chemdner_TEXT:MESH:D013876)`, `chebi_nactem_fullpaper_ner:I-Spectral_Data)`, `biorelex_ner:I-DNA)`, `chemdner_TEXT:MESH:D005492)`, `chemdner_TEXT:MESH:D011810)`, `chemdner_TEXT:MESH:D008563)`, `chemdner_TEXT:MESH:D015735)`, `bionlp_st_2019_bb_ner:B-Microorganism)`, `ddi_corpus_RE:INT)`, `medmentions_st21pv_ner:B-T038)`, `bionlp_st_2013_gro_NER:B-CellCyclePhaseTransition)`, `cellfinder_ner:B-CellLine)`, `pdr_RE:Cause)`, `chemdner_TEXT:MESH:D011433)`, `chemdner_TEXT:MESH:D011720)`, `chemdner_TEXT:MESH:D020156)`, `ebm_pico_ner:O)`, `mlee_ner:B-Organ)`, `chemdner_TEXT:MESH:D012721)`, `chebi_nactem_fullpaper_ner:I-Biological_Activity)`, `bionlp_st_2013_cg_COREF:coref)`, `chemdner_TEXT:MESH:D006918)`, `medmentions_full_ner:B-T092)`, `genia_term_corpus_ner:B-protein_NA)`, `bionlp_st_2013_ge_ner:B-Entity)`, `an_em_ner:B-Multi-tissue_structure)`, `chia_ner:I-Measurement)`, `chia_RE:Has_temporal)`, `bionlp_st_2011_id_NER:B-Protein_catabolism)`, `bionlp_st_2013_gro_NER:B-CellAdhesion)`, `bionlp_st_2013_gro_ner:B-DNABindingSite)`, `biorelex_ner:B-organism)`, `scai_disease_ner:I-DISEASE)`, `bionlp_st_2013_gro_ner:I-DNABindingSite)`, `chemdner_TEXT:MESH:D016607)`, `chemdner_TEXT:MESH:D030421)`, `bionlp_st_2013_pc_NER:I-Binding)`, `medmentions_full_ner:I-T029)`, `chemdner_TEXT:MESH:D001569)`, `genia_term_corpus_ner:B-ANDcell_typecell_type)`, `scai_chemical_ner:B-SUM)`, `chemdner_TEXT:MESH:D007656)`, `medmentions_full_ner:B-T082)`, `chemdner_TEXT:MESH:D009525)`, `medmentions_full_ner:B-T079)`, `bionlp_st_2013_cg_NER:B-Synthesis)`, `biorelex_ner:B-process)`, `bionlp_st_2013_ge_RE:Theme)`, `chemdner_TEXT:MESH:D012825)`, `chemdner_TEXT:MESH:D005462)`, `bionlp_st_2013_cg_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-CellCycle)`, `cellfinder_ner:I-CellLine)`, `bionlp_st_2013_gro_ner:I-DNABindingDomainOfProtein)`, `medmentions_st21pv_ner:B-T168)`, `genia_term_corpus_ner:B-body_part)`, `genia_term_corpus_ner:B-ANDprotein_family_or_groupprotein_family_or_group)`, `mlee_ner:B-Tissue)`, `mlee_NER:I-Localization)`, `medmentions_full_ner:B-T125)`, `bionlp_st_2013_cg_NER:B-Infection)`, `chebi_nactem_abstr_ann1_ner:I-Protein)`, `chemdner_TEXT:MESH:D009570)`, `medmentions_full_ner:I-T045)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivator)`, `verspoor_2013_ner:B-disease)`, `medmentions_full_ner:I-T056)`, `medmentions_full_ner:B-T050)`, `bionlp_st_2013_gro_ner:B-MolecularFunction)`, `medmentions_full_ner:B-T060)`, `bionlp_st_2013_gro_ner:B-Cell)`, `medmentions_full_ner:I-T060)`, `bionlp_st_2013_pc_NER:I-Gene_expression)`, `genia_term_corpus_ner:B-RNA_NA)`, `bionlp_st_2013_gro_ner:I-MessengerRNA)`, `medmentions_full_ner:I-T086)`, `an_em_RE:Part-of)`, `bionlp_st_2013_gro_NER:B-NegativeRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_gro_NER:I-Splicing)`, `bioinfer_RE:PPI)`, `bioscope_papers_ner:I-speculation)`, `bionlp_st_2013_gro_ner:B-HomeoBox)`, `medmentions_full_ner:B-T004)`, `chia_ner:I-Drug)`, `bionlp_st_2013_gro_ner:B-FusionOfGeneWithReporterGene)`, `genia_term_corpus_ner:I-cell_line)`, `chebi_nactem_abstr_ann1_ner:I-Metabolite)`, `bionlp_st_2013_gro_ner:I-ExpressionProfiling)`, `chemdner_TEXT:MESH:D004390)`, `medmentions_full_ner:B-T016)`, `bionlp_st_2013_cg_NER:B-Growth)`, `medmentions_full_ner:I-T170)`, `medmentions_full_ner:B-T093)`, `genia_term_corpus_ner:I-inorganic)`, `mlee_NER:B-Planned_process)`, `bionlp_st_2013_gro_RE:hasPart)`, `bionlp_st_2013_gro_ner:B-BasicDomain)`, `chemdner_TEXT:MESH:D050091)`, `medmentions_st21pv_ner:B-T037)`, `chemdner_TEXT:MESH:D011522)`, `bionlp_st_2013_ge_NER:B-Deacetylation)`, `chemdner_TEXT:MESH:D004008)`, `chemdner_TEXT:MESH:D013972)`, `bionlp_st_2013_gro_NER:B-SignalingPathway)`, `bionlp_st_2013_gro_ner:B-Promoter)`, `chemdner_TEXT:MESH:D012701)`, `an_em_COREF:None)`, `bionlp_st_2019_bb_RE:None)`, `mlee_NER:I-Positive_regulation)`, `bionlp_st_2013_gro_NER:I-Translation)`, `chemdner_TEXT:MESH:D013453)`, `genia_term_corpus_ner:I-ANDprotein_moleculeprotein_molecule)`, `chemdner_TEXT:MESH:D002746)`, `chebi_nactem_abstr_ann1_ner:O)`, `bionlp_st_2013_pc_ner:O)`, `mayosrs_sts:7)`, `bionlp_st_2013_cg_NER:B-Pathway)`, `verspoor_2013_ner:I-age)`, `biorelex_ner:I-peptide)`, `medmentions_full_ner:I-T096)`, `chebi_nactem_fullpaper_ner:I-Chemical_Structure)`, `chemdner_TEXT:MESH:D007211)`, `medmentions_full_ner:I-T018)`, `medmentions_full_ner:B-T201)`, `bionlp_st_2013_gro_NER:B-BindingOfTFToTFBindingSiteOfProtein)`, `medmentions_full_ner:B-T054)`, `ebm_pico_ner:I-Intervention_Pharmacological)`, `chemdner_TEXT:MESH:D010672)`, `chemdner_TEXT:MESH:D004492)`, `chemdner_TEXT:MESH:D008094)`, `chemdner_TEXT:MESH:D002227)`, `chemdner_TEXT:MESH:D009553)`, `bionlp_st_2013_gro_NER:I-ResponseProcess)`, `chemdner_TEXT:MESH:D006046)`, `ebm_pico_ner:B-Participant_Condition)`, `nlm_gene_ner:I-Gene)`, `bionlp_st_2019_bb_ner:I-Habitat)`, `bionlp_shared_task_2009_COREF:coref)`, `chemdner_TEXT:MESH:D005640)`, `mantra_gsc_en_emea_ner:B-PHYS)`, `mantra_gsc_en_patents_ner:B-DISO)`, `bionlp_st_2013_gro_ner:B-Heterochromatin)`, `bionlp_st_2013_gro_NER:I-CellCycle)`, `bionlp_st_2013_cg_NER:I-Cell_proliferation)`, `bionlp_st_2013_cg_ner:B-Simple_chemical)`, `genia_term_corpus_ner:I-cell_type)`, `chemdner_TEXT:MESH:D003553)`, `bionlp_st_2013_ge_RE:Theme2)`, `tmvar_v1_ner:B-ProteinMutation)`, `chemdner_TEXT:MESH:D012717)`, `chemdner_TEXT:MESH:D026121)`, `chemdner_TEXT:MESH:D008687)`, `bionlp_st_2013_gro_NER:I-TranscriptionTermination)`, `medmentions_full_ner:B-T028)`, `biorelex_ner:B-assay)`, `genia_term_corpus_ner:B-tissue)`, `chemdner_TEXT:MESH:D009173)`, `bionlp_st_2013_gro_ner:B-TranscriptionCoactivator)`, `genia_term_corpus_ner:B-amino_acid_monomer)`, `mantra_gsc_en_emea_ner:B-DEVI)`, `bionlp_st_2013_gro_NER:B-Growth)`, `chemdner_TEXT:MESH:D017374)`, `genia_term_corpus_ner:B-other_artificial_source)`, `medmentions_full_ner:B-T072)`, `bionlp_st_2013_gro_NER:B-CellGrowth)`, `bionlp_st_2013_gro_ner:I-DoubleStrandDNA)`, `chemdner_ner:O)`, `bionlp_shared_task_2009_NER:I-Localization)`, `bionlp_st_2013_gro_NER:B-RegulationOfPathway)`, `genia_term_corpus_ner:I-amino_acid_monomer)`, `bionlp_st_2013_gro_NER:I-SPhase)`, `an_em_ner:B-Organism_substance)`, `medmentions_full_ner:B-T052)`, `genia_term_corpus_ner:B-ANDprotein_subunitprotein_subunit)`, `medmentions_full_ner:B-T096)`, `chemdner_TEXT:MESH:D056831)`, `chemdner_TEXT:MESH:D010755)`, `pdr_NER:I-Cause_of_disease)`, `mlee_NER:B-Phosphorylation)`, `medmentions_full_ner:I-T064)`, `chemdner_TEXT:MESH:D005978)`, `mantra_gsc_en_medline_ner:I-PHEN)`, `bionlp_st_2013_cg_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_NER:B-Modification)`, `bionlp_st_2013_gro_ner:B-ProteinComplex)`, `bionlp_st_2013_gro_ner:B-DoubleStrandDNA)`, `medmentions_full_ner:B-T068)`, `medmentions_full_ner:I-T034)`, `bionlp_st_2011_epi_NER:B-Catalysis)`, `biosses_sts:0)`, `bionlp_st_2013_cg_ner:B-Organism_substance)`, `chemdner_TEXT:MESH:D055549)`, `bionlp_st_2013_cg_NER:B-Glycolysis)`, `chemdner_TEXT:MESH:D001761)`, `chemdner_TEXT:MESH:D011728)`, `bionlp_st_2013_gro_ner:B-Function)`, `medmentions_full_ner:I-T033)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfTranscriptionOfGene)`, `medmentions_full_ner:I-T053)`, `bionlp_st_2013_gro_ner:B-Protein)`, `genia_term_corpus_ner:I-ANDprotein_family_or_groupprotein_family_or_group)`, `bionlp_st_2013_gro_NER:I-CatabolicPathway)`, `biorelex_ner:I-chemical)`, `chemdner_TEXT:MESH:D013185)`, `biorelex_ner:I-RNA)`, `chemdner_TEXT:MESH:D009838)`, `medmentions_full_ner:I-T008)`, `chemdner_TEXT:MESH:D002104)`, `bionlp_st_2013_gro_NER:B-RNABiosynthesis)`, `verspoor_2013_ner:I-ethnicity)`, `bionlp_st_2013_gro_ner:I-SmallInterferingRNA)`, `chemdner_TEXT:MESH:D026023)`, `mlee_ner:O)`, `bionlp_st_2013_gro_NER:I-CellHomeostasis)`, `bionlp_st_2013_pc_NER:B-Pathway)`, `gnormplus_ner:I-DomainMotif)`, `bionlp_st_2013_gro_ner:I-OpenReadingFrame)`, `bionlp_st_2013_gro_NER:I-RegulationOfGeneExpression)`, `muchmore_en_ner:O)`, `chemdner_TEXT:MESH:D000911)`, `bionlp_st_2011_epi_NER:B-DNA_demethylation)`, `bionlp_st_2013_gro_ner:I-RuntLikeDomain)`, `chemdner_TEXT:MESH:D010748)`, `medmentions_full_ner:B-T008)`, `biorelex_ner:B-protein-RNA-complex)`, `bionlp_st_2013_cg_NER:I-Planned_process)`, `chemdner_TEXT:MESH:D014867)`, `mantra_gsc_en_patents_ner:I-LIVB)`, `bionlp_st_2013_gro_NER:I-Silencing)`, `chemdner_TEXT:MESH:D015306)`, `chemdner_TEXT:MESH:D001679)`, `bionlp_shared_task_2009_NER:I-Positive_regulation)`, `linnaeus_filtered_ner:O)`, `chia_RE:Has_multiplier)`, `medmentions_full_ner:B-T116)`, `bionlp_shared_task_2009_NER:B-Positive_regulation)`, `anat_em_ner:B-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D011137)`, `chemdner_TEXT:MESH:D048271)`, `chemdner_TEXT:MESH:D003975)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressorActivity)`, `bionlp_st_2011_id_ner:B-Protein)`, `bionlp_st_2013_gro_NER:I-Mutation)`, `chemdner_TEXT:MESH:D001572)`, `mantra_gsc_en_patents_ner:B-CHEM)`, `mantra_gsc_en_medline_ner:I-DEVI)`, `bionlp_st_2013_gro_ner:B-Enzyme)`, `medmentions_full_ner:B-T056)`, `mantra_gsc_en_patents_ner:B-OBJC)`, `medmentions_full_ner:B-T073)`, `anat_em_ner:I-Tissue)`, `chemdner_TEXT:MESH:D047310)`, `chia_ner:I-Scope)`, `ncbi_disease_ner:B-Modifier)`, `medmentions_st21pv_ner:B-T082)`, `medmentions_full_ner:I-T054)`, `genia_term_corpus_ner:I-carbohydrate)`, `bionlp_st_2013_cg_RE:Theme)`, `chemdner_TEXT:MESH:D009538)`, `chemdner_TEXT:MESH:D008691)`, `genia_term_corpus_ner:B-ANDprotein_substructureprotein_substructure)`, `bionlp_st_2013_cg_ner:I-Tissue)`, `chia_ner:B-Device)`, `chemdner_TEXT:MESH:D002784)`, `medmentions_full_ner:I-T007)`, `bionlp_st_2013_gro_ner:I-DNAFragment)`, `mlee_RE:ToLoc)`, `spl_adr_200db_train_ner:I-AdverseReaction)`, `bionlp_st_2013_cg_NER:B-Catabolism)`, `chemdner_TEXT:MESH:D013779)`, `bionlp_st_2013_pc_NER:B-Regulation)`, `bionlp_st_2013_gro_NER:I-Disease)`, `chia_ner:I-Condition)`, `chemdner_TEXT:MESH:D012370)`, `bionlp_st_2013_ge_NER:O)`, `bionlp_st_2013_pc_NER:B-Deubiquitination)`, `bionlp_st_2013_pc_NER:I-Translation)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfTranscriptionOfGene)`, `bionlp_st_2013_cg_NER:B-DNA_methylation)`, `bioscope_papers_ner:B-speculation)`, `chemdner_TEXT:MESH:D018130)`, `bionlp_st_2013_gro_ner:B-RNAPolymeraseII)`, `medmentions_st21pv_ner:B-T098)`, `bionlp_st_2013_gro_NER:B-Elongation)`, `bionlp_st_2013_pc_RE:Cause)`, `seth_corpus_ner:B-RS)`, `bionlp_st_2013_ge_RE:ToLoc)`, `chemdner_TEXT:MESH:D000538)`, `medmentions_full_ner:B-T192)`, `medmentions_full_ner:B-T061)`, `medmentions_full_ner:B-T032)`, `bionlp_st_2013_gro_NER:B-Transport)`, `medmentions_full_ner:I-T014)`, `chemdner_TEXT:MESH:D004137)`, `medmentions_full_ner:B-T101)`, `bionlp_st_2013_gro_NER:B-Transcription)`, `bionlp_st_2013_pc_NER:B-Transport)`, `medmentions_full_ner:I-T203)`, `ebm_pico_ner:I-Intervention_Control)`, `genia_term_corpus_ner:I-atom)`, `chemdner_TEXT:MESH:D014230)`, `osiris_ner:I-gene)`, `mantra_gsc_en_patents_ner:B-ANAT)`, `ncbi_disease_ner:I-SpecificDisease)`, `bionlp_st_2013_gro_NER:I-CellGrowth)`, `chemdner_TEXT:MESH:D001205)`, `chemdner_TEXT:MESH:D016627)`, `genia_term_corpus_ner:B-protein_subunit)`, `bionlp_st_2013_gro_ner:I-CellComponent)`, `medmentions_full_ner:B-T049)`, `scai_chemical_ner:O)`, `chemdner_TEXT:MESH:D010840)`, `chemdner_TEXT:MESH:D008694)`, `mantra_gsc_en_patents_ner:B-PHEN)`, `bionlp_st_2013_cg_RE:Cause)`, `chemdner_TEXT:MESH:D012293)`, `bionlp_st_2013_gro_NER:B-Homodimerization)`, `chemdner_TEXT:MESH:D008070)`, `chia_RE:OR)`, `bionlp_st_2013_cg_ner:I-Gene_or_gene_product)`, `verspoor_2013_ner:I-disease)`, `muchmore_en_ner:B-umlsterm)`, `chemdner_TEXT:MESH:D011794)`, `medmentions_full_ner:I-T002)`, `chemdner_TEXT:MESH:D007649)`, `genia_term_corpus_ner:B-AND_NOTcell_typecell_type)`, `medmentions_full_ner:I-T023)`, `chemprot_RE:CPR:1)`, `chemdner_TEXT:MESH:D001786)`, `bionlp_st_2013_gro_ner:B-HomeoboxTF)`, `bionlp_st_2013_cg_ner:I-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:I-Attenuator)`, `bionlp_st_2019_bb_ner:B-Habitat)`, `chemdner_TEXT:MESH:D017931)`, `medmentions_full_ner:B-T047)`, `chemdner_TEXT:MESH:D006886)`, `genia_term_corpus_ner:I-)`, `medmentions_full_ner:B-T039)`, `chemdner_TEXT:MESH:D004220)`, `bionlp_st_2013_pc_RE:FromLoc)`, `nlm_gene_ner:I-GENERIF)`, `bionlp_st_2013_ge_NER:I-Protein_modification)`, `genia_term_corpus_ner:B-RNA_molecule)`, `chemdner_TEXT:MESH:D006854)`, `chemdner_TEXT:MESH:D006493)`, `chia_ner:B-Qualifier)`, `medmentions_full_ner:I-T013)`, `ehr_rel_sts:8)`, `an_em_RE:frag)`, `genia_term_corpus_ner:I-DNA_substructure)`, `chemdner_TEXT:MESH:D063065)`, `genia_term_corpus_ner:I-ANDprotein_complexprotein_complex)`, `bionlp_st_2013_pc_NER:I-Dissociation)`, `medmentions_full_ner:I-T004)`, `bionlp_st_2013_cg_ner:B-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D010069)`, `bionlp_st_2013_gro_NER:I-Homodimerization)`, `chemdner_TEXT:MESH:D006147)`, `medmentions_full_ner:I-T041)`, `bionlp_st_2011_id_NER:B-Regulation)`, `bionlp_st_2013_gro_ner:O)`, `chemdner_TEXT:MESH:D008623)`, `bionlp_st_2013_ge_ner:I-Protein)`, `scai_chemical_ner:I-TRIVIAL)`, `an_em_ner:B-Organism_subdivision)`, `bionlp_st_2013_gro_ner:B-BindingAssay)`, `bionlp_st_2013_gro_ner:I-HMG)`, `anat_em_ner:I-Anatomical_system)`, `chemdner_TEXT:MESH:D015034)`, `mlee_NER:B-Catabolism)`, `mantra_gsc_en_medline_ner:B-LIVB)`, `ddi_corpus_ner:I-BRAND)`, `chia_ner:I-Multiplier)`, `bionlp_st_2013_gro_ner:I-SequenceHomologyAnalysis)`, `seth_corpus_RE:None)`, `bionlp_st_2013_cg_NER:B-Binding)`, `bioscope_papers_ner:I-negation)`, `chemdner_TEXT:MESH:D008741)`, `chemdner_TEXT:MESH:D052998)`, `chemdner_TEXT:MESH:D005227)`, `chemdner_TEXT:MESH:D009828)`, `spl_adr_200db_train_ner:B-Animal)`, `chemdner_TEXT:MESH:D010616)`, `bionlp_st_2013_gro_ner:I-ProteinComplex)`, `pico_extraction_ner:B-outcome)`, `mlee_NER:B-Negative_regulation)`, `chemdner_TEXT:MESH:D007093)`, `bionlp_st_2013_gro_NER:I-RNAProcessing)`, `bionlp_st_2013_gro_RE:hasAgent2)`, `biorelex_ner:I-reagent)`, `medmentions_st21pv_ner:I-T074)`, `bionlp_st_2013_gro_NER:B-BindingOfMolecularEntity)`, `chemdner_TEXT:MESH:D008911)`, `medmentions_full_ner:B-T033)`, `genia_term_corpus_ner:B-ANDprotein_complexprotein_complex)`, `medmentions_full_ner:I-T100)`, `chemdner_TEXT:MESH:D019259)`, `genia_term_corpus_ner:I-BUT_NOTother_nameother_name)`, `geokhoj_v1_TEXT:1)`, `bionlp_st_2013_cg_RE:Site)`, `medmentions_full_ner:B-T184)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelixTF)`, `bionlp_st_2013_cg_ner:I-Protein_domain_or_region)`, `genia_term_corpus_ner:I-other_organic_compound)`, `chemdner_TEXT:MESH:D010793)`, `bionlp_st_2011_id_NER:B-Phosphorylation)`, `chemdner_TEXT:MESH:D002482)`, `bionlp_st_2013_cg_NER:B-Breakdown)`, `biorelex_ner:I-disease)`, `genia_term_corpus_ner:B-DNA_substructure)`, `bionlp_st_2013_gro_RE:hasPatient)`, `medmentions_full_ner:B-T127)`, `medmentions_full_ner:I-T185)`, `bionlp_shared_task_2009_RE:AtLoc)`, `medmentions_full_ner:I-T201)`, `chemdner_TEXT:MESH:D005290)`, `mlee_NER:I-Breakdown)`, `medmentions_full_ner:I-T063)`, `chemdner_TEXT:MESH:D017964)`, `an_em_ner:I-Tissue)`, `mlee_ner:I-Organism)`, `mantra_gsc_en_emea_ner:I-CHEM)`, `bionlp_st_2013_cg_ner:B-Anatomical_system)`, `genia_term_corpus_ner:B-ORDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_pc_NER:B-Degradation)`, `chemprot_RE:CPR:0)`, `genia_term_corpus_ner:B-inorganic)`, `chemdner_TEXT:MESH:D005466)`, `chia_ner:O)`, `medmentions_full_ner:B-T078)`, `mlee_NER:B-Growth)`, `mantra_gsc_en_emea_ner:B-PHEN)`, `chemdner_TEXT:MESH:D012545)`, `bionlp_st_2013_gro_NER:B-G1Phase)`, `chemdner_TEXT:MESH:D009841)`, `bionlp_st_2013_gro_ner:B-Chromatin)`, `bionlp_st_2011_epi_RE:Site)`, `medmentions_full_ner:B-T066)`, `genetaggold_ner:O)`, `bionlp_st_2013_cg_NER:I-Gene_expression)`, `medmentions_st21pv_ner:B-T092)`, `chemprot_RE:CPR:8)`, `bionlp_st_2013_cg_RE:Instrument)`, `nlm_gene_ner:I-Domain)`, `chemdner_TEXT:MESH:D006151)`, `bionlp_st_2011_id_ner:I-Protein)`, `mlee_NER:B-Synthesis)`, `bionlp_st_2013_gro_NER:B-CellMotility)`, `scai_chemical_ner:B-MODIFIER)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfTranscription)`, `osiris_ner:O)`, `mlee_NER:B-Acetylation)`, `medmentions_st21pv_ner:B-T062)`, `chemdner_TEXT:MESH:D017705)`, `bionlp_st_2013_gro_NER:I-TranscriptionOfGene)`, `genia_term_corpus_ner:I-protein_complex)`, `chemprot_RE:CPR:10)`, `medmentions_full_ner:B-T102)`, `medmentions_full_ner:I-T171)`, `chia_ner:B-Reference_point)`, `medmentions_full_ner:B-T015)`, `bionlp_st_2013_gro_ner:I-RNAPolymerase)`, `chebi_nactem_abstr_ann1_ner:B-Metabolite)`, `bionlp_st_2013_gro_NER:I-CellDifferentiation)`, `chemdner_TEXT:MESH:D006861)`, `pubmed_qa_labeled_fold0_CLF:maybe)`, `bionlp_st_2013_gro_ner:I-Sequence)`, `mlee_NER:B-Transcription)`, `bc5cdr_ner:B-Chemical)`, `chemdner_TEXT:MESH:D000072317)`, `bionlp_st_2013_gro_NER:B-Producing)`, `genia_term_corpus_ner:B-ANDprotein_moleculeprotein_molecule)`, `bionlp_st_2011_id_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-MolecularInteraction)`, `chemdner_TEXT:MESH:D014639)`, `bionlp_st_2013_gro_NER:I-Increase)`, `mlee_NER:I-Translation)`, `medmentions_full_ner:B-T087)`, `bioscope_abstracts_ner:B-speculation)`, `ebm_pico_ner:B-Outcome_Adverse-effects)`, `mantra_gsc_en_medline_ner:B-PHYS)`, `bionlp_st_2013_gro_ner:I-Lipid)`, `bionlp_st_2011_ge_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D005278)`, `bionlp_shared_task_2009_NER:B-Phosphorylation)`, `mlee_NER:I-Gene_expression)`, `bionlp_st_2011_epi_NER:I-Deacetylation)`, `chemdner_TEXT:MESH:D002110)`, `medmentions_full_ner:I-T121)`, `bionlp_st_2011_epi_ner:I-Entity)`, `bionlp_st_2019_bb_RE:Lives_In)`, `chemdner_TEXT:MESH:D001710)`, `anat_em_ner:B-Cancer)`, `bionlp_st_2013_gro_NER:B-RNASplicing)`, `mantra_gsc_en_medline_ner:I-ANAT)`, `chemdner_TEXT:MESH:D024508)`, `chemdner_TEXT:MESH:D000537)`, `mantra_gsc_en_medline_ner:I-DISO)`, `bionlp_st_2013_gro_ner:I-Prokaryote)`, `bionlp_st_2013_gro_ner:I-Chromatin)`, `bionlp_st_2013_gro_ner:B-Nucleotide)`, `linnaeus_ner:I-species)`, `verspoor_2013_ner:I-body-part)`, `bionlp_st_2013_gro_ner:B-DNAFragment)`, `bionlp_st_2013_gro_ner:B-PositiveTranscriptionRegulator)`, `medmentions_full_ner:I-T049)`, `bionlp_st_2011_ge_ner:B-Entity)`, `medmentions_full_ner:I-T017)`, `bionlp_st_2013_gro_NER:B-TranscriptionOfGene)`, `chemdner_TEXT:MESH:D009947)`, `mlee_NER:B-Dephosphorylation)`, `bionlp_st_2013_gro_NER:B-GeneSilencing)`, `pdr_RE:None)`, `scai_chemical_ner:I-TRIVIALVAR)`, `bionlp_st_2011_epi_NER:O)`, `bionlp_st_2013_cg_ner:I-Cell)`, `sciq_SEQ:None)`, `chemdner_TEXT:MESH:D019913)`, `mlee_RE:Participant)`, `chia_ner:I-Negation)`, `chemdner_TEXT:MESH:D014801)`, `chemdner_TEXT:MESH:D058846)`, `chemdner_TEXT:MESH:D011809)`, `bionlp_st_2011_epi_ner:O)`, `bionlp_st_2013_cg_NER:I-Metastasis)`, `chemdner_TEXT:MESH:D012643)`, `an_em_ner:I-Cell)`, `bionlp_st_2013_gro_ner:I-CatalyticActivity)`, `anat_em_ner:B-Anatomical_system)`, `mlee_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_ner:I-ChromosomalDNA)`, `anat_em_ner:B-Cell)`, `chemdner_TEXT:MESH:D000242)`, `chemdner_TEXT:MESH:D017641)`, `bioscope_abstracts_ner:I-negation)`, `medmentions_st21pv_ner:B-T058)`, `chemdner_TEXT:MESH:D008744)`, `bionlp_st_2013_gro_ner:B-UpstreamRegulatorySequence)`, `chemdner_TEXT:MESH:D008012)`, `medmentions_full_ner:B-T013)`, `bionlp_st_2011_epi_NER:B-Glycosylation)`, `chemdner_TEXT:MESH:D052999)`, `chemdner_TEXT:MESH:D002329)`, `ebm_pico_ner:I-Intervention_Physical)`, `bionlp_st_2013_pc_ner:B-Complex)`, `medmentions_st21pv_ner:I-T005)`, `chemdner_TEXT:MESH:D064704)`, `bionlp_st_2013_gro_ner:I-ZincCoordinatingDomainTF)`, `bionlp_st_2013_pc_ner:I-Cellular_component)`, `genia_term_corpus_ner:B-ANDDNA_domain_or_regionDNA_domain_or_region)`, `bionlp_st_2013_gro_ner:B-Chromosome)`, `chemdner_TEXT:MESH:D007546)`, `bionlp_st_2013_gro_NER:I-PositiveRegulationOfGeneExpression)`, `medmentions_full_ner:I-T010)`, `pdr_NER:B-Treatment_of_disease)`, `medmentions_full_ner:B-T081)`, `bionlp_st_2011_epi_NER:B-Demethylation)`, `chemdner_TEXT:MESH:D013261)`, `bionlp_st_2013_gro_ner:I-RibosomalRNA)`, `verspoor_2013_ner:O)`, `bionlp_st_2013_gro_NER:B-DevelopmentalProcess)`, `chemdner_TEXT:MESH:D009270)`, `medmentions_full_ner:I-T130)`, `bionlp_st_2013_cg_ner:B-Organism)`, `medmentions_full_ner:B-T014)`, `chemdner_TEXT:MESH:D003374)`, `chemdner_TEXT:MESH:D011078)`, `cellfinder_ner:B-GeneProtein)`, `mayosrs_sts:6)`, `chemdner_TEXT:MESH:D005576)`, `bionlp_st_2013_ge_RE:Cause)`, `an_em_RE:None)`, `sciq_SEQ:answer)`, `bionlp_st_2013_cg_NER:B-Dissociation)`, `mlee_RE:frag)`, `bionlp_st_2013_pc_COREF:coref)`, `chemdner_TEXT:MESH:D008469)`, `ncbi_disease_ner:O)`, `bionlp_st_2011_epi_ner:I-Protein)`, `chemdner_TEXT:MESH:D011140)`, `chemdner_TEXT:MESH:D020001)`, `bionlp_st_2013_gro_ner:I-ThreeDimensionalMolecularStructure)`, `bionlp_st_2013_cg_ner:B-Cancer)`, `genia_term_corpus_ner:B-BUT_NOTother_nameother_name)`, `chemdner_TEXT:MESH:D006862)`, `medmentions_full_ner:B-T104)`, `bionlp_st_2011_epi_RE:Theme)`, `cellfinder_ner:B-Anatomy)`, `chemdner_TEXT:MESH:D010545)`, `biorelex_ner:B-RNA-family)`, `pico_extraction_ner:I-outcome)`, `mantra_gsc_en_patents_ner:I-PHYS)`, `bionlp_st_2013_pc_NER:I-Transcription)`, `bionlp_shared_task_2009_RE:Cause)`, `bionlp_st_2013_gro_ner:B-Vitamin)`, `bionlp_shared_task_2009_RE:CSite)`, `bionlp_st_2011_ge_ner:I-Protein)`, `mlee_COREF:coref)`, `bionlp_st_2013_gro_ner:I-ForkheadWingedHelix)`, `bioinfer_ner:I-Gene)`, `bionlp_st_2013_gro_ner:B-TranscriptionActivatorActivity)`, `chemdner_TEXT:MESH:D054439)`, `chemdner_TEXT:MESH:D011621)`, `ddi_corpus_ner:I-DRUG_N)`, `chemdner_TEXT:MESH:D019308)`, `bionlp_st_2013_gro_ner:I-Locus)`, `bionlp_shared_task_2009_RE:ToLoc)`, `bionlp_st_2013_cg_NER:B-Development)`, `bionlp_st_2013_gro_NER:I-CellularDevelopmentalProcess)`, `bionlp_st_2013_gro_ner:B-Eukaryote)`, `bionlp_st_2013_ge_NER:B-Negative_regulation)`, `seth_corpus_ner:I-SNP)`, `hprd50_ner:B-protein)`, `bionlp_st_2013_gro_NER:B-BindingOfProtein)`, `mlee_NER:I-Negative_regulation)`, `bionlp_st_2011_ge_NER:B-Protein_catabolism)`, `bionlp_st_2013_pc_ner:B-Cellular_component)`, `bionlp_st_2011_id_ner:I-Chemical)`, `chemdner_TEXT:MESH:D013831)`, `biorelex_COREF:None)`, `chemdner_TEXT:MESH:D005609)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactor)`, `mlee_NER:B-Regulation)`, `chemdner_TEXT:MESH:D059808)`, `bionlp_st_2013_gro_ner:I-bHLHTF)`, `chemdner_TEXT:MESH:D010121)`, `chemdner_TEXT:MESH:D017608)`, `chemdner_TEXT:MESH:D007455)`, `mlee_NER:B-Blood_vessel_development)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorComplex)`, `biorelex_ner:B-disease)`, `bionlp_st_2013_cg_NER:B-Cell_differentiation)`, `medmentions_st21pv_ner:I-T092)`, `chemdner_TEXT:MESH:D007477)`, `medmentions_full_ner:B-T168)`, `pcr_ner:I-Chemical)`, `chemdner_TEXT:MESH:D009636)`, `chemdner_TEXT:MESH:D008051)`, `bionlp_shared_task_2009_NER:I-Gene_expression)`, `chemprot_ner:I-GENE-N)`, `biorelex_ner:B-reagent)`, `chemdner_TEXT:MESH:D020123)`, `nlmchem_ner:O)`, `ebm_pico_ner:I-Outcome_Mental)`, `chemdner_TEXT:MESH:D004040)`, `chemdner_TEXT:MESH:D000450)`, `chebi_nactem_fullpaper_ner:O)`, `biorelex_ner:B-protein-isoform)`, `chemdner_TEXT:MESH:D001564)`, `medmentions_full_ner:I-T095)`, `mlee_NER:I-Remodeling)`, `bionlp_st_2013_cg_RE:None)`, `biorelex_ner:O)`, `seth_corpus_RE:AssociatedTo)`, `bioscope_abstracts_ner:B-negation)`, `chebi_nactem_fullpaper_ner:I-Metabolite)`, `bionlp_st_2013_gro_ner:I-TranscriptionRepressorActivity)`, `bionlp_st_2013_cg_NER:B-Transcription)`, `bionlp_st_2011_ge_ner:B-Protein)`, `bionlp_st_2013_ge_ner:B-Protein)`, `bionlp_st_2013_gro_ner:I-Tissue)`, `chemdner_TEXT:MESH:D044005)`, `genia_term_corpus_ner:I-protein_substructure)`, `bionlp_st_2013_gro_ner:I-TranslationFactor)`, `minimayosrs_sts:5)`, `chemdner_TEXT:MESH:D012834)`, `ncbi_disease_ner:I-Modifier)`, `mlee_NER:B-Death)`, `medmentions_full_ner:B-T196)`, `bio_sim_verb_sts:4)`, `bionlp_st_2013_gro_NER:B-CellHomeostasis)`, `chemdner_TEXT:MESH:D006001)`, `bionlp_st_2013_gro_RE:encodes)`, `biorelex_ner:B-fusion-protein)`, `mlee_COREF:None)`, `chemdner_TEXT:MESH:D001623)`, `chemdner_TEXT:MESH:D000812)`, `medmentions_full_ner:B-T046)`, `bionlp_shared_task_2009_NER:O)`, `chemdner_TEXT:MESH:D000735)`, `gnormplus_ner:O)`, `chemdner_TEXT:MESH:D014635)`, `bionlp_st_2013_gro_NER:B-Mitosis)`, `chemdner_TEXT:MESH:D003847)`, `chemdner_TEXT:MESH:D002809)`, `medmentions_full_ner:I-T116)`, `chemdner_TEXT:MESH:D060406)`, `chemprot_ner:B-CHEMICAL)`, `chemdner_TEXT:MESH:D016642)`, `bionlp_st_2013_cg_NER:B-Phosphorylation)`, `an_em_ner:B-Organ)`, `chemdner_TEXT:MESH:D013431)`, `bionlp_shared_task_2009_RE:None)`, `medmentions_full_ner:B-T041)`, `mlee_ner:I-Tissue)`, `chemdner_TEXT:MESH:D023303)`, `ebm_pico_ner:I-Participant_Condition)`, `bionlp_st_2013_gro_ner:I-TATAbox)`, `bionlp_st_2013_gro_ner:I-bZIP)`, `bionlp_st_2011_epi_RE:Sidechain)`, `bionlp_st_2013_gro_ner:B-LivingEntity)`, `mantra_gsc_en_medline_ner:B-CHEM)`, `chemdner_TEXT:MESH:D007659)`, `medmentions_full_ner:I-T085)`, `bionlp_st_2013_cg_ner:I-Organism_substance)`, `medmentions_full_ner:B-T067)`, `chemdner_TEXT:MESH:D057846)`, `bionlp_st_2013_gro_NER:I-SignalingPathway)`, `bc5cdr_ner:I-Chemical)`, `nlm_gene_ner:I-STARGENE)`, `medmentions_full_ner:B-T090)`, `medmentions_full_ner:I-T037)`, `medmentions_full_ner:B-T037)`, `minimayosrs_sts:6)`, `medmentions_full_ner:I-T020)`, `chebi_nactem_fullpaper_ner:B-Species)`, `mirna_ner:O)`, `bionlp_st_2011_id_RE:Participant)`, `bionlp_st_2013_ge_NER:B-Binding)`, `ddi_corpus_ner:B-DRUG)`, `medmentions_full_ner:I-T078)`, `chemdner_TEXT:MESH:D012965)`, `bionlp_st_2013_cg_ner:I-Organ)`, `bionlp_st_2011_id_NER:B-Binding)`, `chemdner_TEXT:MESH:D006571)`, `mayosrs_sts:4)`, `chemdner_TEXT:MESH:D026422)`, `genia_term_corpus_ner:I-RNA_NA)`, `bionlp_st_2011_epi_RE:None)`, `chemdner_TEXT:MESH:D012265)`, `medmentions_full_ner:B-T195)`, `chemdner_TEXT:MESH:D014443)`, `bionlp_st_2013_gro_ner:I-OrganicChemical)`, `ebm_pico_ner:B-Participant_Age)`, `chemdner_TEXT:MESH:D009584)`, `chemdner_TEXT:MESH:D010862)`, `verspoor_2013_ner:B-Concepts_Ideas)`, `bionlp_st_2013_gro_NER:B-ActivationOfProcess)`, `chemdner_TEXT:MESH:D010118)`, `biorelex_COREF:coref)`, `bionlp_st_2013_gro_ner:I-Enzyme)`, `chemdner_TEXT:MESH:D012530)`, `chemdner_TEXT:MESH:D002351)`, `biorelex_ner:B-gene)`, `chemdner_TEXT:MESH:D013213)`, `medmentions_full_ner:B-T103)`, `chemdner_TEXT:MESH:D010091)`, `ebm_pico_ner:B-Participant_Sex)`, `bionlp_st_2013_gro_ner:B-ComplexOfProteinAndDNA)`, `bionlp_st_2013_gro_ner:B-Phenotype)`, `chemdner_TEXT:MESH:D019791)`, `chemdner_TEXT:MESH:D014280)`, `chemdner_TEXT:MESH:D011094)`, `chia_RE:None)`, `biorelex_RE:None)`, `chemdner_TEXT:MESH:D005230)`, `verspoor_2013_ner:B-cohort-patient)`, `chemdner_TEXT:MESH:D013645)`, `bionlp_st_2013_gro_ner:B-SecondMessenger)`, `mlee_ner:B-Cellular_component)`, `bionlp_shared_task_2009_NER:I-Phosphorylation)`, `mlee_ner:B-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D017275)`, `chemdner_TEXT:MESH:D007053)`, `bionlp_st_2013_ge_RE:Site)`, `genia_term_corpus_ner:O)`, `chemprot_RE:CPR:6)`, `chemdner_TEXT:MESH:D006859)`, `genia_term_corpus_ner:I-other_name)`, `medmentions_full_ner:I-T042)`, `pdr_ner:O)`, `medmentions_full_ner:I-T057)`, `bionlp_st_2013_pc_RE:Product)`, `verspoor_2013_ner:B-size)`, `bionlp_st_2013_pc_NER:B-Acetylation)`, `medmentions_st21pv_ner:B-T017)`, `chia_ner:B-Temporal)`, `chemdner_TEXT:MESH:D003404)`, `bionlp_st_2013_gro_RE:None)`, `bionlp_shared_task_2009_NER:B-Gene_expression)`, `mqp_sts:3)`, `bionlp_st_2013_gro_ner:B-Chemical)`, `chemdner_TEXT:MESH:D013754)`, `mantra_gsc_en_medline_ner:B-GEOG)`, `mirna_ner:B-Specific_miRNAs)`, `chemdner_TEXT:MESH:D012492)`, `medmentions_full_ner:B-T190)`, `bionlp_st_2013_cg_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:B-RNA)`, `chemdner_TEXT:MESH:D011743)`, `chemdner_TEXT:MESH:D010795)`, `bionlp_st_2013_gro_NER:I-PositiveRegulation)`, `chemdner_TEXT:MESH:D002241)`, `medmentions_full_ner:B-T038)`, `bionlp_st_2013_gro_RE:hasAgent)`, `mlee_ner:B-Organism)`, `medmentions_full_ner:I-T168)`, `bioscope_abstracts_ner:O)`, `chemdner_TEXT:MESH:D002599)`, `bionlp_st_2013_pc_ner:I-Simple_chemical)`, `medmentions_full_ner:I-T066)`, `chemdner_TEXT:MESH:D019695)`, `bionlp_st_2013_ge_NER:I-Transcription)`, `mantra_gsc_en_emea_ner:B-DISO)`, `bionlp_st_2013_gro_NER:B-CellDeath)`, `medmentions_st21pv_ner:I-T031)`, `chemdner_TEXT:MESH:D004317)`, `bionlp_st_2013_gro_ner:B-TATAbox)`, `chemdner_TEXT:MESH:D052203)`, `bionlp_st_2013_gro_NER:B-CellFateDetermination)`, `medmentions_st21pv_ner:I-T022)`, `bionlp_st_2013_ge_NER:B-Protein_catabolism)`, `bionlp_st_2011_epi_NER:I-Catalysis)`, `verspoor_2013_ner:I-cohort-patient)`, `chemdner_TEXT:MESH:D010100)`, `an_em_ner:I-Developing_anatomical_structure)`, `chemdner_TEXT:MESH:D045162)`, `chia_RE:Has_qualifier)`, `verspoor_2013_RE:has)`, `chemdner_TEXT:MESH:D021382)`, `bionlp_st_2013_ge_NER:B-Acetylation)`, `medmentions_full_ner:I-T079)`, `bionlp_st_2013_gro_NER:B-Maintenance)`, `biorelex_ner:I-protein-domain)`, `chebi_nactem_abstr_ann1_ner:I-Chemical)`, `bioscope_papers_ner:O)`, `chia_RE:Has_scope)`, `bc5cdr_ner:B-Disease)`, `mlee_ner:I-Cellular_component)`, `medmentions_full_ner:I-T195)`, `spl_adr_200db_train_ner:B-AdverseReaction)`, `bionlp_st_2013_gro_ner:I-Promoter)`, `medmentions_full_ner:B-T040)`, `chemdner_TEXT:MESH:D005960)`, `chemdner_TEXT:MESH:D004164)`, `chemdner_TEXT:MESH:D015032)`, `chemdner_TEXT:MESH:D014255)`, `ebm_pico_ner:B-Outcome_Pain)`, `bionlp_st_2013_gro_ner:I-UpstreamRegulatorySequence)`, `bionlp_st_2013_pc_NER:I-Positive_regulation)`, `bionlp_st_2013_cg_NER:I-Regulation)`, `chemdner_TEXT:MESH:D001151)`, `medmentions_full_ner:I-T077)`, `chemdner_TEXT:MESH:D000081)`, `bionlp_st_2013_gro_NER:B-Stabilization)`, `mayosrs_sts:1)`, `biorelex_ner:B-mutation)`, `chemdner_TEXT:MESH:D000241)`, `chemdner_TEXT:MESH:D007930)`, `bionlp_st_2013_gro_NER:B-MetabolicPathway)`, `chemdner_TEXT:MESH:D013629)`, `chemdner_TEXT:MESH:D016202)`, `tmvar_v1_ner:I-DNAMutation)`, `chemdner_TEXT:MESH:D012502)`, `chemdner_TEXT:MESH:D044945)`, `bionlp_st_2013_cg_ner:I-Cellular_component)`, `mlee_ner:B-Developing_anatomical_structure)`, `bionlp_st_2013_gro_ner:I-AP2EREBPRelatedDomain)`, `chemdner_TEXT:MESH:D002338)`, `mayosrs_sts:5)`, `bionlp_st_2013_gro_ner:B-Intron)`, `genia_term_corpus_ner:I-DNA_domain_or_region)`, `anat_em_ner:I-Immaterial_anatomical_entity)`, `bionlp_st_2013_gro_ner:B-MutatedProtein)`, `ebm_pico_ner:I-Outcome_Mortality)`, `bionlp_st_2013_gro_ner:B-ProteinCodingRegion)`, `chemdner_TEXT:MESH:D005047)`, `chia_ner:B-Mood)`, `medmentions_st21pv_ner:O)`, `cellfinder_ner:I-Species)`, `bionlp_st_2013_gro_ner:I-InorganicChemical)`, `bionlp_st_2011_id_ner:B-Entity)`, `bionlp_st_2013_cg_NER:I-Catabolism)`, `an_em_ner:I-Cellular_component)`, `medmentions_full_ner:B-T021)`, `bionlp_st_2013_gro_NER:B-Heterodimerization)`, `chemdner_TEXT:MESH:D008315)`, `medmentions_st21pv_ner:I-T170)`, `chemdner_TEXT:MESH:D050112)`, `chia_RE:Subsumes)`, `medmentions_full_ner:I-T099)`, `bionlp_st_2013_gro_ner:I-Protein)`, `chemdner_TEXT:MESH:D047071)`, `bionlp_st_2013_gro_ner:B-TranscriptionFactorActivity)`, `mlee_ner:B-Organism_subdivision)`, `chemdner_TEXT:MESH:D016559)`, `medmentions_full_ner:B-T129)`, `genia_term_corpus_ner:I-protein_molecule)`, `mlee_ner:B-Drug_or_compound)`, `bionlp_st_2013_gro_NER:B-Silencing)`, `bionlp_st_2013_gro_ner:I-MolecularStructure)`, `genia_term_corpus_ner:B-nucleotide)`, `chemdner_TEXT:MESH:D003042)`, `mantra_gsc_en_emea_ner:B-ANAT)`, `chemdner_TEXT:MESH:D006690)`, `genia_term_corpus_ner:I-ANDcell_linecell_line)`, `chemdner_TEXT:MESH:D005473)`, `mantra_gsc_en_medline_ner:I-PHYS)`, `bionlp_st_2013_cg_NER:B-Blood_vessel_development)`, `bionlp_st_2013_gro_ner:B-BetaScaffoldDomain_WithMinorGrooveContacts)`, `chemdner_TEXT:MESH:D001549)`, `chia_ner:B-Measurement)`, `bionlp_st_2011_id_ner:B-Regulon-operon)`, `bionlp_st_2013_cg_NER:B-Acetylation)`, `pdr_ner:B-Plant)`, `mlee_NER:B-Development)`, `linnaeus_filtered_ner:B-species)`, `bionlp_st_2013_pc_RE:AtLoc)`, `medmentions_full_ner:I-T192)`, `bionlp_st_2013_gro_ner:B-BindingSiteOfProtein)`, `bionlp_st_2013_ge_NER:B-Ubiquitination)`, `bionlp_st_2013_gro_ner:I-ProteinCodingDNARegion)`, `chemdner_TEXT:MESH:D009647)`, `bionlp_st_2013_gro_ner:I-Ligand)`, `bionlp_st_2011_id_ner:O)`, `bionlp_st_2013_gro_NER:I-RNASplicing)`, `bionlp_st_2013_gro_ner:I-ComplexOfProteinAndRNA)`, `bionlp_st_2011_id_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D007501)`, `ehr_rel_sts:5)`, `bionlp_st_2013_gro_ner:B-TranscriptionRegulator)`, `medmentions_full_ner:B-T089)`, `bionlp_st_2011_epi_NER:I-DNA_demethylation)`, `mirna_ner:B-Species)`, `bionlp_st_2013_gro_ner:I-TranscriptionRegulator)`, `bionlp_st_2013_gro_NER:B-ProteinBiosynthesis)`, `scai_chemical_ner:B-ABBREVIATION)`, `bionlp_st_2013_gro_ner:I-Virus)`, `bionlp_st_2011_ge_NER:O)`, `medmentions_full_ner:B-T203)`, `bionlp_st_2013_cg_NER:I-Mutation)`, `bionlp_st_2013_gro_ner:B-ThreeDimensionalMolecularStructure)`, `genetaggold_ner:I-NEWGENE)`, `chemdner_TEXT:MESH:D010705)`, `chia_ner:I-Mood)`, `medmentions_full_ner:I-T068)`, `minimayosrs_sts:4)`, `medmentions_full_ner:I-T097)`, `bionlp_st_2013_gro_ner:I-BetaScaffoldDomain_WithMinorGrooveContacts)`, `mantra_gsc_en_emea_ner:I-PHYS)`, `medmentions_full_ner:I-T104)`, `bio_sim_verb_sts:5)`, `chebi_nactem_abstr_ann1_ner:B-Biological_Activity)`, `bionlp_st_2013_gro_NER:B-IntraCellularProcess)`, `mantra_gsc_en_emea_ner:I-PHEN)`, `mlee_ner:B-Cell)`, `chemdner_TEXT:MESH:D045784)`, `bionlp_st_2013_gro_ner:I-Vitamin)`, `chemdner_TEXT:MESH:D010416)`, `bionlp_st_2013_gro_ner:B-FusionGene)`, `bionlp_st_2013_gro_ner:I-FusionProtein)`, `mlee_NER:B-Remodeling)`, `minimayosrs_sts:8)`, `bionlp_st_2013_gro_ner:B-Enhancer)`, `mantra_gsc_en_emea_ner:O)`, `bionlp_st_2013_gro_ner:B-OpenReadingFrame)`, `bionlp_st_2013_pc_COREF:None)`, `medmentions_full_ner:I-T123)`, `bionlp_st_2013_gro_NER:I-RegulatoryProcess)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfGeneExpression)`, `nlm_gene_ner:B-Domain)`, `bionlp_st_2013_pc_NER:B-Methylation)`, `medmentions_full_ner:B-T057)`, `chemdner_TEXT:MESH:D010226)`, `bionlp_st_2013_gro_ner:B-GeneProduct)`, `ebm_pico_ner:I-Outcome_Other)`, `chemdner_TEXT:MESH:D005223)`, `pdr_RE:Theme)`, `bionlp_shared_task_2009_NER:B-Protein_catabolism)`, `chemdner_TEXT:MESH:D019344)`, `gnormplus_ner:I-FamilyName)`, `verspoor_2013_ner:B-gender)`, `bionlp_st_2013_gro_NER:B-TranscriptionInitiation)`, `spl_adr_200db_train_ner:B-Severity)`, `medmentions_st21pv_ner:B-T097)`, `anat_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_NER:I-RNAMetabolism)`, `bioinfer_ner:I-Protein_complex)`, `anat_em_ner:I-Cell)`, `bionlp_st_2013_gro_ner:B-ProteinDomain)`, `bionlp_st_2013_gro_ner:I-PrimaryStructure)`, `genia_term_corpus_ner:I-other_artificial_source)`, `chemdner_TEXT:MESH:D010098)`, `bionlp_st_2013_gro_ner:I-Enhancer)`, `bionlp_st_2013_gro_ner:I-PositiveTranscriptionRegulator)`, `chemdner_TEXT:MESH:D004051)`, `chemdner_TEXT:MESH:D013853)`, `chebi_nactem_fullpaper_ner:B-Metabolite)`, `diann_iber_eval_en_ner:B-Disability)`, `biorelex_ner:B-peptide)`, `medmentions_full_ner:B-T048)`, `bionlp_st_2013_gro_ner:I-Function)`, `genia_term_corpus_ner:I-DNA_NA)`, `mlee_ner:I-Anatomical_system)`, `bioinfer_ner:B-Individual_protein)`, `verspoor_2013_ner:I-Physiology)`, `genia_term_corpus_ner:I-RNA_molecule)`, `chemdner_TEXT:MESH:D000255)`, `minimayosrs_sts:7)`, `mlee_NER:B-Localization)`, `bionlp_st_2013_gro_NER:B-ResponseProcess)`, `mantra_gsc_en_medline_ner:I-LIVB)`, `chemdner_TEXT:MESH:D010649)`, `seth_corpus_ner:B-Gene)`, `bionlp_st_2013_gro_ner:B-Attenuator)`, `chemdner_TEXT:MESH:D015363)`, `bionlp_st_2013_pc_NER:B-Inactivation)`, `medmentions_full_ner:I-T191)`, `mlee_ner:I-Organ)`, `chemdner_TEXT:MESH:D011765)`, `bionlp_shared_task_2009_NER:B-Binding)`, `an_em_ner:B-Cellular_component)`, `genia_term_corpus_ner:I-RNA_substructure)`, `medmentions_full_ner:B-T051)`, `anat_em_ner:I-Pathological_formation)`, `bionlp_st_2013_gro_RE:hasPatient3)`, `chemdner_TEXT:MESH:D013634)`, `chemdner_TEXT:MESH:D014414)`, `chia_RE:Has_index)`, `ddi_corpus_ner:B-GROUP)`, `bionlp_st_2013_gro_ner:B-MutantProtein)`, `bionlp_st_2013_ge_NER:I-Negative_regulation)`, `biorelex_ner:I-amino-acid)`, `chemdner_TEXT:MESH:D053279)`, `chemprot_RE:CPR:2)`, `bionlp_st_2013_gro_ner:B-bHLHTF)`, `bionlp_st_2013_cg_NER:I-Breakdown)`, `scai_chemical_ner:I-ABBREVIATION)`, `pdr_NER:B-Cause_of_disease)`, `chemdner_TEXT:MESH:D002219)`, `medmentions_full_ner:B-T044)`, `mirna_ner:B-Non-Specific_miRNAs)`, `chemdner_TEXT:MESH:D020748)`, `bionlp_shared_task_2009_RE:Theme)`, `chemdner_TEXT:MESH:D001647)`, `bionlp_st_2011_ge_NER:I-Regulation)`, `bionlp_st_2013_pc_ner:B-Gene_or_gene_product)`, `biorelex_ner:I-protein)`, `mantra_gsc_en_medline_ner:B-PROC)`, `medmentions_full_ner:I-T081)`, `medmentions_st21pv_ner:B-T022)`, `chia_ner:B-Multiplier)`, `bionlp_st_2013_gro_NER:B-GeneMutation)`, `chemdner_TEXT:MESH:D002232)`, `chemdner_TEXT:MESH:D010456)`, `biosses_sts:7)`, `medmentions_full_ner:B-T071)`, `chemdner_TEXT:MESH:D008628)`, `biorelex_ner:I-protein-complex)`, `chemdner_TEXT:MESH:D007328)`, `bionlp_st_2013_pc_NER:I-Activation)`, `bionlp_st_2013_cg_NER:B-Metabolism)`, `scai_chemical_ner:I-PARTIUPAC)`, `verspoor_2013_ner:B-age)`, `medmentions_full_ner:B-T122)`, `medmentions_full_ner:I-T050)`, `genia_term_corpus_ner:B-ANDother_nameother_name)`, `bionlp_st_2013_gro_NER:B-SPhase)`, `chemdner_TEXT:MESH:D012500)`, `mlee_NER:B-Metabolism)`, `bionlp_st_2011_id_NER:B-Positive_regulation)`, `chemdner_TEXT:MESH:D002794)`, `bionlp_st_2013_gro_NER:B-ProteinTransport)`, `chemdner_TEXT:MESH:D006028)`, `bionlp_st_2013_gro_RE:hasPatient2)`, `chemdner_TEXT:MESH:D009822)`, `bionlp_st_2013_cg_ner:I-Cancer)`, `bionlp_shared_task_2009_ner:I-Entity)`, `pcr_ner:B-Herb)`, `pubmed_qa_labeled_fold0_CLF:yes)`, `bionlp_st_2013_gro_NER:I-NegativeRegulation)`, `bionlp_st_2013_cg_NER:B-Dephosphorylation)`, `anat_em_ner:B-Multi-tissue_structure)`, `chemdner_TEXT:MESH:D008274)`, `medmentions_full_ner:B-T025)`, `chemprot_RE:CPR:9)`, `bionlp_st_2013_pc_RE:Participant)`, `bionlp_st_2013_pc_ner:B-Simple_chemical)`, `genia_term_corpus_ner:B-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:B-bZIP)`, `bionlp_st_2013_gro_ner:I-Eukaryote)`, `bionlp_st_2013_pc_ner:I-Complex)`, `hprd50_ner:I-protein)`, `medmentions_full_ner:B-T020)`, `bionlp_st_2013_gro_ner:B-Agonist)`, `medmentions_full_ner:B-T030)`, `chemdner_TEXT:MESH:D009536)`, `medmentions_full_ner:B-T169)`, `genia_term_corpus_ner:I-nucleotide)`, `bionlp_st_2013_gro_NER:I-ProteinCatabolism)`, `bc5cdr_ner:O)`, `chemdner_TEXT:MESH:D003078)`, `medmentions_full_ner:I-T040)`, `chemdner_TEXT:MESH:D005963)`, `bionlp_st_2013_gro_ner:B-ExpressionProfiling)`, `mantra_gsc_en_emea_ner:I-DEVI)`, `mlee_NER:B-Cell_division)`, `ebm_pico_ner:B-Intervention_Pharmacological)`, `chemdner_TEXT:MESH:D008790)`, `mantra_gsc_en_emea_ner:I-ANAT)`, `mantra_gsc_en_medline_ner:B-ANAT)`, `chemdner_TEXT:MESH:D003545)`, `bionlp_st_2013_gro_NER:I-IntraCellularTransport)`, `bionlp_st_2013_gro_NER:I-CellDivision)`, `chemdner_TEXT:MESH:D013438)`, `bionlp_st_2011_id_NER:I-Negative_regulation)`, `bionlp_st_2013_gro_NER:I-DevelopmentalProcess)`, `mlee_ner:B-Protein_domain_or_region)`, `chemdner_TEXT:MESH:D014978)`, `bionlp_st_2011_id_NER:O)`, `bionlp_st_2013_gro_ner:I-ReporterGeneConstruction)`, `medmentions_full_ner:I-T025)`, `bionlp_st_2019_bb_RE:Exhibits)`, `ddi_corpus_ner:I-GROUP)`, `chemdner_TEXT:MESH:D011241)`, `chemdner_TEXT:MESH:D010446)`, `bionlp_st_2013_gro_ner:I-ExperimentalMethod)`, `anat_em_ner:B-Tissue)`, `chemdner_TEXT:MESH:D000470)`, `bionlp_st_2013_pc_NER:I-Inactivation)`, `bionlp_st_2013_gro_ner:I-Agonist)`, `medmentions_full_ner:B-T024)`, `mlee_NER:I-Transcription)`, `bionlp_st_2011_epi_NER:B-Deglycosylation)`, `bionlp_st_2013_cg_NER:B-Cell_death)`, `chemdner_TEXT:MESH:D000266)`, `chemdner_TEXT:MESH:D019833)`, `genia_term_corpus_ner:I-RNA_family_or_group)`, `biosses_sts:8)`, `lll_RE:genic_interaction)`, `bionlp_st_2013_gro_ner:B-OrganicChemical)`, `chemdner_TEXT:MESH:D013267)`, `bionlp_st_2013_gro_ner:I-TranscriptionCofactor)`, `biorelex_ner:B-protein-region)`, `chemdner_TEXT:MESH:D001565)`, `genia_term_corpus_ner:B-cell_line)`, `bionlp_st_2013_gro_NER:B-Cleavage)`, `ddi_corpus_RE:EFFECT)`, `bionlp_st_2013_cg_NER:B-Planned_process)`, `bionlp_st_2013_cg_ner:I-Immaterial_anatomical_entity)`, `chemdner_TEXT:MESH:D007660)`, `medmentions_full_ner:I-T090)`, `bionlp_st_2013_gro_ner:I-CpGIsland)`, `bionlp_st_2013_gro_ner:B-AminoAcid)`, `chemdner_TEXT:MESH:D001095)`, `mlee_NER:I-Death)`, `bionlp_st_2013_cg_ner:I-Anatomical_system)`, `bionlp_st_2013_gro_NER:B-Decrease)`, `bionlp_st_2013_pc_NER:B-Hydroxylation)`, `chemdner_TEXT:None)`, `bio_sim_verb_sts:3)`, `biorelex_ner:B-protein)`, `bionlp_st_2013_gro_ner:I-BasicDomain)`, `bionlp_st_2011_ge_ner:I-Entity)`, `bionlp_st_2013_gro_ner:B-PhysicalContinuant)`, `chemprot_RE:CPR:4)`, `chemdner_TEXT:MESH:D003345)`, `chemdner_TEXT:MESH:D010080)`, `mantra_gsc_en_patents_ner:O)`, `bionlp_st_2013_gro_ner:B-AntisenseRNA)`, `bionlp_st_2013_gro_ner:B-ProteinCodingDNARegion)`, `chemdner_TEXT:MESH:D010768)`, `chebi_nactem_fullpaper_ner:I-Protein)`, `genia_term_corpus_ner:I-multi_cell)`, `bionlp_st_2013_gro_ner:I-Gene)`, `medmentions_full_ner:B-T042)`, `chemdner_TEXT:MESH:D006034)`, `biorelex_ner:I-brand)`, `chebi_nactem_abstr_ann1_ner:I-Species)`, `chemdner_TEXT:MESH:D012236)`, `bionlp_st_2013_gro_ner:I-GeneProduct)`, `chemdner_TEXT:MESH:D005665)`, `chemdner_TEXT:MESH:D008715)`, `medmentions_st21pv_ner:I-T103)`, `ddi_corpus_RE:None)`, `medmentions_st21pv_ner:I-T091)`, `chemdner_TEXT:MESH:D019158)`, `chemdner_TEXT:MESH:D001280)`, `chemdner_TEXT:MESH:D009249)`, `medmentions_full_ner:I-T067)`, `medmentions_full_ner:B-T005)`, `bionlp_st_2013_cg_NER:I-Remodeling)`, `chemdner_TEXT:MESH:D000166)`, `osiris_ner:B-variant)`, `spl_adr_200db_train_ner:I-DrugClass)`, `mirna_ner:I-Species)`, `medmentions_st21pv_ner:I-T033)`, `ebm_pico_ner:I-Participant_Age)`, `medmentions_full_ner:B-T095)`, `bionlp_st_2013_gro_NER:B-RNAMetabolism)`, `chemdner_TEXT:MESH:D005231)`, `medmentions_full_ner:B-T062)`, `bionlp_st_2011_ge_NER:I-Gene_expression)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactor)`, `genia_term_corpus_ner:B-protein_domain_or_region)`, `mantra_gsc_en_emea_ner:B-PROC)`, `mlee_NER:I-Pathway)`, `bionlp_st_2013_gro_NER:I-BindingOfProteinToProteinBindingSiteOfProtein)`, `bionlp_st_2011_id_COREF:coref)`, `biosses_sts:6)`, `biorelex_ner:I-organism)`, `chia_ner:B-Value)`, `verspoor_2013_ner:B-body-part)`, `chemdner_TEXT:MESH:D004974)`, `chia_RE:Has_mood)`, `medmentions_st21pv_ner:B-T074)`, `chemdner_TEXT:MESH:D000535)`, `verspoor_2013_ner:I-Disorder)`, `bionlp_st_2013_gro_NER:B-BindingToMolecularEntity)`, `bionlp_st_2013_gro_ner:I-ReporterGene)`, `mayosrs_sts:8)`, `bionlp_st_2013_cg_ner:I-DNA_domain_or_region)`, `bionlp_st_2013_gro_NER:I-Pathway)`, `medmentions_st21pv_ner:I-T168)`, `bionlp_st_2013_gro_NER:B-NegativeRegulation)`, `medmentions_full_ner:B-T123)`, `bionlp_st_2013_pc_NER:B-Positive_regulation)`, `bionlp_st_2013_gro_NER:I-FormationOfProteinDNAComplex)`, `chemdner_TEXT:MESH:D000577)`, `mlee_NER:B-Ubiquitination)`, `chemdner_TEXT:MESH:D003630)`, `bionlp_st_2013_gro_ner:B-Transcript)`, `bionlp_st_2013_cg_NER:I-Transcription)`, `anat_em_ner:B-Organ)`, `anat_em_ner:I-Organism_substance)`, `spl_adr_200db_train_ner:B-DrugClass)`, `bionlp_st_2013_gro_ner:I-ProteinSubunit)`, `biorelex_ner:B-protein-domain)`, `chemdner_TEXT:MESH:D006051)`, `bionlp_st_2011_id_NER:B-Process)`, `bionlp_st_2013_pc_NER:B-Ubiquitination)`, `bionlp_st_2013_pc_NER:B-Transcription)`, `chemdner_TEXT:MESH:D006838)`, `bionlp_st_2013_gro_RE:hasPatient5)`, `bionlp_st_2013_ge_NER:B-Localization)`, `chemdner_TEXT:MESH:D011759)`, `chemdner_TEXT:MESH:D053243)`, `biorelex_ner:I-mutation)`, `mantra_gsc_en_emea_ner:I-LIVB)`, `bionlp_st_2013_gro_NER:I-Transport)`, `bionlp_st_2011_id_RE:Site)`, `chemdner_TEXT:MESH:D015474)`, `bionlp_st_2013_gro_NER:B-Dimerization)`, `bionlp_st_2013_cg_NER:I-Localization)`, `medmentions_full_ner:I-T032)`, `chemdner_TEXT:MESH:D018036)`, `medmentions_full_ner:I-T167)`, `chemprot_RE:CPR:5)`, `minimayosrs_sts:2)`, `biorelex_ner:B-protein-DNA-complex)`, `cellfinder_ner:I-CellComponent)`, `nlm_gene_ner:B-Other)`, `medmentions_full_ner:I-T019)`, `chebi_nactem_abstr_ann1_ner:B-Spectral_Data)`, `bionlp_st_2013_cg_ner:I-Multi-tissue_structure)`, `medmentions_full_ner:B-T010)`, `mantra_gsc_en_medline_ner:I-GEOG)`, `chemprot_ner:I-GENE-Y)`, `mirna_ner:I-Diseases)`, `an_em_ner:O)`, `bionlp_st_2013_cg_NER:B-Remodeling)`, `medmentions_st21pv_ner:I-T058)`, `scicite_TEXT:background)`, `bionlp_st_2013_cg_NER:B-Mutation)`, `genia_term_corpus_ner:B-mono_cell)`, `bionlp_st_2013_gro_ner:B-DNA)`, `medmentions_full_ner:I-T114)`, `bionlp_st_2011_id_RE:Theme)`, `genetaggold_ner:B-NEWGENE)`, `mlee_ner:I-Organism_subdivision)`, `bionlp_shared_task_2009_NER:I-Regulation)`, `bionlp_st_2013_gro_ner:B-Microorganism)`, `chemdner_TEXT:MESH:D006108)`, `biorelex_ner:B-amino-acid)`, `bioinfer_ner:I-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:I-Chemical)`, `mantra_gsc_en_patents_ner:I-DEVI)`, `mantra_gsc_en_medline_ner:O)`, `bionlp_st_2013_pc_NER:I-Regulation)`, `medmentions_full_ner:B-T043)`, `scicite_TEXT:result)`, `bionlp_st_2013_ge_NER:I-Binding)`, `chemdner_TEXT:MESH:D011441)`, `genia_term_corpus_ner:I-protein_domain_or_region)`, `bionlp_st_2011_epi_RE:Cause)`, `bionlp_st_2013_gro_ner:B-Nucleosome)`, `chemdner_TEXT:MESH:D011223)`, `chebi_nactem_abstr_ann1_ner:B-Protein)`, `bionlp_st_2013_gro_RE:hasFunction)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorActivity)`, `biorelex_ner:B-protein-family)`, `bionlp_st_2013_cg_ner:B-Gene_or_gene_product)`, `tmvar_v1_ner:B-SNP)`, `bionlp_st_2013_gro_ner:B-ExperimentalMethod)`, `bionlp_st_2013_gro_ner:B-ReporterGeneConstruction)`, `bionlp_st_2011_ge_NER:B-Transcription)`, `chemdner_TEXT:MESH:D004041)`, `chemdner_TEXT:MESH:D000631)`, `chebi_nactem_fullpaper_ner:I-Species)`, `medmentions_full_ner:B-T170)`, `bionlp_st_2013_gro_ner:B-ForkheadWingedHelix)`, `bionlp_st_2013_cg_ner:B-Organism_subdivision)`, `genia_term_corpus_ner:I-DNA_molecule)`, `bionlp_st_2013_cg_NER:I-Glycolysis)`, `an_em_ner:B-Pathological_formation)`, `bionlp_st_2013_gro_NER:B-TranscriptionTermination)`, `bionlp_st_2013_gro_NER:B-CellAging)`, `bionlp_st_2013_cg_ner:B-Protein_domain_or_region)`, `anat_em_ner:B-Organism_substance)`, `medmentions_full_ner:B-T053)`, `mlee_ner:B-Multi-tissue_structure)`, `biosses_sts:4)`, `bioscope_abstracts_ner:I-speculation)`, `chemdner_TEXT:MESH:D053644)`, `bionlp_st_2013_cg_NER:I-Translation)`, `tmvar_v1_ner:B-DNAMutation)`, `genia_term_corpus_ner:B-RNA_substructure)`, `an_em_ner:B-Anatomical_system)`, `bionlp_st_2013_gro_ner:B-Conformation)`, `bionlp_st_2013_gro_NER:I-NegativeRegulationOfTranscriptionOfGene)`, `medmentions_full_ner:I-T069)`, `chemdner_TEXT:MESH:D006820)`, `chemdner_TEXT:MESH:D015725)`, `chemdner_TEXT:MESH:D010281)`, `mlee_NER:B-Pathway)`, `bionlp_st_2011_id_NER:I-Regulation)`, `bionlp_st_2013_gro_NER:I-GeneExpression)`, `medmentions_full_ner:I-T073)`, `biosses_sts:2)`, `medmentions_full_ner:I-T043)`, `chemdner_TEXT:MESH:D001152)`, `bionlp_st_2013_gro_ner:I-DNAMolecule)`, `chemdner_TEXT:MESH:D015636)`, `chemdner_TEXT:MESH:D000666)`, `chemprot_RE:None)`, `bionlp_st_2013_gro_ner:B-Sequence)`, `chemdner_TEXT:MESH:D009151)`, `chia_ner:B-Observation)`, `an_em_COREF:coref)`, `medmentions_full_ner:B-T120)`, `bionlp_st_2013_gro_ner:B-Tissue)`, `bionlp_st_2013_gro_ner:B-MolecularEntity)`, `bionlp_st_2013_pc_NER:B-Dephosphorylation)`, `chemdner_TEXT:MESH:D044242)`, `bionlp_st_2013_gro_ner:B-FusionProtein)`, `biorelex_ner:B-cell)`, `bionlp_st_2013_gro_NER:B-Disease)`, `bionlp_st_2011_id_RE:None)`, `biorelex_ner:B-protein-motif)`, `bionlp_st_2013_pc_NER:I-Localization)`, `bionlp_st_2013_gro_ner:B-ZincCoordinatingDomain)`, `bionlp_st_2013_gro_ner:B-Locus)`, `genia_term_corpus_ner:B-other_organic_compound)`, `seth_corpus_ner:B-SNP)`, `pcr_ner:O)`, `genia_term_corpus_ner:I-virus)`, `bionlp_st_2013_gro_ner:I-Peptide)`, `chebi_nactem_abstr_ann1_ner:B-Chemical)`, `bionlp_st_2013_gro_ner:B-RNAMolecule)`, `bionlp_st_2013_gro_ner:B-SequenceHomologyAnalysis)`, `chemdner_TEXT:MESH:D005054)`, `bionlp_st_2013_ge_NER:B-Phosphorylation)`, `bionlp_st_2013_gro_NER:B-CellularProcess)`, `bionlp_st_2013_ge_RE:Site2)`, `verspoor_2013_ner:B-Phenomena)`, `chia_ner:I-Temporal)`, `bionlp_st_2013_gro_NER:I-Localization)`, `bionlp_st_2013_cg_NER:B-Ubiquitination)`, `chemdner_TEXT:MESH:D009020)`, `bionlp_st_2013_cg_RE:FromLoc)`, `mlee_ner:B-Organism_substance)`, `genia_term_corpus_ner:I-tissue)`, `medmentions_st21pv_ner:I-T082)`, `chemdner_TEXT:MESH:D054358)`, `medmentions_full_ner:I-T052)`, `chemdner_TEXT:MESH:D005459)`, `chemdner_TEXT:MESH:D047188)`, `medmentions_full_ner:I-T031)`, `chemdner_TEXT:MESH:D013890)`, `chemdner_TEXT:MESH:D004573)`, `genia_term_corpus_ner:B-peptide)`, `an_em_ner:I-Organism_subdivision)`, `bionlp_st_2013_gro_ner:B-MessengerRNA)`, `medmentions_full_ner:B-T171)`, `bionlp_st_2013_gro_NER:B-Affecting)`, `genia_term_corpus_ner:I-body_part)`, `bionlp_st_2013_gro_ner:B-Prokaryote)`, `chemdner_TEXT:MESH:D013844)`, `medmentions_full_ner:I-T061)`, `bionlp_st_2013_pc_NER:B-Negative_regulation)`, `bionlp_st_2013_gro_ner:I-EukaryoticCell)`, `pdr_ner:I-Plant)`, `chemdner_TEXT:MESH:D024341)`, `medmentions_full_ner:I-T092)`, `chemdner_TEXT:MESH:D020319)`, `bionlp_st_2013_cg_NER:B-Cell_transformation)`, `bionlp_st_2013_gro_NER:B-BindingOfTranscriptionFactorToDNA)`, `an_em_ner:I-Anatomical_system)`, `bionlp_st_2011_epi_NER:B-Hydroxylation)`, `bionlp_st_2013_gro_ner:I-Exon)`, `cellfinder_ner:B-Species)`, `bionlp_st_2013_gro_NER:B-Pathway)`, `bionlp_st_2013_ge_NER:B-Protein_modification)`, `bionlp_st_2013_gro_ner:I-FusionGene)`, `bionlp_st_2011_rel_ner:B-Entity)`, `bionlp_st_2011_id_RE:CSite)`, `bionlp_st_2013_ge_NER:B-Positive_regulation)`, `bionlp_st_2013_gro_ner:I-BindingAssay)`, `bionlp_st_2013_gro_NER:B-CellDivision)`, `bionlp_st_2019_bb_ner:I-Microorganism)`, `medmentions_full_ner:I-T059)`, `chemdner_TEXT:MESH:D011108)`, `bionlp_st_2013_gro_NER:B-PositiveRegulationOfTranscription)`, `bionlp_st_2013_gro_ner:B-GeneRegion)`, `bionlp_st_2013_cg_COREF:None)`, `chemdner_TEXT:MESH:D010261)`, `mlee_NER:B-Binding)`, `chemprot_ner:I-CHEMICAL)`, `bionlp_st_2011_id_RE:ToLoc)`, `biorelex_ner:I-organelle)`, `chemdner_TEXT:MESH:D004318)`, `genia_term_corpus_ner:I-DNA_family_or_group)`, `bionlp_st_2013_gro_ner:B-RNAPolymerase)`, `bionlp_st_2013_gro_ner:B-CellComponent)`, `bionlp_st_2013_gro_NER:B-RegulationOfGeneExpression)`, `bionlp_st_2013_gro_ner:B-Peptide)`, `bionlp_shared_task_2009_NER:B-Transcription)`, `biorelex_ner:B-tissue)`, `pico_extraction_ner:B-participant)`, `chia_ner:I-Visit)`, `chemdner_TEXT:MESH:D011807)`, `chemdner_TEXT:MESH:D014501)`, `bionlp_st_2013_gro_NER:I-IntraCellularProcess)`, `ehr_rel_sts:7)`, `pico_extraction_ner:I-intervention)`, `chemdner_TEXT:MESH:D001599)`, `bionlp_st_2013_gro_ner:I-RegulatoryDNARegion)`, `medmentions_st21pv_ner:I-T037)`, `chemdner_TEXT:MESH:D055768)`, `bionlp_st_2013_gro_ner:B-ChromosomalDNA)`, `chemdner_TEXT:MESH:D008550)`, `bionlp_st_2013_pc_RE:Site)`, `medmentions_full_ner:I-T087)`, `chemdner_TEXT:MESH:D001583)`, `bionlp_st_2011_epi_NER:B-Dehydroxylation)`, `ehr_rel_sts:3)`, `bionlp_st_2013_gro_ner:I-MutantProtein)`, `chemdner_TEXT:MESH:D011804)`, `medmentions_full_ner:B-T091)`, `bionlp_st_2013_cg_RE:CSite)`, `linnaeus_ner:O)`, `medmentions_st21pv_ner:B-T201)`, `verspoor_2013_ner:B-Disorder)`, `bionlp_st_2013_cg_NER:I-Death)`, `bioinfer_ner:I-Individual_protein)`, `medmentions_full_ner:B-T191)`, `verspoor_2013_ner:B-ethnicity)`, `chemdner_TEXT:MESH:D002083)`, `genia_term_corpus_ner:B-carbohydrate)`, `genia_term_corpus_ner:B-DNA_molecule)`, `medmentions_full_ner:B-T069)`, `pdr_NER:I-Treatment_of_disease)`, `mlee_ner:B-Anatomical_system)`, `chebi_nactem_fullpaper_ner:B-Spectral_Data)`, `chemdner_TEXT:MESH:D005419)`, `bionlp_st_2013_gro_ner:I-Nucleotide)`, `medmentions_full_ner:B-T194)`, `chemdner_TEXT:MESH:D005947)`, `chemdner_TEXT:MESH:D008627)`, `bionlp_st_2013_gro_NER:B-ExperimentalIntervention)`, `chemdner_TEXT:MESH:D011073)`, `chia_RE:Has_negation)`, `verspoor_2013_ner:I-mutation)`, `chemdner_TEXT:MESH:D004224)`, `chemdner_TEXT:MESH:D005663)`, `medmentions_full_ner:I-T094)`, `chemdner_TEXT:MESH:D006877)`, `ebm_pico_ner:B-Outcome_Mortality)`, `bionlp_st_2013_gro_ner:B-TranscriptionRepressor)`, `biorelex_ner:I-cell)`, `bionlp_st_2013_gro_NER:I-BindingOfProteinToDNA)`, `verspoor_2013_RE:None)`, `bionlp_st_2013_gro_NER:B-ProteinModification)`, `chemdner_TEXT:MESH:D047090)`, `medmentions_full_ner:I-T204)`, `chemdner_TEXT:MESH:D006843)`, `biorelex_ner:I-protein-family)`, `chemdner_TEXT:MESH:D012694)`, `bionlp_st_2013_gro_ner:B-TranslationFactor)`, `scai_chemical_ner:B-)`, `bionlp_st_2013_gro_ner:B-Exon)`, `medmentions_full_ner:I-T083)`, `bionlp_st_2013_gro_ner:I-TranscriptionActivatorActivity)`, `medmentions_full_ner:I-T101)`, `medmentions_full_ner:B-T034)`, `bionlp_st_2013_gro_ner:I-Histone)`, `ddi_corpus_RE:MECHANISM)`, `mantra_gsc_en_emea_ner:I-PROC)`, `genia_term_corpus_ner:I-peptide)`, `bionlp_st_2013_cg_NER:B-Cell_proliferation)`, `chemdner_TEXT:MESH:D004140)`, `medmentions_full_ner:B-T083)`, `diann_iber_eval_en_ner:I-Disability)`, `bionlp_st_2013_gro_NER:B-PosttranslationalModification)`, `biorelex_ner:I-fusion-protein)`, `chemdner_TEXT:MESH:D020910)`, `chemdner_TEXT:MESH:D014747)`, `bionlp_st_2013_ge_NER:B-Gene_expression)`, `biorelex_ner:I-tissue)`, `mantra_gsc_en_patents_ner:B-LIVB)`, `medmentions_full_ner:O)`, `medmentions_full_ner:B-T077)`, `bionlp_st_2013_gro_ner:I-Operon)`, `chemdner_TEXT:MESH:D002392)`, `chemdner_TEXT:MESH:D014498)`, `chemdner_TEXT:MESH:D002368)`, `chemdner_TEXT:MESH:D018817)`, `bionlp_st_2013_ge_NER:I-Regulation)`, `genia_term_corpus_ner:B-atom)`, `chemdner_TEXT:MESH:D011092)`, `chemdner_TEXT:MESH:D015283)`, `chemdner_TEXT:MESH:D018698)`, `chemdner_TEXT:MESH:D009569)`, `muchmore_en_ner:I-umlsterm)`, `bionlp_st_2013_cg_NER:B-Death)`, `nlm_gene_ner:I-Other)`, `medmentions_full_ner:B-T109)`, `osiris_ner:I-variant)`, `ehr_rel_sts:6)`, `chemdner_TEXT:MESH:D001120)`, `mlee_ner:I-Protein_domain_or_region)`, `bionlp_st_2013_pc_NER:B-Dissociation)`, `bionlp_st_2013_cg_NER:B-Metastasis)`, `chemdner_TEXT:MESH:D014204)`, `chemdner_TEXT:MESH:D005857)`, `medmentions_full_ner:I-T030)`, `chemdner_TEXT:MESH:D019256)`, `bionlp_st_2013_gro_ner:B-Polymerase)`, `chia_ner:B-Negation)`, `bionlp_st_2013_gro_NER:B-CellularMetabolicProcess)`, `bionlp_st_2013_gro_NER:B-CellDifferentiation)`, `biorelex_ner:I-protein-motif)`, `medmentions_full_ner:I-T093)`, `chemdner_TEXT:MESH:D019820)`, `anat_em_ner:B-Pathological_formation)`, `bionlp_shared_task_2009_NER:B-Localization)`, `genia_term_corpus_ner:B-RNA_domain_or_region)`, `chemdner_TEXT:MESH:D014668)`, `bionlp_st_2013_pc_ner:I-Gene_or_gene_product)`, `chemdner_TEXT:MESH:D019207)`, `bionlp_st_2013_gro_NER:B-BindingOfProteinToProteinBindingSiteOfDNA)`, `medmentions_full_ner:B-T059)`, `bionlp_st_2013_gro_ner:B-Ligand)`, `bio_sim_verb_sts:6)`, `biorelex_ner:B-experimental-construct)`, `bionlp_st_2013_gro_ner:I-DNA)`, `pdr_NER:O)`, `chemdner_TEXT:MESH:D008670)`, `bionlp_st_2011_ge_RE:Cause)`, `chemdner_TEXT:MESH:D015232)`, `bionlp_st_2013_pc_NER:O)`, `bionlp_st_2013_gro_NER:B-FormationOfProteinDNAComplex)`, `medmentions_full_ner:B-T121)`, `bionlp_shared_task_2009_NER:B-Regulation)`, `chemdner_TEXT:MESH:D009534)`, `chemdner_TEXT:MESH:D014451)`, `bionlp_st_2011_id_RE:AtLoc)`, `chemdner_TEXT:MESH:D011799)`, `medmentions_st21pv_ner:B-T204)`, `genia_term_corpus_ner:I-protein_subunit)`, `biorelex_ner:I-assay)`, `chemdner_TEXT:MESH:D005680)`, `an_em_ner:I-Organism_substance)`, `chemdner_TEXT:MESH:D010368)`, `chemdner_TEXT:MESH:D000872)`, `bionlp_st_2011_id_NER:I-Gene_expression)`, `bionlp_st_2013_cg_NER:B-Regulation)`, `mlee_ner:I-DNA_domain_or_region)`, `chemdner_TEXT:MESH:D001393)`, `medmentions_full_ner:I-T038)`, `chemdner_TEXT:MESH:D047311)`, `chemdner_TEXT:MESH:D011453)`, `chemdner_TEXT:MESH:D020106)`, `chemdner_TEXT:MESH:D019257)`, `bionlp_st_2013_gro_ner:B-NuclearReceptor)`, `chemdner_TEXT:MESH:D002117)`, `genia_term_corpus_ner:B-lipid)`, `bionlp_st_2013_gro_ner:B-SmallInterferingRNA)`, `chemdner_TEXT:MESH:D011205)`, `chemdner_TEXT:MESH:D002686)`, `bionlp_st_2013_gro_NER:B-Translation)`, `ebm_pico_ner:I-Intervention_Psychological)`, `mlee_ner:I-Drug_or_compound)`, `bionlp_st_2013_gro_ner:I-TranscriptionFactorBindingSiteOfDNA)`, `chemdner_TEXT:MESH:D000688)`, `bionlp_st_2011_ge_RE:None)`, `bionlp_st_2013_gro_ner:B-ProteinSubunit)`, `genia_term_corpus_ner:I-ANDother_nameother_name)`, `bionlp_st_2013_gro_NER:I-Heterodimerization)`, `pico_extraction_ner:B-intervention)`, `bionlp_st_2013_cg_ner:I-Organism)`, `bionlp_st_2013_gro_ner:I-ProteinDomain)`, `bionlp_st_2013_gro_NER:I-BindingToProtein)`, `scai_chemical_ner:I-)`, `biorelex_ner:B-experiment-tag)`, `ebm_pico_ner:B-Intervention_Physical)`, `bionlp_st_2013_cg_RE:ToLoc)`, `bionlp_st_2013_gro_NER:B-FormationOfTranscriptionFactorComplex)`, `linnaeus_ner:B-species)`, `medmentions_full_ner:I-T062)`, `chemdner_TEXT:MESH:D014640)`, `mlee_NER:B-Gene_expression)`, `chemdner_TEXT:MESH:D008701)`, `mlee_NER:O)`, `chemdner_TEXT:MESH:D014302)`, `genia_term_corpus_ner:B-RNA_family_or_group)`, `medmentions_full_ner:I-T091)`, `medmentions_full_ner:B-T022)`, `medmentions_full_ner:B-T074)`, `bionlp_st_2013_gro_NER:B-ProteinCatabolism)`, `bionlp_st_2013_gro_RE:hasPatient4)`, `chemdner_TEXT:MESH:D011388)`, `bionlp_st_2013_ge_NER:I-Phosphorylation)`, `bionlp_st_2013_gro_NER:I-CellAdhesion)`, `anat_em_ner:I-Organ)`, `medmentions_full_ner:B-T045)`, `chemdner_TEXT:MESH:D008727)`, `chebi_nactem_abstr_ann1_ner:B-Species)`, `bionlp_st_2013_gro_ner:I-RNAPolymeraseII)`, `nlm_gene_ner:B-STARGENE)`, `mantra_gsc_en_emea_ner:B-OBJC)`, `bionlp_st_2013_gro_ner:B-DNABindingDomainOfProtein)`, `chemdner_TEXT:MESH:D010636)`, `chemdner_TEXT:MESH:D004061)`, `mlee_NER:I-Binding)`, `medmentions_full_ner:B-T075)`, `medmentions_full_ner:B-UnknownType)`, `chemdner_TEXT:MESH:D019081)`, `bionlp_st_2013_gro_NER:I-Binding)`, `medmentions_full_ner:I-T005)`, `chemdner_TEXT:MESH:D009821)` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_foo_en_5.2.0_3.0_1699292612679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_foo_en_5.2.0_3.0_1699292612679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_foo","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_foo","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.foo.by_leonweber").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_foo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|420.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/leonweber/foo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_german_press_bert_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_german_press_bert_de.md new file mode 100644 index 000000000000..0f3b7cc039c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_german_press_bert_de.md @@ -0,0 +1,114 @@ +--- +layout: model +title: German BertForTokenClassification Cased model (from severinsimmler) +author: John Snow Labs +name: bert_ner_german_press_bert +date: 2023-11-06 +tags: [bert, ner, open_source, de, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `german-press-bert` is a German model originally trained by `severinsimmler`. + +## Predicted Entities + +`PER`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_german_press_bert_de_5.2.0_3.0_1699294652431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_german_press_bert_de_5.2.0_3.0_1699294652431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_german_press_bert","de") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_german_press_bert","de") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.ner.bert.by_severinsimmler").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_german_press_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|409.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/severinsimmler/german-press-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ghost1_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ghost1_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..245e8ede0461 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ghost1_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ghost1_bert_finetuned_ner_accelerate BertForTokenClassification from Ghost1 +author: John Snow Labs +name: bert_ner_ghost1_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ghost1_bert_finetuned_ner_accelerate` is a English model originally trained by Ghost1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ghost1_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699279177125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ghost1_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699279177125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ghost1_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ghost1_bert_finetuned_ner_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ghost1_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Ghost1/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_gk07_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_gk07_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..9133f8912755 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_gk07_wikineural_multilingual_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from gk07) +author: John Snow Labs +name: bert_ner_gk07_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `wikineural-multilingual-ner` is a English model originally trained by `gk07`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_gk07_wikineural_multilingual_ner_en_5.2.0_3.0_1699291765022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_gk07_wikineural_multilingual_ner_en_5.2.0_3.0_1699291765022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_gk07_wikineural_multilingual_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_gk07_wikineural_multilingual_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.wikineural.multilingual.by_gk07").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_gk07_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gk07/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_gro_ner_2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_gro_ner_2_en.md new file mode 100644 index 000000000000..58bb60520944 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_gro_ner_2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mirikwa) +author: John Snow Labs +name: bert_ner_gro_ner_2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `gro-ner-2` is a English model originally trained by `mirikwa`. + +## Predicted Entities + +`METRIC`, `REGION`, `ITEM` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_gro_ner_2_en_5.2.0_3.0_1699291798204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_gro_ner_2_en_5.2.0_3.0_1699291798204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_gro_ner_2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_gro_ner_2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_mirikwa").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_gro_ner_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mirikwa/gro-ner-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hebert_ner_he.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hebert_ner_he.md new file mode 100644 index 000000000000..75c720c073ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hebert_ner_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew bert_ner_hebert_ner BertForTokenClassification from avichr +author: John Snow Labs +name: bert_ner_hebert_ner +date: 2023-11-06 +tags: [bert, he, open_source, token_classification, onnx] +task: Named Entity Recognition +language: he +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_hebert_ner` is a Hebrew model originally trained by avichr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_hebert_ner_he_5.2.0_3.0_1699294856711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_hebert_ner_he_5.2.0_3.0_1699294856711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_hebert_ner","he") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_hebert_ner", "he") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_hebert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|he| +|Size:|408.1 MB| + +## References + +https://huggingface.co/avichr/heBERT_NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hiner_original_muril_base_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hiner_original_muril_base_cased_en.md new file mode 100644 index 000000000000..ed440b8f3bd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hiner_original_muril_base_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_hiner_original_muril_base_cased BertForTokenClassification from cfilt +author: John Snow Labs +name: bert_ner_hiner_original_muril_base_cased +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_hiner_original_muril_base_cased` is a English model originally trained by cfilt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_hiner_original_muril_base_cased_en_5.2.0_3.0_1699276931660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_hiner_original_muril_base_cased_en_5.2.0_3.0_1699276931660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_hiner_original_muril_base_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_hiner_original_muril_base_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_hiner_original_muril_base_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|890.5 MB| + +## References + +https://huggingface.co/cfilt/HiNER-original-muril-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hing_bert_lid_hi.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hing_bert_lid_hi.md new file mode 100644 index 000000000000..cbc3dad98428 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hing_bert_lid_hi.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Hindi Named Entity Recognition (from l3cube-pune) +author: John Snow Labs +name: bert_ner_hing_bert_lid +date: 2023-11-06 +tags: [bert, ner, token_classification, hi, open_source, onnx] +task: Named Entity Recognition +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `hing-bert-lid` is a Hindi model orginally trained by `l3cube-pune`. + +## Predicted Entities + +`EN`, `HI` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_hing_bert_lid_hi_5.2.0_3.0_1699292024423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_hing_bert_lid_hi_5.2.0_3.0_1699292024423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_hing_bert_lid","hi") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["मुझे स्पार्क एनएलपी बहुत पसंद है"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_hing_bert_lid","hi") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("मुझे स्पार्क एनएलपी बहुत पसंद है").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_hing_bert_lid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|hi| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/l3cube-pune/hing-bert-lid +- https://github.com/l3cube-pune/code-mixed-nlp +- https://arxiv.org/abs/2204.08398 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner_en.md new file mode 100644 index 000000000000..5b5c43e58763 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from hossay) +author: John Snow Labs +name: bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert-base-cased-v1.2-finetuned-ner` is a English model originally trained by `hossay`. + +## Predicted Entities + +`Disease` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner_en_5.2.0_3.0_1699292074948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner_en_5.2.0_3.0_1699292074948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.biobert.cased_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_hossay_biobert_base_cased_v1.2_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/hossay/biobert-base-cased-v1.2-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=ncbi_disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_host_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_host_en.md new file mode 100644 index 000000000000..f4b12d649e7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_host_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Maaly) +author: John Snow Labs +name: bert_ner_host +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `host` is a English model originally trained by `Maaly`. + +## Predicted Entities + +`host` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_host_en_5.2.0_3.0_1699293027968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_host_en_5.2.0_3.0_1699293027968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_host","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_host","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.host.by_maaly").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_host| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Maaly/host +- https://gitlab.com/maaly7/emerald_metagenomics_annotations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_huggingface_course_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_huggingface_course_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..04e893efb7cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_huggingface_course_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from huggingface-course) +author: John Snow Labs +name: bert_ner_huggingface_course_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `huggingface-course`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_huggingface_course_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699292365020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_huggingface_course_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699292365020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_huggingface_course_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_huggingface_course_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_huggingface_course").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_huggingface_course_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/huggingface-course/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_huggingface_course_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_huggingface_course_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ba55171af48b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_huggingface_course_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from huggingface-course) +author: John Snow Labs +name: bert_ner_huggingface_course_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `huggingface-course`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_huggingface_course_bert_finetuned_ner_en_5.2.0_3.0_1699294557264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_huggingface_course_bert_finetuned_ner_en_5.2.0_3.0_1699294557264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_huggingface_course_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_huggingface_course_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_huggingface_course").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_huggingface_course_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/huggingface-course/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_icelandic_ner_bert_is.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_icelandic_ner_bert_is.md new file mode 100644 index 000000000000..aaf050c5316b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_icelandic_ner_bert_is.md @@ -0,0 +1,117 @@ +--- +layout: model +title: Icelandic BertForTokenClassification Cased model (from m3hrdadfi) +author: John Snow Labs +name: bert_ner_icelandic_ner_bert +date: 2023-11-06 +tags: [bert, ner, open_source, is, onnx] +task: Named Entity Recognition +language: is +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `icelandic-ner-bert` is a Icelandic model originally trained by `m3hrdadfi`. + +## Predicted Entities + +`Organization`, `Time`, `Location`, `Miscellaneous`, `Person`, `Money`, `Percent`, `Date` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_icelandic_ner_bert_is_5.2.0_3.0_1699294925637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_icelandic_ner_bert_is_5.2.0_3.0_1699294925637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_icelandic_ner_bert","is") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ég elska neista NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_icelandic_ner_bert","is") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ég elska neista NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("is.ner.bert").predict("""Ég elska neista NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_icelandic_ner_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|is| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/m3hrdadfi/icelandic-ner-bert +- https://github.com/m3hrdadfi/icelandic-ner/issues +- https://en.ru.is/ +- http://hdl.handle.net/20.500.12537/42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tb_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tb_en.md new file mode 100644 index 000000000000..bc29b5ff33e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_idrisi_lmr_hd_tb BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: bert_ner_idrisi_lmr_hd_tb +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_idrisi_lmr_hd_tb` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_idrisi_lmr_hd_tb_en_5.2.0_3.0_1699280801371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_idrisi_lmr_hd_tb_en_5.2.0_3.0_1699280801371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_idrisi_lmr_hd_tb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_idrisi_lmr_hd_tb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_idrisi_lmr_hd_tb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-HD-TB \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tb_partition_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tb_partition_en.md new file mode 100644 index 000000000000..cab78571399b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tb_partition_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_idrisi_lmr_hd_tb_partition BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: bert_ner_idrisi_lmr_hd_tb_partition +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_idrisi_lmr_hd_tb_partition` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_idrisi_lmr_hd_tb_partition_en_5.2.0_3.0_1699279482189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_idrisi_lmr_hd_tb_partition_en_5.2.0_3.0_1699279482189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_idrisi_lmr_hd_tb_partition","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_idrisi_lmr_hd_tb_partition", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_idrisi_lmr_hd_tb_partition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-HD-TB-partition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tl_partition_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tl_partition_en.md new file mode 100644 index 000000000000..f784c9518f74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_idrisi_lmr_hd_tl_partition_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_idrisi_lmr_hd_tl_partition BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: bert_ner_idrisi_lmr_hd_tl_partition +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_idrisi_lmr_hd_tl_partition` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_idrisi_lmr_hd_tl_partition_en_5.2.0_3.0_1699277268999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_idrisi_lmr_hd_tl_partition_en_5.2.0_3.0_1699277268999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_idrisi_lmr_hd_tl_partition","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_idrisi_lmr_hd_tl_partition", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_idrisi_lmr_hd_tl_partition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-HD-TL-partition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner_en.md new file mode 100644 index 000000000000..df2ac6a69955 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner BertForTokenClassification from importsmart +author: John Snow Labs +name: bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner` is a English model originally trained by importsmart. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner_en_5.2.0_3.0_1699292534714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner_en_5.2.0_3.0_1699292534714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_importsmart_bert_tonga_tonga_islands_distilbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.2 MB| + +## References + +https://huggingface.co/importsmart/bert-to-distilbert-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english_en.md new file mode 100644 index 000000000000..f17330bbfcdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Large Cased model (from imvladikon) +author: John Snow Labs +name: bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-cased-finetuned-conll03-english` is a English model originally trained by `imvladikon`. + +## Predicted Entities + +`PER`, `LOC`, `MISC`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english_en_5.2.0_3.0_1699293022370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english_en_5.2.0_3.0_1699293022370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.cased_large_finetuned.by_imvladikon").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_imvladikon_bert_large_cased_finetuned_conll03_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/imvladikon/bert-large-cased-finetuned-conll03-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_indicner_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_indicner_xx.md new file mode 100644 index 000000000000..0fa511197fc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_indicner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_ner_indicner BertForTokenClassification from ai4bharat +author: John Snow Labs +name: bert_ner_indicner +date: 2023-11-06 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_indicner` is a Multilingual model originally trained by ai4bharat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_indicner_xx_5.2.0_3.0_1699279738572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_indicner_xx_5.2.0_3.0_1699279738572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_indicner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_indicner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_indicner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|625.3 MB| + +## References + +https://huggingface.co/ai4bharat/IndicNER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jatinshah_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jatinshah_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..cb98ac1c6980 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jatinshah_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from jatinshah) +author: John Snow Labs +name: bert_ner_jatinshah_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `jatinshah`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_jatinshah_bert_finetuned_ner_en_5.2.0_3.0_1699293312284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_jatinshah_bert_finetuned_ner_en_5.2.0_3.0_1699293312284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_jatinshah_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_jatinshah_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_jatinshah").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_jatinshah_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/jatinshah/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jdang_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jdang_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ca3db9b0b938 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jdang_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from jdang) +author: John Snow Labs +name: bert_ner_jdang_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `jdang`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_jdang_bert_finetuned_ner_en_5.2.0_3.0_1699293584567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_jdang_bert_finetuned_ner_en_5.2.0_3.0_1699293584567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_jdang_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_jdang_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_jdang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_jdang_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/jdang/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jrubin01_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jrubin01_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..292f5adcac6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_jrubin01_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from jrubin01) +author: John Snow Labs +name: bert_ner_jrubin01_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `jrubin01`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_jrubin01_bert_finetuned_ner_en_5.2.0_3.0_1699293922897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_jrubin01_bert_finetuned_ner_en_5.2.0_3.0_1699293922897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_jrubin01_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_jrubin01_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_jrubin01").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_jrubin01_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/jrubin01/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kalex_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kalex_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..18e7ba137121 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kalex_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from kalex) +author: John Snow Labs +name: bert_ner_kalex_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `kalex`. + +## Predicted Entities + +`Disease` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_kalex_bert_finetuned_ner_en_5.2.0_3.0_1699294162666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_kalex_bert_finetuned_ner_en_5.2.0_3.0_1699294162666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kalex_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kalex_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_kalex").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_kalex_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/kalex/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kamalkraj_bert_base_cased_ner_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kamalkraj_bert_base_cased_ner_conll2003_en.md new file mode 100644 index 000000000000..8e07d4af10ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kamalkraj_bert_base_cased_ner_conll2003_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from kamalkraj) +author: John Snow Labs +name: bert_ner_kamalkraj_bert_base_cased_ner_conll2003 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-ner-conll2003` is a English model originally trained by `kamalkraj`. + +## Predicted Entities + +`ORG`, `MISC`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_kamalkraj_bert_base_cased_ner_conll2003_en_5.2.0_3.0_1699295382664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_kamalkraj_bert_base_cased_ner_conll2003_en_5.2.0_3.0_1699295382664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kamalkraj_bert_base_cased_ner_conll2003","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kamalkraj_bert_base_cased_ner_conll2003","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_kamalkraj_bert_base_cased_ner_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/kamalkraj/bert-base-cased-ner-conll2003 +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner_en.md new file mode 100644 index 000000000000..42a2996f6a4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner BertForTokenClassification from kaushalkhator +author: John Snow Labs +name: bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner` is a English model originally trained by kaushalkhator. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner_en_5.2.0_3.0_1699295523105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner_en_5.2.0_3.0_1699295523105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_kaushalkhator_bert_tonga_tonga_islands_distilbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.3 MB| + +## References + +https://huggingface.co/kaushalkhator/bert-to-distilbert-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kb_bert_base_swedish_cased_ner_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kb_bert_base_swedish_cased_ner_sv.md new file mode 100644 index 000000000000..0edbef913906 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kb_bert_base_swedish_cased_ner_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_ner_kb_bert_base_swedish_cased_ner BertForTokenClassification from KB +author: John Snow Labs +name: bert_ner_kb_bert_base_swedish_cased_ner +date: 2023-11-06 +tags: [bert, sv, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_kb_bert_base_swedish_cased_ner` is a Swedish model originally trained by KB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_kb_bert_base_swedish_cased_ner_sv_5.2.0_3.0_1699278518487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_kb_bert_base_swedish_cased_ner_sv_5.2.0_3.0_1699278518487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kb_bert_base_swedish_cased_ner","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_kb_bert_base_swedish_cased_ner", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_kb_bert_base_swedish_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.3 MB| + +## References + +https://huggingface.co/KB/bert-base-swedish-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_10000_9_16_more_ingredient_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_10000_9_16_more_ingredient_en.md new file mode 100644 index 000000000000..56b023b5d328 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_10000_9_16_more_ingredient_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_10000_9_16_more_ingredient +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-10000-9-16_more_ingredient` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`negingredient`, `occasion`, `mealcourse`, `cuisines`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_10000_9_16_more_ingredient_en_5.2.0_3.0_1699293647025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_10000_9_16_more_ingredient_en_5.2.0_3.0_1699293647025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_10000_9_16_more_ingredient","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_10000_9_16_more_ingredient","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.ingredient.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_10000_9_16_more_ingredient| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-10000-9-16_more_ingredient \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_9_16_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_9_16_en.md new file mode 100644 index 000000000000..a22f3291d44b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_9_16_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_2000_9_16 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-2000-9-16` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`negingredient`, `occasion`, `mealcourse`, `cuisines`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_2000_9_16_en_5.2.0_3.0_1699295306624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_2000_9_16_en_5.2.0_3.0_1699295306624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_2000_9_16","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_2000_9_16","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.keyword_tag_model_2000_9_16.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_2000_9_16| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-2000-9-16 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_9_16_more_ingredient_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_9_16_more_ingredient_en.md new file mode 100644 index 000000000000..fd51b67c43d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_9_16_more_ingredient_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_2000_9_16_more_ingredient +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-2000-9-16_more_ingredient` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`negingredient`, `occasion`, `mealcourse`, `cuisines`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_2000_9_16_more_ingredient_en_5.2.0_3.0_1699294455704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_2000_9_16_more_ingredient_en_5.2.0_3.0_1699294455704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_2000_9_16_more_ingredient","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_2000_9_16_more_ingredient","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.ingredient.2000_9_16.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_2000_9_16_more_ingredient| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-2000-9-16_more_ingredient \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_en.md new file mode 100644 index 000000000000..095debdb1db2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_2000_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_2000 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-2000` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`occasion`, `cuisines`, `mealcourse`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_2000_en_5.2.0_3.0_1699295816871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_2000_en_5.2.0_3.0_1699295816871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_2000","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_2000","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.keyword_tag_model_2000.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-2000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_3000_v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_3000_v2_en.md new file mode 100644 index 000000000000..3abde52b5e77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_3000_v2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_3000_v2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-3000-v2` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`occasion`, `cuisines`, `mealcourse`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_3000_v2_en_5.2.0_3.0_1699294714859.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_3000_v2_en_5.2.0_3.0_1699294714859.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_3000_v2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_3000_v2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.v2.3000_v2.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_3000_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-3000-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_4000_9_16_more_ingredient_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_4000_9_16_more_ingredient_en.md new file mode 100644 index 000000000000..1f18e3bce577 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_4000_9_16_more_ingredient_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_4000_9_16_more_ingredient +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-4000-9-16_more_ingredient` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`negingredient`, `occasion`, `mealcourse`, `cuisines`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_4000_9_16_more_ingredient_en_5.2.0_3.0_1699293952171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_4000_9_16_more_ingredient_en_5.2.0_3.0_1699293952171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_4000_9_16_more_ingredient","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_4000_9_16_more_ingredient","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.ingredient.4000_9_16.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_4000_9_16_more_ingredient| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-4000-9-16_more_ingredient \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_4000_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_4000_en.md new file mode 100644 index 000000000000..e25f93983928 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_4000_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_4000 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-4000` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`occasion`, `cuisines`, `mealcourse`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_4000_en_5.2.0_3.0_1699292700679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_4000_en_5.2.0_3.0_1699292700679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_4000","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_4000","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.keyword_tag_model_4000.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_4000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-4000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_9_16_more_ingredient_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_9_16_more_ingredient_en.md new file mode 100644 index 000000000000..859c8ed26271 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_9_16_more_ingredient_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_6000_9_16_more_ingredient +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-6000-9-16_more_ingredient` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`negingredient`, `occasion`, `mealcourse`, `cuisines`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_6000_9_16_more_ingredient_en_5.2.0_3.0_1699294967428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_6000_9_16_more_ingredient_en_5.2.0_3.0_1699294967428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_6000_9_16_more_ingredient","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_6000_9_16_more_ingredient","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.ingredient.6000_9_16.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_6000_9_16_more_ingredient| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-6000-9-16_more_ingredient \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_en.md new file mode 100644 index 000000000000..e4ff00bf96f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_6000 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-6000` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`occasion`, `cuisines`, `mealcourse`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_6000_en_5.2.0_3.0_1699294209815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_6000_en_5.2.0_3.0_1699294209815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_6000","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_6000","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.keyword_tag_model_6000.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_6000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-6000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_v2_en.md new file mode 100644 index 000000000000..e42d219f31a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_6000_v2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_6000_v2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-6000-v2` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`occasion`, `cuisines`, `mealcourse`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_6000_v2_en_5.2.0_3.0_1699294516084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_6000_v2_en_5.2.0_3.0_1699294516084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_6000_v2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_6000_v2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.v2.6000_v2.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_6000_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-6000-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_8000_9_16_more_ingredient_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_8000_9_16_more_ingredient_en.md new file mode 100644 index 000000000000..414d8530e792 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_8000_9_16_more_ingredient_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_8000_9_16_more_ingredient +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-8000-9-16_more_ingredient` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`negingredient`, `occasion`, `mealcourse`, `cuisines`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_8000_9_16_more_ingredient_en_5.2.0_3.0_1699296079463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_8000_9_16_more_ingredient_en_5.2.0_3.0_1699296079463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_8000_9_16_more_ingredient","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_8000_9_16_more_ingredient","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.ingredient.8000_9_16.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_8000_9_16_more_ingredient| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-8000-9-16_more_ingredient \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_9000_v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_9000_v2_en.md new file mode 100644 index 000000000000..85e3e2b091bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_9000_v2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model_9000_v2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model-9000-v2` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`occasion`, `cuisines`, `mealcourse`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_9000_v2_en_5.2.0_3.0_1699296381549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_9000_v2_en_5.2.0_3.0_1699296381549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_9000_v2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model_9000_v2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.v2.9000_v2.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model_9000_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model-9000-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_en.md new file mode 100644 index 000000000000..322c6b86b6e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_keyword_tag_model_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Media1129) +author: John Snow Labs +name: bert_ner_keyword_tag_model +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyword-tag-model` is a English model originally trained by `Media1129`. + +## Predicted Entities + +`occasion`, `cuisines`, `mealcourse`, `ingredient` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_en_5.2.0_3.0_1699292413018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_keyword_tag_model_en_5.2.0_3.0_1699292413018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_keyword_tag_model","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_media1129").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_keyword_tag_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Media1129/keyword-tag-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_krimo11_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_krimo11_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..2ff034985e41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_krimo11_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from krimo11) +author: John Snow Labs +name: bert_ner_krimo11_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `krimo11`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_krimo11_bert_finetuned_ner_en_5.2.0_3.0_1699292976154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_krimo11_bert_finetuned_ner_en_5.2.0_3.0_1699292976154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_krimo11_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_krimo11_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_krimo11").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_krimo11_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/krimo11/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ksaluja_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ksaluja_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..2241c34c5071 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ksaluja_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ksaluja_bert_finetuned_ner BertForTokenClassification from kSaluja +author: John Snow Labs +name: bert_ner_ksaluja_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ksaluja_bert_finetuned_ner` is a English model originally trained by kSaluja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ksaluja_bert_finetuned_ner_en_5.2.0_3.0_1699293363364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ksaluja_bert_finetuned_ner_en_5.2.0_3.0_1699293363364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ksaluja_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ksaluja_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ksaluja_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/kSaluja/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kurama_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kurama_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..e885db189cf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kurama_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from kurama) +author: John Snow Labs +name: bert_ner_kurama_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `kurama`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_kurama_bert_finetuned_ner_en_5.2.0_3.0_1699295552710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_kurama_bert_finetuned_ner_en_5.2.0_3.0_1699295552710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kurama_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kurama_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_kurama").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_kurama_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/kurama/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kurianbenoy_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kurianbenoy_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..0e802a1c85e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kurianbenoy_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from kurianbenoy) +author: John Snow Labs +name: bert_ner_kurianbenoy_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `kurianbenoy`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_kurianbenoy_bert_finetuned_ner_en_5.2.0_3.0_1699295792667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_kurianbenoy_bert_finetuned_ner_en_5.2.0_3.0_1699295792667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kurianbenoy_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kurianbenoy_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_kurianbenoy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_kurianbenoy_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/kurianbenoy/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner_en.md new file mode 100644 index 000000000000..8bd2b3547e48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner BertForTokenClassification from kushaljoseph +author: John Snow Labs +name: bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner` is a English model originally trained by kushaljoseph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner_en_5.2.0_3.0_1699293125116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner_en_5.2.0_3.0_1699293125116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_kushaljoseph_bert_tonga_tonga_islands_distilbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.3 MB| + +## References + +https://huggingface.co/kushaljoseph/bert-to-distilbert-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_labse_ner_nerel_ru.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_labse_ner_nerel_ru.md new file mode 100644 index 000000000000..3a5e136135b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_labse_ner_nerel_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian bert_ner_labse_ner_nerel BertForTokenClassification from surdan +author: John Snow Labs +name: bert_ner_labse_ner_nerel +date: 2023-11-06 +tags: [bert, ru, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_labse_ner_nerel` is a Russian model originally trained by surdan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_labse_ner_nerel_ru_5.2.0_3.0_1699280332092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_labse_ner_nerel_ru_5.2.0_3.0_1699280332092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_labse_ner_nerel","ru") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_labse_ner_nerel", "ru") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_labse_ner_nerel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ru| +|Size:|480.5 MB| + +## References + +https://huggingface.co/surdan/LaBSE_ner_nerel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_laure996_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_laure996_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..5eb9f05513f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_laure996_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_laure996_bert_finetuned_ner BertForTokenClassification from Laure996 +author: John Snow Labs +name: bert_ner_laure996_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_laure996_bert_finetuned_ner` is a English model originally trained by Laure996. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_laure996_bert_finetuned_ner_en_5.2.0_3.0_1699280007843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_laure996_bert_finetuned_ner_en_5.2.0_3.0_1699280007843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_laure996_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_laure996_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_laure996_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Laure996/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_leander_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_leander_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..c4679f535150 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_leander_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from leander) +author: John Snow Labs +name: bert_ner_leander_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `leander`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_leander_bert_finetuned_ner_en_5.2.0_3.0_1699296106567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_leander_bert_finetuned_ner_en_5.2.0_3.0_1699296106567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_leander_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_leander_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_leander").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_leander_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/leander/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_legalbert_beneficiary_single_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_legalbert_beneficiary_single_en.md new file mode 100644 index 000000000000..33e7f4503754 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_legalbert_beneficiary_single_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Anery) +author: John Snow Labs +name: bert_ner_legalbert_beneficiary_single +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `legalbert_beneficiary_single` is a English model originally trained by `Anery`. + +## Predicted Entities + +`AC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_legalbert_beneficiary_single_en_5.2.0_3.0_1699296397141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_legalbert_beneficiary_single_en_5.2.0_3.0_1699296397141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_legalbert_beneficiary_single","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_legalbert_beneficiary_single","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.legal").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_legalbert_beneficiary_single| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Anery/legalbert_beneficiary_single \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_legalbert_clause_combined_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_legalbert_clause_combined_en.md new file mode 100644 index 000000000000..88fa8df5211e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_legalbert_clause_combined_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Anery) +author: John Snow Labs +name: bert_ner_legalbert_clause_combined +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `legalbert_clause_combined` is a English model originally trained by `Anery`. + +## Predicted Entities + +`AC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_legalbert_clause_combined_en_5.2.0_3.0_1699293384321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_legalbert_clause_combined_en_5.2.0_3.0_1699293384321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_legalbert_clause_combined","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_legalbert_clause_combined","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.legal.by_anery").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_legalbert_clause_combined| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|130.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Anery/legalbert_clause_combined \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_lewtun_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_lewtun_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..4c904da90c8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_lewtun_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from lewtun) +author: John Snow Labs +name: bert_ner_lewtun_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_lewtun_bert_finetuned_ner_en_5.2.0_3.0_1699295206231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_lewtun_bert_finetuned_ner_en_5.2.0_3.0_1699295206231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_lewtun_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_lewtun_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_lewtun_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/lewtun/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_literary_german_bert_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_literary_german_bert_de.md new file mode 100644 index 000000000000..045464a979d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_literary_german_bert_de.md @@ -0,0 +1,120 @@ +--- +layout: model +title: German Named Entity Recognition (from severinsimmler) +author: John Snow Labs +name: bert_ner_literary_german_bert +date: 2023-11-06 +tags: [bert, ner, token_classification, de, open_source, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `literary-german-bert` is a German model orginally trained by `severinsimmler`. + +## Predicted Entities + +`PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_literary_german_bert_de_5.2.0_3.0_1699296715278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_literary_german_bert_de_5.2.0_3.0_1699296715278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_literary_german_bert","de") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_literary_german_bert","de") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.ner.literary.bert.by_severinsimmler").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_literary_german_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|409.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/severinsimmler/literary-german-bert +- https://figshare.com/articles/Corpus_of_German-Language_Fiction_txt_/4524680/1 +- https://gitlab2.informatik.uni-wuerzburg.de/kallimachos/DROC-Release +- https://figshare.com/articles/Corpus_of_German-Language_Fiction_txt_/4524680/1 +- https://opus.bibliothek.uni-wuerzburg.de/opus4-wuerzburg/frontdoor/deliver/index/docId/14333/file/Jannidis_Figurenerkennung_Roman.pdf +- http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2018-27.pdf +- https://opus.bibliothek.uni-wuerzburg.de/opus4-wuerzburg/frontdoor/deliver/index/docId/14333/file/Jannidis_Figurenerkennung_Roman.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ludoviciarraga_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ludoviciarraga_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..0567b87b031a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ludoviciarraga_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ludoviciarraga) +author: John Snow Labs +name: bert_ner_ludoviciarraga_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `ludoviciarraga`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ludoviciarraga_bert_finetuned_ner_en_5.2.0_3.0_1699294875593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ludoviciarraga_bert_finetuned_ner_en_5.2.0_3.0_1699294875593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ludoviciarraga_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ludoviciarraga_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_ludoviciarraga").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ludoviciarraga_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ludoviciarraga/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_m_bert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_m_bert_ner_en.md new file mode 100644 index 000000000000..ea6352ea650a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_m_bert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_m_bert_ner BertForTokenClassification from Andrija +author: John Snow Labs +name: bert_ner_m_bert_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_m_bert_ner` is a English model originally trained by Andrija. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_m_bert_ner_en_5.2.0_3.0_1699278976314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_m_bert_ner_en_5.2.0_3.0_1699278976314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_m_bert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_m_bert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_m_bert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Andrija/M-bert-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_marathi_ner_mr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_marathi_ner_mr.md new file mode 100644 index 000000000000..ff9d032e9d8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_marathi_ner_mr.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Marathi Named Entity Recognition (from l3cube-pune) +author: John Snow Labs +name: bert_ner_marathi_ner +date: 2023-11-06 +tags: [bert, ner, token_classification, mr, open_source, onnx] +task: Named Entity Recognition +language: mr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `marathi-ner` is a Marathi model orginally trained by `l3cube-pune`. + +## Predicted Entities + +`Location`, `Time`, `Organization`, `Designation`, `Person`, `Other`, `Measure`, `Date` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_marathi_ner_mr_5.2.0_3.0_1699293776206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_marathi_ner_mr_5.2.0_3.0_1699293776206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_marathi_ner","mr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["मला स्पार्क एनएलपी आवडते"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_marathi_ner","mr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("मला स्पार्क एनएलपी आवडते").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_marathi_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|mr| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/l3cube-pune/marathi-ner +- https://github.com/l3cube-pune/MarathiNLP +- https://arxiv.org/abs/2204.06029 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mascariddu8_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mascariddu8_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..3f67713d91d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mascariddu8_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_mascariddu8_bert_finetuned_ner_accelerate BertForTokenClassification from Mascariddu8 +author: John Snow Labs +name: bert_ner_mascariddu8_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mascariddu8_bert_finetuned_ner_accelerate` is a English model originally trained by Mascariddu8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mascariddu8_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699277583393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mascariddu8_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699277583393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mascariddu8_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mascariddu8_bert_finetuned_ner_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mascariddu8_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Mascariddu8/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mascariddu8_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mascariddu8_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ddd7995e8e4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mascariddu8_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_mascariddu8_bert_finetuned_ner BertForTokenClassification from Mascariddu8 +author: John Snow Labs +name: bert_ner_mascariddu8_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mascariddu8_bert_finetuned_ner` is a English model originally trained by Mascariddu8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mascariddu8_bert_finetuned_ner_en_5.2.0_3.0_1699280520765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mascariddu8_bert_finetuned_ner_en_5.2.0_3.0_1699280520765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mascariddu8_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mascariddu8_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mascariddu8_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Mascariddu8/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mateocolina_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mateocolina_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..459eb6d3b2fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mateocolina_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mateocolina) +author: John Snow Labs +name: bert_ner_mateocolina_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `mateocolina`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mateocolina_bert_finetuned_ner_en_5.2.0_3.0_1699294025892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mateocolina_bert_finetuned_ner_en_5.2.0_3.0_1699294025892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mateocolina_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mateocolina_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_mateocolina").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mateocolina_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mateocolina/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mattchurgin_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mattchurgin_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..2cff29e54b21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mattchurgin_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mattchurgin) +author: John Snow Labs +name: bert_ner_mattchurgin_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `mattchurgin`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mattchurgin_bert_finetuned_ner_en_5.2.0_3.0_1699295792178.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mattchurgin_bert_finetuned_ner_en_5.2.0_3.0_1699295792178.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mattchurgin_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mattchurgin_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_mattchurgin").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mattchurgin_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mattchurgin/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbateman_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbateman_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..8dde76068c98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbateman_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mbateman) +author: John Snow Labs +name: bert_ner_mbateman_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `mbateman`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbateman_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699296100140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbateman_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699296100140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbateman_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbateman_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_mbateman").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbateman_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mbateman/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbateman_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbateman_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..1ba324a177a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbateman_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mbateman) +author: John Snow Labs +name: bert_ner_mbateman_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `mbateman`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbateman_bert_finetuned_ner_en_5.2.0_3.0_1699297078432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbateman_bert_finetuned_ner_en_5.2.0_3.0_1699297078432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbateman_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbateman_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_mbateman").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbateman_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mbateman/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_albanian_cased_ner_sq.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_albanian_cased_ner_sq.md new file mode 100644 index 000000000000..599357f30681 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_albanian_cased_ner_sq.md @@ -0,0 +1,115 @@ +--- +layout: model +title: Albanian BertForTokenClassification Base Cased model (from akdeniz27) +author: John Snow Labs +name: bert_ner_mbert_base_albanian_cased_ner +date: 2023-11-06 +tags: [bert, ner, open_source, sq, onnx] +task: Named Entity Recognition +language: sq +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mbert-base-albanian-cased-ner` is a Albanian model originally trained by `akdeniz27`. + +## Predicted Entities + +`PER`, `ORG`, `LOC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_albanian_cased_ner_sq_5.2.0_3.0_1699296753654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_albanian_cased_ner_sq_5.2.0_3.0_1699296753654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_albanian_cased_ner","sq") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["E dua shkëndijën nlp"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_albanian_cased_ner","sq") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("E dua shkëndijën nlp").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sq.ner.bert.cased_base").predict("""E dua shkëndijën nlp""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_albanian_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sq| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/akdeniz27/mbert-base-albanian-cased-ner +- https://aclanthology.org/P17-1178.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_biomedical_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_biomedical_ner_en.md new file mode 100644 index 000000000000..53258ae77485 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_biomedical_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_mbert_base_biomedical_ner BertForTokenClassification from StivenLancheros +author: John Snow Labs +name: bert_ner_mbert_base_biomedical_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mbert_base_biomedical_ner` is a English model originally trained by StivenLancheros. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_biomedical_ner_en_5.2.0_3.0_1699295535389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_biomedical_ner_en_5.2.0_3.0_1699295535389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_biomedical_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mbert_base_biomedical_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_biomedical_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/StivenLancheros/mBERT-base-Biomedical-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_kinyarwanda_kin.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_kinyarwanda_kin.md new file mode 100644 index 000000000000..34159f08ea0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_kinyarwanda_kin.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Kinyarwanda bert_ner_mbert_base_uncased_kinyarwanda BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_mbert_base_uncased_kinyarwanda +date: 2023-11-06 +tags: [bert, kin, open_source, token_classification, onnx] +task: Named Entity Recognition +language: kin +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mbert_base_uncased_kinyarwanda` is a Kinyarwanda model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_kinyarwanda_kin_5.2.0_3.0_1699295120508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_kinyarwanda_kin_5.2.0_3.0_1699295120508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_uncased_kinyarwanda","kin") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mbert_base_uncased_kinyarwanda", "kin") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_uncased_kinyarwanda| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|kin| +|Size:|665.1 MB| + +## References + +https://huggingface.co/arnolfokam/mbert-base-uncased-kin \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_kinyarwanda_kin.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_kinyarwanda_kin.md new file mode 100644 index 000000000000..5973c9818e28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_kinyarwanda_kin.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Kinyarwanda bert_ner_mbert_base_uncased_ner_kinyarwanda BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_mbert_base_uncased_ner_kinyarwanda +date: 2023-11-06 +tags: [bert, kin, open_source, token_classification, onnx] +task: Named Entity Recognition +language: kin +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mbert_base_uncased_ner_kinyarwanda` is a Kinyarwanda model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_ner_kinyarwanda_kin_5.2.0_3.0_1699296325986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_ner_kinyarwanda_kin_5.2.0_3.0_1699296325986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_uncased_ner_kinyarwanda","kin") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mbert_base_uncased_ner_kinyarwanda", "kin") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_uncased_ner_kinyarwanda| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|kin| +|Size:|665.1 MB| + +## References + +https://huggingface.co/arnolfokam/mbert-base-uncased-ner-kin \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_nigerian_pidgin_pcm.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_nigerian_pidgin_pcm.md new file mode 100644 index 000000000000..c2c69cc99267 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_nigerian_pidgin_pcm.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Nigerian Pidgin bert_ner_mbert_base_uncased_ner_nigerian_pidgin BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_mbert_base_uncased_ner_nigerian_pidgin +date: 2023-11-06 +tags: [bert, pcm, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pcm +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mbert_base_uncased_ner_nigerian_pidgin` is a Nigerian Pidgin model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_ner_nigerian_pidgin_pcm_5.2.0_3.0_1699297293688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_ner_nigerian_pidgin_pcm_5.2.0_3.0_1699297293688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_uncased_ner_nigerian_pidgin","pcm") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mbert_base_uncased_ner_nigerian_pidgin", "pcm") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_uncased_ner_nigerian_pidgin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pcm| +|Size:|665.1 MB| + +## References + +https://huggingface.co/arnolfokam/mbert-base-uncased-ner-pcm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_swahili_macrolanguage_swa.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_swahili_macrolanguage_swa.md new file mode 100644 index 000000000000..e097864bb900 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_ner_swahili_macrolanguage_swa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swahili (macrolanguage) bert_ner_mbert_base_uncased_ner_swahili_macrolanguage BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_mbert_base_uncased_ner_swahili_macrolanguage +date: 2023-11-06 +tags: [bert, swa, open_source, token_classification, onnx] +task: Named Entity Recognition +language: swa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mbert_base_uncased_ner_swahili_macrolanguage` is a Swahili (macrolanguage) model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_ner_swahili_macrolanguage_swa_5.2.0_3.0_1699295348491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_ner_swahili_macrolanguage_swa_5.2.0_3.0_1699295348491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_uncased_ner_swahili_macrolanguage","swa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mbert_base_uncased_ner_swahili_macrolanguage", "swa") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_uncased_ner_swahili_macrolanguage| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|swa| +|Size:|665.1 MB| + +## References + +https://huggingface.co/arnolfokam/mbert-base-uncased-ner-swa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_nigerian_pidgin_pcm.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_nigerian_pidgin_pcm.md new file mode 100644 index 000000000000..f4366b7e6a56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_nigerian_pidgin_pcm.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Nigerian Pidgin bert_ner_mbert_base_uncased_nigerian_pidgin BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_mbert_base_uncased_nigerian_pidgin +date: 2023-11-06 +tags: [bert, pcm, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pcm +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mbert_base_uncased_nigerian_pidgin` is a Nigerian Pidgin model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_nigerian_pidgin_pcm_5.2.0_3.0_1699297500464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_nigerian_pidgin_pcm_5.2.0_3.0_1699297500464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_uncased_nigerian_pidgin","pcm") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mbert_base_uncased_nigerian_pidgin", "pcm") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_uncased_nigerian_pidgin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pcm| +|Size:|665.1 MB| + +## References + +https://huggingface.co/arnolfokam/mbert-base-uncased-pcm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_swahili_macrolanguage_swa.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_swahili_macrolanguage_swa.md new file mode 100644 index 000000000000..a23ebb6c0de0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mbert_base_uncased_swahili_macrolanguage_swa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swahili (macrolanguage) bert_ner_mbert_base_uncased_swahili_macrolanguage BertForTokenClassification from arnolfokam +author: John Snow Labs +name: bert_ner_mbert_base_uncased_swahili_macrolanguage +date: 2023-11-06 +tags: [bert, swa, open_source, token_classification, onnx] +task: Named Entity Recognition +language: swa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mbert_base_uncased_swahili_macrolanguage` is a Swahili (macrolanguage) model originally trained by arnolfokam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_swahili_macrolanguage_swa_5.2.0_3.0_1699297744554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mbert_base_uncased_swahili_macrolanguage_swa_5.2.0_3.0_1699297744554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mbert_base_uncased_swahili_macrolanguage","swa") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mbert_base_uncased_swahili_macrolanguage", "swa") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mbert_base_uncased_swahili_macrolanguage| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|swa| +|Size:|665.1 MB| + +## References + +https://huggingface.co/arnolfokam/mbert-base-uncased-swa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mcdzwil_bert_base_ner_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mcdzwil_bert_base_ner_finetuned_ner_en.md new file mode 100644 index 000000000000..909f77e03ec5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mcdzwil_bert_base_ner_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_mcdzwil_bert_base_ner_finetuned_ner BertForTokenClassification from mcdzwil +author: John Snow Labs +name: bert_ner_mcdzwil_bert_base_ner_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mcdzwil_bert_base_ner_finetuned_ner` is a English model originally trained by mcdzwil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mcdzwil_bert_base_ner_finetuned_ner_en_5.2.0_3.0_1699296958384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mcdzwil_bert_base_ner_finetuned_ner_en_5.2.0_3.0_1699296958384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mcdzwil_bert_base_ner_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mcdzwil_bert_base_ner_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mcdzwil_bert_base_ner_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mcdzwil/bert-base-NER-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mdroth_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mdroth_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..d91cd42e2d93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mdroth_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mdroth) +author: John Snow Labs +name: bert_ner_mdroth_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `mdroth`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mdroth_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699298017829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mdroth_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699298017829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mdroth_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mdroth_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_mdroth").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mdroth_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mdroth/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mdroth_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mdroth_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..5d048e18d6ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mdroth_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mdroth) +author: John Snow Labs +name: bert_ner_mdroth_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `mdroth`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mdroth_bert_finetuned_ner_en_5.2.0_3.0_1699296633696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mdroth_bert_finetuned_ner_en_5.2.0_3.0_1699296633696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mdroth_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mdroth_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_mdroth").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mdroth_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mdroth/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_meddocan_beto_ner_es.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_meddocan_beto_ner_es.md new file mode 100644 index 000000000000..91803145a4fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_meddocan_beto_ner_es.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Spanish BertForTokenClassification Cased model (from rjuez00) +author: John Snow Labs +name: bert_ner_meddocan_beto_ner +date: 2023-11-06 +tags: [bert, ner, open_source, es, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `meddocan-beto-ner` is a Spanish model originally trained by `rjuez00`. + +## Predicted Entities + +`CALLE`, `NUMERO_FAX`, `FECHAS`, `CENTRO_SALUD`, `INSTITUCION`, `PROFESION`, `ID_EMPLEO_PERSONAL_SANITARIO`, `SEXO_SUJETO_ASISTENCIA`, `PAIS`, `FAMILIARES_SUJETO_ASISTENCIA`, `EDAD_SUJETO_ASISTENCIA`, `CORREO_ELECTRONICO`, `NUMERO_TELEFONO`, `HOSPITAL`, `ID_CONTACTO_ASISTENCIAL`, `ID_ASEGURAMIENTO`, `OTROS_SUJETO_ASISTENCIA`, `NOMBRE_SUJETO_ASISTENCIA`, `ID_SUJETO_ASISTENCIA`, `NOMBRE_PERSONAL_SANITARIO`, `ID_TITULACION_PERSONAL_SANITARIO`, `TERRITORIO` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_meddocan_beto_ner_es_5.2.0_3.0_1699294266340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_meddocan_beto_ner_es_5.2.0_3.0_1699294266340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_meddocan_beto_ner","es") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Amo Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_meddocan_beto_ner","es") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Amo Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.ner.beto_bert").predict("""Amo Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_meddocan_beto_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/rjuez00/meddocan-beto-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_media1129_recipe_tag_model_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_media1129_recipe_tag_model_en.md new file mode 100644 index 000000000000..23d16d9541e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_media1129_recipe_tag_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_media1129_recipe_tag_model BertForTokenClassification from Media1129 +author: John Snow Labs +name: bert_ner_media1129_recipe_tag_model +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_media1129_recipe_tag_model` is a English model originally trained by Media1129. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_media1129_recipe_tag_model_en_5.2.0_3.0_1699277758427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_media1129_recipe_tag_model_en_5.2.0_3.0_1699277758427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_media1129_recipe_tag_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_media1129_recipe_tag_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_media1129_recipe_tag_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Media1129/recipe-tag-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_michojan_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_michojan_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..dd53acb5091e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_michojan_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from michojan) +author: John Snow Labs +name: bert_ner_michojan_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `michojan`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_michojan_bert_finetuned_ner_en_5.2.0_3.0_1699296913438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_michojan_bert_finetuned_ner_en_5.2.0_3.0_1699296913438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_michojan_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_michojan_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_michojan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_michojan_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/michojan/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mldev_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mldev_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..2fd333419f9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mldev_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from mldev) +author: John Snow Labs +name: bert_ner_mldev_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `mldev`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mldev_bert_finetuned_ner_en_5.2.0_3.0_1699295583692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mldev_bert_finetuned_ner_en_5.2.0_3.0_1699295583692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mldev_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mldev_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_mldev").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mldev_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/mldev/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_col_mod_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_col_mod_en.md new file mode 100644 index 000000000000..8ecd250b656c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_col_mod_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_col_mod BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_col_mod +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_col_mod` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_col_mod_en_5.2.0_3.0_1699277922356.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_col_mod_en_5.2.0_3.0_1699277922356.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_col_mod","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_col_mod", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_col_mod| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_col-mod \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_corsican_imb_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_corsican_imb_en.md new file mode 100644 index 000000000000..cd91093003a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_corsican_imb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_corsican_imb BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_corsican_imb +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_corsican_imb` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_corsican_imb_en_5.2.0_3.0_1699281413219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_corsican_imb_en_5.2.0_3.0_1699281413219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_corsican_imb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_corsican_imb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_corsican_imb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_co_imb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_1_en.md new file mode 100644 index 000000000000..4994c70d9708 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_imb_1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_imb_1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_imb_1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_imb_1_en_5.2.0_3.0_1699280703807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_imb_1_en_5.2.0_3.0_1699280703807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_imb_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_imb_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_imb_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_imb_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_2_en.md new file mode 100644 index 000000000000..e52d14a6c78f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_imb_2 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_imb_2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_imb_2` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_imb_2_en_5.2.0_3.0_1699280963757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_imb_2_en_5.2.0_3.0_1699280963757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_imb_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_imb_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_imb_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_imb_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_en.md new file mode 100644 index 000000000000..39b08c842689 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_imb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_imb BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_imb +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_imb` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_imb_en_5.2.0_3.0_1699281585872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_imb_en_5.2.0_3.0_1699281585872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_imb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_imb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_imb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_imb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_1_en.md new file mode 100644 index 000000000000..3ce471fa82af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_org_1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_org_1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_org_1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_org_1_en_5.2.0_3.0_1699281779167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_org_1_en_5.2.0_3.0_1699281779167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_org_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_org_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_org_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_org_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_2_en.md new file mode 100644 index 000000000000..35d9ff33b92f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_org_2 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_org_2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_org_2` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_org_2_en_5.2.0_3.0_1699279187176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_org_2_en_5.2.0_3.0_1699279187176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_org_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_org_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_org_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_org_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_en.md new file mode 100644 index 000000000000..deb385f0ec50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_model_org_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_model_org BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_model_org +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_model_org` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_model_org_en_5.2.0_3.0_1699281179145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_model_org_en_5.2.0_3.0_1699281179145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_model_org","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_model_org", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_model_org| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Model_org \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_modified_bluebert_biored_chem_512_5_30_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_modified_bluebert_biored_chem_512_5_30_en.md new file mode 100644 index 000000000000..1983adab5995 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_modified_bluebert_biored_chem_512_5_30_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_modified_bluebert_biored_chem_512_5_30 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_modified_bluebert_biored_chem_512_5_30 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_modified_bluebert_biored_chem_512_5_30` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_modified_bluebert_biored_chem_512_5_30_en_5.2.0_3.0_1699279382212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_modified_bluebert_biored_chem_512_5_30_en_5.2.0_3.0_1699279382212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_modified_bluebert_biored_chem_512_5_30","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_modified_bluebert_biored_chem_512_5_30", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_modified_bluebert_biored_chem_512_5_30| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Modified-BlueBERT-BioRED-Chem-512-5-30 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mohitsingh_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mohitsingh_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..81501818b603 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_mohitsingh_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_mohitsingh_wikineural_multilingual_ner BertForTokenClassification from MohitSingh +author: John Snow Labs +name: bert_ner_mohitsingh_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_mohitsingh_wikineural_multilingual_ner` is a English model originally trained by MohitSingh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_mohitsingh_wikineural_multilingual_ner_en_5.2.0_3.0_1699280294734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_mohitsingh_wikineural_multilingual_ner_en_5.2.0_3.0_1699280294734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_mohitsingh_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_mohitsingh_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_mohitsingh_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/MohitSingh/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nbailab_base_ner_scandi_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nbailab_base_ner_scandi_xx.md new file mode 100644 index 000000000000..0419c13ba61c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nbailab_base_ner_scandi_xx.md @@ -0,0 +1,118 @@ +--- +layout: model +title: Multilingual BertForTokenClassification Base Cased model (from saattrupdan) +author: John Snow Labs +name: bert_ner_nbailab_base_ner_scandi +date: 2023-11-06 +tags: [bert, ner, open_source, da, nb, nn, "no", sv, is, fo, xx, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `nbailab-base-ner-scandi` is a Multilingual model originally trained by `saattrupdan`. + +## Predicted Entities + +`LOC`, `ORG`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nbailab_base_ner_scandi_xx_5.2.0_3.0_1699297224666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nbailab_base_ner_scandi_xx_5.2.0_3.0_1699297224666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nbailab_base_ner_scandi","xx") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nbailab_base_ner_scandi","xx") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.ner.bert.wikiann.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nbailab_base_ner_scandi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|666.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/saattrupdan/nbailab-base-ner-scandi +- https://aclanthology.org/P17-1178/ +- https://arxiv.org/abs/1911.12146 +- https://aclanthology.org/2020.lrec-1.565/ +- https://spraakbanken.gu.se/en/resources/suc3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ncduy_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ncduy_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..834d36167dd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ncduy_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ncduy) +author: John Snow Labs +name: bert_ner_ncduy_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `ncduy`. + +## Predicted Entities + +`MISC`, `ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ncduy_bert_finetuned_ner_en_5.2.0_3.0_1699298376329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ncduy_bert_finetuned_ner_en_5.2.0_3.0_1699298376329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ncduy_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ncduy_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_ncduy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ncduy_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ncduy/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nepal_bhasa_test_model2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nepal_bhasa_test_model2_en.md new file mode 100644 index 000000000000..f5ad024c4597 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nepal_bhasa_test_model2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_nepal_bhasa_test_model2 BertForTokenClassification from kSaluja +author: John Snow Labs +name: bert_ner_nepal_bhasa_test_model2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_nepal_bhasa_test_model2` is a English model originally trained by kSaluja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nepal_bhasa_test_model2_en_5.2.0_3.0_1699296436373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nepal_bhasa_test_model2_en_5.2.0_3.0_1699296436373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nepal_bhasa_test_model2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_nepal_bhasa_test_model2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nepal_bhasa_test_model2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/kSaluja/new-test-model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nepal_bhasa_test_model_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nepal_bhasa_test_model_en.md new file mode 100644 index 000000000000..87429cfee33e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nepal_bhasa_test_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_nepal_bhasa_test_model BertForTokenClassification from kSaluja +author: John Snow Labs +name: bert_ner_nepal_bhasa_test_model +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_nepal_bhasa_test_model` is a English model originally trained by kSaluja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nepal_bhasa_test_model_en_5.2.0_3.0_1699298365359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nepal_bhasa_test_model_en_5.2.0_3.0_1699298365359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nepal_bhasa_test_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_nepal_bhasa_test_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nepal_bhasa_test_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/kSaluja/new-test-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_2006_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_2006_en.md new file mode 100644 index 000000000000..9d138906be41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_2006_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from yihahn) +author: John Snow Labs +name: bert_ner_ner_2006 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ner_2006` is a English model originally trained by `yihahn`. + +## Predicted Entities + +`PHONE`, `ID`, `PATIENT`, `DATE`, `AGE`, `LOCATION`, `HOSPITAL`, `DOCTOR` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_2006_en_5.2.0_3.0_1699297522148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_2006_en_5.2.0_3.0_1699297522148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_2006","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_2006","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_yihahn").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_2006| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/yihahn/ner_2006 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_bert_base_cased_portuguese_lenerbr_pt.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_bert_base_cased_portuguese_lenerbr_pt.md new file mode 100644 index 000000000000..3960ba53685b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_bert_base_cased_portuguese_lenerbr_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bert_ner_ner_bert_base_cased_portuguese_lenerbr BertForTokenClassification from mateusqc +author: John Snow Labs +name: bert_ner_ner_bert_base_cased_portuguese_lenerbr +date: 2023-11-06 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_bert_base_cased_portuguese_lenerbr` is a Portuguese model originally trained by mateusqc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_bert_base_cased_portuguese_lenerbr_pt_5.2.0_3.0_1699297714905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_bert_base_cased_portuguese_lenerbr_pt_5.2.0_3.0_1699297714905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_bert_base_cased_portuguese_lenerbr","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_bert_base_cased_portuguese_lenerbr", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_bert_base_cased_portuguese_lenerbr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|406.0 MB| + +## References + +https://huggingface.co/mateusqc/ner-bert-base-cased-pt-lenerbr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_camelbert_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_camelbert_ar.md new file mode 100644 index 000000000000..7df59b73db6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_camelbert_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_ner_ner_camelbert BertForTokenClassification from Holako +author: John Snow Labs +name: bert_ner_ner_camelbert +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_camelbert` is a Arabic model originally trained by Holako. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_camelbert_ar_5.2.0_3.0_1699278115850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_camelbert_ar_5.2.0_3.0_1699278115850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_camelbert","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_camelbert", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_camelbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.4 MB| + +## References + +https://huggingface.co/Holako/NER_CAMELBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_en.md new file mode 100644 index 000000000000..77e2a564c880 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ramybaly) +author: John Snow Labs +name: bert_ner_ner_conll2003 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ner_conll2003` is a English model originally trained by `ramybaly`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_en_5.2.0_3.0_1699280727724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_en_5.2.0_3.0_1699280727724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_conll2003","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_conll2003","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ramybaly/ner_conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v2_en.md new file mode 100644 index 000000000000..f3601e7ccd85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ner_conll2003_v2 BertForTokenClassification from Xiaoman +author: John Snow Labs +name: bert_ner_ner_conll2003_v2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_conll2003_v2` is a English model originally trained by Xiaoman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_v2_en_5.2.0_3.0_1699281093967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_v2_en_5.2.0_3.0_1699281093967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_conll2003_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_conll2003_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_conll2003_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Xiaoman/NER-CoNLL2003-V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v3_en.md new file mode 100644 index 000000000000..90ff74e904c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ner_conll2003_v3 BertForTokenClassification from Xiaoman +author: John Snow Labs +name: bert_ner_ner_conll2003_v3 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_conll2003_v3` is a English model originally trained by Xiaoman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_v3_en_5.2.0_3.0_1699279717066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_v3_en_5.2.0_3.0_1699279717066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_conll2003_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_conll2003_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_conll2003_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Xiaoman/NER-CoNLL2003-V3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v4_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v4_en.md new file mode 100644 index 000000000000..bed3d57d240e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_conll2003_v4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ner_conll2003_v4 BertForTokenClassification from Xiaoman +author: John Snow Labs +name: bert_ner_ner_conll2003_v4 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_conll2003_v4` is a English model originally trained by Xiaoman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_v4_en_5.2.0_3.0_1699280076980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_conll2003_v4_en_5.2.0_3.0_1699280076980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_conll2003_v4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_conll2003_v4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_conll2003_v4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Xiaoman/NER-CoNLL2003-V4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_english_vietnamese_italian_spanish_tinparadox_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_english_vietnamese_italian_spanish_tinparadox_xx.md new file mode 100644 index 000000000000..a642640fb507 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_english_vietnamese_italian_spanish_tinparadox_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_ner_ner_english_vietnamese_italian_spanish_tinparadox BertForTokenClassification from tinparadox +author: John Snow Labs +name: bert_ner_ner_english_vietnamese_italian_spanish_tinparadox +date: 2023-11-06 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_english_vietnamese_italian_spanish_tinparadox` is a Multilingual model originally trained by tinparadox. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_english_vietnamese_italian_spanish_tinparadox_xx_5.2.0_3.0_1699281445339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_english_vietnamese_italian_spanish_tinparadox_xx_5.2.0_3.0_1699281445339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_english_vietnamese_italian_spanish_tinparadox","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_english_vietnamese_italian_spanish_tinparadox", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_english_vietnamese_italian_spanish_tinparadox| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/tinparadox/NER-en-vi-it-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_for_female_names_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_for_female_names_en.md new file mode 100644 index 000000000000..6b3a8a29d39f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_for_female_names_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ner_for_female_names BertForTokenClassification from Xiaoman +author: John Snow Labs +name: bert_ner_ner_for_female_names +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_for_female_names` is a English model originally trained by Xiaoman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_for_female_names_en_5.2.0_3.0_1699281788116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_for_female_names_en_5.2.0_3.0_1699281788116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_for_female_names","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_for_female_names", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_for_female_names| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Xiaoman/NER-for-female-names \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_hungarian_model_2021_hu.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_hungarian_model_2021_hu.md new file mode 100644 index 000000000000..9411d8164572 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_hungarian_model_2021_hu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hungarian bert_ner_ner_hungarian_model_2021 BertForTokenClassification from fdominik98 +author: John Snow Labs +name: bert_ner_ner_hungarian_model_2021 +date: 2023-11-06 +tags: [bert, hu, open_source, token_classification, onnx] +task: Named Entity Recognition +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_hungarian_model_2021` is a Hungarian model originally trained by fdominik98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_hungarian_model_2021_hu_5.2.0_3.0_1699298022376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_hungarian_model_2021_hu_5.2.0_3.0_1699298022376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_hungarian_model_2021","hu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_hungarian_model_2021", "hu") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_hungarian_model_2021| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|hu| +|Size:|412.5 MB| + +## References + +https://huggingface.co/fdominik98/ner-hu-model-2021 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_nerd_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_nerd_en.md new file mode 100644 index 000000000000..b6a7996fc37f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_nerd_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ramybaly) +author: John Snow Labs +name: bert_ner_ner_nerd +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ner_nerd` is a English model originally trained by `ramybaly`. + +## Predicted Entities + +`ORG`, `EVENT`, `BUILDING`, `MISC`, `PER`, `PRODUCT`, `LOC`, `ART` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_nerd_en_5.2.0_3.0_1699298675966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_nerd_en_5.2.0_3.0_1699298675966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_nerd","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_nerd","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.nerd.by_ramybaly").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_nerd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ramybaly/ner_nerd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_nerd_fine_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_nerd_fine_en.md new file mode 100644 index 000000000000..01c6b58e5377 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_nerd_fine_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ramybaly) +author: John Snow Labs +name: bert_ner_ner_nerd_fine +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ner_nerd_fine` is a English model originally trained by `ramybaly`. + +## Predicted Entities + +`MISC_educationaldegree`, `ORG_other`, `BUILDING_restaurant`, `MISC_law`, `LOC_mountain`, `ART_other`, `MISC_medical`, `LOC_other`, `PER_athlete`, `PRODUCT_food`, `MISC_god`, `BUILDING_theater`, `LOC_GPE`, `ORG_media/newspaper`, `PRODUCT_other`, `ORG_government/governmentagency`, `PRODUCT_airplane`, `PRODUCT_software`, `BUILDING_other`, `ART_film`, `LOC_park`, `LOC_road/railway/highway/transit`, `PER_soldier`, `PRODUCT_weapon`, `EVENT_other`, `ORG_sportsleague`, `PRODUCT_train`, `PER_other`, `PER_politician`, `EVENT_election`, `ORG_company`, `PER_director`, `BUILDING_sportsfacility`, `ART_painting`, `BUILDING_airport`, `ART_music`, `LOC_island`, `ORG_politicalparty`, `MISC_award`, `PRODUCT_ship`, `BUILDING_hospital`, `ORG_sportsteam`, `MISC_livingthing`, `MISC_astronomything`, `BUILDING_hotel`, `MISC_language`, `EVENT_attack/battle/war/militaryconflict`, `LOC_bodiesofwater`, `EVENT_sportsevent`, `ORG_religion`, `PRODUCT_car`, `BUILDING_library`, `ORG_education`, `MISC_disease`, `MISC_currency`, `PER_scholar`, `EVENT_disaster`, `PRODUCT_game`, `PER_artist/author`, `ART_writtenart`, `EVENT_protest`, `MISC_chemicalthing`, `PER_actor`, `MISC_biologything`, `ART_broadcastprogram`, `ORG_showorganization` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_nerd_fine_en_5.2.0_3.0_1699295857916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_nerd_fine_en_5.2.0_3.0_1699295857916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_nerd_fine","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_nerd_fine","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.nerd_fine.by_ramybaly").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_nerd_fine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ramybaly/ner_nerd_fine \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_news_portuguese_pt.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_news_portuguese_pt.md new file mode 100644 index 000000000000..92b94c1bc025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_news_portuguese_pt.md @@ -0,0 +1,115 @@ +--- +layout: model +title: Portuguese Named Entity Recognition (from monilouise) +author: John Snow Labs +name: bert_ner_ner_news_portuguese +date: 2023-11-06 +tags: [bert, ner, token_classification, pt, open_source, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `ner_news_portuguese` is a Portuguese model orginally trained by `monilouise`. + +## Predicted Entities + +`PUB`, `PESSOA`, `LOC`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_news_portuguese_pt_5.2.0_3.0_1699296116605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_news_portuguese_pt_5.2.0_3.0_1699296116605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_news_portuguese","pt") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Eu amo Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_news_portuguese","pt") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Eu amo Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("pt.ner.bert.news.").predict("""Eu amo Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_news_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|406.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/monilouise/ner_news_portuguese +- https://github.com/neuralmind-ai/portuguese-bert/blob/master/README.md \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_rubert_per_loc_org_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_rubert_per_loc_org_en.md new file mode 100644 index 000000000000..667cbc823103 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_rubert_per_loc_org_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ner_rubert_per_loc_org BertForTokenClassification from tesemnikov-av +author: John Snow Labs +name: bert_ner_ner_rubert_per_loc_org +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ner_rubert_per_loc_org` is a English model originally trained by tesemnikov-av. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_rubert_per_loc_org_en_5.2.0_3.0_1699278244013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_rubert_per_loc_org_en_5.2.0_3.0_1699278244013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_rubert_per_loc_org","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ner_rubert_per_loc_org", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_rubert_per_loc_org| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|43.8 MB| + +## References + +https://huggingface.co/tesemnikov-av/NER-RUBERT-Per-Loc-Org \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_test_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_test_en.md new file mode 100644 index 000000000000..eae3c1d517e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ner_test_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from fgravelaine) +author: John Snow Labs +name: bert_ner_ner_test +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ner-test` is a English model originally trained by `fgravelaine`. + +## Predicted Entities + +`MADIN`, `TAG`, `COLOR`, `LOC`, `CAT`, `COUNTRY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ner_test_en_5.2.0_3.0_1699297211148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ner_test_en_5.2.0_3.0_1699297211148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_test","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ner_test","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_fgravelaine").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ner_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/fgravelaine/ner-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_neulvo_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_neulvo_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..ce3c5e87a46f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_neulvo_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_neulvo_bert_finetuned_ner_accelerate BertForTokenClassification from Neulvo +author: John Snow Labs +name: bert_ner_neulvo_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_neulvo_bert_finetuned_ner_accelerate` is a English model originally trained by Neulvo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_neulvo_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699278719184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_neulvo_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699278719184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_neulvo_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_neulvo_bert_finetuned_ner_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_neulvo_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Neulvo/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_neulvo_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_neulvo_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..482f28695096 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_neulvo_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_neulvo_bert_finetuned_ner BertForTokenClassification from Neulvo +author: John Snow Labs +name: bert_ner_neulvo_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_neulvo_bert_finetuned_ner` is a English model originally trained by Neulvo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_neulvo_bert_finetuned_ner_en_5.2.0_3.0_1699282015146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_neulvo_bert_finetuned_ner_en_5.2.0_3.0_1699282015146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_neulvo_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_neulvo_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_neulvo_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Neulvo/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nielsr_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nielsr_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..35d5333b4e58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nielsr_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from nielsr) +author: John Snow Labs +name: bert_ner_nielsr_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `nielsr`. + +## Predicted Entities + +`geo`, `org`, `per`, `tim`, `gpe` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nielsr_bert_finetuned_ner_en_5.2.0_3.0_1699296703770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nielsr_bert_finetuned_ner_en_5.2.0_3.0_1699296703770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nielsr_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nielsr_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_nielsr").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nielsr_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/nielsr/bert-finetuned-ner +- https://github.com/NielsRogge/Transformers-Tutorials/blob/master/BERT/Custom_Named_Entity_Recognition_with_BERT.ipynb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned_en.md new file mode 100644 index 000000000000..e793378e8af1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned BertForTokenClassification from ajtamayoh +author: John Snow Labs +name: bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned` is a English model originally trained by ajtamayoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned_en_5.2.0_3.0_1699280588888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned_en_5.2.0_3.0_1699280588888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nlp_cic_wfu_clinical_cases_ner_mbert_cased_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/ajtamayoh/NLP-CIC-WFU_Clinical_Cases_NER_mBERT_cased_fine_tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned_en.md new file mode 100644 index 000000000000..4c7ba6399da5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned BertForTokenClassification from ajtamayoh +author: John Snow Labs +name: bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned` is a English model originally trained by ajtamayoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned_en_5.2.0_3.0_1699278521403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned_en_5.2.0_3.0_1699278521403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nlp_cic_wfu_clinical_cases_ner_paragraph_tokenized_mbert_cased_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/ajtamayoh/NLP-CIC-WFU_Clinical_Cases_NER_Paragraph_Tokenized_mBERT_cased_fine_tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned_en.md new file mode 100644 index 000000000000..52c1b335b6b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned BertForTokenClassification from ajtamayoh +author: John Snow Labs +name: bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned` is a English model originally trained by ajtamayoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned_en_5.2.0_3.0_1699280341642.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned_en_5.2.0_3.0_1699280341642.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nlp_cic_wfu_clinical_cases_ner_sents_tokenized_mbert_cased_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/ajtamayoh/NLP-CIC-WFU_Clinical_Cases_NER_Sents_tokenized_mBERT_cased_fine_tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nominalization_candidate_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nominalization_candidate_classifier_en.md new file mode 100644 index 000000000000..62f1fcca94af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nominalization_candidate_classifier_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English Named Entity Recognition (from kleinay) +author: John Snow Labs +name: bert_ner_nominalization_candidate_classifier +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `nominalization-candidate-classifier` is a English model orginally trained by `kleinay`. + +## Predicted Entities + +`False`, `True` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nominalization_candidate_classifier_en_5.2.0_3.0_1699298930263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nominalization_candidate_classifier_en_5.2.0_3.0_1699298930263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nominalization_candidate_classifier","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("pos") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nominalization_candidate_classifier","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("pos") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_kleinay").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nominalization_candidate_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/kleinay/nominalization-candidate-classifier +- https://www.aclweb.org/anthology/2020.coling-main.274/ +- https://github.com/kleinay/QANom \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nonzerophilip_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nonzerophilip_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..0101e6eb6b27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_nonzerophilip_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_nonzerophilip_bert_finetuned_ner BertForTokenClassification from Nonzerophilip +author: John Snow Labs +name: bert_ner_nonzerophilip_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_nonzerophilip_bert_finetuned_ner` is a English model originally trained by Nonzerophilip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_nonzerophilip_bert_finetuned_ner_en_5.2.0_3.0_1699280794527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_nonzerophilip_bert_finetuned_ner_en_5.2.0_3.0_1699280794527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_nonzerophilip_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_nonzerophilip_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_nonzerophilip_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Nonzerophilip/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_offlangdetectionturkish_tr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_offlangdetectionturkish_tr.md new file mode 100644 index 000000000000..20b1d3aea631 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_offlangdetectionturkish_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish bert_ner_offlangdetectionturkish BertForTokenClassification from savasy +author: John Snow Labs +name: bert_ner_offlangdetectionturkish +date: 2023-11-06 +tags: [bert, tr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_offlangdetectionturkish` is a Turkish model originally trained by savasy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_offlangdetectionturkish_tr_5.2.0_3.0_1699298650328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_offlangdetectionturkish_tr_5.2.0_3.0_1699298650328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_offlangdetectionturkish","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_offlangdetectionturkish", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_offlangdetectionturkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.5 MB| + +## References + +https://huggingface.co/savasy/offLangDetectionTurkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc2gm_en.md new file mode 100644 index 000000000000..538dd5c94f69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc2gm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_biobert_bc2gm BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_biobert_bc2gm +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_biobert_bc2gm` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_bc2gm_en_5.2.0_3.0_1699281281190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_bc2gm_en_5.2.0_3.0_1699281281190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_biobert_bc2gm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_biobert_bc2gm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_biobert_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BioBERT-BC2GM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc5cdr_chemical_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc5cdr_chemical_en.md new file mode 100644 index 000000000000..2908067a9cb6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc5cdr_chemical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_biobert_bc5cdr_chemical BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_biobert_bc5cdr_chemical +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_biobert_bc5cdr_chemical` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_bc5cdr_chemical_en_5.2.0_3.0_1699278929146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_bc5cdr_chemical_en_5.2.0_3.0_1699278929146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_biobert_bc5cdr_chemical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_biobert_bc5cdr_chemical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_biobert_bc5cdr_chemical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BioBERT-BC5CDR-Chemical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc5cdr_disease_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc5cdr_disease_en.md new file mode 100644 index 000000000000..14865a436ea6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_bc5cdr_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_biobert_bc5cdr_disease BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_biobert_bc5cdr_disease +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_biobert_bc5cdr_disease` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_bc5cdr_disease_en_5.2.0_3.0_1699281001755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_bc5cdr_disease_en_5.2.0_3.0_1699281001755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_biobert_bc5cdr_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_biobert_bc5cdr_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_biobert_bc5cdr_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BioBERT-BC5CDR-Disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_linnaeus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_linnaeus_en.md new file mode 100644 index 000000000000..9e014b99e8fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_linnaeus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_biobert_linnaeus BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_biobert_linnaeus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_biobert_linnaeus` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_linnaeus_en_5.2.0_3.0_1699279134262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_linnaeus_en_5.2.0_3.0_1699279134262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_biobert_linnaeus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_biobert_linnaeus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_biobert_linnaeus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BioBERT-Linnaeus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_ncbi_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_ncbi_en.md new file mode 100644 index 000000000000..17d4c5c6f05d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_biobert_ncbi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_biobert_ncbi BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_biobert_ncbi +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_biobert_ncbi` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_ncbi_en_5.2.0_3.0_1699279337375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_biobert_ncbi_en_5.2.0_3.0_1699279337375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_biobert_ncbi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_biobert_ncbi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_biobert_ncbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BioBERT-NCBI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc2gm_en.md new file mode 100644 index 000000000000..d800436adcb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc2gm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_bluebert_bc2gm BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_bluebert_bc2gm +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_bluebert_bc2gm` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc2gm_en_5.2.0_3.0_1699279570424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc2gm_en_5.2.0_3.0_1699279570424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_bluebert_bc2gm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_bluebert_bc2gm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_bluebert_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BlueBERT-BC2GM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc4chemd_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc4chemd_en.md new file mode 100644 index 000000000000..ff4380227e59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc4chemd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_bluebert_bc4chemd BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_bluebert_bc4chemd +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_bluebert_bc4chemd` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc4chemd_en_5.2.0_3.0_1699281985956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc4chemd_en_5.2.0_3.0_1699281985956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_bluebert_bc4chemd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_bluebert_bc4chemd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_bluebert_bc4chemd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BlueBERT-BC4CHEMD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc5cdr_chemical_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc5cdr_chemical_en.md new file mode 100644 index 000000000000..3cfa47e065d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc5cdr_chemical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_bluebert_bc5cdr_chemical BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_bluebert_bc5cdr_chemical +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_bluebert_bc5cdr_chemical` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc5cdr_chemical_en_5.2.0_3.0_1699282192810.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc5cdr_chemical_en_5.2.0_3.0_1699282192810.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_bluebert_bc5cdr_chemical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_bluebert_bc5cdr_chemical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_bluebert_bc5cdr_chemical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BlueBERT-BC5CDR-Chemical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc5cdr_disease_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc5cdr_disease_en.md new file mode 100644 index 000000000000..dc181747eb1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_bc5cdr_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_bluebert_bc5cdr_disease BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_bluebert_bc5cdr_disease +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_bluebert_bc5cdr_disease` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc5cdr_disease_en_5.2.0_3.0_1699279772968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_bc5cdr_disease_en_5.2.0_3.0_1699279772968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_bluebert_bc5cdr_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_bluebert_bc5cdr_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_bluebert_bc5cdr_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BlueBERT-BC5CDR-Disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_biored_chem_512_5_30_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_biored_chem_512_5_30_en.md new file mode 100644 index 000000000000..87fb6585e878 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_biored_chem_512_5_30_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_bluebert_biored_chem_512_5_30 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_bluebert_biored_chem_512_5_30 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_bluebert_biored_chem_512_5_30` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_biored_chem_512_5_30_en_5.2.0_3.0_1699282167449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_biored_chem_512_5_30_en_5.2.0_3.0_1699282167449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_bluebert_biored_chem_512_5_30","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_bluebert_biored_chem_512_5_30", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_bluebert_biored_chem_512_5_30| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BlueBERT-BioRED-Chem-512-5-30 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_biored_chem_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_biored_chem_en.md new file mode 100644 index 000000000000..80ea1eb688a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_biored_chem_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_bluebert_biored_chem BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_bluebert_biored_chem +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_bluebert_biored_chem` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_biored_chem_en_5.2.0_3.0_1699282355533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_biored_chem_en_5.2.0_3.0_1699282355533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_bluebert_biored_chem","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_bluebert_biored_chem", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_bluebert_biored_chem| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BlueBERT-BioRED-Chem \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_linnaeus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_linnaeus_en.md new file mode 100644 index 000000000000..83ac9b7d7d69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_bluebert_linnaeus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_bluebert_linnaeus BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_bluebert_linnaeus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_bluebert_linnaeus` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_linnaeus_en_5.2.0_3.0_1699281191949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_bluebert_linnaeus_en_5.2.0_3.0_1699281191949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_bluebert_linnaeus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_bluebert_linnaeus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_bluebert_linnaeus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-BlueBERT-Linnaeus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc2gm_en.md new file mode 100644 index 000000000000..0463159da0a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc2gm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_pubmedbert_bc2gm BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_pubmedbert_bc2gm +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_pubmedbert_bc2gm` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc2gm_en_5.2.0_3.0_1699282350972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc2gm_en_5.2.0_3.0_1699282350972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_pubmedbert_bc2gm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_pubmedbert_bc2gm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_pubmedbert_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-PubMedBERT-BC2GM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc4chemd_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc4chemd_en.md new file mode 100644 index 000000000000..8938e8196649 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc4chemd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_pubmedbert_bc4chemd BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_pubmedbert_bc4chemd +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_pubmedbert_bc4chemd` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc4chemd_en_5.2.0_3.0_1699281398165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc4chemd_en_5.2.0_3.0_1699281398165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_pubmedbert_bc4chemd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_pubmedbert_bc4chemd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_pubmedbert_bc4chemd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-PubMedBERT-BC4CHEMD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc5cdr_chemical_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc5cdr_chemical_en.md new file mode 100644 index 000000000000..8a4ff00188e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc5cdr_chemical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_pubmedbert_bc5cdr_chemical BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_pubmedbert_bc5cdr_chemical +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_pubmedbert_bc5cdr_chemical` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc5cdr_chemical_en_5.2.0_3.0_1699282558437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc5cdr_chemical_en_5.2.0_3.0_1699282558437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_pubmedbert_bc5cdr_chemical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_pubmedbert_bc5cdr_chemical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_pubmedbert_bc5cdr_chemical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-PubMedBERT-BC5CDR-Chemical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc5cdr_disease_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc5cdr_disease_en.md new file mode 100644 index 000000000000..62d49b9fd3ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_bc5cdr_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_pubmedbert_bc5cdr_disease BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_pubmedbert_bc5cdr_disease +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_pubmedbert_bc5cdr_disease` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc5cdr_disease_en_5.2.0_3.0_1699279985989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_bc5cdr_disease_en_5.2.0_3.0_1699279985989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_pubmedbert_bc5cdr_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_pubmedbert_bc5cdr_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_pubmedbert_bc5cdr_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-PubMedBERT-BC5CDR-disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_linnaeus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_linnaeus_en.md new file mode 100644 index 000000000000..ccb81438b81c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_linnaeus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_pubmedbert_linnaeus BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_pubmedbert_linnaeus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_pubmedbert_linnaeus` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_linnaeus_en_5.2.0_3.0_1699281621940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_linnaeus_en_5.2.0_3.0_1699281621940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_pubmedbert_linnaeus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_pubmedbert_linnaeus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_pubmedbert_linnaeus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-PubMedBERT-Linnaeus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_ncbi_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_ncbi_en.md new file mode 100644 index 000000000000..e70dc50389dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_pubmedbert_ncbi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_pubmedbert_ncbi BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_pubmedbert_ncbi +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_pubmedbert_ncbi` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_ncbi_en_5.2.0_3.0_1699281818269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_pubmedbert_ncbi_en_5.2.0_3.0_1699281818269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_pubmedbert_ncbi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_pubmedbert_ncbi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_pubmedbert_ncbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-PubMedBERT-NCBI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc2gm_en.md new file mode 100644 index 000000000000..dabe172e9676 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc2gm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc2gm BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc2gm +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc2gm` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc2gm_en_5.2.0_3.0_1699280165266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc2gm_en_5.2.0_3.0_1699280165266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc2gm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc2gm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC2GM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc4chemd_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc4chemd_en.md new file mode 100644 index 000000000000..d5587b267c96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc4chemd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc4chemd BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc4chemd +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc4chemd` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc4chemd_en_5.2.0_3.0_1699282728006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc4chemd_en_5.2.0_3.0_1699282728006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc4chemd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc4chemd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc4chemd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC4CHEMD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc4chemd_o_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc4chemd_o_en.md new file mode 100644 index 000000000000..8e9603157c1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc4chemd_o_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc4chemd_o BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc4chemd_o +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc4chemd_o` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc4chemd_o_en_5.2.0_3.0_1699281473885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc4chemd_o_en_5.2.0_3.0_1699281473885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc4chemd_o","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc4chemd_o", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc4chemd_o| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC4CHEMD-O \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_en.md new file mode 100644 index 000000000000..d7e2e59f3d09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc5cdr_chemical BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc5cdr_chemical +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc5cdr_chemical` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_en_5.2.0_3.0_1699282021654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_en_5.2.0_3.0_1699282021654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc5cdr_chemical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc5cdr_chemical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc5cdr_chemical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC5CDR-Chemical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t1_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t1_en.md new file mode 100644 index 000000000000..ed30aff8929b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc5cdr_chemical_t1 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc5cdr_chemical_t1 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc5cdr_chemical_t1` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_t1_en_5.2.0_3.0_1699280367481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_t1_en_5.2.0_3.0_1699280367481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc5cdr_chemical_t1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc5cdr_chemical_t1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc5cdr_chemical_t1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC5CDR-Chemical-T1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t2_en.md new file mode 100644 index 000000000000..4ce8e8204015 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc5cdr_chemical_t2 BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc5cdr_chemical_t2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc5cdr_chemical_t2` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_t2_en_5.2.0_3.0_1699282572917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_t2_en_5.2.0_3.0_1699282572917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc5cdr_chemical_t2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc5cdr_chemical_t2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc5cdr_chemical_t2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC5CDR-Chemical-T2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t_en.md new file mode 100644 index 000000000000..091f26dd85bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_chemical_t_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc5cdr_chemical_t BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc5cdr_chemical_t +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc5cdr_chemical_t` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_t_en_5.2.0_3.0_1699281651469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_chemical_t_en_5.2.0_3.0_1699281651469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc5cdr_chemical_t","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc5cdr_chemical_t", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc5cdr_chemical_t| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC5CDR-Chemical-T \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_disease_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_disease_en.md new file mode 100644 index 000000000000..e21849551a28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_bc5cdr_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_bc5cdr_disease BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_bc5cdr_disease +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_bc5cdr_disease` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_disease_en_5.2.0_3.0_1699282951376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_bc5cdr_disease_en_5.2.0_3.0_1699282951376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_bc5cdr_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_bc5cdr_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_bc5cdr_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-BC5CDR-Disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_linnaeus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_linnaeus_en.md new file mode 100644 index 000000000000..fc375800807b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_original_scibert_linnaeus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_original_scibert_linnaeus BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_original_scibert_linnaeus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_original_scibert_linnaeus` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_linnaeus_en_5.2.0_3.0_1699282219957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_original_scibert_linnaeus_en_5.2.0_3.0_1699282219957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_original_scibert_linnaeus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_original_scibert_linnaeus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_original_scibert_linnaeus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Original-SciBERT-Linnaeus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_orignal_scibert_ncbi_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_orignal_scibert_ncbi_en.md new file mode 100644 index 000000000000..f227aa53b18e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_orignal_scibert_ncbi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_orignal_scibert_ncbi BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_orignal_scibert_ncbi +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_orignal_scibert_ncbi` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_orignal_scibert_ncbi_en_5.2.0_3.0_1699282752411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_orignal_scibert_ncbi_en_5.2.0_3.0_1699282752411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_orignal_scibert_ncbi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_orignal_scibert_ncbi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_orignal_scibert_ncbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/Orignal-SciBERT-NCBI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_orignial_bluebert_ncbi_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_orignial_bluebert_ncbi_en.md new file mode 100644 index 000000000000..c7bed364f7f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_orignial_bluebert_ncbi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_orignial_bluebert_ncbi BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_orignial_bluebert_ncbi +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_orignial_bluebert_ncbi` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_orignial_bluebert_ncbi_en_5.2.0_3.0_1699280549914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_orignial_bluebert_ncbi_en_5.2.0_3.0_1699280549914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_orignial_bluebert_ncbi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_orignial_bluebert_ncbi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_orignial_bluebert_ncbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/Orignial-BlueBERT-NCBI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_peterhsu_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_peterhsu_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..0e27b2b7d8ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_peterhsu_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from peterhsu) +author: John Snow Labs +name: bert_ner_peterhsu_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `peterhsu`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_peterhsu_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699299193052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_peterhsu_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699299193052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_peterhsu_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_peterhsu_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.accelerate.by_peterhsu").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_peterhsu_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/peterhsu/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_peterhsu_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_peterhsu_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..5d598872e681 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_peterhsu_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from peterhsu) +author: John Snow Labs +name: bert_ner_peterhsu_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `peterhsu`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_peterhsu_bert_finetuned_ner_en_5.2.0_3.0_1699298921856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_peterhsu_bert_finetuned_ner_en_5.2.0_3.0_1699298921856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_peterhsu_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_peterhsu_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_peterhsu").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_peterhsu_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/peterhsu/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_phijve_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_phijve_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..1c6a9ccd496e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_phijve_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from phijve) +author: John Snow Labs +name: bert_ner_phijve_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `phijve`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_phijve_bert_finetuned_ner_en_5.2.0_3.0_1699299193636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_phijve_bert_finetuned_ner_en_5.2.0_3.0_1699299193636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_phijve_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_phijve_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_phijve").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_phijve_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/phijve/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_prot_bert_bfd_ss3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_prot_bert_bfd_ss3_en.md new file mode 100644 index 000000000000..2d49ba04c598 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_prot_bert_bfd_ss3_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Rostlab) +author: John Snow Labs +name: bert_ner_prot_bert_bfd_ss3 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `prot_bert_bfd_ss3` is a English model originally trained by `Rostlab`. + +## Predicted Entities + +`H`, `C`, `E` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_prot_bert_bfd_ss3_en_5.2.0_3.0_1699297853422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_prot_bert_bfd_ss3_en_5.2.0_3.0_1699297853422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_prot_bert_bfd_ss3","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_prot_bert_bfd_ss3","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_rostlab").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_prot_bert_bfd_ss3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.6 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Rostlab/prot_bert_bfd_ss3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ravindra001_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ravindra001_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..5af6f5fe5576 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ravindra001_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ravindra001_bert_finetuned_ner BertForTokenClassification from Ravindra001 +author: John Snow Labs +name: bert_ner_ravindra001_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ravindra001_bert_finetuned_ner` is a English model originally trained by Ravindra001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ravindra001_bert_finetuned_ner_en_5.2.0_3.0_1699282404808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ravindra001_bert_finetuned_ner_en_5.2.0_3.0_1699282404808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ravindra001_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ravindra001_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ravindra001_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Ravindra001/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_raymelius_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_raymelius_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..afd815d8c8d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_raymelius_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_raymelius_bert_finetuned_ner BertForTokenClassification from RayMelius +author: John Snow Labs +name: bert_ner_raymelius_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_raymelius_bert_finetuned_ner` is a English model originally trained by RayMelius. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_raymelius_bert_finetuned_ner_en_5.2.0_3.0_1699282963248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_raymelius_bert_finetuned_ner_en_5.2.0_3.0_1699282963248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_raymelius_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_raymelius_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_raymelius_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/RayMelius/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rdchambers_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rdchambers_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..49a02b3489f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rdchambers_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from rdchambers) +author: John Snow Labs +name: bert_ner_rdchambers_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `rdchambers`. + +## Predicted Entities + +`Filler`, `Null` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_rdchambers_bert_finetuned_ner_en_5.2.0_3.0_1699299487426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_rdchambers_bert_finetuned_ner_en_5.2.0_3.0_1699299487426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rdchambers_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rdchambers_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.rdchambers.by_rdchambers").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_rdchambers_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/rdchambers/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_roberta_base_finetuned_cluener2020_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_roberta_base_finetuned_cluener2020_chinese_zh.md new file mode 100644 index 000000000000..ab156bbaa3ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_roberta_base_finetuned_cluener2020_chinese_zh.md @@ -0,0 +1,118 @@ +--- +layout: model +title: Chinese Named Entity Recognition (from uer) +author: John Snow Labs +name: bert_ner_roberta_base_finetuned_cluener2020_chinese +date: 2023-11-06 +tags: [bert, ner, token_classification, zh, open_source, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `roberta-base-finetuned-cluener2020-chinese` is a Chinese model orginally trained by `uer`. + +## Predicted Entities + +`position`, `company`, `address`, `movie`, `organization`, `game`, `name`, `book`, `government`, `scene` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_roberta_base_finetuned_cluener2020_chinese_zh_5.2.0_3.0_1699294710756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_roberta_base_finetuned_cluener2020_chinese_zh_5.2.0_3.0_1699294710756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_roberta_base_finetuned_cluener2020_chinese","zh") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_roberta_base_finetuned_cluener2020_chinese","zh") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.ner.bert.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_roberta_base_finetuned_cluener2020_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|380.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/uer/roberta-base-finetuned-cluener2020-chinese +- https://github.com/dbiir/UER-py/wiki/Modelzoo +- https://github.com/CLUEbenchmark/CLUENER2020 +- https://github.com/dbiir/UER-py/ +- https://cloud.tencent.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_romainlhardy_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_romainlhardy_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..69a604de05d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_romainlhardy_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from romainlhardy) +author: John Snow Labs +name: bert_ner_romainlhardy_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `romainlhardy`. + +## Predicted Entities + +`ORG`, `LOC`, `MISC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_romainlhardy_bert_finetuned_ner_en_5.2.0_3.0_1699296980720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_romainlhardy_bert_finetuned_ner_en_5.2.0_3.0_1699296980720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_romainlhardy_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_romainlhardy_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_romainlhardy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_romainlhardy_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/romainlhardy/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_base_srl_seqlabeling_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_base_srl_seqlabeling_en.md new file mode 100644 index 000000000000..b88711426b87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_base_srl_seqlabeling_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Base Cased model (from Rexhaif) +author: John Snow Labs +name: bert_ner_rubert_base_srl_seqlabeling +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-base-srl-seqlabeling` is a English model originally trained by `Rexhaif`. + +## Predicted Entities + +`INSTRUMENT`, `OTHER`, `CAUSATOR`, `PREDICATE`, `EXPIRIENCER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_rubert_base_srl_seqlabeling_en_5.2.0_3.0_1699298254905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_rubert_base_srl_seqlabeling_en_5.2.0_3.0_1699298254905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rubert_base_srl_seqlabeling","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rubert_base_srl_seqlabeling","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.base.by_rexhaif").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_rubert_base_srl_seqlabeling| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|667.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Rexhaif/rubert-base-srl-seqlabeling \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_ner_toxicity_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_ner_toxicity_en.md new file mode 100644 index 000000000000..baabb2de281c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_ner_toxicity_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from tesemnikov-av) +author: John Snow Labs +name: bert_ner_rubert_ner_toxicity +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-ner-toxicity` is a English model originally trained by `tesemnikov-av`. + +## Predicted Entities + +`TOXIC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_rubert_ner_toxicity_en_5.2.0_3.0_1699299371188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_rubert_ner_toxicity_en_5.2.0_3.0_1699299371188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rubert_ner_toxicity","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rubert_ner_toxicity","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.toxic.by_tesemnikov_av").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_rubert_ner_toxicity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|43.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tesemnikov-av/rubert-ner-toxicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_tiny2_sentence_compression_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_tiny2_sentence_compression_en.md new file mode 100644 index 000000000000..3d93de08a922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_rubert_tiny2_sentence_compression_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Tiny Cased model (from cointegrated) +author: John Snow Labs +name: bert_ner_rubert_tiny2_sentence_compression +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rubert-tiny2-sentence-compression` is a English model originally trained by `cointegrated`. + +## Predicted Entities + +`drop`, `keep` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_rubert_tiny2_sentence_compression_en_5.2.0_3.0_1699297171069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_rubert_tiny2_sentence_compression_en_5.2.0_3.0_1699297171069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rubert_tiny2_sentence_compression","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_rubert_tiny2_sentence_compression","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_rubert_tiny2_sentence_compression| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|109.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/cointegrated/rubert-tiny2-sentence-compression +- https://www.dialog-21.ru/media/5106/kuvshinovat-050.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_russellc_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_russellc_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..4d026ae89a5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_russellc_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from russellc) +author: John Snow Labs +name: bert_ner_russellc_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `russellc`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_russellc_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699299752124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_russellc_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699299752124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_russellc_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_russellc_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_russellc").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_russellc_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/russellc/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_russellc_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_russellc_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..77d72bb6f183 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_russellc_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from russellc) +author: John Snow Labs +name: bert_ner_russellc_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `russellc`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_russellc_bert_finetuned_ner_en_5.2.0_3.0_1699297478470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_russellc_bert_finetuned_ner_en_5.2.0_3.0_1699297478470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_russellc_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_russellc_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_russellc").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_russellc_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/russellc/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sagerpascal_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sagerpascal_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..75aeee20a707 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sagerpascal_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from sagerpascal) +author: John Snow Labs +name: bert_ner_sagerpascal_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `sagerpascal`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_sagerpascal_bert_finetuned_ner_en_5.2.0_3.0_1699298557423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_sagerpascal_bert_finetuned_ner_en_5.2.0_3.0_1699298557423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_sagerpascal_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_sagerpascal_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_sagerpascal").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_sagerpascal_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sagerpascal/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_salvatore_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_salvatore_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..f5f1b848beba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_salvatore_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_salvatore_bert_finetuned_ner BertForTokenClassification from Salvatore +author: John Snow Labs +name: bert_ner_salvatore_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_salvatore_bert_finetuned_ner` is a English model originally trained by Salvatore. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_salvatore_bert_finetuned_ner_en_5.2.0_3.0_1699282207590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_salvatore_bert_finetuned_ner_en_5.2.0_3.0_1699282207590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_salvatore_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_salvatore_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_salvatore_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Salvatore/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_cased_ner_jnlpba_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_cased_ner_jnlpba_en.md new file mode 100644 index 000000000000..7113d6476457 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_cased_ner_jnlpba_en.md @@ -0,0 +1,119 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from fran-martinez) +author: John Snow Labs +name: bert_ner_scibert_scivocab_cased_ner_jnlpba +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scibert_scivocab_cased_ner_jnlpba` is a English model originally trained by `fran-martinez`. + +## Predicted Entities + +`RNA`, `cell_type`, `protein`, `cell_line`, `DNA` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_cased_ner_jnlpba_en_5.2.0_3.0_1699297794145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_cased_ner_jnlpba_en_5.2.0_3.0_1699297794145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_scibert_scivocab_cased_ner_jnlpba","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_scibert_scivocab_cased_ner_jnlpba","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.scibert.scibert.cased.by_fran_martinez").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_scibert_scivocab_cased_ner_jnlpba| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/fran-martinez/scibert_scivocab_cased_ner_jnlpba +- https://github.com/fran-martinez/bio_ner_bert +- http://www.geniaproject.org/shared-tasks/bionlp-jnlpba-shared-task-2004 +- https://arxiv.org/pdf/1903.10676.pdf +- https://www.semanticscholar.org/ +- https://allenai.org/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_cased_sdu21_ai_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_cased_sdu21_ai_en.md new file mode 100644 index 000000000000..03c896205037 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_cased_sdu21_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_scibert_scivocab_cased_sdu21_ai BertForTokenClassification from napsternxg +author: John Snow Labs +name: bert_ner_scibert_scivocab_cased_sdu21_ai +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_scibert_scivocab_cased_sdu21_ai` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_cased_sdu21_ai_en_5.2.0_3.0_1699294901728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_cased_sdu21_ai_en_5.2.0_3.0_1699294901728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_scibert_scivocab_cased_sdu21_ai","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_scibert_scivocab_cased_sdu21_ai", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_scibert_scivocab_cased_sdu21_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/napsternxg/scibert_scivocab_cased_SDU21_AI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_ft_sdu21_ai_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_ft_sdu21_ai_en.md new file mode 100644 index 000000000000..8e753a79277c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_ft_sdu21_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_scibert_scivocab_uncased_ft_sdu21_ai BertForTokenClassification from napsternxg +author: John Snow Labs +name: bert_ner_scibert_scivocab_uncased_ft_sdu21_ai +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_scibert_scivocab_uncased_ft_sdu21_ai` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_ft_sdu21_ai_en_5.2.0_3.0_1699299937740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_ft_sdu21_ai_en_5.2.0_3.0_1699299937740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_scibert_scivocab_uncased_ft_sdu21_ai","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_scibert_scivocab_uncased_ft_sdu21_ai", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_scibert_scivocab_uncased_ft_sdu21_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/napsternxg/scibert_scivocab_uncased_ft_SDU21_AI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai_en.md new file mode 100644 index 000000000000..71efa0119070 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai BertForTokenClassification from napsternxg +author: John Snow Labs +name: bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai_en_5.2.0_3.0_1699300157518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai_en_5.2.0_3.0_1699300157518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_scibert_scivocab_uncased_ft_tv_sdu21_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/napsternxg/scibert_scivocab_uncased_ft_tv_SDU21_AI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_sdu21_ai_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_sdu21_ai_en.md new file mode 100644 index 000000000000..ed9ac856c9f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_sdu21_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_scibert_scivocab_uncased_sdu21_ai BertForTokenClassification from napsternxg +author: John Snow Labs +name: bert_ner_scibert_scivocab_uncased_sdu21_ai +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_scibert_scivocab_uncased_sdu21_ai` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_sdu21_ai_en_5.2.0_3.0_1699298884772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_sdu21_ai_en_5.2.0_3.0_1699298884772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_scibert_scivocab_uncased_sdu21_ai","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_scibert_scivocab_uncased_sdu21_ai", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_scibert_scivocab_uncased_sdu21_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/napsternxg/scibert_scivocab_uncased_SDU21_AI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_tv_sdu21_ai_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_tv_sdu21_ai_en.md new file mode 100644 index 000000000000..786ac6be07f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_scibert_scivocab_uncased_tv_sdu21_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_scibert_scivocab_uncased_tv_sdu21_ai BertForTokenClassification from napsternxg +author: John Snow Labs +name: bert_ner_scibert_scivocab_uncased_tv_sdu21_ai +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_scibert_scivocab_uncased_tv_sdu21_ai` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_tv_sdu21_ai_en_5.2.0_3.0_1699299559527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_scibert_scivocab_uncased_tv_sdu21_ai_en_5.2.0_3.0_1699299559527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_scibert_scivocab_uncased_tv_sdu21_ai","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_scibert_scivocab_uncased_tv_sdu21_ai", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_scibert_scivocab_uncased_tv_sdu21_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/napsternxg/scibert_scivocab_uncased_tv_SDU21_AI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sebastians_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sebastians_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..4e74f2da7f78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sebastians_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_sebastians_bert_finetuned_ner BertForTokenClassification from SebastianS +author: John Snow Labs +name: bert_ner_sebastians_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_sebastians_bert_finetuned_ner` is a English model originally trained by SebastianS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_sebastians_bert_finetuned_ner_en_5.2.0_3.0_1699283151341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_sebastians_bert_finetuned_ner_en_5.2.0_3.0_1699283151341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_sebastians_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_sebastians_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_sebastians_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/SebastianS/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shiva12_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shiva12_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..bd22639bcb34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shiva12_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_shiva12_wikineural_multilingual_ner BertForTokenClassification from Shiva12 +author: John Snow Labs +name: bert_ner_shiva12_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_shiva12_wikineural_multilingual_ner` is a English model originally trained by Shiva12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_shiva12_wikineural_multilingual_ner_en_5.2.0_3.0_1699283368263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_shiva12_wikineural_multilingual_ner_en_5.2.0_3.0_1699283368263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_shiva12_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_shiva12_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_shiva12_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Shiva12/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shivanand_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shivanand_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..83c8fa9bf6bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shivanand_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_shivanand_wikineural_multilingual_ner BertForTokenClassification from Shivanand +author: John Snow Labs +name: bert_ner_shivanand_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_shivanand_wikineural_multilingual_ner` is a English model originally trained by Shivanand. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_shivanand_wikineural_multilingual_ner_en_5.2.0_3.0_1699282400699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_shivanand_wikineural_multilingual_ner_en_5.2.0_3.0_1699282400699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_shivanand_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_shivanand_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_shivanand_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Shivanand/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shwetabh_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shwetabh_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..74d7603f96d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_shwetabh_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_shwetabh_wikineural_multilingual_ner BertForTokenClassification from Shwetabh +author: John Snow Labs +name: bert_ner_shwetabh_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_shwetabh_wikineural_multilingual_ner` is a English model originally trained by Shwetabh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_shwetabh_wikineural_multilingual_ner_en_5.2.0_3.0_1699280755326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_shwetabh_wikineural_multilingual_ner_en_5.2.0_3.0_1699280755326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_shwetabh_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_shwetabh_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_shwetabh_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Shwetabh/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_siegelou_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_siegelou_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..8b60d2ff1fc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_siegelou_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from siegelou) +author: John Snow Labs +name: bert_ner_siegelou_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `siegelou`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_siegelou_bert_finetuned_ner_en_5.2.0_3.0_1699299179962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_siegelou_bert_finetuned_ner_en_5.2.0_3.0_1699299179962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_siegelou_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_siegelou_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_siegelou").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_siegelou_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/siegelou/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_silpa_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_silpa_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..b261af9e09aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_silpa_wikineural_multilingual_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from silpa) +author: John Snow Labs +name: bert_ner_silpa_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `wikineural-multilingual-ner` is a English model originally trained by `silpa`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_silpa_wikineural_multilingual_ner_en_5.2.0_3.0_1699299492292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_silpa_wikineural_multilingual_ner_en_5.2.0_3.0_1699299492292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_silpa_wikineural_multilingual_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_silpa_wikineural_multilingual_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.wikineural.multilingual.by_silpa").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_silpa_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/silpa/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_simple_transformer_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_simple_transformer_en.md new file mode 100644 index 000000000000..e62d01344c4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_simple_transformer_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from kunalr63) +author: John Snow Labs +name: bert_ner_simple_transformer +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `simple_transformer` is a English model originally trained by `kunalr63`. + +## Predicted Entities + +`L-CLG`, `U-LOC`, `L-SKILLS`, `U-DESIG`, `U-SKILLS`, `L-ADDRESS`, `WORK_EXP`, `U-COMPANY`, `U-PER`, `L-EMAIL`, `DESIG`, `L-PER`, `L-LOC`, `LOC`, `COMPANY`, `L-QUALI`, `L-TRAIN`, `L-COMPANY`, `SCH`, `SKILLS`, `L-DESIG`, `L-WORK_EXP`, `L-SCH`, `U-SCH`, `CLG`, `L-HOBBI`, `L-EXPERIENCE`, `TRAIN`, `CERTIFICATION`, `QUALI`, `PHONE`, `U-CLG`, `U-EXPERIENCE`, `EMAIL`, `U-PHONE`, `PER`, `U-QUALI`, `L-CERTIFICATION`, `L-PHONE`, `HOBBI`, `U-EMAIL`, `ADDRESS`, `EXPERIENCE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_simple_transformer_en_5.2.0_3.0_1699300440938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_simple_transformer_en_5.2.0_3.0_1699300440938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_simple_transformer","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_simple_transformer","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_kunalr63").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_simple_transformer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/kunalr63/simple_transformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_small2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_small2_en.md new file mode 100644 index 000000000000..91dd5ca5382d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_small2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Small Cased model (from Narsil) +author: John Snow Labs +name: bert_ner_small2 +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `small2` is a English model originally trained by `Narsil`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_small2_en_5.2.0_3.0_1699299785612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_small2_en_5.2.0_3.0_1699299785612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_small2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_small2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.small.by_narsil").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_small2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|527.6 KB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Narsil/small2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2_en.md new file mode 100644 index 000000000000..9cf848012d07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English Named Entity Recognition (from abhibisht89) +author: John Snow Labs +name: bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2 +date: 2023-11-06 +tags: [bert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `spanbert-large-cased-finetuned-ade_corpus_v2` is a English model orginally trained by `abhibisht89`. + +## Predicted Entities + +`DRUG`, `ADR` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2_en_5.2.0_3.0_1699300075479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2_en_5.2.0_3.0_1699300075479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.span_bert.cased_v2_large_finetuned_adverse_drug_event").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_spanbert_large_cased_finetuned_ade_corpus_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/abhibisht89/spanbert-large-cased-finetuned-ade_corpus_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spasis_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spasis_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..a8648009c8c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spasis_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from spasis) +author: John Snow Labs +name: bert_ner_spasis_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `spasis`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_spasis_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699300386668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_spasis_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699300386668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_spasis_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_spasis_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_spasis").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_spasis_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/spasis/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spasis_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spasis_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..41679ca52beb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_spasis_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from spasis) +author: John Snow Labs +name: bert_ner_spasis_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `spasis`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_spasis_bert_finetuned_ner_en_5.2.0_3.0_1699300428300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_spasis_bert_finetuned_ner_en_5.2.0_3.0_1699300428300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_spasis_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_spasis_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_spasis").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_spasis_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/spasis/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ssmnspantagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ssmnspantagger_en.md new file mode 100644 index 000000000000..eae4ad13cc44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ssmnspantagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_ssmnspantagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_ner_ssmnspantagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_ssmnspantagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ssmnspantagger_en_5.2.0_3.0_1699282007827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ssmnspantagger_en_5.2.0_3.0_1699282007827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ssmnspantagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_ssmnspantagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ssmnspantagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/RJ3vans/SSMNspanTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_stefan_jo_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_stefan_jo_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..8fe638767251 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_stefan_jo_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from stefan-jo) +author: John Snow Labs +name: bert_ner_stefan_jo_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `stefan-jo`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_stefan_jo_bert_finetuned_ner_en_5.2.0_3.0_1699300785655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_stefan_jo_bert_finetuned_ner_en_5.2.0_3.0_1699300785655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_stefan_jo_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_stefan_jo_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_stefan_jo").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_stefan_jo_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/stefan-jo/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_suonbo_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_suonbo_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..dd6308a13ead --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_suonbo_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from suonbo) +author: John Snow Labs +name: bert_ner_suonbo_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `suonbo`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_suonbo_bert_finetuned_ner_en_5.2.0_3.0_1699298273027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_suonbo_bert_finetuned_ner_en_5.2.0_3.0_1699298273027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_suonbo_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_suonbo_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_suonbo").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_suonbo_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/suonbo/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_ner_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_ner_sv.md new file mode 100644 index 000000000000..1c6c6b23e4b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_ner_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_ner_swedish_ner BertForTokenClassification from RecordedFuture +author: John Snow Labs +name: bert_ner_swedish_ner +date: 2023-11-06 +tags: [bert, sv, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_swedish_ner` is a Swedish model originally trained by RecordedFuture. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_swedish_ner_sv_5.2.0_3.0_1699283270683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_swedish_ner_sv_5.2.0_3.0_1699283270683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_swedish_ner","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_swedish_ner", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_swedish_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.3 MB| + +## References + +https://huggingface.co/RecordedFuture/Swedish-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_sentiment_fear_targets_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_sentiment_fear_targets_sv.md new file mode 100644 index 000000000000..936bc5c9e03a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_sentiment_fear_targets_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_ner_swedish_sentiment_fear_targets BertForTokenClassification from RecordedFuture +author: John Snow Labs +name: bert_ner_swedish_sentiment_fear_targets +date: 2023-11-06 +tags: [bert, sv, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_swedish_sentiment_fear_targets` is a Swedish model originally trained by RecordedFuture. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_swedish_sentiment_fear_targets_sv_5.2.0_3.0_1699283476483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_swedish_sentiment_fear_targets_sv_5.2.0_3.0_1699283476483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_swedish_sentiment_fear_targets","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_swedish_sentiment_fear_targets", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_swedish_sentiment_fear_targets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.2 MB| + +## References + +https://huggingface.co/RecordedFuture/Swedish-Sentiment-Fear-Targets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_sentiment_violence_targets_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_sentiment_violence_targets_sv.md new file mode 100644 index 000000000000..b0b234313210 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_swedish_sentiment_violence_targets_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_ner_swedish_sentiment_violence_targets BertForTokenClassification from RecordedFuture +author: John Snow Labs +name: bert_ner_swedish_sentiment_violence_targets +date: 2023-11-06 +tags: [bert, sv, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_swedish_sentiment_violence_targets` is a Swedish model originally trained by RecordedFuture. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_swedish_sentiment_violence_targets_sv_5.2.0_3.0_1699283674869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_swedish_sentiment_violence_targets_sv_5.2.0_3.0_1699283674869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_swedish_sentiment_violence_targets","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_swedish_sentiment_violence_targets", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_swedish_sentiment_violence_targets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.2 MB| + +## References + +https://huggingface.co/RecordedFuture/Swedish-Sentiment-Violence-Targets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sysformbatches2acs_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sysformbatches2acs_en.md new file mode 100644 index 000000000000..eca13680420f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_sysformbatches2acs_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from blckwdw61) +author: John Snow Labs +name: bert_ner_sysformbatches2acs +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sysformbatches2acs` is a English model originally trained by `blckwdw61`. + +## Predicted Entities + +`SYSTEMATIC`, `FORMULA` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_sysformbatches2acs_en_5.2.0_3.0_1699300725442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_sysformbatches2acs_en_5.2.0_3.0_1699300725442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_sysformbatches2acs","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_sysformbatches2acs","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_blckwdw61").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_sysformbatches2acs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/blckwdw61/sysformbatches2acs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_t_202_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_t_202_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..f8b410153a76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_t_202_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_t_202_bert_finetuned_ner BertForTokenClassification from T-202 +author: John Snow Labs +name: bert_ner_t_202_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_t_202_bert_finetuned_ner` is a English model originally trained by T-202. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_t_202_bert_finetuned_ner_en_5.2.0_3.0_1699283538281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_t_202_bert_finetuned_ner_en_5.2.0_3.0_1699283538281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_t_202_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_t_202_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_t_202_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/T-202/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_temporal_tagger_bert_tokenclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_temporal_tagger_bert_tokenclassifier_en.md new file mode 100644 index 000000000000..60afd0b7c6cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_temporal_tagger_bert_tokenclassifier_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_temporal_tagger_bert_tokenclassifier BertForTokenClassification from satyaalmasian +author: John Snow Labs +name: bert_ner_temporal_tagger_bert_tokenclassifier +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_temporal_tagger_bert_tokenclassifier` is a English model originally trained by satyaalmasian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_temporal_tagger_bert_tokenclassifier_en_5.2.0_3.0_1699300992865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_temporal_tagger_bert_tokenclassifier_en_5.2.0_3.0_1699300992865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_temporal_tagger_bert_tokenclassifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_temporal_tagger_bert_tokenclassifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_temporal_tagger_bert_tokenclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/satyaalmasian/temporal_tagger_BERT_tokenclassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_testingmodel_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_testingmodel_en.md new file mode 100644 index 000000000000..40b81b76744f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_testingmodel_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from superman) +author: John Snow Labs +name: bert_ner_testingmodel +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `testingmodel` is a English model originally trained by `superman`. + +## Predicted Entities + +`EPI`, `LOC`, `STAT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_testingmodel_en_5.2.0_3.0_1699295175206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_testingmodel_en_5.2.0_3.0_1699295175206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_testingmodel","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_testingmodel","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_superman").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_testingmodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/superman/testingmodel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tg_relation_model_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tg_relation_model_en.md new file mode 100644 index 000000000000..90d725d40bbd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tg_relation_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_tg_relation_model BertForTokenClassification from alichte +author: John Snow Labs +name: bert_ner_tg_relation_model +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_tg_relation_model` is a English model originally trained by alichte. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tg_relation_model_en_5.2.0_3.0_1699283849338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tg_relation_model_en_5.2.0_3.0_1699283849338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tg_relation_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_tg_relation_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tg_relation_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/alichte/TG-Relation-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_bert_for_token_classification_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_bert_for_token_classification_en.md new file mode 100644 index 000000000000..4abec8a913df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_bert_for_token_classification_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Tiny Cased model (from hf-internal-testing) +author: John Snow Labs +name: bert_ner_tiny_bert_for_token_classification +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-for-token-classification` is a English model originally trained by `hf-internal-testing`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tiny_bert_for_token_classification_en_5.2.0_3.0_1699301009832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tiny_bert_for_token_classification_en_5.2.0_3.0_1699301009832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tiny_bert_for_token_classification","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tiny_bert_for_token_classification","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.tiny.by_hf_internal_testing").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tiny_bert_for_token_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|527.6 KB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/hf-internal-testing/tiny-bert-for-token-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english_en.md new file mode 100644 index 000000000000..86ce7b7d4d2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Tiny Cased model (from sshleifer) +author: John Snow Labs +name: bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-dbmdz-bert-large-cased-finetuned-conll03-english` is a English model originally trained by `sshleifer`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english_en_5.2.0_3.0_1699301132855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english_en_5.2.0_3.0_1699301132855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.cased_large_tiny_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tiny_dbmdz_bert_large_cased_finetuned_conll03_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|528.1 KB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sshleifer/tiny-dbmdz-bert-large-cased-finetuned-conll03-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_distilbert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_distilbert_base_cased_en.md new file mode 100644 index 000000000000..15838595a1bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tiny_distilbert_base_cased_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Tiny Cased model (from sshleifer) +author: John Snow Labs +name: bert_ner_tiny_distilbert_base_cased +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-distilbert-base-cased` is a English model originally trained by `sshleifer`. + +## Predicted Entities + +`ORG`, `PER`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tiny_distilbert_base_cased_en_5.2.0_3.0_1699300582474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tiny_distilbert_base_cased_en_5.2.0_3.0_1699300582474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tiny_distilbert_base_cased","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tiny_distilbert_base_cased","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.distilled_cased_base_tiny").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tiny_distilbert_base_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|528.1 KB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sshleifer/tiny-distilbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tinybert_fincorp_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tinybert_fincorp_en.md new file mode 100644 index 000000000000..f31d9de4fc74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tinybert_fincorp_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Tiny Cased model (from satyamrajawat1994) +author: John Snow Labs +name: bert_ner_tinybert_fincorp +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tinybert-fincorp` is a English model originally trained by `satyamrajawat1994`. + +## Predicted Entities + +`Fin_Corp` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tinybert_fincorp_en_5.2.0_3.0_1699301146413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tinybert_fincorp_en_5.2.0_3.0_1699301146413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tinybert_fincorp","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tinybert_fincorp","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.tiny.by_satyamrajawat1994").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tinybert_fincorp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/satyamrajawat1994/tinybert-fincorp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tinybert_spanish_uncased_finetuned_ner_es.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tinybert_spanish_uncased_finetuned_ner_es.md new file mode 100644 index 000000000000..4db752355ba7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tinybert_spanish_uncased_finetuned_ner_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_ner_tinybert_spanish_uncased_finetuned_ner BertForTokenClassification from mrm8488 +author: John Snow Labs +name: bert_ner_tinybert_spanish_uncased_finetuned_ner +date: 2023-11-06 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_tinybert_spanish_uncased_finetuned_ner` is a Castilian, Spanish model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tinybert_spanish_uncased_finetuned_ner_es_5.2.0_3.0_1699283940718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tinybert_spanish_uncased_finetuned_ner_es_5.2.0_3.0_1699283940718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tinybert_spanish_uncased_finetuned_ner","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_tinybert_spanish_uncased_finetuned_ner", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tinybert_spanish_uncased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|54.3 MB| + +## References + +https://huggingface.co/mrm8488/TinyBERT-spanish-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tolgahanturker_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tolgahanturker_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..309ae31b2216 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tolgahanturker_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from tolgahanturker) +author: John Snow Labs +name: bert_ner_tolgahanturker_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `tolgahanturker`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tolgahanturker_bert_finetuned_ner_en_5.2.0_3.0_1699295510686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tolgahanturker_bert_finetuned_ner_en_5.2.0_3.0_1699295510686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tolgahanturker_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tolgahanturker_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_tolgahanturker").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tolgahanturker_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tolgahanturker/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_turkish_ner_tr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_turkish_ner_tr.md new file mode 100644 index 000000000000..4d782a681ccd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_turkish_ner_tr.md @@ -0,0 +1,114 @@ +--- +layout: model +title: Turkish BertForTokenClassification Cased model (from gurkan08) +author: John Snow Labs +name: bert_ner_turkish_ner +date: 2023-11-06 +tags: [bert, ner, open_source, tr, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `turkish-ner` is a Turkish model originally trained by `gurkan08`. + +## Predicted Entities + +`ORG`, `LOC`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_turkish_ner_tr_5.2.0_3.0_1699301430110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_turkish_ner_tr_5.2.0_3.0_1699301430110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_turkish_ner","tr") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Spark NLP'yi seviyorum"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_turkish_ner","tr") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Spark NLP'yi seviyorum").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.ner.bert.by_gurkan08").predict("""Spark NLP'yi seviyorum""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_turkish_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gurkan08/turkish-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tushar_rishav_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tushar_rishav_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..c2ed9d3c70b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_tushar_rishav_bert_finetuned_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from tushar-rishav) +author: John Snow Labs +name: bert_ner_tushar_rishav_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `tushar-rishav`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_tushar_rishav_bert_finetuned_ner_en_5.2.0_3.0_1699301712518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_tushar_rishav_bert_finetuned_ner_en_5.2.0_3.0_1699301712518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tushar_rishav_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_tushar_rishav_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_tushar_rishav").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_tushar_rishav_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tushar-rishav/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_umlsbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_umlsbert_ner_en.md new file mode 100644 index 000000000000..d39eee548b88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_umlsbert_ner_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from RohanVB) +author: John Snow Labs +name: bert_ner_umlsbert_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `umlsbert_ner` is a English model originally trained by `RohanVB`. + +## Predicted Entities + +`test`, `problem`, `treatment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_umlsbert_ner_en_5.2.0_3.0_1699298642614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_umlsbert_ner_en_5.2.0_3.0_1699298642614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_umlsbert_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_umlsbert_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.by_rohanvb").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_umlsbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/RohanVB/umlsbert_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vanmas_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vanmas_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..a441e04198b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vanmas_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_vanmas_bert_finetuned_ner BertForTokenClassification from Vanmas +author: John Snow Labs +name: bert_ner_vanmas_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_vanmas_bert_finetuned_ner` is a English model originally trained by Vanmas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_vanmas_bert_finetuned_ner_en_5.2.0_3.0_1699282583949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_vanmas_bert_finetuned_ner_en_5.2.0_3.0_1699282583949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_vanmas_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_vanmas_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_vanmas_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Vanmas/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikasaeta_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikasaeta_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..2ad8946ae56d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikasaeta_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from vikasaeta) +author: John Snow Labs +name: bert_ner_vikasaeta_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `vikasaeta`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_vikasaeta_bert_finetuned_ner_en_5.2.0_3.0_1699301417405.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_vikasaeta_bert_finetuned_ner_en_5.2.0_3.0_1699301417405.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_vikasaeta_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_vikasaeta_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_vikasaeta").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_vikasaeta_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/vikasaeta/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikasmani_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikasmani_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..5943778c93e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikasmani_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_vikasmani_wikineural_multilingual_ner BertForTokenClassification from VikasMani +author: John Snow Labs +name: bert_ner_vikasmani_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_vikasmani_wikineural_multilingual_ner` is a English model originally trained by VikasMani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_vikasmani_wikineural_multilingual_ner_en_5.2.0_3.0_1699280978942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_vikasmani_wikineural_multilingual_ner_en_5.2.0_3.0_1699280978942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_vikasmani_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_vikasmani_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_vikasmani_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/VikasMani/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikings03_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikings03_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..358e1cb1e06b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vikings03_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_vikings03_wikineural_multilingual_ner BertForTokenClassification from Vikings03 +author: John Snow Labs +name: bert_ner_vikings03_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_vikings03_wikineural_multilingual_ner` is a English model originally trained by Vikings03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_vikings03_wikineural_multilingual_ner_en_5.2.0_3.0_1699281184123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_vikings03_wikineural_multilingual_ner_en_5.2.0_3.0_1699281184123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_vikings03_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_vikings03_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_vikings03_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Vikings03/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vinspatel4_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vinspatel4_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..00221f89986b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_vinspatel4_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_vinspatel4_wikineural_multilingual_ner BertForTokenClassification from Vinspatel4 +author: John Snow Labs +name: bert_ner_vinspatel4_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_vinspatel4_wikineural_multilingual_ner` is a English model originally trained by Vinspatel4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_vinspatel4_wikineural_multilingual_ner_en_5.2.0_3.0_1699283746553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_vinspatel4_wikineural_multilingual_ner_en_5.2.0_3.0_1699283746553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_vinspatel4_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_vinspatel4_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_vinspatel4_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Vinspatel4/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wende_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wende_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..98852b013723 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wende_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wende_bert_finetuned_ner_accelerate BertForTokenClassification from Wende +author: John Snow Labs +name: bert_ner_wende_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wende_bert_finetuned_ner_accelerate` is a English model originally trained by Wende. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wende_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699281831344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wende_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699281831344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wende_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wende_bert_finetuned_ner_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wende_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Wende/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wende_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wende_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..22cd78142ac9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wende_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wende_bert_finetuned_ner BertForTokenClassification from Wende +author: John Snow Labs +name: bert_ner_wende_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wende_bert_finetuned_ner` is a English model originally trained by Wende. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wende_bert_finetuned_ner_en_5.2.0_3.0_1699284307641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wende_bert_finetuned_ner_en_5.2.0_3.0_1699284307641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wende_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wende_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wende_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Wende/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wikineural_multilingual_ner_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wikineural_multilingual_ner_nl.md new file mode 100644 index 000000000000..aae7953686a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wikineural_multilingual_ner_nl.md @@ -0,0 +1,117 @@ +--- +layout: model +title: Dutch Named Entity Recognition (from Babelscape) +author: John Snow Labs +name: bert_ner_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, ner, token_classification, nl, open_source, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, uploaded to Hugging Face, adapted and imported into Spark NLP. `wikineural-multilingual-ner` is a Dutch model orginally trained by `Babelscape`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wikineural_multilingual_ner_nl_5.2.0_3.0_1699300983486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wikineural_multilingual_ner_nl_5.2.0_3.0_1699300983486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wikineural_multilingual_ner","nl") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wikineural_multilingual_ner","nl") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.ner.bert.wikineural.multilingual").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Babelscape/wikineural-multilingual-ner +- https://github.com/Babelscape/wikineural +- https://aclanthology.org/2021.findings-emnlp.215/ +- https://creativecommons.org/licenses/by-nc-sa/4.0/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_winson_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_winson_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..b6b2396e2072 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_winson_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from winson) +author: John Snow Labs +name: bert_ner_winson_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `winson`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_winson_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699296108481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_winson_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699296108481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_winson_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_winson_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_winson").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_winson_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/winson/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_biobert_ncbi_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_biobert_ncbi_en.md new file mode 100644 index 000000000000..bc65b0e76292 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_biobert_ncbi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wlt_biobert_ncbi BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_wlt_biobert_ncbi +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wlt_biobert_ncbi` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_biobert_ncbi_en_5.2.0_3.0_1699281392060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_biobert_ncbi_en_5.2.0_3.0_1699281392060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wlt_biobert_ncbi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wlt_biobert_ncbi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wlt_biobert_ncbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/WLT-BioBERT-NCBI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_bluebert_linnaeus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_bluebert_linnaeus_en.md new file mode 100644 index 000000000000..ff2ac9e0cec4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_bluebert_linnaeus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wlt_bluebert_linnaeus BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_wlt_bluebert_linnaeus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wlt_bluebert_linnaeus` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_bluebert_linnaeus_en_5.2.0_3.0_1699282592351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_bluebert_linnaeus_en_5.2.0_3.0_1699282592351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wlt_bluebert_linnaeus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wlt_bluebert_linnaeus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wlt_bluebert_linnaeus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/WLT-BlueBERT-Linnaeus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_bluebert_ncbi_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_bluebert_ncbi_en.md new file mode 100644 index 000000000000..f618f767685d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_bluebert_ncbi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wlt_bluebert_ncbi BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_wlt_bluebert_ncbi +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wlt_bluebert_ncbi` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_bluebert_ncbi_en_5.2.0_3.0_1699282747580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_bluebert_ncbi_en_5.2.0_3.0_1699282747580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wlt_bluebert_ncbi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wlt_bluebert_ncbi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wlt_bluebert_ncbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/WLT-BlueBERT-NCBI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_pubmedbert_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_pubmedbert_bc2gm_en.md new file mode 100644 index 000000000000..899299ba2476 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_pubmedbert_bc2gm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wlt_pubmedbert_bc2gm BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_wlt_pubmedbert_bc2gm +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wlt_pubmedbert_bc2gm` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_pubmedbert_bc2gm_en_5.2.0_3.0_1699282961159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_pubmedbert_bc2gm_en_5.2.0_3.0_1699282961159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wlt_pubmedbert_bc2gm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wlt_pubmedbert_bc2gm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wlt_pubmedbert_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/WLT-PubMedBERT-BC2GM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_pubmedbert_linnaeus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_pubmedbert_linnaeus_en.md new file mode 100644 index 000000000000..67dc0b76ca7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_pubmedbert_linnaeus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wlt_pubmedbert_linnaeus BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_wlt_pubmedbert_linnaeus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wlt_pubmedbert_linnaeus` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_pubmedbert_linnaeus_en_5.2.0_3.0_1699281617182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_pubmedbert_linnaeus_en_5.2.0_3.0_1699281617182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wlt_pubmedbert_linnaeus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wlt_pubmedbert_linnaeus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wlt_pubmedbert_linnaeus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/ghadeermobasher/WLT-PubMedBERT-Linnaeus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_scibert_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_scibert_bc2gm_en.md new file mode 100644 index 000000000000..7c917957d558 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_scibert_bc2gm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wlt_scibert_bc2gm BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_wlt_scibert_bc2gm +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wlt_scibert_bc2gm` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_scibert_bc2gm_en_5.2.0_3.0_1699283938910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_scibert_bc2gm_en_5.2.0_3.0_1699283938910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wlt_scibert_bc2gm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wlt_scibert_bc2gm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wlt_scibert_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/WLT-SciBERT-BC2GM \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_scibert_linnaeus_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_scibert_linnaeus_en.md new file mode 100644 index 000000000000..55896db7f0e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_wlt_scibert_linnaeus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_wlt_scibert_linnaeus BertForTokenClassification from ghadeermobasher +author: John Snow Labs +name: bert_ner_wlt_scibert_linnaeus +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_wlt_scibert_linnaeus` is a English model originally trained by ghadeermobasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_scibert_linnaeus_en_5.2.0_3.0_1699284109584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_wlt_scibert_linnaeus_en_5.2.0_3.0_1699284109584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_wlt_scibert_linnaeus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_wlt_scibert_linnaeus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_wlt_scibert_linnaeus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/ghadeermobasher/WLT-SciBERT-Linnaeus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xesaad_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xesaad_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..6c56a92b105d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xesaad_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_xesaad_bert_finetuned_ner BertForTokenClassification from XeSaad +author: John Snow Labs +name: bert_ner_xesaad_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_xesaad_bert_finetuned_ner` is a English model originally trained by XeSaad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_xesaad_bert_finetuned_ner_en_5.2.0_3.0_1699283195120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_xesaad_bert_finetuned_ner_en_5.2.0_3.0_1699283195120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_xesaad_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_xesaad_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_xesaad_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/XeSaad/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xkang_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xkang_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..76c95bd3cecc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xkang_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from xkang) +author: John Snow Labs +name: bert_ner_xkang_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner-accelerate` is a English model originally trained by `xkang`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_xkang_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699302026564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_xkang_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699302026564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_xkang_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_xkang_bert_finetuned_ner_accelerate","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.finetuned.by_xkang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_xkang_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/xkang/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xkang_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xkang_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..d2f8d9ff7862 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_xkang_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from xkang) +author: John Snow Labs +name: bert_ner_xkang_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `xkang`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_xkang_bert_finetuned_ner_en_5.2.0_3.0_1699301262858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_xkang_bert_finetuned_ner_en_5.2.0_3.0_1699301262858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_xkang_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_xkang_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_xkang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_xkang_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/xkang/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yannis95_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yannis95_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..228ffd49d292 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yannis95_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from yannis95) +author: John Snow Labs +name: bert_ner_yannis95_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `yannis95`. + +## Predicted Entities + +`LOC`, `PER`, `ORG`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_yannis95_bert_finetuned_ner_en_5.2.0_3.0_1699296385812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_yannis95_bert_finetuned_ner_en_5.2.0_3.0_1699296385812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_yannis95_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") ++ +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_yannis95_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_yannis95").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_yannis95_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/yannis95/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ysharma_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ysharma_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ddea248bf671 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_ysharma_bert_finetuned_ner_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from ysharma) +author: John Snow Labs +name: bert_ner_ysharma_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, ner, open_source, en, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-ner` is a English model originally trained by `ysharma`. + +## Predicted Entities + +`ORG`, `LOC`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_ysharma_bert_finetuned_ner_en_5.2.0_3.0_1699301535436.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_ysharma_bert_finetuned_ner_en_5.2.0_3.0_1699301535436.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ysharma_bert_finetuned_ner","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_ner_ysharma_bert_finetuned_ner","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.bert.conll.finetuned.by_ysharma").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_ysharma_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ysharma/bert-finetuned-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yv_bert_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yv_bert_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..42c772253c46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yv_bert_finetuned_ner_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_yv_bert_finetuned_ner_accelerate BertForTokenClassification from Yv +author: John Snow Labs +name: bert_ner_yv_bert_finetuned_ner_accelerate +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_yv_bert_finetuned_ner_accelerate` is a English model originally trained by Yv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_yv_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699284471838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_yv_bert_finetuned_ner_accelerate_en_5.2.0_3.0_1699284471838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_yv_bert_finetuned_ner_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_yv_bert_finetuned_ner_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_yv_bert_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Yv/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yv_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yv_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..68be942356ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_yv_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_yv_bert_finetuned_ner BertForTokenClassification from Yv +author: John Snow Labs +name: bert_ner_yv_bert_finetuned_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_yv_bert_finetuned_ner` is a English model originally trained by Yv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_yv_bert_finetuned_ner_en_5.2.0_3.0_1699282015925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_yv_bert_finetuned_ner_en_5.2.0_3.0_1699282015925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_yv_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_yv_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_yv_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Yv/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_ner_zainab18_wikineural_multilingual_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_zainab18_wikineural_multilingual_ner_en.md new file mode 100644 index 000000000000..edda7c751eb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_ner_zainab18_wikineural_multilingual_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_zainab18_wikineural_multilingual_ner BertForTokenClassification from Zainab18 +author: John Snow Labs +name: bert_ner_zainab18_wikineural_multilingual_ner +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_zainab18_wikineural_multilingual_ner` is a English model originally trained by Zainab18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_zainab18_wikineural_multilingual_ner_en_5.2.0_3.0_1699284632273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_zainab18_wikineural_multilingual_ner_en_5.2.0_3.0_1699284632273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_zainab18_wikineural_multilingual_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_zainab18_wikineural_multilingual_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_zainab18_wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Zainab18/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_13.05.2022.ssccvspantagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_13.05.2022.ssccvspantagger_en.md new file mode 100644 index 000000000000..3d5d4453365d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_13.05.2022.ssccvspantagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_13.05.2022.ssccvspantagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_sayula_popoluca_13.05.2022.ssccvspantagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_13.05.2022.ssccvspantagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_13.05.2022.ssccvspantagger_en_5.2.0_3.0_1699296736661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_13.05.2022.ssccvspantagger_en_5.2.0_3.0_1699296736661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_13.05.2022.ssccvspantagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_13.05.2022.ssccvspantagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_13.05.2022.ssccvspantagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/RJ3vans/13.05.2022.SSCCVspanTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_4l_weight_decay_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_4l_weight_decay_en.md new file mode 100644 index 000000000000..b4822704c3ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_4l_weight_decay_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_4l_weight_decay BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_4l_weight_decay +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_4l_weight_decay` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_4l_weight_decay_en_5.2.0_3.0_1699301742751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_4l_weight_decay_en_5.2.0_3.0_1699301742751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_4l_weight_decay","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_4l_weight_decay", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_4l_weight_decay| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/4L_weight_decay \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amhariccacopostag_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amhariccacopostag_en.md new file mode 100644 index 000000000000..06b024535fb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amhariccacopostag_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_amhariccacopostag BertForTokenClassification from mitiku +author: John Snow Labs +name: bert_sayula_popoluca_amhariccacopostag +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_amhariccacopostag` is a English model originally trained by mitiku. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_amhariccacopostag_en_5.2.0_3.0_1699298884730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_amhariccacopostag_en_5.2.0_3.0_1699298884730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_amhariccacopostag","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_amhariccacopostag", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_amhariccacopostag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/mitiku/AmharicCacoPostag \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amharicwicpostag10tags_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amharicwicpostag10tags_en.md new file mode 100644 index 000000000000..e74343b3bdcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amharicwicpostag10tags_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_amharicwicpostag10tags BertForTokenClassification from mitiku +author: John Snow Labs +name: bert_sayula_popoluca_amharicwicpostag10tags +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_amharicwicpostag10tags` is a English model originally trained by mitiku. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_amharicwicpostag10tags_en_5.2.0_3.0_1699299256671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_amharicwicpostag10tags_en_5.2.0_3.0_1699299256671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_amharicwicpostag10tags","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_amharicwicpostag10tags", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_amharicwicpostag10tags| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.7 MB| + +## References + +https://huggingface.co/mitiku/AmharicWICPostag10Tags \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amharicwicpostag_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amharicwicpostag_en.md new file mode 100644 index 000000000000..1da102d37b94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_amharicwicpostag_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_amharicwicpostag BertForTokenClassification from mitiku +author: John Snow Labs +name: bert_sayula_popoluca_amharicwicpostag +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_amharicwicpostag` is a English model originally trained by mitiku. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_amharicwicpostag_en_5.2.0_3.0_1699299094755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_amharicwicpostag_en_5.2.0_3.0_1699299094755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_amharicwicpostag","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_amharicwicpostag", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_amharicwicpostag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/mitiku/AmharicWICPostag \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque_pt.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque_pt.md new file mode 100644 index 000000000000..7ca79df5cab6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque BertForTokenClassification from Emanuel +author: John Snow Labs +name: bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque +date: 2023-11-06 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque` is a Portuguese model originally trained by Emanuel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque_pt_5.2.0_3.0_1699300286479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque_pt_5.2.0_3.0_1699300286479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_autonlp_sayula_popoluca_tag_bosque| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Emanuel/autonlp-pos-tag-bosque \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_ancient_chinese_base_upos_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_ancient_chinese_base_upos_zh.md new file mode 100644 index 000000000000..8ca19e080789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_ancient_chinese_base_upos_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_sayula_popoluca_bert_ancient_chinese_base_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_ancient_chinese_base_upos +date: 2023-11-06 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_ancient_chinese_base_upos` is a Chinese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_ancient_chinese_base_upos_zh_5.2.0_3.0_1699297246594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_ancient_chinese_base_upos_zh_5.2.0_3.0_1699297246594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_ancient_chinese_base_upos","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_ancient_chinese_base_upos", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_ancient_chinese_base_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|430.7 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-ancient-chinese-base-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy_ar.md new file mode 100644 index 000000000000..5039ee4f306a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy_ar_5.2.0_3.0_1699300450055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy_ar_5.2.0_3.0_1699300450055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_egy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.7 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-ca-pos-egy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf_ar.md new file mode 100644 index 000000000000..bd543e117835 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf_ar_5.2.0_3.0_1699302101136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf_ar_5.2.0_3.0_1699302101136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_glf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.7 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-ca-pos-glf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa_ar.md new file mode 100644 index 000000000000..053e348e1d71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa_ar_5.2.0_3.0_1699302606661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa_ar_5.2.0_3.0_1699302606661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_catalan_sayula_popoluca_msa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.7 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-ca-pos-msa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy_ar.md new file mode 100644 index 000000000000..26a95fda9d1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy_ar_5.2.0_3.0_1699302329367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy_ar_5.2.0_3.0_1699302329367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_egy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.8 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-da-pos-egy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf_ar.md new file mode 100644 index 000000000000..433085215658 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf_ar_5.2.0_3.0_1699302497616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf_ar_5.2.0_3.0_1699302497616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_glf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.8 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-da-pos-glf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa_ar.md new file mode 100644 index 000000000000..d75e4e76ab38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa_ar_5.2.0_3.0_1699302654159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa_ar_5.2.0_3.0_1699302654159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_danish_sayula_popoluca_msa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.8 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-da-pos-msa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy_ar.md new file mode 100644 index 000000000000..f360a64d2942 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy_ar_5.2.0_3.0_1699297434626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy_ar_5.2.0_3.0_1699297434626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_egy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.7 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-mix-pos-egy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf_ar.md new file mode 100644 index 000000000000..b481bc253a75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf_ar_5.2.0_3.0_1699300623702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf_ar_5.2.0_3.0_1699300623702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_glf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.7 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-mix-pos-glf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa_ar.md new file mode 100644 index 000000000000..a0b7c1c87b64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa_ar_5.2.0_3.0_1699297635239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa_ar_5.2.0_3.0_1699297635239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_mix_sayula_popoluca_msa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.7 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-mix-pos-msa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy_ar.md new file mode 100644 index 000000000000..ca0a1c0f6bb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy_ar_5.2.0_3.0_1699302624301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy_ar_5.2.0_3.0_1699302624301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_egy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.4 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-msa-pos-egy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf_ar.md new file mode 100644 index 000000000000..4024f5134607 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf_ar_5.2.0_3.0_1699302805972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf_ar_5.2.0_3.0_1699302805972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_glf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.4 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-msa-pos-glf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa_ar.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa_ar.md new file mode 100644 index 000000000000..311899accf08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa BertForTokenClassification from CAMeL-Lab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa +date: 2023-11-06 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa` is a Arabic model originally trained by CAMeL-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa_ar_5.2.0_3.0_1699302967020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa_ar_5.2.0_3.0_1699302967020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_arabic_camelbert_msa_sayula_popoluca_msa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|406.4 MB| + +## References + +https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-msa-pos-msa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_cased_ccg_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_cased_ccg_en.md new file mode 100644 index 000000000000..b259c7f79f3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_cased_ccg_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_base_cased_ccg BertForTokenClassification from QCRI +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_cased_ccg +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_cased_ccg` is a English model originally trained by QCRI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_cased_ccg_en_5.2.0_3.0_1699300808412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_cased_ccg_en_5.2.0_3.0_1699300808412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_cased_ccg","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_cased_ccg", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_cased_ccg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.5 MB| + +## References + +https://huggingface.co/QCRI/bert-base-cased-ccg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_cased_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_cased_sayula_popoluca_en.md new file mode 100644 index 000000000000..ae10439e9a57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_cased_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_base_cased_sayula_popoluca BertForTokenClassification from QCRI +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_cased_sayula_popoluca +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_cased_sayula_popoluca` is a English model originally trained by QCRI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_cased_sayula_popoluca_en_5.2.0_3.0_1699303128065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_cased_sayula_popoluca_en_5.2.0_3.0_1699303128065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_cased_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_cased_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_cased_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/QCRI/bert-base-cased-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca_nl.md new file mode 100644 index 000000000000..c77135210fa7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca_nl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Dutch, Flemish bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca BertForTokenClassification from wietsedv +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca +date: 2023-11-06 +tags: [bert, nl, open_source, token_classification, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca` is a Dutch, Flemish model originally trained by wietsedv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca_nl_5.2.0_3.0_1699303346752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca_nl_5.2.0_3.0_1699303346752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca","nl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca", "nl") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_dutch_cased_finetuned_lassysmall_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|407.3 MB| + +## References + +https://huggingface.co/wietsedv/bert-base-dutch-cased-finetuned-lassysmall-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca_nl.md new file mode 100644 index 000000000000..d28daffaf766 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca_nl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Dutch, Flemish bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca BertForTokenClassification from wietsedv +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca +date: 2023-11-06 +tags: [bert, nl, open_source, token_classification, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca` is a Dutch, Flemish model originally trained by wietsedv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca_nl_5.2.0_3.0_1699297812004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca_nl_5.2.0_3.0_1699297812004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca","nl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca", "nl") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_dutch_cased_finetuned_udlassy_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|406.7 MB| + +## References + +https://huggingface.co/wietsedv/bert-base-dutch-cased-finetuned-udlassy-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian_xx.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian_xx.md new file mode 100644 index 000000000000..51be1bb8b9b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian BertForTokenClassification from GroNLP +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian +date: 2023-11-06 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian` is a Multilingual model originally trained by GroNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian_xx_5.2.0_3.0_1699297970707.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian_xx_5.2.0_3.0_1699297970707.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_frisian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|349.0 MB| + +## References + +https://huggingface.co/GroNLP/bert-base-dutch-cased-upos-alpino-frisian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings_nl.md new file mode 100644 index 000000000000..c27ec8abdd21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings_nl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Dutch, Flemish bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings BertForTokenClassification from GroNLP +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings +date: 2023-11-06 +tags: [bert, nl, open_source, token_classification, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings` is a Dutch, Flemish model originally trained by GroNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings_nl_5.2.0_3.0_1699302820080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings_nl_5.2.0_3.0_1699302820080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings","nl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings", "nl") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_gronings| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|348.9 MB| + +## References + +https://huggingface.co/GroNLP/bert-base-dutch-cased-upos-alpino-gronings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_nl.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_nl.md new file mode 100644 index 000000000000..06f4a6e47c2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_nl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Dutch, Flemish bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino BertForTokenClassification from GroNLP +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino +date: 2023-11-06 +tags: [bert, nl, open_source, token_classification, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino` is a Dutch, Flemish model originally trained by GroNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_nl_5.2.0_3.0_1699302795836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino_nl_5.2.0_3.0_1699302795836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino","nl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino", "nl") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_dutch_cased_upos_alpino| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|406.6 MB| + +## References + +https://huggingface.co/GroNLP/bert-base-dutch-cased-upos-alpino \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_german_upos_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_german_upos_de.md new file mode 100644 index 000000000000..bb1c45383b66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_german_upos_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_sayula_popoluca_bert_base_german_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_german_upos +date: 2023-11-06 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_german_upos` is a German model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_german_upos_de_5.2.0_3.0_1699300997630.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_german_upos_de_5.2.0_3.0_1699300997630.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_german_upos","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_german_upos", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_german_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|409.9 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-base-german-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_japanese_luw_upos_ja.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_japanese_luw_upos_ja.md new file mode 100644 index 000000000000..8d680af6cf76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_japanese_luw_upos_ja.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Japanese bert_sayula_popoluca_bert_base_japanese_luw_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_japanese_luw_upos +date: 2023-11-06 +tags: [bert, ja, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_japanese_luw_upos` is a Japanese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_japanese_luw_upos_ja_5.2.0_3.0_1699298148873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_japanese_luw_upos_ja_5.2.0_3.0_1699298148873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_japanese_luw_upos","ja") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_japanese_luw_upos", "ja") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_japanese_luw_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ja| +|Size:|338.3 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-base-japanese-luw-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_japanese_upos_ja.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_japanese_upos_ja.md new file mode 100644 index 000000000000..910c351d1369 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_japanese_upos_ja.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Japanese bert_sayula_popoluca_bert_base_japanese_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_japanese_upos +date: 2023-11-06 +tags: [bert, ja, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_japanese_upos` is a Japanese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_japanese_upos_ja_5.2.0_3.0_1699303064211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_japanese_upos_ja_5.2.0_3.0_1699303064211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_japanese_upos","ja") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_japanese_upos", "ja") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_japanese_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ja| +|Size:|338.2 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-base-japanese-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english_en.md new file mode 100644 index 000000000000..ae9b2fc97208 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english BertForTokenClassification from QCRI +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english` is a English model originally trained by QCRI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english_en_5.2.0_3.0_1699298384330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english_en_5.2.0_3.0_1699298384330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_multilingual_cased_chunking_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/QCRI/bert-base-multilingual-cased-chunking-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english_en.md new file mode 100644 index 000000000000..e31c741d2650 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english BertForTokenClassification from QCRI +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english` is a English model originally trained by QCRI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english_en_5.2.0_3.0_1699303340728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english_en_5.2.0_3.0_1699303340728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_multilingual_cased_sayula_popoluca_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.2 MB| + +## References + +https://huggingface.co/QCRI/bert-base-multilingual-cased-pos-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_russian_upos_ru.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_russian_upos_ru.md new file mode 100644 index 000000000000..5d2fa50fb543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_russian_upos_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian bert_sayula_popoluca_bert_base_russian_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_russian_upos +date: 2023-11-06 +tags: [bert, ru, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_russian_upos` is a Russian model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_russian_upos_ru_5.2.0_3.0_1699298638136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_russian_upos_ru_5.2.0_3.0_1699298638136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_russian_upos","ru") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_russian_upos", "ru") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_russian_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ru| +|Size:|664.5 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-base-russian-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_slavic_cyrillic_upos_uk.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_slavic_cyrillic_upos_uk.md new file mode 100644 index 000000000000..ca566302e12a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_slavic_cyrillic_upos_uk.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Ukrainian bert_sayula_popoluca_bert_base_slavic_cyrillic_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_slavic_cyrillic_upos +date: 2023-11-06 +tags: [bert, uk, open_source, token_classification, onnx] +task: Named Entity Recognition +language: uk +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_slavic_cyrillic_upos` is a Ukrainian model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_slavic_cyrillic_upos_uk_5.2.0_3.0_1699303611590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_slavic_cyrillic_upos_uk_5.2.0_3.0_1699303611590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_slavic_cyrillic_upos","uk") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_slavic_cyrillic_upos", "uk") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_slavic_cyrillic_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|uk| +|Size:|667.5 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-base-slavic-cyrillic-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca_sv.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca_sv.md new file mode 100644 index 000000000000..bb00c38e717d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca BertForTokenClassification from KBLab +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca +date: 2023-11-06 +tags: [bert, sv, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca` is a Swedish model originally trained by KBLab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca_sv_5.2.0_3.0_1699303862586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca_sv_5.2.0_3.0_1699303862586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca","sv") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca", "sv") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_swedish_cased_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.3 MB| + +## References + +https://huggingface.co/KBLab/bert-base-swedish-cased-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_thai_upos_th.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_thai_upos_th.md new file mode 100644 index 000000000000..15c0b4738f0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_base_thai_upos_th.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Thai bert_sayula_popoluca_bert_base_thai_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_base_thai_upos +date: 2023-11-06 +tags: [bert, th, open_source, token_classification, onnx] +task: Named Entity Recognition +language: th +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_base_thai_upos` is a Thai model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_thai_upos_th_5.2.0_3.0_1699303051514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_base_thai_upos_th_5.2.0_3.0_1699303051514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_base_thai_upos","th") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_base_thai_upos", "th") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_base_thai_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|th| +|Size:|345.3 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-base-thai-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_english_uncased_finetuned_chunk_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_english_uncased_finetuned_chunk_en.md new file mode 100644 index 000000000000..2e604f2d541c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_english_uncased_finetuned_chunk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_english_uncased_finetuned_chunk BertForTokenClassification from vblagoje +author: John Snow Labs +name: bert_sayula_popoluca_bert_english_uncased_finetuned_chunk +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_english_uncased_finetuned_chunk` is a English model originally trained by vblagoje. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_english_uncased_finetuned_chunk_en_5.2.0_3.0_1699298884738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_english_uncased_finetuned_chunk_en_5.2.0_3.0_1699298884738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_english_uncased_finetuned_chunk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_english_uncased_finetuned_chunk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_english_uncased_finetuned_chunk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/vblagoje/bert-english-uncased-finetuned-chunk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca_en.md new file mode 100644 index 000000000000..39d268736721 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca BertForTokenClassification from vblagoje +author: John Snow Labs +name: bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca` is a English model originally trained by vblagoje. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca_en_5.2.0_3.0_1699304061872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca_en_5.2.0_3.0_1699304061872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_english_uncased_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/vblagoje/bert-english-uncased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_finetuned_conll2003_pos_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_finetuned_conll2003_pos_en.md new file mode 100644 index 000000000000..4e5e694ce4f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_finetuned_conll2003_pos_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_finetuned_conll2003_pos BertForTokenClassification from Tahsin +author: John Snow Labs +name: bert_sayula_popoluca_bert_finetuned_conll2003_pos +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_finetuned_conll2003_pos` is a English model originally trained by Tahsin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_finetuned_conll2003_pos_en_5.2.0_3.0_1699301917061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_finetuned_conll2003_pos_en_5.2.0_3.0_1699301917061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_finetuned_conll2003_pos","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_finetuned_conll2003_pos", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_finetuned_conll2003_pos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/Tahsin/BERT-finetuned-conll2003-POS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_finetuned_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_finetuned_sayula_popoluca_en.md new file mode 100644 index 000000000000..f4d8320af733 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_finetuned_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_finetuned_sayula_popoluca BertForTokenClassification from Fredvv +author: John Snow Labs +name: bert_sayula_popoluca_bert_finetuned_sayula_popoluca +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_finetuned_sayula_popoluca` is a English model originally trained by Fredvv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_finetuned_sayula_popoluca_en_5.2.0_3.0_1699303573296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_finetuned_sayula_popoluca_en_5.2.0_3.0_1699303573296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_finetuned_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_finetuned_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Fredvv/bert-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca_it.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca_it.md new file mode 100644 index 000000000000..e9a4c032f2ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca BertForTokenClassification from sachaarbonel +author: John Snow Labs +name: bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca +date: 2023-11-06 +tags: [bert, it, open_source, token_classification, onnx] +task: Named Entity Recognition +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca` is a Italian model originally trained by sachaarbonel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca_it_5.2.0_3.0_1699299194965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca_it_5.2.0_3.0_1699299194965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_italian_cased_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|it| +|Size:|409.8 MB| + +## References + +https://huggingface.co/sachaarbonel/bert-italian-cased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_german_upos_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_german_upos_de.md new file mode 100644 index 000000000000..51aeab9d7f5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_german_upos_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_sayula_popoluca_bert_large_german_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_large_german_upos +date: 2023-11-06 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_large_german_upos` is a German model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_german_upos_de_5.2.0_3.0_1699303896635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_german_upos_de_5.2.0_3.0_1699303896635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_large_german_upos","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_large_german_upos", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_large_german_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|1.3 GB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-large-german-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_japanese_luw_upos_ja.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_japanese_luw_upos_ja.md new file mode 100644 index 000000000000..1e1a8a364344 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_japanese_luw_upos_ja.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Japanese bert_sayula_popoluca_bert_large_japanese_luw_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_large_japanese_luw_upos +date: 2023-11-06 +tags: [bert, ja, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_large_japanese_luw_upos` is a Japanese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_japanese_luw_upos_ja_5.2.0_3.0_1699299518567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_japanese_luw_upos_ja_5.2.0_3.0_1699299518567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_large_japanese_luw_upos","ja") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_large_japanese_luw_upos", "ja") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_large_japanese_luw_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ja| +|Size:|1.2 GB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-large-japanese-luw-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_japanese_upos_ja.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_japanese_upos_ja.md new file mode 100644 index 000000000000..6dee7380b9cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_japanese_upos_ja.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Japanese bert_sayula_popoluca_bert_large_japanese_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_large_japanese_upos +date: 2023-11-06 +tags: [bert, ja, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_large_japanese_upos` is a Japanese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_japanese_upos_ja_5.2.0_3.0_1699301302690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_japanese_upos_ja_5.2.0_3.0_1699301302690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_large_japanese_upos","ja") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_large_japanese_upos", "ja") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_large_japanese_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ja| +|Size:|1.2 GB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-large-japanese-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_slavic_cyrillic_upos_uk.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_slavic_cyrillic_upos_uk.md new file mode 100644 index 000000000000..563777d250d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_large_slavic_cyrillic_upos_uk.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Ukrainian bert_sayula_popoluca_bert_large_slavic_cyrillic_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_bert_large_slavic_cyrillic_upos +date: 2023-11-06 +tags: [bert, uk, open_source, token_classification, onnx] +task: Named Entity Recognition +language: uk +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_large_slavic_cyrillic_upos` is a Ukrainian model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_slavic_cyrillic_upos_uk_5.2.0_3.0_1699303518422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_large_slavic_cyrillic_upos_uk_5.2.0_3.0_1699303518422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_large_slavic_cyrillic_upos","uk") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_large_slavic_cyrillic_upos", "uk") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_large_slavic_cyrillic_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|uk| +|Size:|1.6 GB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-large-slavic-cyrillic-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_danish_alvenir_da.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_danish_alvenir_da.md new file mode 100644 index 000000000000..2c28dffc714c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_danish_alvenir_da.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Danish bert_sayula_popoluca_bert_punct_restoration_danish_alvenir BertForTokenClassification from Alvenir +author: John Snow Labs +name: bert_sayula_popoluca_bert_punct_restoration_danish_alvenir +date: 2023-11-06 +tags: [bert, da, open_source, token_classification, onnx] +task: Named Entity Recognition +language: da +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_punct_restoration_danish_alvenir` is a Danish model originally trained by Alvenir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_punct_restoration_danish_alvenir_da_5.2.0_3.0_1699299878516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_punct_restoration_danish_alvenir_da_5.2.0_3.0_1699299878516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_punct_restoration_danish_alvenir","da") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_punct_restoration_danish_alvenir", "da") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_punct_restoration_danish_alvenir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|da| +|Size:|412.3 MB| + +## References + +https://huggingface.co/Alvenir/bert-punct-restoration-da \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_english_alvenir_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_english_alvenir_en.md new file mode 100644 index 000000000000..722f500a03fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_english_alvenir_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_punct_restoration_english_alvenir BertForTokenClassification from Alvenir +author: John Snow Labs +name: bert_sayula_popoluca_bert_punct_restoration_english_alvenir +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_punct_restoration_english_alvenir` is a English model originally trained by Alvenir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_punct_restoration_english_alvenir_en_5.2.0_3.0_1699304261004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_punct_restoration_english_alvenir_en_5.2.0_3.0_1699304261004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_punct_restoration_english_alvenir","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_punct_restoration_english_alvenir", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_punct_restoration_english_alvenir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Alvenir/bert-punct-restoration-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_german_alvenir_de.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_german_alvenir_de.md new file mode 100644 index 000000000000..2165d37f6053 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_punct_restoration_german_alvenir_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_sayula_popoluca_bert_punct_restoration_german_alvenir BertForTokenClassification from Alvenir +author: John Snow Labs +name: bert_sayula_popoluca_bert_punct_restoration_german_alvenir +date: 2023-11-06 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_punct_restoration_german_alvenir` is a German model originally trained by Alvenir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_punct_restoration_german_alvenir_de_5.2.0_3.0_1699300067718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_punct_restoration_german_alvenir_de_5.2.0_3.0_1699300067718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_punct_restoration_german_alvenir","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_punct_restoration_german_alvenir", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_punct_restoration_german_alvenir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|409.8 MB| + +## References + +https://huggingface.co/Alvenir/bert-punct-restoration-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld_en.md new file mode 100644 index 000000000000..f64d131e3ac7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld BertForTokenClassification from proycon +author: John Snow Labs +name: bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld` is a English model originally trained by proycon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld_en_5.2.0_3.0_1699304108856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld_en_5.2.0_3.0_1699304108856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_sayula_popoluca_cased_deepfrog_nld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/proycon/bert-pos-cased-deepfrog-nld \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags_es.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags_es.md new file mode 100644 index 000000000000..f632cb4bd810 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags BertForTokenClassification from mrm8488 +author: John Snow Labs +name: bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags +date: 2023-11-06 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags` is a Castilian, Spanish model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags_es_5.2.0_3.0_1699304252166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags_es_5.2.0_3.0_1699304252166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_16_tags| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/mrm8488/bert-spanish-cased-finetuned-pos-16-tags \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_es.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_es.md new file mode 100644 index 000000000000..aa70ab1f5dad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca BertForTokenClassification from mrm8488 +author: John Snow Labs +name: bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca +date: 2023-11-06 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca` is a Castilian, Spanish model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_es_5.2.0_3.0_1699303777052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_es_5.2.0_3.0_1699303777052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.7 MB| + +## References + +https://huggingface.co/mrm8488/bert-spanish-cased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax_es.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax_es.md new file mode 100644 index 000000000000..d224753aafb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax BertForTokenClassification from mrm8488 +author: John Snow Labs +name: bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax +date: 2023-11-06 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax` is a Castilian, Spanish model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax_es_5.2.0_3.0_1699301495205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax_es_5.2.0_3.0_1699301495205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_spanish_cased_finetuned_sayula_popoluca_syntax| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/mrm8488/bert-spanish-cased-finetuned-pos-syntax \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca_zh.md new file mode 100644 index 000000000000..f91a40641be6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca BertForTokenClassification from ckiplab +author: John Snow Labs +name: bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca +date: 2023-11-06 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca` is a Chinese model originally trained by ckiplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca_zh_5.2.0_3.0_1699304358725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca_zh_5.2.0_3.0_1699304358725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bert_tiny_chinese_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|43.1 MB| + +## References + +https://huggingface.co/ckiplab/bert-tiny-chinese-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2_en.md new file mode 100644 index 000000000000..01d8f1e3d092 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2 BertForTokenClassification from Deborah +author: John Snow Labs +name: bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2` is a English model originally trained by Deborah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2_en_5.2.0_3.0_1699303981929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2_en_5.2.0_3.0_1699303981929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Deborah/bertimbau-finetuned-pos-accelerate2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3_en.md new file mode 100644 index 000000000000..48989754e1f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3 BertForTokenClassification from Deborah +author: John Snow Labs +name: bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3` is a English model originally trained by Deborah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3_en_5.2.0_3.0_1699304416492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3_en_5.2.0_3.0_1699304416492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Deborah/bertimbau-finetuned-pos-accelerate3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5_en.md new file mode 100644 index 000000000000..219f42c99ab4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5 BertForTokenClassification from camilag +author: John Snow Labs +name: bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5` is a English model originally trained by camilag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5_en_5.2.0_3.0_1699304172382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5_en_5.2.0_3.0_1699304172382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/camilag/bertimbau-finetuned-pos-accelerate-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6_en.md new file mode 100644 index 000000000000..946ef05b474e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6 BertForTokenClassification from camilag +author: John Snow Labs +name: bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6` is a English model originally trained by camilag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6_en_5.2.0_3.0_1699304601450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6_en_5.2.0_3.0_1699304601450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/camilag/bertimbau-finetuned-pos-accelerate-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7_en.md new file mode 100644 index 000000000000..51b7db2ac401 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7 BertForTokenClassification from camilag +author: John Snow Labs +name: bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7` is a English model originally trained by camilag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7_en_5.2.0_3.0_1699304708381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7_en_5.2.0_3.0_1699304708381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/camilag/bertimbau-finetuned-pos-accelerate-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_en.md new file mode 100644 index 000000000000..bdc92ea86003 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate BertForTokenClassification from Deborah +author: John Snow Labs +name: bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate` is a English model originally trained by Deborah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_en_5.2.0_3.0_1699304519838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate_en_5.2.0_3.0_1699304519838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_bertimbau_finetuned_sayula_popoluca_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Deborah/bertimbau-finetuned-pos-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_ccvspantagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_ccvspantagger_en.md new file mode 100644 index 000000000000..500bf7071655 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_ccvspantagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_ccvspantagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_sayula_popoluca_ccvspantagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_ccvspantagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_ccvspantagger_en_5.2.0_3.0_1699302072883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_ccvspantagger_en_5.2.0_3.0_1699302072883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_ccvspantagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_ccvspantagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_ccvspantagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/RJ3vans/CCVspanTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_bert_wwm_ext_upos_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_bert_wwm_ext_upos_zh.md new file mode 100644 index 000000000000..709064fce54c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_bert_wwm_ext_upos_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_sayula_popoluca_chinese_bert_wwm_ext_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_chinese_bert_wwm_ext_upos +date: 2023-11-06 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_chinese_bert_wwm_ext_upos` is a Chinese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_chinese_bert_wwm_ext_upos_zh_5.2.0_3.0_1699304771456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_chinese_bert_wwm_ext_upos_zh_5.2.0_3.0_1699304771456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_chinese_bert_wwm_ext_upos","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_chinese_bert_wwm_ext_upos", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_chinese_bert_wwm_ext_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.3 MB| + +## References + +https://huggingface.co/KoichiYasuoka/chinese-bert-wwm-ext-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_roberta_base_upos_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_roberta_base_upos_zh.md new file mode 100644 index 000000000000..7108f987d2c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_roberta_base_upos_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_sayula_popoluca_chinese_roberta_base_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_chinese_roberta_base_upos +date: 2023-11-06 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_chinese_roberta_base_upos` is a Chinese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_chinese_roberta_base_upos_zh_5.2.0_3.0_1699305982964.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_chinese_roberta_base_upos_zh_5.2.0_3.0_1699305982964.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_chinese_roberta_base_upos","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_chinese_roberta_base_upos", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_chinese_roberta_base_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/KoichiYasuoka/chinese-roberta-base-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_roberta_large_upos_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_roberta_large_upos_zh.md new file mode 100644 index 000000000000..6e3e82892c03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_chinese_roberta_large_upos_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_sayula_popoluca_chinese_roberta_large_upos BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_sayula_popoluca_chinese_roberta_large_upos +date: 2023-11-06 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_chinese_roberta_large_upos` is a Chinese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_chinese_roberta_large_upos_zh_5.2.0_3.0_1699301806106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_chinese_roberta_large_upos_zh_5.2.0_3.0_1699301806106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_chinese_roberta_large_upos","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_chinese_roberta_large_upos", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_chinese_roberta_large_upos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|1.2 GB| + +## References + +https://huggingface.co/KoichiYasuoka/chinese-roberta-large-upos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_clnspantagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_clnspantagger_en.md new file mode 100644 index 000000000000..c8ed41a6bccf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_clnspantagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_clnspantagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_sayula_popoluca_clnspantagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_clnspantagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_clnspantagger_en_5.2.0_3.0_1699299582578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_clnspantagger_en_5.2.0_3.0_1699299582578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_clnspantagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_clnspantagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_clnspantagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/RJ3vans/CLNspanTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_cmn1spantagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_cmn1spantagger_en.md new file mode 100644 index 000000000000..389cf7ebf5b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_cmn1spantagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_cmn1spantagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_sayula_popoluca_cmn1spantagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_cmn1spantagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_cmn1spantagger_en_5.2.0_3.0_1699302391956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_cmn1spantagger_en_5.2.0_3.0_1699302391956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_cmn1spantagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_cmn1spantagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_cmn1spantagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/RJ3vans/CMN1spanTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_cmv1spantagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_cmv1spantagger_en.md new file mode 100644 index 000000000000..790470970456 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_cmv1spantagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_cmv1spantagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_sayula_popoluca_cmv1spantagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_cmv1spantagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_cmv1spantagger_en_5.2.0_3.0_1699297054050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_cmv1spantagger_en_5.2.0_3.0_1699297054050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_cmv1spantagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_cmv1spantagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_cmv1spantagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/RJ3vans/CMV1spanTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince_hi.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince_hi.md new file mode 100644 index 000000000000..2516c992015d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince_hi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hindi bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince BertForTokenClassification from sagorsarker +author: John Snow Labs +name: bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince +date: 2023-11-06 +tags: [bert, hi, open_source, token_classification, onnx] +task: Named Entity Recognition +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince` is a Hindi model originally trained by sagorsarker. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince_hi_5.2.0_3.0_1699302088739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince_hi_5.2.0_3.0_1699302088739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince","hi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince", "hi") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_codeswitch_hineng_sayula_popoluca_lince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|hi| +|Size:|665.1 MB| + +## References + +https://huggingface.co/sagorsarker/codeswitch-hineng-pos-lince \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince_en.md new file mode 100644 index 000000000000..83cc14082f86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince BertForTokenClassification from sagorsarker +author: John Snow Labs +name: bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince` is a English model originally trained by sagorsarker. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince_en_5.2.0_3.0_1699307513091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince_en_5.2.0_3.0_1699307513091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_codeswitch_spaeng_sayula_popoluca_lince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/sagorsarker/codeswitch-spaeng-pos-lince \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_estbert_upos_128_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_estbert_upos_128_en.md new file mode 100644 index 000000000000..ace88f84084e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_estbert_upos_128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_estbert_upos_128 BertForTokenClassification from tartuNLP +author: John Snow Labs +name: bert_sayula_popoluca_estbert_upos_128 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_estbert_upos_128` is a English model originally trained by tartuNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_estbert_upos_128_en_5.2.0_3.0_1699299768492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_estbert_upos_128_en_5.2.0_3.0_1699299768492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_estbert_upos_128","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_estbert_upos_128", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_estbert_upos_128| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|463.5 MB| + +## References + +https://huggingface.co/tartuNLP/EstBERT_UPOS_128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_mbert_grammatical_error_tagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_mbert_grammatical_error_tagger_en.md new file mode 100644 index 000000000000..59ad87bdce9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_mbert_grammatical_error_tagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_mbert_grammatical_error_tagger BertForTokenClassification from alice-hml +author: John Snow Labs +name: bert_sayula_popoluca_mbert_grammatical_error_tagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_mbert_grammatical_error_tagger` is a English model originally trained by alice-hml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_mbert_grammatical_error_tagger_en_5.2.0_3.0_1699309288172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_mbert_grammatical_error_tagger_en_5.2.0_3.0_1699309288172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_mbert_grammatical_error_tagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_mbert_grammatical_error_tagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_mbert_grammatical_error_tagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/alice-hml/mBERT_grammatical_error_tagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca_en.md new file mode 100644 index 000000000000..8786036f1d7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca BertForTokenClassification from sepidmnorozy +author: John Snow Labs +name: bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca` is a English model originally trained by sepidmnorozy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca_en_5.2.0_3.0_1699306247944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca_en_5.2.0_3.0_1699306247944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_parsbert_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|606.5 MB| + +## References + +https://huggingface.co/sepidmnorozy/parsbert-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_signtagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_signtagger_en.md new file mode 100644 index 000000000000..d9354a830dfe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_signtagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_signtagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_sayula_popoluca_signtagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_signtagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_signtagger_en_5.2.0_3.0_1699302420746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_signtagger_en_5.2.0_3.0_1699302420746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_signtagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_signtagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_signtagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/RJ3vans/SignTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_ssccvspantagger_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_ssccvspantagger_en.md new file mode 100644 index 000000000000..d499b8840258 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_ssccvspantagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_ssccvspantagger BertForTokenClassification from RJ3vans +author: John Snow Labs +name: bert_sayula_popoluca_ssccvspantagger +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_ssccvspantagger` is a English model originally trained by RJ3vans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_ssccvspantagger_en_5.2.0_3.0_1699300095426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_ssccvspantagger_en_5.2.0_3.0_1699300095426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_ssccvspantagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_ssccvspantagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_ssccvspantagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/RJ3vans/SSCCVspanTagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tahitian_punctuator_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tahitian_punctuator_en.md new file mode 100644 index 000000000000..fa745c8d203c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tahitian_punctuator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tahitian_punctuator BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tahitian_punctuator +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tahitian_punctuator` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tahitian_punctuator_en_5.2.0_3.0_1699302308368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tahitian_punctuator_en_5.2.0_3.0_1699302308368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tahitian_punctuator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tahitian_punctuator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tahitian_punctuator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/kktoto/ty_punctuator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tetra_tag_english_kitaev_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tetra_tag_english_kitaev_en.md new file mode 100644 index 000000000000..25c77c048338 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tetra_tag_english_kitaev_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tetra_tag_english_kitaev BertForTokenClassification from kitaev +author: John Snow Labs +name: bert_sayula_popoluca_tetra_tag_english_kitaev +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tetra_tag_english_kitaev` is a English model originally trained by kitaev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tetra_tag_english_kitaev_en_5.2.0_3.0_1699300423265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tetra_tag_english_kitaev_en_5.2.0_3.0_1699300423265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tetra_tag_english_kitaev","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tetra_tag_english_kitaev", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tetra_tag_english_kitaev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/kitaev/tetra-tag-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_bb_wd_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_bb_wd_en.md new file mode 100644 index 000000000000..b39d0d724fb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_bb_wd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_bb_wd BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_bb_wd +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_bb_wd` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_bb_wd_en_5.2.0_3.0_1699300563685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_bb_wd_en_5.2.0_3.0_1699300563685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_bb_wd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_bb_wd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_bb_wd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_bb_wd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_alpah75_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_alpah75_en.md new file mode 100644 index 000000000000..7cfdb2751693 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_alpah75_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_focal_alpah75 BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_focal_alpah75 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_focal_alpah75` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_alpah75_en_5.2.0_3.0_1699304352550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_alpah75_en_5.2.0_3.0_1699304352550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_focal_alpah75","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_focal_alpah75", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_focal_alpah75| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_focal_alpah75 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_alpah_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_alpah_en.md new file mode 100644 index 000000000000..416c053b4a47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_alpah_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_focal_alpah BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_focal_alpah +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_focal_alpah` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_alpah_en_5.2.0_3.0_1699310254283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_alpah_en_5.2.0_3.0_1699310254283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_focal_alpah","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_focal_alpah", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_focal_alpah| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_focal_alpah \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_ckpt_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_ckpt_en.md new file mode 100644 index 000000000000..8389c50a94e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_ckpt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_focal_ckpt BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_focal_ckpt +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_focal_ckpt` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_ckpt_en_5.2.0_3.0_1699311196571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_ckpt_en_5.2.0_3.0_1699311196571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_focal_ckpt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_focal_ckpt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_focal_ckpt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_focal_ckpt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_v2_label_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_v2_label_en.md new file mode 100644 index 000000000000..d0a11d16e166 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_v2_label_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_focal_v2_label BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_focal_v2_label +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_focal_v2_label` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_v2_label_en_5.2.0_3.0_1699312620695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_v2_label_en_5.2.0_3.0_1699312620695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_focal_v2_label","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_focal_v2_label", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_focal_v2_label| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_focal_v2_label \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_v3_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_v3_en.md new file mode 100644 index 000000000000..0a135dd93bd2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_focal_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_focal_v3 BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_focal_v3 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_focal_v3` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_v3_en_5.2.0_3.0_1699313528503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_focal_v3_en_5.2.0_3.0_1699313528503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_focal_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_focal_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_focal_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_focal_v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_kt_punctuator_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_kt_punctuator_en.md new file mode 100644 index 000000000000..bf1e9cb9a5ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_kt_punctuator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_kt_punctuator BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_kt_punctuator +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_kt_punctuator` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_kt_punctuator_en_5.2.0_3.0_1699307175877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_kt_punctuator_en_5.2.0_3.0_1699307175877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_kt_punctuator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_kt_punctuator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_kt_punctuator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_kt_punctuator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_ktoto_punctuator_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_ktoto_punctuator_en.md new file mode 100644 index 000000000000..124137514ae9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_ktoto_punctuator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_ktoto_punctuator BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_ktoto_punctuator +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_ktoto_punctuator` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_ktoto_punctuator_en_5.2.0_3.0_1699304453526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_ktoto_punctuator_en_5.2.0_3.0_1699304453526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_ktoto_punctuator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_ktoto_punctuator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_ktoto_punctuator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_ktoto_punctuator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_lr_kazakh_kktoto_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_lr_kazakh_kktoto_en.md new file mode 100644 index 000000000000..49bc45967fe5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_lr_kazakh_kktoto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_lr_kazakh_kktoto BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_lr_kazakh_kktoto +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_lr_kazakh_kktoto` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_lr_kazakh_kktoto_en_5.2.0_3.0_1699300652184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_lr_kazakh_kktoto_en_5.2.0_3.0_1699300652184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_lr_kazakh_kktoto","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_lr_kazakh_kktoto", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_lr_kazakh_kktoto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_lr_kk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_norwegian_focal_v2_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_norwegian_focal_v2_en.md new file mode 100644 index 000000000000..36e3acb1f64c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_norwegian_focal_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_norwegian_focal_v2 BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_norwegian_focal_v2 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_norwegian_focal_v2` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_norwegian_focal_v2_en_5.2.0_3.0_1699300739842.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_norwegian_focal_v2_en_5.2.0_3.0_1699300739842.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_norwegian_focal_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_norwegian_focal_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_norwegian_focal_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_no_focal_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_toto_punctuator_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_toto_punctuator_en.md new file mode 100644 index 000000000000..f3a4edf0d05a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_tiny_toto_punctuator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_tiny_toto_punctuator BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_tiny_toto_punctuator +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_tiny_toto_punctuator` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_toto_punctuator_en_5.2.0_3.0_1699307918818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_tiny_toto_punctuator_en_5.2.0_3.0_1699307918818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_tiny_toto_punctuator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_tiny_toto_punctuator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_tiny_toto_punctuator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/tiny_toto_punctuator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert_en.md new file mode 100644 index 000000000000..73d84ad4db56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert BertForTokenClassification from mustafabaris +author: John Snow Labs +name: bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert` is a English model originally trained by mustafabaris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert_en_5.2.0_3.0_1699304689691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert_en_5.2.0_3.0_1699304689691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_turkish_kongo_sayula_popoluca_conllu_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|689.0 MB| + +## References + +https://huggingface.co/mustafabaris/tr_kg_pos_conllu_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_wwdd_tiny_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_wwdd_tiny_en.md new file mode 100644 index 000000000000..d4c03d50d23c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_sayula_popoluca_wwdd_tiny_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_sayula_popoluca_wwdd_tiny BertForTokenClassification from kktoto +author: John Snow Labs +name: bert_sayula_popoluca_wwdd_tiny +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sayula_popoluca_wwdd_tiny` is a English model originally trained by kktoto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_wwdd_tiny_en_5.2.0_3.0_1699308782051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sayula_popoluca_wwdd_tiny_en_5.2.0_3.0_1699308782051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_sayula_popoluca_wwdd_tiny","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_sayula_popoluca_wwdd_tiny", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sayula_popoluca_wwdd_tiny| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|42.7 MB| + +## References + +https://huggingface.co/kktoto/wwdd_tiny \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_final_784824206_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_final_784824206_en.md new file mode 100644 index 000000000000..db5f336ad224 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_final_784824206_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Lucifermorningstar011) +author: John Snow Labs +name: bert_token_classifier_autotrain_final_784824206 +date: 2023-11-06 +tags: [en, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-final-784824206` is a English model originally trained by `Lucifermorningstar011`. + +## Predicted Entities + +`9`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_final_784824206_en_5.2.0_3.0_1699308782243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_final_784824206_en_5.2.0_3.0_1699308782243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_autotrain_final_784824206","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_autotrain_final_784824206","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_autotrain_final_784824206| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Lucifermorningstar011/autotrain-final-784824206 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_gro_ner_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_gro_ner_en.md new file mode 100644 index 000000000000..3c5f8dd644ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_gro_ner_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Wanjiru) +author: John Snow Labs +name: bert_token_classifier_autotrain_gro_ner +date: 2023-11-06 +tags: [en, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain_gro_ner` is a English model originally trained by `Wanjiru`. + +## Predicted Entities + +`METRIC`, `ITEM` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_gro_ner_en_5.2.0_3.0_1699301248192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_gro_ner_en_5.2.0_3.0_1699301248192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_autotrain_gro_ner","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_autotrain_gro_ner","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_autotrain_gro_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Wanjiru/autotrain_gro_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_turkmen_1181244086_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_turkmen_1181244086_en.md new file mode 100644 index 000000000000..d3f3dbae0eed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_autotrain_turkmen_1181244086_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_token_classifier_autotrain_turkmen_1181244086 BertForTokenClassification from Shenzy2 +author: John Snow Labs +name: bert_token_classifier_autotrain_turkmen_1181244086 +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_autotrain_turkmen_1181244086` is a English model originally trained by Shenzy2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_turkmen_1181244086_en_5.2.0_3.0_1699310235474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_turkmen_1181244086_en_5.2.0_3.0_1699310235474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_autotrain_turkmen_1181244086","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_autotrain_turkmen_1181244086", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_autotrain_turkmen_1181244086| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Shenzy2/autotrain-tk-1181244086 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_ner_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_ner_zh.md new file mode 100644 index 000000000000..3c88b3e4a617 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_ner_zh.md @@ -0,0 +1,105 @@ +--- +layout: model +title: Chinese BertForTokenClassification Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_token_classifier_base_chinese_ner +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-chinese-ner` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`S-WORK_OF_ART`, `S-TIME`, `E-FAC`, `S-PERCENT`, `S-PRODUCT`, `E-LANGUAGE`, `S-NORP`, `S-QUANTITY`, `S-PERSON`, `E-DATE`, `S-LOC`, `S-CARDINAL`, `E-QUANTITY`, `S-GPE`, `S-FAC`, `MONEY`, `S-ORG`, `E-NORP`, `E-GPE`, `E-TIME`, `EVENT`, `DATE`, `CARDINAL`, `FAC`, `E-PERCENT`, `E-PERSON`, `S-ORDINAL`, `NORP`, `LOC`, `E-ORG`, `E-MONEY`, `S-LAW`, `LAW`, `E-LOC`, `S-EVENT`, `ORG`, `TIME`, `ORDINAL`, `E-WORK_OF_ART`, `LANGUAGE`, `S-MONEY`, `E-ORDINAL`, `PERCENT`, `E-EVENT`, `S-LANGUAGE`, `E-PRODUCT`, `QUANTITY`, `WORK_OF_ART`, `E-LAW`, `S-DATE`, `PRODUCT`, `E-CARDINAL`, `PERSON`, `GPE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_chinese_ner_zh_5.2.0_3.0_1699302560261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_chinese_ner_zh_5.2.0_3.0_1699302560261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_chinese_ner","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_chinese_ner","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_chinese_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-chinese-ner +- https://github.com/ckiplab/ckip-transformers +- https://muyang.pro +- https://ckip.iis.sinica.edu.tw +- https://github.com/ckiplab/ckip-transformers +- https://github.com/ckiplab/ckip-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_sayula_popoluca_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_sayula_popoluca_zh.md new file mode 100644 index 000000000000..4b443c578f3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_sayula_popoluca_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_token_classifier_base_chinese_sayula_popoluca BertForTokenClassification from ckiplab +author: John Snow Labs +name: bert_token_classifier_base_chinese_sayula_popoluca +date: 2023-11-06 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_base_chinese_sayula_popoluca` is a Chinese model originally trained by ckiplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_chinese_sayula_popoluca_zh_5.2.0_3.0_1699311615246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_chinese_sayula_popoluca_zh_5.2.0_3.0_1699311615246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_chinese_sayula_popoluca","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_base_chinese_sayula_popoluca", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_chinese_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/ckiplab/bert-base-chinese-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_ws_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_ws_zh.md new file mode 100644 index 000000000000..9b7c728b8c3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_chinese_ws_zh.md @@ -0,0 +1,105 @@ +--- +layout: model +title: Chinese BertForTokenClassification Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_token_classifier_base_chinese_ws +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-chinese-ws` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`B`, `I` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_chinese_ws_zh_5.2.0_3.0_1699314567003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_chinese_ws_zh_5.2.0_3.0_1699314567003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_chinese_ws","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_chinese_ws","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_chinese_ws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-chinese-ws +- https://github.com/ckiplab/ckip-transformers +- https://muyang.pro +- https://ckip.iis.sinica.edu.tw +- https://github.com/ckiplab/ckip-transformers +- https://github.com/ckiplab/ckip-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu_zh.md new file mode 100644 index 000000000000..60a99832522a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu BertForTokenClassification from ckiplab +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu +date: 2023-11-06 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu` is a Chinese model originally trained by ckiplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu_zh_5.2.0_3.0_1699312965297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu_zh_5.2.0_3.0_1699312965297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_sayula_popoluca_zhonggu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.7 MB| + +## References + +https://huggingface.co/ckiplab/bert-base-han-chinese-pos-zhonggu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_jindai_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_jindai_zh.md new file mode 100644 index 000000000000..9ac15f78f983 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_jindai_zh.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Chinese BertForTokenClassification Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_ws_jindai +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-han-chinese-ws-jindai` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`B`, `I` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_jindai_zh_5.2.0_3.0_1699314337007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_jindai_zh_5.2.0_3.0_1699314337007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_jindai","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_jindai","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_ws_jindai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-han-chinese-ws-jindai +- https://github.com/ckiplab/han-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_shanggu_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_shanggu_zh.md new file mode 100644 index 000000000000..32630e75e108 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_shanggu_zh.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Chinese BertForTokenClassification Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_ws_shanggu +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-han-chinese-ws-shanggu` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`B`, `I` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_shanggu_zh_5.2.0_3.0_1699302841534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_shanggu_zh_5.2.0_3.0_1699302841534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_shanggu","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_shanggu","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_ws_shanggu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-han-chinese-ws-shanggu +- https://github.com/ckiplab/han-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_xiandai_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_xiandai_zh.md new file mode 100644 index 000000000000..701ff0e2dfa8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_xiandai_zh.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Chinese BertForTokenClassification Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_ws_xiandai +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-han-chinese-ws-xiandai` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`B`, `I` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_xiandai_zh_5.2.0_3.0_1699303132711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_xiandai_zh_5.2.0_3.0_1699303132711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_xiandai","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_xiandai","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_ws_xiandai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-han-chinese-ws-xiandai +- https://github.com/ckiplab/han-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_zh.md new file mode 100644 index 000000000000..cc7e8c0ebc0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_han_chinese_ws_zh.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Chinese BertForTokenClassification Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_ws +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-han-chinese-ws` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`B`, `I` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_zh_5.2.0_3.0_1699301530772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_zh_5.2.0_3.0_1699301530772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_ws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-han-chinese-ws +- https://github.com/ckiplab/han-transformers +- http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/akiwi/kiwi.sh +- http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/dkiwi/kiwi.sh +- http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/pkiwi/kiwi.sh +- http://asbc.iis.sinica.edu.tw +- https://ckip.iis.sinica.edu.tw/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner_sq.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner_sq.md new file mode 100644 index 000000000000..3e51978ec223 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner_sq.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Albanian BertForTokenClassification Base Cased model (from Kushtrim) +author: John Snow Labs +name: bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner +date: 2023-11-06 +tags: [sq, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: sq +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-albanian-ner` is a Albanian model originally trained by `Kushtrim`. + +## Predicted Entities + +`LOC`, `ORG`, `PER`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner_sq_5.2.0_3.0_1699303539654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner_sq_5.2.0_3.0_1699303539654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner","sq") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner","sq") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_multilingual_cased_finetuned_albanian_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sq| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Kushtrim/bert-base-multilingual-cased-finetuned-albanian-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_ner_atc_english_atco2_1h_en.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_ner_atc_english_atco2_1h_en.md new file mode 100644 index 000000000000..39d3216af99a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_ner_atc_english_atco2_1h_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_token_classifier_base_ner_atc_english_atco2_1h BertForTokenClassification from Jzuluaga +author: John Snow Labs +name: bert_token_classifier_base_ner_atc_english_atco2_1h +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_base_ner_atc_english_atco2_1h` is a English model originally trained by Jzuluaga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_ner_atc_english_atco2_1h_en_5.2.0_3.0_1699303771342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_ner_atc_english_atco2_1h_en_5.2.0_3.0_1699303771342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_ner_atc_english_atco2_1h","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_base_ner_atc_english_atco2_1h", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_ner_atc_english_atco2_1h| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Jzuluaga/bert-base-ner-atc-en-atco2-1h \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_turkish_cased_ner_tr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_turkish_cased_ner_tr.md new file mode 100644 index 000000000000..3cbf0c84bdf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_base_turkish_cased_ner_tr.md @@ -0,0 +1,102 @@ +--- +layout: model +title: Turkish BertForTokenClassification Base Cased model (from akdeniz27) +author: John Snow Labs +name: bert_token_classifier_base_turkish_cased_ner +date: 2023-11-06 +tags: [tr, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-cased-ner` is a Turkish model originally trained by `akdeniz27`. + +## Predicted Entities + +`LOC`, `ORG`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_turkish_cased_ner_tr_5.2.0_3.0_1699301799959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_turkish_cased_ner_tr_5.2.0_3.0_1699301799959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_turkish_cased_ner","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_turkish_cased_ner","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_turkish_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/akdeniz27/bert-base-turkish-cased-ner +- https://github.com/stefan-it/turkish-bert/files/4558187/nerdata.txt +- https://ieeexplore.ieee.org/document/7495744 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_berturk_sunlp_ner_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_berturk_sunlp_ner_turkish_tr.md new file mode 100644 index 000000000000..81edfca44a15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_berturk_sunlp_ner_turkish_tr.md @@ -0,0 +1,102 @@ +--- +layout: model +title: Turkish BertForTokenClassification Cased model (from busecarik) +author: John Snow Labs +name: bert_token_classifier_berturk_sunlp_ner_turkish +date: 2023-11-06 +tags: [tr, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `berturk-sunlp-ner-turkish` is a Turkish model originally trained by `busecarik`. + +## Predicted Entities + +`ORGANIZATION`, `TVSHOW`, `MONEY`, `LOCATION`, `PRODUCT`, `TIME`, `PERSON` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_berturk_sunlp_ner_turkish_tr_5.2.0_3.0_1699304172705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_berturk_sunlp_ner_turkish_tr_5.2.0_3.0_1699304172705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_berturk_sunlp_ner_turkish","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_berturk_sunlp_ner_turkish","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_berturk_sunlp_ner_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|689.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/busecarik/berturk-sunlp-ner-turkish +- https://github.com/SU-NLP/SUNLP-Twitter-NER-Dataset +- http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.484.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_berturk_uncased_keyword_extractor_tr.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_berturk_uncased_keyword_extractor_tr.md new file mode 100644 index 000000000000..2adc1793966f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_berturk_uncased_keyword_extractor_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish BertForTokenClassification Uncased model (from yanekyuk) +author: John Snow Labs +name: bert_token_classifier_berturk_uncased_keyword_extractor +date: 2023-11-06 +tags: [tr, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `berturk-uncased-keyword-extractor` is a Turkish model originally trained by `yanekyuk`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_berturk_uncased_keyword_extractor_tr_5.2.0_3.0_1699304433332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_berturk_uncased_keyword_extractor_tr_5.2.0_3.0_1699304433332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_berturk_uncased_keyword_extractor","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_berturk_uncased_keyword_extractor","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_berturk_uncased_keyword_extractor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/yanekyuk/berturk-uncased-keyword-extractor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_datafun_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_datafun_zh.md new file mode 100644 index 000000000000..25f537355dca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_datafun_zh.md @@ -0,0 +1,104 @@ +--- +layout: model +title: Chinese BertForTokenClassification Cased model (from canIjoin) +author: John Snow Labs +name: bert_token_classifier_datafun +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `datafun` is a Chinese model originally trained by `canIjoin`. + +## Predicted Entities + +`movie`, `no1`, `government`, `name1`, `position`, `book1`, `address`, `address1`, `game`, `organization`, `book`, `government1`, `company1`, `game1`, `position1`, `movie1`, `scene1`, `name`, `company`, `scene`, `organization1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_datafun_zh_5.2.0_3.0_1699302097623.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_datafun_zh_5.2.0_3.0_1699302097623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_datafun","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_datafun","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_datafun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|380.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/canIjoin/datafun +- https://github.com/dbiir/UER-py/wiki/Modelzoo +- https://github.com/CLUEbenchmark/CLUENER2020 +- https://github.com/dbiir/UER-py/ +- https://cloud.tencent.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_est_morph_128_et.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_est_morph_128_et.md new file mode 100644 index 000000000000..509b220a2068 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_est_morph_128_et.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Estonian BertForTokenClassification Cased model (from tartuNLP) +author: John Snow Labs +name: bert_token_classifier_est_morph_128 +date: 2023-11-06 +tags: [et, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: et +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `EstBERT_Morph_128` is a Estonian model originally trained by `tartuNLP`. + +## Predicted Entities + +`AdpType=Prep`, `VerbForm=Part`, `Case=Ade`, `PronType=Rel`, `Polarity=Neg`, `Degree=Pos`, `VerbForm=Inf`, `PronType=Ind`, `PronType=Tot`, `Case=Par`, `Abbr=Yes`, `Case=Nom`, `Foreign=Yes`, `_`, `PronType=Dem`, `NumType=Ord`, `Hyph=Yes`, `Connegative=Yes`, `AdpType=Post`, `NumType=Card`, `Number=Sing`, `VerbForm=Conv` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_est_morph_128_et_5.2.0_3.0_1699302426581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_est_morph_128_et_5.2.0_3.0_1699302426581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_est_morph_128","et") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_est_morph_128","et") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_est_morph_128| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|et| +|Size:|465.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tartuNLP/EstBERT_Morph_128 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_est_ner_v2_et.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_est_ner_v2_et.md new file mode 100644 index 000000000000..cc563fe5247b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_est_ner_v2_et.md @@ -0,0 +1,102 @@ +--- +layout: model +title: Estonian BertForTokenClassification Cased model (from tartuNLP) +author: John Snow Labs +name: bert_token_classifier_est_ner_v2 +date: 2023-11-06 +tags: [et, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: et +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `EstBERT_NER_v2` is a Estonian model originally trained by `tartuNLP`. + +## Predicted Entities + +`TIME`, `ORG`, `MONEY`, `PER`, `GPE`, `DATE`, `PERCENT`, `TITLE`, `LOC`, `EVENT`, `PROD` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_est_ner_v2_et_5.2.0_3.0_1699304760907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_est_ner_v2_et_5.2.0_3.0_1699304760907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_est_ner_v2","et") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_est_ner_v2","et") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_est_ner_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|et| +|Size:|463.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/tartuNLP/EstBERT_NER_v2 +- https://metashare.ut.ee/repository/browse/reannotated-estonian-ner-corpus/bd43f1f614a511eca6e4fa163e9d45477d086613d2894fd5af79bf13e3f13594/ +- https://metashare.ut.ee/repository/browse/new-estonian-ner-corpus/98b6706c963c11eba6e4fa163e9d45470bcd0533b6994c93ab8b8c628516ffed/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian_hu.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian_hu.md new file mode 100644 index 000000000000..7703ce6118a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian_hu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hungarian bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian BertForTokenClassification from NYTK +author: John Snow Labs +name: bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian +date: 2023-11-06 +tags: [bert, hu, open_source, token_classification, onnx] +task: Named Entity Recognition +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian` is a Hungarian model originally trained by NYTK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian_hu_5.2.0_3.0_1699308204116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian_hu_5.2.0_3.0_1699308204116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian","hu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian", "hu") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_named_entity_recognition_nerkor_hungarian_hungarian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|hu| +|Size:|412.5 MB| + +## References + +https://huggingface.co/NYTK/named-entity-recognition-nerkor-hubert-hungarian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_navigation_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_navigation_chinese_zh.md new file mode 100644 index 000000000000..220505a26782 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_navigation_chinese_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForTokenClassification Cased model (from Kunologist) +author: John Snow Labs +name: bert_token_classifier_navigation_chinese +date: 2023-11-06 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `navigation-chinese` is a Chinese model originally trained by `Kunologist`. + +## Predicted Entities + +`IQ`, `X`, `IK`, `IO`, `IB`, `IM`, `IA`, `ID`, `DO`, `IH`, `II`, `IC`, `IG`, `IJ`, `DN`, `IN`, `IP` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_navigation_chinese_zh_5.2.0_3.0_1699302683861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_navigation_chinese_zh_5.2.0_3.0_1699302683861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_navigation_chinese","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_navigation_chinese","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_navigation_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Kunologist/navigation-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_satellite_instrument_ner_pt.md b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_satellite_instrument_ner_pt.md new file mode 100644 index 000000000000..c3d434d14ebf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-bert_token_classifier_satellite_instrument_ner_pt.md @@ -0,0 +1,102 @@ +--- +layout: model +title: Portuguese BertForTokenClassification Cased model (from m-lin20) +author: John Snow Labs +name: bert_token_classifier_satellite_instrument_ner +date: 2023-11-06 +tags: [pt, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `satellite-instrument-bert-NER` is a Portuguese model originally trained by `m-lin20`. + +## Predicted Entities + +`instrument`, `satellite` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_satellite_instrument_ner_pt_5.2.0_3.0_1699303598163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_satellite_instrument_ner_pt_5.2.0_3.0_1699303598163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_satellite_instrument_ner","pt") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_satellite_instrument_ner","pt") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_satellite_instrument_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/m-lin20/satellite-instrument-bert-NER +- https://github.com/THU-EarthInformationScienceLab/Satellite-Instrument-NER +- https://www.tandfonline.com/doi/full/10.1080/17538947.2022.2107098 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-jobbert_knowledge_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-06-jobbert_knowledge_extraction_en.md new file mode 100644 index 000000000000..63ea24878b9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-jobbert_knowledge_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jobbert_knowledge_extraction BertForTokenClassification from jjzha +author: John Snow Labs +name: jobbert_knowledge_extraction +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobbert_knowledge_extraction` is a English model originally trained by jjzha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobbert_knowledge_extraction_en_5.2.0_3.0_1699304016062.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobbert_knowledge_extraction_en_5.2.0_3.0_1699304016062.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("jobbert_knowledge_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("jobbert_knowledge_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobbert_knowledge_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/jjzha/jobbert_knowledge_extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-jobbert_skill_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-06-jobbert_skill_extraction_en.md new file mode 100644 index 000000000000..f5cbcfe3141c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-jobbert_skill_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jobbert_skill_extraction BertForTokenClassification from jjzha +author: John Snow Labs +name: jobbert_skill_extraction +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobbert_skill_extraction` is a English model originally trained by jjzha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobbert_skill_extraction_en_5.2.0_3.0_1699304183375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobbert_skill_extraction_en_5.2.0_3.0_1699304183375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("jobbert_skill_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("jobbert_skill_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobbert_skill_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/jjzha/jobbert_skill_extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-negation_and_uncertainty_scope_detection_mbert_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-06-negation_and_uncertainty_scope_detection_mbert_fine_tuned_en.md new file mode 100644 index 000000000000..f977aec616f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-negation_and_uncertainty_scope_detection_mbert_fine_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English negation_and_uncertainty_scope_detection_mbert_fine_tuned BertForTokenClassification from ajtamayoh +author: John Snow Labs +name: negation_and_uncertainty_scope_detection_mbert_fine_tuned +date: 2023-11-06 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`negation_and_uncertainty_scope_detection_mbert_fine_tuned` is a English model originally trained by ajtamayoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/negation_and_uncertainty_scope_detection_mbert_fine_tuned_en_5.2.0_3.0_1699309597062.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/negation_and_uncertainty_scope_detection_mbert_fine_tuned_en_5.2.0_3.0_1699309597062.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("negation_and_uncertainty_scope_detection_mbert_fine_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("negation_and_uncertainty_scope_detection_mbert_fine_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|negation_and_uncertainty_scope_detection_mbert_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/ajtamayoh/Negation_and_Uncertainty_Scope_Detection_mBERT_fine_tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-06-vietnamese_ner_v1_4_0a2_vi.md b/docs/_posts/ahmedlone127/2023-11-06-vietnamese_ner_v1_4_0a2_vi.md new file mode 100644 index 000000000000..0bcadcac4d37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-06-vietnamese_ner_v1_4_0a2_vi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Vietnamese vietnamese_ner_v1_4_0a2 BertForTokenClassification from undertheseanlp +author: John Snow Labs +name: vietnamese_ner_v1_4_0a2 +date: 2023-11-06 +tags: [bert, vi, open_source, token_classification, onnx] +task: Named Entity Recognition +language: vi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vietnamese_ner_v1_4_0a2` is a Vietnamese model originally trained by undertheseanlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vietnamese_ner_v1_4_0a2_vi_5.2.0_3.0_1699312310697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vietnamese_ner_v1_4_0a2_vi_5.2.0_3.0_1699312310697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("vietnamese_ner_v1_4_0a2","vi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("vietnamese_ner_v1_4_0a2", "vi") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vietnamese_ner_v1_4_0a2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|vi| +|Size:|428.8 MB| + +## References + +https://huggingface.co/undertheseanlp/vietnamese-ner-v1.4.0a2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ade_bio_clinicalbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-ade_bio_clinicalbert_ner_en.md new file mode 100644 index 000000000000..37686dd7ee64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ade_bio_clinicalbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ade_bio_clinicalbert_ner BertForTokenClassification from commanderstrife +author: John Snow Labs +name: ade_bio_clinicalbert_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ade_bio_clinicalbert_ner` is a English model originally trained by commanderstrife. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ade_bio_clinicalbert_ner_en_5.2.0_3.0_1699386038757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ade_bio_clinicalbert_ner_en_5.2.0_3.0_1699386038757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ade_bio_clinicalbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ade_bio_clinicalbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ade_bio_clinicalbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.3 MB| + +## References + +https://huggingface.co/commanderstrife/ADE-Bio_ClinicalBERT-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-arabert_arabic_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-arabert_arabic_ner_en.md new file mode 100644 index 000000000000..2e0904f5914e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-arabert_arabic_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English arabert_arabic_ner BertForTokenClassification from PRAli22 +author: John Snow Labs +name: arabert_arabic_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_arabic_ner` is a English model originally trained by PRAli22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_arabic_ner_en_5.2.0_3.0_1699396006785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_arabic_ner_en_5.2.0_3.0_1699396006785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("arabert_arabic_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("arabert_arabic_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_arabic_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.1 MB| + +## References + +https://huggingface.co/PRAli22/arabert_arabic_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-assignment2_meher_test3_en.md b/docs/_posts/ahmedlone127/2023-11-07-assignment2_meher_test3_en.md new file mode 100644 index 000000000000..0e70cd8060b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-assignment2_meher_test3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English assignment2_meher_test3 BertForTokenClassification from mpalaval +author: John Snow Labs +name: assignment2_meher_test3 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`assignment2_meher_test3` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/assignment2_meher_test3_en_5.2.0_3.0_1699383048254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/assignment2_meher_test3_en_5.2.0_3.0_1699383048254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("assignment2_meher_test3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("assignment2_meher_test3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|assignment2_meher_test3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/assignment2_meher_test3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-autotrain_medicaltokenclassification_1279048948_en.md b/docs/_posts/ahmedlone127/2023-11-07-autotrain_medicaltokenclassification_1279048948_en.md new file mode 100644 index 000000000000..49cb5cfdcb28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-autotrain_medicaltokenclassification_1279048948_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_medicaltokenclassification_1279048948 BertForTokenClassification from shreyas-singh +author: John Snow Labs +name: autotrain_medicaltokenclassification_1279048948 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_medicaltokenclassification_1279048948` is a English model originally trained by shreyas-singh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_medicaltokenclassification_1279048948_en_5.2.0_3.0_1699393608127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_medicaltokenclassification_1279048948_en_5.2.0_3.0_1699393608127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("autotrain_medicaltokenclassification_1279048948","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("autotrain_medicaltokenclassification_1279048948", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_medicaltokenclassification_1279048948| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/shreyas-singh/autotrain-MedicalTokenClassification-1279048948 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bde_abbrev_batteryonlybert_cased_base_en.md b/docs/_posts/ahmedlone127/2023-11-07-bde_abbrev_batteryonlybert_cased_base_en.md new file mode 100644 index 000000000000..ff454574f92e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bde_abbrev_batteryonlybert_cased_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bde_abbrev_batteryonlybert_cased_base BertForTokenClassification from batterydata +author: John Snow Labs +name: bde_abbrev_batteryonlybert_cased_base +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bde_abbrev_batteryonlybert_cased_base` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bde_abbrev_batteryonlybert_cased_base_en_5.2.0_3.0_1699388506500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bde_abbrev_batteryonlybert_cased_base_en_5.2.0_3.0_1699388506500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bde_abbrev_batteryonlybert_cased_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bde_abbrev_batteryonlybert_cased_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bde_abbrev_batteryonlybert_cased_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.5 MB| + +## References + +https://huggingface.co/batterydata/bde-abbrev-batteryonlybert-cased-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bde_sayula_popoluca_bert_cased_base_en.md b/docs/_posts/ahmedlone127/2023-11-07-bde_sayula_popoluca_bert_cased_base_en.md new file mode 100644 index 000000000000..d77d445f080c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bde_sayula_popoluca_bert_cased_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bde_sayula_popoluca_bert_cased_base BertForTokenClassification from batterydata +author: John Snow Labs +name: bde_sayula_popoluca_bert_cased_base +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bde_sayula_popoluca_bert_cased_base` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bde_sayula_popoluca_bert_cased_base_en_5.2.0_3.0_1699394949156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bde_sayula_popoluca_bert_cased_base_en_5.2.0_3.0_1699394949156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bde_sayula_popoluca_bert_cased_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bde_sayula_popoluca_bert_cased_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bde_sayula_popoluca_bert_cased_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/batterydata/bde-pos-bert-cased-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bengali_language_ner_bn.md b/docs/_posts/ahmedlone127/2023-11-07-bengali_language_ner_bn.md new file mode 100644 index 000000000000..ecf724d7f061 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bengali_language_ner_bn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Bengali bengali_language_ner BertForTokenClassification from Suchandra +author: John Snow Labs +name: bengali_language_ner +date: 2023-11-07 +tags: [bert, bn, open_source, token_classification, onnx] +task: Named Entity Recognition +language: bn +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bengali_language_ner` is a Bengali model originally trained by Suchandra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bengali_language_ner_bn_5.2.0_3.0_1699385474779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bengali_language_ner_bn_5.2.0_3.0_1699385474779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bengali_language_ner","bn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bengali_language_ner", "bn") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bengali_language_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|bn| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Suchandra/bengali_language_NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_anatomical_en.md b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_anatomical_en.md new file mode 100644 index 000000000000..9d3ea9a30789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_anatomical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_anatomical BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_anatomical +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_anatomical` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_anatomical_en_5.2.0_3.0_1699385335505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_anatomical_en_5.2.0_3.0_1699385335505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_anatomical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_anatomical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_anatomical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Anatomical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_bioprocess_en.md b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_bioprocess_en.md new file mode 100644 index 000000000000..998aa5840b03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_bioprocess_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_bioprocess BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_bioprocess +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_bioprocess` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_bioprocess_en_5.2.0_3.0_1699315652198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_bioprocess_en_5.2.0_3.0_1699315652198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_bioprocess","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_bioprocess", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_bioprocess| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Bioprocess \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_cell_component_en.md b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_cell_component_en.md new file mode 100644 index 000000000000..8c819dbc1f6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_cell_component_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_cell_component BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_cell_component +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_cell_component` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_cell_component_en_5.2.0_3.0_1699385532471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_cell_component_en_5.2.0_3.0_1699385532471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_cell_component","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_cell_component", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_cell_component| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Cell-Component \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_cell_line_en.md b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_cell_line_en.md new file mode 100644 index 000000000000..2186b6892e41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_cell_line_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_cell_line BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_cell_line +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_cell_line` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_cell_line_en_5.2.0_3.0_1699384184050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_cell_line_en_5.2.0_3.0_1699384184050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_cell_line","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_cell_line", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_cell_line| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Cell-Line \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_organism_en.md b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_organism_en.md new file mode 100644 index 000000000000..fbc475de54c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_organism_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_organism BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_organism +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_organism` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_organism_en_5.2.0_3.0_1699383678035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_organism_en_5.2.0_3.0_1699383678035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_organism","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_organism", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_organism| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMedBERT-NER-Organism \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_variant_en.md b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_variant_en.md new file mode 100644 index 000000000000..60c20acd1135 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bent_pubmedbert_ner_variant_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bent_pubmedbert_ner_variant BertForTokenClassification from pruas +author: John Snow Labs +name: bent_pubmedbert_ner_variant +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bent_pubmedbert_ner_variant` is a English model originally trained by pruas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_variant_en_5.2.0_3.0_1699316972395.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bent_pubmedbert_ner_variant_en_5.2.0_3.0_1699316972395.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bent_pubmedbert_ner_variant","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bent_pubmedbert_ner_variant", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bent_pubmedbert_ner_variant| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/pruas/BENT-PubMEdBERT-NER-Variant \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert4ner_base_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert4ner_base_chinese_zh.md new file mode 100644 index 000000000000..dd4cf208c321 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert4ner_base_chinese_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert4ner_base_chinese BertForTokenClassification from shibing624 +author: John Snow Labs +name: bert4ner_base_chinese +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert4ner_base_chinese` is a Chinese model originally trained by shibing624. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert4ner_base_chinese_zh_5.2.0_3.0_1699386449688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert4ner_base_chinese_zh_5.2.0_3.0_1699386449688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert4ner_base_chinese","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert4ner_base_chinese", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert4ner_base_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/shibing624/bert4ner-base-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_cased_finetuned_conll03_english_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_cased_finetuned_conll03_english_en.md new file mode 100644 index 000000000000..37eb94f1668e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_cased_finetuned_conll03_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_conll03_english BertForTokenClassification from dbmdz +author: John Snow Labs +name: bert_base_cased_finetuned_conll03_english +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_conll03_english` is a English model originally trained by dbmdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_conll03_english_en_5.2.0_3.0_1699325678056.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_conll03_english_en_5.2.0_3.0_1699325678056.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_finetuned_conll03_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_finetuned_conll03_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_conll03_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/dbmdz/bert-base-cased-finetuned-conll03-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_cased_literary_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_cased_literary_ner_en.md new file mode 100644 index 000000000000..cfd427dae0fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_cased_literary_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_literary_ner BertForTokenClassification from compnet-renard +author: John Snow Labs +name: bert_base_cased_literary_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_literary_ner` is a English model originally trained by compnet-renard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_literary_ner_en_5.2.0_3.0_1699387100983.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_literary_ner_en_5.2.0_3.0_1699387100983.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_literary_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_literary_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_literary_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/compnet-renard/bert-base-cased-literary-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_danielwei0214_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_danielwei0214_zh.md new file mode 100644 index 000000000000..19019210b5ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_danielwei0214_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_base_chinese_finetuned_ner_danielwei0214 BertForTokenClassification from Danielwei0214 +author: John Snow Labs +name: bert_base_chinese_finetuned_ner_danielwei0214 +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_ner_danielwei0214` is a Chinese model originally trained by Danielwei0214. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_danielwei0214_zh_5.2.0_3.0_1699389183060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_danielwei0214_zh_5.2.0_3.0_1699389183060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_finetuned_ner_danielwei0214","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_finetuned_ner_danielwei0214", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_ner_danielwei0214| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/Danielwei0214/bert-base-chinese-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_gyr66_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_gyr66_zh.md new file mode 100644 index 000000000000..b3ce4d4f3c12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_gyr66_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_base_chinese_finetuned_ner_gyr66 BertForTokenClassification from gyr66 +author: John Snow Labs +name: bert_base_chinese_finetuned_ner_gyr66 +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_ner_gyr66` is a Chinese model originally trained by gyr66. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_gyr66_zh_5.2.0_3.0_1699386946416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_gyr66_zh_5.2.0_3.0_1699386946416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_finetuned_ner_gyr66","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_finetuned_ner_gyr66", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_ner_gyr66| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/gyr66/bert-base-chinese-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_leonadase_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_leonadase_en.md new file mode 100644 index 000000000000..c80998ad774e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_finetuned_ner_leonadase_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_ner_leonadase BertForTokenClassification from leonadase +author: John Snow Labs +name: bert_base_chinese_finetuned_ner_leonadase +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_ner_leonadase` is a English model originally trained by leonadase. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_leonadase_en_5.2.0_3.0_1699389681080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_leonadase_en_5.2.0_3.0_1699389681080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_finetuned_ner_leonadase","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_finetuned_ner_leonadase", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_ner_leonadase| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/leonadase/bert-base-chinese-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_medical_ner_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_medical_ner_zh.md new file mode 100644 index 000000000000..4acf5c60b0aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_medical_ner_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_base_chinese_medical_ner BertForTokenClassification from iioSnail +author: John Snow Labs +name: bert_base_chinese_medical_ner +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_medical_ner` is a Chinese model originally trained by iioSnail. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_medical_ner_zh_5.2.0_3.0_1699386242094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_medical_ner_zh_5.2.0_3.0_1699386242094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_medical_ner","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_medical_ner", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_medical_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/iioSnail/bert-base-chinese-medical-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_stock_ner_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_stock_ner_zh.md new file mode 100644 index 000000000000..6a0cee910b09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_chinese_stock_ner_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_base_chinese_stock_ner BertForTokenClassification from JasonYan +author: John Snow Labs +name: bert_base_chinese_stock_ner +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_stock_ner` is a Chinese model originally trained by JasonYan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_stock_ner_zh_5.2.0_3.0_1699383470842.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_stock_ner_zh_5.2.0_3.0_1699383470842.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_stock_ner","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_stock_ner", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_stock_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/JasonYan/bert-base-chinese-stock-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_finetuned_sayula_popoluca_ud_english_ewt_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_finetuned_sayula_popoluca_ud_english_ewt_en.md new file mode 100644 index 000000000000..564e9c7c3cb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_finetuned_sayula_popoluca_ud_english_ewt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_finetuned_sayula_popoluca_ud_english_ewt BertForTokenClassification from TokenfreeEMNLPSubmission +author: John Snow Labs +name: bert_base_finetuned_sayula_popoluca_ud_english_ewt +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_sayula_popoluca_ud_english_ewt` is a English model originally trained by TokenfreeEMNLPSubmission. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sayula_popoluca_ud_english_ewt_en_5.2.0_3.0_1699383854788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_sayula_popoluca_ud_english_ewt_en_5.2.0_3.0_1699383854788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_finetuned_sayula_popoluca_ud_english_ewt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_finetuned_sayula_popoluca_ud_english_ewt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_sayula_popoluca_ud_english_ewt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/TokenfreeEMNLPSubmission/bert-base-finetuned-pos-ud-english-ewt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_finnish_uncased_ner_fi.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_finnish_uncased_ner_fi.md new file mode 100644 index 000000000000..97498a8b6047 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_finnish_uncased_ner_fi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Finnish bert_base_finnish_uncased_ner BertForTokenClassification from iguanodon-ai +author: John Snow Labs +name: bert_base_finnish_uncased_ner +date: 2023-11-07 +tags: [bert, fi, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finnish_uncased_ner` is a Finnish model originally trained by iguanodon-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finnish_uncased_ner_fi_5.2.0_3.0_1699331821322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finnish_uncased_ner_fi_5.2.0_3.0_1699331821322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_finnish_uncased_ner","fi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_finnish_uncased_ner", "fi") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finnish_uncased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fi| +|Size:|464.7 MB| + +## References + +https://huggingface.co/iguanodon-ai/bert-base-finnish-uncased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual_xx.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual_xx.md new file mode 100644 index 000000000000..f375089d38cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual BertForTokenClassification from DunnBC22 +author: John Snow Labs +name: bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual +date: 2023-11-07 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual` is a Multilingual model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual_xx_5.2.0_3.0_1699320480543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual_xx_5.2.0_3.0_1699320480543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_fine_tuned_ner_wikineural_multilingual| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/DunnBC22/bert-base-multilingual-cased-fine_tuned-ner-WikiNeural_Multilingual \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_finetuned_conll03_spanish_xx.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_finetuned_conll03_spanish_xx.md new file mode 100644 index 000000000000..a8a8f26dbceb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_finetuned_conll03_spanish_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_conll03_spanish BertForTokenClassification from dbmdz +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_conll03_spanish +date: 2023-11-07 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_conll03_spanish` is a Multilingual model originally trained by dbmdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_conll03_spanish_xx_5.2.0_3.0_1699396605070.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_conll03_spanish_xx_5.2.0_3.0_1699396605070.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_finetuned_conll03_spanish","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_multilingual_cased_finetuned_conll03_spanish", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_conll03_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/dbmdz/bert-base-multilingual-cased-finetuned-conll03-spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_finetuned_sayula_popoluca_xx.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_finetuned_sayula_popoluca_xx.md new file mode 100644 index 000000000000..955fee9e14c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_finetuned_sayula_popoluca_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_sayula_popoluca BertForTokenClassification from MayaGalvez +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_sayula_popoluca +date: 2023-11-07 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_sayula_popoluca` is a Multilingual model originally trained by MayaGalvez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_sayula_popoluca_xx_5.2.0_3.0_1699324295140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_sayula_popoluca_xx_5.2.0_3.0_1699324295140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_finetuned_sayula_popoluca","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_multilingual_cased_finetuned_sayula_popoluca", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/MayaGalvez/bert-base-multilingual-cased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_sayula_popoluca_english_xx.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_sayula_popoluca_english_xx.md new file mode 100644 index 000000000000..b32efb90bdc6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_multilingual_cased_sayula_popoluca_english_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_sayula_popoluca_english BertForTokenClassification from gbwsolutions +author: John Snow Labs +name: bert_base_multilingual_cased_sayula_popoluca_english +date: 2023-11-07 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_sayula_popoluca_english` is a Multilingual model originally trained by gbwsolutions. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sayula_popoluca_english_xx_5.2.0_3.0_1699389224592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_sayula_popoluca_english_xx_5.2.0_3.0_1699389224592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_sayula_popoluca_english","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_multilingual_cased_sayula_popoluca_english", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_sayula_popoluca_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.2 MB| + +## References + +https://huggingface.co/gbwsolutions/bert-base-multilingual-cased-pos-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_named_entity_extractor_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_named_entity_extractor_en.md new file mode 100644 index 000000000000..0d7f99d291bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_named_entity_extractor_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_named_entity_extractor BertForTokenClassification from Azma-AI +author: John Snow Labs +name: bert_base_named_entity_extractor +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_named_entity_extractor` is a English model originally trained by Azma-AI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_named_entity_extractor_en_5.2.0_3.0_1699384799757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_named_entity_extractor_en_5.2.0_3.0_1699384799757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_named_entity_extractor","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_named_entity_extractor", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_named_entity_extractor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Azma-AI/bert-base-named-entity-extractor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_ner_058_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_ner_058_en.md new file mode 100644 index 000000000000..6ae751705349 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_ner_058_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_ner_058 BertForTokenClassification from NguyenVanHieu1605 +author: John Snow Labs +name: bert_base_ner_058 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_ner_058` is a English model originally trained by NguyenVanHieu1605. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_ner_058_en_5.2.0_3.0_1699385797170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_ner_058_en_5.2.0_3.0_1699385797170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_ner_058","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_ner_058", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_ner_058| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/NguyenVanHieu1605/bert-base-ner-058 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_ner_reptile_5_datasets_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_ner_reptile_5_datasets_en.md new file mode 100644 index 000000000000..08ce5a910f52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_ner_reptile_5_datasets_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_ner_reptile_5_datasets BertForTokenClassification from ai-forever +author: John Snow Labs +name: bert_base_ner_reptile_5_datasets +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_ner_reptile_5_datasets` is a English model originally trained by ai-forever. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_ner_reptile_5_datasets_en_5.2.0_3.0_1699401088372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_ner_reptile_5_datasets_en_5.2.0_3.0_1699401088372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_ner_reptile_5_datasets","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_ner_reptile_5_datasets", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_ner_reptile_5_datasets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/ai-forever/bert-base-NER-reptile-5-datasets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_portuguese_ner_enamex_pt.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_portuguese_ner_enamex_pt.md new file mode 100644 index 000000000000..9569d5d45303 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_portuguese_ner_enamex_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bert_base_portuguese_ner_enamex BertForTokenClassification from marcosgg +author: John Snow Labs +name: bert_base_portuguese_ner_enamex +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_ner_enamex` is a Portuguese model originally trained by marcosgg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_ner_enamex_pt_5.2.0_3.0_1699388924890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_ner_enamex_pt_5.2.0_3.0_1699388924890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_portuguese_ner_enamex","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_portuguese_ner_enamex", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_ner_enamex| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|405.9 MB| + +## References + +https://huggingface.co/marcosgg/bert-base-pt-ner-enamex \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_romanian_ner_ro.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_romanian_ner_ro.md new file mode 100644 index 000000000000..0fe870d7f606 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_romanian_ner_ro.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Moldavian, Moldovan, Romanian bert_base_romanian_ner BertForTokenClassification from dumitrescustefan +author: John Snow Labs +name: bert_base_romanian_ner +date: 2023-11-07 +tags: [bert, ro, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ro +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_romanian_ner` is a Moldavian, Moldovan, Romanian model originally trained by dumitrescustefan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_romanian_ner_ro_5.2.0_3.0_1699386817889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_romanian_ner_ro_5.2.0_3.0_1699386817889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_romanian_ner","ro") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_romanian_ner", "ro") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_romanian_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ro| +|Size:|464.1 MB| + +## References + +https://huggingface.co/dumitrescustefan/bert-base-romanian-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_spanish_wwm_cased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_spanish_wwm_cased_finetuned_ner_en.md new file mode 100644 index 000000000000..09c073a659aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_spanish_wwm_cased_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_ner BertForTokenClassification from dccuchile +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_ner` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_ner_en_5.2.0_3.0_1699384970511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_ner_en_5.2.0_3.0_1699384970511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_spanish_wwm_cased_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_spanish_wwm_cased_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_city_country_ner_ml6team_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_city_country_ner_ml6team_en.md new file mode 100644 index 000000000000..3340f3750199 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_city_country_ner_ml6team_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_city_country_ner_ml6team BertForTokenClassification from ml6team +author: John Snow Labs +name: bert_base_uncased_city_country_ner_ml6team +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_city_country_ner_ml6team` is a English model originally trained by ml6team. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_city_country_ner_ml6team_en_5.2.0_3.0_1699383583804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_city_country_ner_ml6team_en_5.2.0_3.0_1699383583804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_city_country_ner_ml6team","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_city_country_ner_ml6team", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_city_country_ner_ml6team| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/ml6team/bert-base-uncased-city-country-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_conll2003_hfeng_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_conll2003_hfeng_en.md new file mode 100644 index 000000000000..0e625a4165d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_conll2003_hfeng_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_conll2003_hfeng BertForTokenClassification from hfeng +author: John Snow Labs +name: bert_base_uncased_conll2003_hfeng +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_conll2003_hfeng` is a English model originally trained by hfeng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_conll2003_hfeng_en_5.2.0_3.0_1699389315261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_conll2003_hfeng_en_5.2.0_3.0_1699389315261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_conll2003_hfeng","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_conll2003_hfeng", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_conll2003_hfeng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/hfeng/bert_base_uncased_conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_finetuned_scientific_eval_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_finetuned_scientific_eval_en.md new file mode 100644 index 000000000000..26ffc694f140 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_base_uncased_finetuned_scientific_eval_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_scientific_eval BertForTokenClassification from reyhanemyr +author: John Snow Labs +name: bert_base_uncased_finetuned_scientific_eval +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_scientific_eval` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_scientific_eval_en_5.2.0_3.0_1699384660893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_scientific_eval_en_5.2.0_3.0_1699384660893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_finetuned_scientific_eval","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_finetuned_scientific_eval", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_scientific_eval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/reyhanemyr/bert-base-uncased-finetuned-scientific-eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_animacy_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_animacy_en.md new file mode 100644 index 000000000000..d69b40b07758 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_animacy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_animacy BertForTokenClassification from andrewt-cam +author: John Snow Labs +name: bert_finetuned_animacy +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_animacy` is a English model originally trained by andrewt-cam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_animacy_en_5.2.0_3.0_1699390063073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_animacy_en_5.2.0_3.0_1699390063073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_animacy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_animacy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_animacy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/andrewt-cam/bert-finetuned-animacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_history_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_history_ner_en.md new file mode 100644 index 000000000000..b6c96110e4e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_history_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_history_ner BertForTokenClassification from QuanAI +author: John Snow Labs +name: bert_finetuned_history_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_history_ner` is a English model originally trained by QuanAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_history_ner_en_5.2.0_3.0_1699387810524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_history_ner_en_5.2.0_3.0_1699387810524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_history_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_history_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_history_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/QuanAI/bert-finetuned-history-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_n2c2_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_n2c2_ner_en.md new file mode 100644 index 000000000000..b43734091361 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_n2c2_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_n2c2_ner BertForTokenClassification from georgeleung30 +author: John Snow Labs +name: bert_finetuned_n2c2_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_n2c2_ner` is a English model originally trained by georgeleung30. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_n2c2_ner_en_5.2.0_3.0_1699389048043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_n2c2_ner_en_5.2.0_3.0_1699389048043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_n2c2_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_n2c2_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_n2c2_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/georgeleung30/bert-finetuned-n2c2-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_accelerate_sanjay7178_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_accelerate_sanjay7178_en.md new file mode 100644 index 000000000000..ef7d8472502e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_accelerate_sanjay7178_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_accelerate_sanjay7178 BertForTokenClassification from sanjay7178 +author: John Snow Labs +name: bert_finetuned_ner_accelerate_sanjay7178 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_accelerate_sanjay7178` is a English model originally trained by sanjay7178. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_sanjay7178_en_5.2.0_3.0_1699400446204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_sanjay7178_en_5.2.0_3.0_1699400446204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_accelerate_sanjay7178","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_accelerate_sanjay7178", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_accelerate_sanjay7178| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/sanjay7178/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_applemoon_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_applemoon_en.md new file mode 100644 index 000000000000..08d3c0f7adcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_applemoon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_applemoon BertForTokenClassification from Applemoon +author: John Snow Labs +name: bert_finetuned_ner_applemoon +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_applemoon` is a English model originally trained by Applemoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_applemoon_en_5.2.0_3.0_1699389684774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_applemoon_en_5.2.0_3.0_1699389684774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_applemoon","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_applemoon", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_applemoon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Applemoon/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_default_parameters_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_default_parameters_en.md new file mode 100644 index 000000000000..aa8d99a23ceb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_default_parameters_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_default_parameters BertForTokenClassification from Mabel465 +author: John Snow Labs +name: bert_finetuned_ner_default_parameters +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_default_parameters` is a English model originally trained by Mabel465. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_default_parameters_en_5.2.0_3.0_1699398163108.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_default_parameters_en_5.2.0_3.0_1699398163108.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_default_parameters","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_default_parameters", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_default_parameters| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Mabel465/bert-finetuned-ner.default_parameters \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_konic_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_konic_en.md new file mode 100644 index 000000000000..abc7493cb3bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_konic_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_konic BertForTokenClassification from Konic +author: John Snow Labs +name: bert_finetuned_ner_konic +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_konic` is a English model originally trained by Konic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_konic_en_5.2.0_3.0_1699384414045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_konic_en_5.2.0_3.0_1699384414045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_konic","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_konic", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_konic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Konic/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_lamthanhtin2811_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_lamthanhtin2811_en.md new file mode 100644 index 000000000000..65e9b1c62b2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_lamthanhtin2811_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_lamthanhtin2811 BertForTokenClassification from lamthanhtin2811 +author: John Snow Labs +name: bert_finetuned_ner_lamthanhtin2811 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_lamthanhtin2811` is a English model originally trained by lamthanhtin2811. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_lamthanhtin2811_en_5.2.0_3.0_1699387480751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_lamthanhtin2811_en_5.2.0_3.0_1699387480751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_lamthanhtin2811","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_lamthanhtin2811", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_lamthanhtin2811| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/lamthanhtin2811/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_lightsaber689_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_lightsaber689_en.md new file mode 100644 index 000000000000..3b5eb09aea04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_lightsaber689_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_lightsaber689 BertForTokenClassification from lightsaber689 +author: John Snow Labs +name: bert_finetuned_ner_lightsaber689 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_lightsaber689` is a English model originally trained by lightsaber689. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_lightsaber689_en_5.2.0_3.0_1699385847274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_lightsaber689_en_5.2.0_3.0_1699385847274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_lightsaber689","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_lightsaber689", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_lightsaber689| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/lightsaber689/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_minea_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_minea_en.md new file mode 100644 index 000000000000..b543b0b783d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_minea_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_minea BertForTokenClassification from minea +author: John Snow Labs +name: bert_finetuned_ner_minea +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_minea` is a English model originally trained by minea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_minea_en_5.2.0_3.0_1699386438023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_minea_en_5.2.0_3.0_1699386438023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_minea","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_minea", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_minea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/minea/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_pii_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_pii_en.md new file mode 100644 index 000000000000..1f4a204dfcd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_pii_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_pii BertForTokenClassification from ArunaSaraswathy +author: John Snow Labs +name: bert_finetuned_ner_pii +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_pii` is a English model originally trained by ArunaSaraswathy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_pii_en_5.2.0_3.0_1699385100996.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_pii_en_5.2.0_3.0_1699385100996.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_pii","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_pii", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_pii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|404.0 MB| + +## References + +https://huggingface.co/ArunaSaraswathy/bert-finetuned-ner-pii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_rahulmukherji_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_rahulmukherji_en.md new file mode 100644 index 000000000000..2c4c94fa38f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_rahulmukherji_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_rahulmukherji BertForTokenClassification from rahulmukherji +author: John Snow Labs +name: bert_finetuned_ner_rahulmukherji +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_rahulmukherji` is a English model originally trained by rahulmukherji. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_rahulmukherji_en_5.2.0_3.0_1699399643914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_rahulmukherji_en_5.2.0_3.0_1699399643914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_rahulmukherji","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_rahulmukherji", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_rahulmukherji| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/rahulmukherji/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_vbhasin_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_vbhasin_en.md new file mode 100644 index 000000000000..c9462e658f07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_ner_vbhasin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_vbhasin BertForTokenClassification from vbhasin +author: John Snow Labs +name: bert_finetuned_ner_vbhasin +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_vbhasin` is a English model originally trained by vbhasin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_vbhasin_en_5.2.0_3.0_1699389493005.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_vbhasin_en_5.2.0_3.0_1699389493005.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_vbhasin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_vbhasin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_vbhasin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/vbhasin/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_tech_product_name_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_tech_product_name_ner_en.md new file mode 100644 index 000000000000..bc4df13b32f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_tech_product_name_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_tech_product_name_ner BertForTokenClassification from ashleyliu31 +author: John Snow Labs +name: bert_finetuned_tech_product_name_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_tech_product_name_ner` is a English model originally trained by ashleyliu31. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_tech_product_name_ner_en_5.2.0_3.0_1699383979008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_tech_product_name_ner_en_5.2.0_3.0_1699383979008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_tech_product_name_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_tech_product_name_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_tech_product_name_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/ashleyliu31/bert-finetuned-tech-product-name-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_unpunctual_text_segmentation_v2_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_unpunctual_text_segmentation_v2_en.md new file mode 100644 index 000000000000..52d0e8407d72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_finetuned_unpunctual_text_segmentation_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_unpunctual_text_segmentation_v2 BertForTokenClassification from TankuVie +author: John Snow Labs +name: bert_finetuned_unpunctual_text_segmentation_v2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_unpunctual_text_segmentation_v2` is a English model originally trained by TankuVie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_unpunctual_text_segmentation_v2_en_5.2.0_3.0_1699382983044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_unpunctual_text_segmentation_v2_en_5.2.0_3.0_1699382983044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_unpunctual_text_segmentation_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_unpunctual_text_segmentation_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_unpunctual_text_segmentation_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/TankuVie/bert-finetuned-unpunctual-text-segmentation-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_german_ler_de.md b/docs/_posts/ahmedlone127/2023-11-07-bert_german_ler_de.md new file mode 100644 index 000000000000..b4d5f58ed878 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_german_ler_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_german_ler BertForTokenClassification from elenanereiss +author: John Snow Labs +name: bert_german_ler +date: 2023-11-07 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_german_ler` is a German model originally trained by elenanereiss. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_german_ler_de_5.2.0_3.0_1699398163265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_german_ler_de_5.2.0_3.0_1699398163265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_german_ler","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_german_ler", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_german_ler| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|407.0 MB| + +## References + +https://huggingface.co/elenanereiss/bert-german-ler \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_large_cased_ft_ner_maplestory_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_large_cased_ft_ner_maplestory_en.md new file mode 100644 index 000000000000..cce057309575 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_large_cased_ft_ner_maplestory_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_cased_ft_ner_maplestory BertForTokenClassification from nxaliao +author: John Snow Labs +name: bert_large_cased_ft_ner_maplestory +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_cased_ft_ner_maplestory` is a English model originally trained by nxaliao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_cased_ft_ner_maplestory_en_5.2.0_3.0_1699388820984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_cased_ft_ner_maplestory_en_5.2.0_3.0_1699388820984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_large_cased_ft_ner_maplestory","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_large_cased_ft_ner_maplestory", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_cased_ft_ner_maplestory| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/nxaliao/bert-large-cased-ft-ner-maplestory \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_large_portuguese_ner_enamex_pt.md b/docs/_posts/ahmedlone127/2023-11-07-bert_large_portuguese_ner_enamex_pt.md new file mode 100644 index 000000000000..9be2e45f6888 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_large_portuguese_ner_enamex_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bert_large_portuguese_ner_enamex BertForTokenClassification from marcosgg +author: John Snow Labs +name: bert_large_portuguese_ner_enamex +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_portuguese_ner_enamex` is a Portuguese model originally trained by marcosgg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_ner_enamex_pt_5.2.0_3.0_1699388439394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_portuguese_ner_enamex_pt_5.2.0_3.0_1699388439394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_large_portuguese_ner_enamex","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_large_portuguese_ner_enamex", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_portuguese_ner_enamex| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|1.2 GB| + +## References + +https://huggingface.co/marcosgg/bert-large-pt-ner-enamex \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_medical_ner_proj_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_medical_ner_proj_en.md new file mode 100644 index 000000000000..d87525a3f24d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_medical_ner_proj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_medical_ner_proj BertForTokenClassification from medical-ner-proj +author: John Snow Labs +name: bert_medical_ner_proj +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_medical_ner_proj` is a English model originally trained by medical-ner-proj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_medical_ner_proj_en_5.2.0_3.0_1699383204967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_medical_ner_proj_en_5.2.0_3.0_1699383204967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_medical_ner_proj","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_medical_ner_proj", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_medical_ner_proj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/medical-ner-proj/bert-medical-ner-proj \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_ner_4_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_ner_4_en.md new file mode 100644 index 000000000000..27e12b48ecdb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_ner_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_4 BertForTokenClassification from mpalaval +author: John Snow Labs +name: bert_ner_4 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_4` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_4_en_5.2.0_3.0_1699397402535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_4_en_5.2.0_3.0_1699397402535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/bert-ner-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_portuguese_ner_archive_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_portuguese_ner_archive_en.md new file mode 100644 index 000000000000..c261ccd5da8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_portuguese_ner_archive_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_portuguese_ner_archive BertForTokenClassification from lfcc +author: John Snow Labs +name: bert_portuguese_ner_archive +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_portuguese_ner_archive` is a English model originally trained by lfcc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_portuguese_ner_archive_en_5.2.0_3.0_1699383518668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_portuguese_ner_archive_en_5.2.0_3.0_1699383518668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_portuguese_ner_archive","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_portuguese_ner_archive", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_portuguese_ner_archive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/lfcc/bert-portuguese-ner-archive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_restore_punctuation_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-07-bert_restore_punctuation_turkish_tr.md new file mode 100644 index 000000000000..1b0916bec289 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_restore_punctuation_turkish_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish bert_restore_punctuation_turkish BertForTokenClassification from uygarkurt +author: John Snow Labs +name: bert_restore_punctuation_turkish +date: 2023-11-07 +tags: [bert, tr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_restore_punctuation_turkish` is a Turkish model originally trained by uygarkurt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_restore_punctuation_turkish_tr_5.2.0_3.0_1699385993721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_restore_punctuation_turkish_tr_5.2.0_3.0_1699385993721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_restore_punctuation_turkish","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_restore_punctuation_turkish", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_restore_punctuation_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| + +## References + +https://huggingface.co/uygarkurt/bert-restore-punctuation-turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_tagalog_base_uncased_sayula_popoluca_tagger_tl.md b/docs/_posts/ahmedlone127/2023-11-07-bert_tagalog_base_uncased_sayula_popoluca_tagger_tl.md new file mode 100644 index 000000000000..6fb4cf87d534 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_tagalog_base_uncased_sayula_popoluca_tagger_tl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Tagalog bert_tagalog_base_uncased_sayula_popoluca_tagger BertForTokenClassification from syke9p3 +author: John Snow Labs +name: bert_tagalog_base_uncased_sayula_popoluca_tagger +date: 2023-11-07 +tags: [bert, tl, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tagalog_base_uncased_sayula_popoluca_tagger` is a Tagalog model originally trained by syke9p3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tagalog_base_uncased_sayula_popoluca_tagger_tl_5.2.0_3.0_1699388445151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tagalog_base_uncased_sayula_popoluca_tagger_tl_5.2.0_3.0_1699388445151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tagalog_base_uncased_sayula_popoluca_tagger","tl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tagalog_base_uncased_sayula_popoluca_tagger", "tl") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tagalog_base_uncased_sayula_popoluca_tagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tl| +|Size:|470.3 MB| + +## References + +https://huggingface.co/syke9p3/bert-tagalog-base-uncased-pos-tagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_chinese_ws_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_chinese_ws_zh.md new file mode 100644 index 000000000000..c61290efabee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_chinese_ws_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_tiny_chinese_ws BertForTokenClassification from ckiplab +author: John Snow Labs +name: bert_tiny_chinese_ws +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_chinese_ws` is a Chinese model originally trained by ckiplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_chinese_ws_zh_5.2.0_3.0_1699383183911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_chinese_ws_zh_5.2.0_3.0_1699383183911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_chinese_ws","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tiny_chinese_ws", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_chinese_ws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|43.0 MB| + +## References + +https://huggingface.co/ckiplab/bert-tiny-chinese-ws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_finetuned_finer_139_full_intel_cpu_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_finetuned_finer_139_full_intel_cpu_en.md new file mode 100644 index 000000000000..8b8340ad36ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_finetuned_finer_139_full_intel_cpu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tiny_finetuned_finer_139_full_intel_cpu BertForTokenClassification from muhtasham +author: John Snow Labs +name: bert_tiny_finetuned_finer_139_full_intel_cpu +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_finer_139_full_intel_cpu` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_139_full_intel_cpu_en_5.2.0_3.0_1699394753224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_139_full_intel_cpu_en_5.2.0_3.0_1699394753224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_finetuned_finer_139_full_intel_cpu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tiny_finetuned_finer_139_full_intel_cpu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_finer_139_full_intel_cpu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/muhtasham/bert-tiny-finetuned-finer-139-full-intel-cpu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_finetuned_ner_en.md new file mode 100644 index 000000000000..490eb18664ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_tiny_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tiny_finetuned_ner BertForTokenClassification from gagan3012 +author: John Snow Labs +name: bert_tiny_finetuned_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_ner` is a English model originally trained by gagan3012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_ner_en_5.2.0_3.0_1699386559106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_ner_en_5.2.0_3.0_1699386559106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tiny_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/gagan3012/bert-tiny-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_arabic_ner_ar.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_arabic_ner_ar.md new file mode 100644 index 000000000000..0aa97b36f1e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_arabic_ner_ar.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Arabic BertForTokenClassification Cased model (from hatmimoha) +author: John Snow Labs +name: bert_token_classifier_arabic_ner +date: 2023-11-07 +tags: [ar, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `arabic-ner` is a Arabic model originally trained by `hatmimoha`. + +## Predicted Entities + +`PRODUCT`, `COMPETITION`, `DATE`, `LOCATION`, `PERSON`, `ORGANIZATION`, `DISEASE`, `PRICE`, `EVENT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_arabic_ner_ar_5.2.0_3.0_1699317634318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_arabic_ner_ar_5.2.0_3.0_1699317634318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_arabic_ner","ar") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_arabic_ner","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_arabic_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/hatmimoha/arabic-ner +- https://github.com/hatmimoha/arabic-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_autotrain_oms_ner_bislama_1044135953_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_autotrain_oms_ner_bislama_1044135953_en.md new file mode 100644 index 000000000000..7bc982877f57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_autotrain_oms_ner_bislama_1044135953_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_token_classifier_autotrain_oms_ner_bislama_1044135953 BertForTokenClassification from danielmantisnlp +author: John Snow Labs +name: bert_token_classifier_autotrain_oms_ner_bislama_1044135953 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_autotrain_oms_ner_bislama_1044135953` is a English model originally trained by danielmantisnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_oms_ner_bislama_1044135953_en_5.2.0_3.0_1699319157111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_autotrain_oms_ner_bislama_1044135953_en_5.2.0_3.0_1699319157111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_autotrain_oms_ner_bislama_1044135953","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_autotrain_oms_ner_bislama_1044135953", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_autotrain_oms_ner_bislama_1044135953| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielmantisnlp/autotrain-oms-ner-bi-1044135953 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_jindai_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_jindai_zh.md new file mode 100644 index 000000000000..a793b2f78ef8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_jindai_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_token_classifier_base_han_chinese_sayula_popoluca_jindai BertForTokenClassification from ckiplab +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_sayula_popoluca_jindai +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_base_han_chinese_sayula_popoluca_jindai` is a Chinese model originally trained by ckiplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_jindai_zh_5.2.0_3.0_1699320699279.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_jindai_zh_5.2.0_3.0_1699320699279.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_jindai","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_jindai", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_sayula_popoluca_jindai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.7 MB| + +## References + +https://huggingface.co/ckiplab/bert-base-han-chinese-pos-jindai \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu_zh.md new file mode 100644 index 000000000000..703590249fd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu BertForTokenClassification from ckiplab +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu` is a Chinese model originally trained by ckiplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu_zh_5.2.0_3.0_1699322259275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu_zh_5.2.0_3.0_1699322259275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_sayula_popoluca_shanggu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|396.6 MB| + +## References + +https://huggingface.co/ckiplab/bert-base-han-chinese-pos-shanggu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai_zh.md new file mode 100644 index 000000000000..0161dd2ec7db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai BertForTokenClassification from ckiplab +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai` is a Chinese model originally trained by ckiplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai_zh_5.2.0_3.0_1699323722187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai_zh_5.2.0_3.0_1699323722187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_sayula_popoluca_xiandai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.6 MB| + +## References + +https://huggingface.co/ckiplab/bert-base-han-chinese-pos-xiandai \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_ws_zhonggu_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_ws_zhonggu_zh.md new file mode 100644 index 000000000000..ba80ea90e294 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_han_chinese_ws_zhonggu_zh.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Chinese BertForTokenClassification Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_token_classifier_base_han_chinese_ws_zhonggu +date: 2023-11-07 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-han-chinese-ws-zhonggu` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + +`B`, `I` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_zhonggu_zh_5.2.0_3.0_1699316982060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_han_chinese_ws_zhonggu_zh_5.2.0_3.0_1699316982060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_zhonggu","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_han_chinese_ws_zhonggu","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_han_chinese_ws_zhonggu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|395.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-han-chinese-ws-zhonggu +- https://github.com/ckiplab/han-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_swedish_cased_ner_sv.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_swedish_cased_ner_sv.md new file mode 100644 index 000000000000..26dd104b556a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_swedish_cased_ner_sv.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Swedish BertForTokenClassification Base Cased model (from KBLab) +author: John Snow Labs +name: bert_token_classifier_base_swedish_cased_ner +date: 2023-11-07 +tags: [sv, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-swedish-cased-ner` is a Swedish model originally trained by `KBLab`. + +## Predicted Entities + +`PER`, `LOC`, `TME`, `WRK`, `PRS/WRK`, `LOC/ORG`, `MSR`, `ORG`, `OBJ/ORG`, `ORG/PRS`, `OBJ`, `LOC/PRS`, `EVN` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_swedish_cased_ner_sv_5.2.0_3.0_1699316623845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_swedish_cased_ner_sv_5.2.0_3.0_1699316623845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_swedish_cased_ner","sv") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_swedish_cased_ner","sv") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_swedish_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/KBLab/bert-base-swedish-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc_en.md new file mode 100644 index 000000000000..eef00480a7c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc BertForTokenClassification from Jzuluaga +author: John Snow Labs +name: bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc` is a English model originally trained by Jzuluaga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc_en_5.2.0_3.0_1699318386290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc_en_5.2.0_3.0_1699318386290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_base_token_classification_for_atc_english_uwb_atcc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Jzuluaga/bert-base-token-classification-for-atc-en-uwb-atcc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_berturk_uncased_keyword_discriminator_tr.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_berturk_uncased_keyword_discriminator_tr.md new file mode 100644 index 000000000000..82e15693e365 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_berturk_uncased_keyword_discriminator_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish BertForTokenClassification Uncased model (from yanekyuk) +author: John Snow Labs +name: bert_token_classifier_berturk_uncased_keyword_discriminator +date: 2023-11-07 +tags: [tr, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `berturk-uncased-keyword-discriminator` is a Turkish model originally trained by `yanekyuk`. + +## Predicted Entities + +`ENT`, `CON` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_berturk_uncased_keyword_discriminator_tr_5.2.0_3.0_1699330162546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_berturk_uncased_keyword_discriminator_tr_5.2.0_3.0_1699330162546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_berturk_uncased_keyword_discriminator","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_berturk_uncased_keyword_discriminator","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_berturk_uncased_keyword_discriminator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/yanekyuk/berturk-uncased-keyword-discriminator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_danish_ner_base_da.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_danish_ner_base_da.md new file mode 100644 index 000000000000..8f20202106a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_danish_ner_base_da.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Danish bert_token_classifier_danish_ner_base BertForTokenClassification from alexandrainst +author: John Snow Labs +name: bert_token_classifier_danish_ner_base +date: 2023-11-07 +tags: [bert, da, open_source, token_classification, onnx] +task: Named Entity Recognition +language: da +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_token_classifier_danish_ner_base` is a Danish model originally trained by alexandrainst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_danish_ner_base_da_5.2.0_3.0_1699325255783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_danish_ner_base_da_5.2.0_3.0_1699325255783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_danish_ner_base","da") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_token_classifier_danish_ner_base", "da") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_danish_ner_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|da| +|Size:|412.3 MB| + +## References + +https://huggingface.co/alexandrainst/da-ner-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_german_intensifiers_tagging_de.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_german_intensifiers_tagging_de.md new file mode 100644 index 000000000000..71b98438a344 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_german_intensifiers_tagging_de.md @@ -0,0 +1,98 @@ +--- +layout: model +title: German BertForTokenClassification Cased model (from TariqYousef) +author: John Snow Labs +name: bert_token_classifier_german_intensifiers_tagging +date: 2023-11-07 +tags: [de, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `german-intensifiers-tagging` is a German model originally trained by `TariqYousef`. + +## Predicted Entities + +`INT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_german_intensifiers_tagging_de_5.2.0_3.0_1699382987270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_german_intensifiers_tagging_de_5.2.0_3.0_1699382987270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_german_intensifiers_tagging","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_german_intensifiers_tagging","de") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_german_intensifiers_tagging| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|409.9 MB| + +## References + +References + +- https://huggingface.co/TariqYousef/german-intensifiers-tagging \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_instafood_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_instafood_ner_en.md new file mode 100644 index 000000000000..6ec8f75c2785 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_instafood_ner_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from Dizex) +author: John Snow Labs +name: bert_token_classifier_instafood_ner +date: 2023-11-07 +tags: [en, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `InstaFoodBERT-NER` is a English model originally trained by `Dizex`. + +## Predicted Entities + +`FOOD` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_instafood_ner_en_5.2.0_3.0_1699383278855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_instafood_ner_en_5.2.0_3.0_1699383278855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_instafood_ner","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_instafood_ner","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_instafood_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Dizex/InstaFoodBERT-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_restore_punctuation_ptbr_pt.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_restore_punctuation_ptbr_pt.md new file mode 100644 index 000000000000..ad939d748b22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_restore_punctuation_ptbr_pt.md @@ -0,0 +1,104 @@ +--- +layout: model +title: Portuguese BertForTokenClassification Cased model (from dominguesm) +author: John Snow Labs +name: bert_token_classifier_restore_punctuation_ptbr +date: 2023-11-07 +tags: [pt, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-restore-punctuation-ptbr` is a Portuguese model originally trained by `dominguesm`. + +## Predicted Entities + +`.U`, `!O`, `:O`, `:U`, `;O`, `OU`, `?U`, `!U`, `OO`, `.O`, `-O`, `'O`, `?O` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_restore_punctuation_ptbr_pt_5.2.0_3.0_1699383762732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_restore_punctuation_ptbr_pt_5.2.0_3.0_1699383762732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_restore_punctuation_ptbr","pt") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_restore_punctuation_ptbr","pt") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_restore_punctuation_ptbr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|406.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/dominguesm/bert-restore-punctuation-ptbr +- https://wandb.ai/dominguesm/RestorePunctuationPTBR +- https://github.com/DominguesM/respunct +- https://github.com/esdurmus/Wikilingua +- https://paperswithcode.com/sota?task=named-entity-recognition&dataset=wiki_lingua \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_sentcore_zh.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_sentcore_zh.md new file mode 100644 index 000000000000..b867a44b0f28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_sentcore_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForTokenClassification Cased model (from theta) +author: John Snow Labs +name: bert_token_classifier_sentcore +date: 2023-11-07 +tags: [zh, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sentcore` is a Chinese model originally trained by `theta`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_sentcore_zh_5.2.0_3.0_1699329704958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_sentcore_zh_5.2.0_3.0_1699329704958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_sentcore","zh") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_sentcore","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_sentcore| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/theta/sentcore \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_swedish_ner_sv.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_swedish_ner_sv.md new file mode 100644 index 000000000000..c77d138e155e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_swedish_ner_sv.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Swedish BertForTokenClassification Cased model (from hkaraoguz) +author: John Snow Labs +name: bert_token_classifier_swedish_ner +date: 2023-11-07 +tags: [sv, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BERT_swedish-ner` is a Swedish model originally trained by `hkaraoguz`. + +## Predicted Entities + +`LOC`, `ORG`, `PER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_swedish_ner_sv_5.2.0_3.0_1699384090991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_swedish_ner_sv_5.2.0_3.0_1699384090991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_swedish_ner","sv") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_swedish_ner","sv") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_swedish_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/hkaraoguz/BERT_swedish-ner +- https://paperswithcode.com/sota?task=Token+Classification&dataset=wikiann \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_uncased_keyword_extractor_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_uncased_keyword_extractor_en.md new file mode 100644 index 000000000000..e2796954f3fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_uncased_keyword_extractor_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English BertForTokenClassification Uncased model (from yanekyuk) +author: John Snow Labs +name: bert_token_classifier_uncased_keyword_extractor +date: 2023-11-07 +tags: [en, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-uncased-keyword-extractor` is a English model originally trained by `yanekyuk`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_uncased_keyword_extractor_en_5.2.0_3.0_1699331251723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_uncased_keyword_extractor_en_5.2.0_3.0_1699331251723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_uncased_keyword_extractor","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_uncased_keyword_extractor","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_uncased_keyword_extractor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +References + +- https://huggingface.co/yanekyuk/bert-uncased-keyword-extractor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_wg_bert_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_wg_bert_en.md new file mode 100644 index 000000000000..72e11eed4e9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_token_classifier_wg_bert_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English BertForTokenClassification Cased model (from krishjothi) +author: John Snow Labs +name: bert_token_classifier_wg_bert +date: 2023-11-07 +tags: [en, open_source, bert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `WG_Bert` is a English model originally trained by `krishjothi`. + +## Predicted Entities + +`LOC`, `TYPE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_token_classifier_wg_bert_en_5.2.0_3.0_1699382984736.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_token_classifier_wg_bert_en_5.2.0_3.0_1699382984736.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_wg_bert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_wg_bert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_token_classifier_wg_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +References + +- https://huggingface.co/krishjothi/WG_Bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bert_uncased_keyword_extractor_en.md b/docs/_posts/ahmedlone127/2023-11-07-bert_uncased_keyword_extractor_en.md new file mode 100644 index 000000000000..1bd5aa890572 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bert_uncased_keyword_extractor_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_uncased_keyword_extractor BertForTokenClassification from Azma-AI +author: John Snow Labs +name: bert_uncased_keyword_extractor +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_keyword_extractor` is a English model originally trained by Azma-AI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_keyword_extractor_en_5.2.0_3.0_1699393569738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_keyword_extractor_en_5.2.0_3.0_1699393569738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_uncased_keyword_extractor","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_uncased_keyword_extractor", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_keyword_extractor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Azma-AI/bert-uncased-keyword-extractor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-berttest2_rtwc_en.md b/docs/_posts/ahmedlone127/2023-11-07-berttest2_rtwc_en.md new file mode 100644 index 000000000000..bb33b17d93b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-berttest2_rtwc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English berttest2_rtwc BertForTokenClassification from RtwC +author: John Snow Labs +name: berttest2_rtwc +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berttest2_rtwc` is a English model originally trained by RtwC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berttest2_rtwc_en_5.2.0_3.0_1699396604379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berttest2_rtwc_en_5.2.0_3.0_1699396604379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("berttest2_rtwc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("berttest2_rtwc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berttest2_rtwc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/RtwC/berttest2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-berturk_cased_ner_tr.md b/docs/_posts/ahmedlone127/2023-11-07-berturk_cased_ner_tr.md new file mode 100644 index 000000000000..0f3702c9bccd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-berturk_cased_ner_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish berturk_cased_ner BertForTokenClassification from alierenak +author: John Snow Labs +name: berturk_cased_ner +date: 2023-11-07 +tags: [bert, tr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berturk_cased_ner` is a Turkish model originally trained by alierenak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berturk_cased_ner_tr_5.2.0_3.0_1699393575749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berturk_cased_ner_tr_5.2.0_3.0_1699393575749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("berturk_cased_ner","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("berturk_cased_ner", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berturk_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| + +## References + +https://huggingface.co/alierenak/berturk-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-biobert_base_cased_v1_2_bc2gm_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-biobert_base_cased_v1_2_bc2gm_ner_en.md new file mode 100644 index 000000000000..08ebaa458f1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-biobert_base_cased_v1_2_bc2gm_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_base_cased_v1_2_bc2gm_ner BertForTokenClassification from chintagunta85 +author: John Snow Labs +name: biobert_base_cased_v1_2_bc2gm_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_base_cased_v1_2_bc2gm_ner` is a English model originally trained by chintagunta85. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_2_bc2gm_ner_en_5.2.0_3.0_1699383762749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_2_bc2gm_ner_en_5.2.0_3.0_1699383762749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_base_cased_v1_2_bc2gm_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_base_cased_v1_2_bc2gm_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_base_cased_v1_2_bc2gm_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/chintagunta85/biobert-base-cased-v1.2-bc2gm-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner_en.md new file mode 100644 index 000000000000..8955c1d65126 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner BertForTokenClassification from jordyvl +author: John Snow Labs +name: biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner` is a English model originally trained by jordyvl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner_en_5.2.0_3.0_1699384773826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner_en_5.2.0_3.0_1699384773826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_base_cased_v1_2_ncbi_disease_softmax_labelall_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/jordyvl/biobert-base-cased-v1.2_ncbi_disease-softmax-labelall-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-biobert_diseases_ner_alvaroalon2_en.md b/docs/_posts/ahmedlone127/2023-11-07-biobert_diseases_ner_alvaroalon2_en.md new file mode 100644 index 000000000000..02bddb8a486f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-biobert_diseases_ner_alvaroalon2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_diseases_ner_alvaroalon2 BertForTokenClassification from alvaroalon2 +author: John Snow Labs +name: biobert_diseases_ner_alvaroalon2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_diseases_ner_alvaroalon2` is a English model originally trained by alvaroalon2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_diseases_ner_alvaroalon2_en_5.2.0_3.0_1699384011144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_diseases_ner_alvaroalon2_en_5.2.0_3.0_1699384011144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_diseases_ner_alvaroalon2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_diseases_ner_alvaroalon2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_diseases_ner_alvaroalon2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/alvaroalon2/biobert_diseases_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-biobert_diseases_ner_sschet_en.md b/docs/_posts/ahmedlone127/2023-11-07-biobert_diseases_ner_sschet_en.md new file mode 100644 index 000000000000..58b0eab74102 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-biobert_diseases_ner_sschet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_diseases_ner_sschet BertForTokenClassification from sschet +author: John Snow Labs +name: biobert_diseases_ner_sschet +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_diseases_ner_sschet` is a English model originally trained by sschet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_diseases_ner_sschet_en_5.2.0_3.0_1699387452087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_diseases_ner_sschet_en_5.2.0_3.0_1699387452087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_diseases_ner_sschet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_diseases_ner_sschet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_diseases_ner_sschet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/sschet/biobert_diseases_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bioformer_8l_ncbi_disease_en.md b/docs/_posts/ahmedlone127/2023-11-07-bioformer_8l_ncbi_disease_en.md new file mode 100644 index 000000000000..bb349619d43f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bioformer_8l_ncbi_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bioformer_8l_ncbi_disease BertForTokenClassification from bioformers +author: John Snow Labs +name: bioformer_8l_ncbi_disease +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bioformer_8l_ncbi_disease` is a English model originally trained by bioformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bioformer_8l_ncbi_disease_en_5.2.0_3.0_1699325104597.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bioformer_8l_ncbi_disease_en_5.2.0_3.0_1699325104597.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bioformer_8l_ncbi_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bioformer_8l_ncbi_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bioformer_8l_ncbi_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|158.5 MB| + +## References + +https://huggingface.co/bioformers/bioformer-8L-ncbi-disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-biolinkbert_base_finetuned_n2c2_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-biolinkbert_base_finetuned_n2c2_ner_en.md new file mode 100644 index 000000000000..77e71bde34a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-biolinkbert_base_finetuned_n2c2_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biolinkbert_base_finetuned_n2c2_ner BertForTokenClassification from georgeleung30 +author: John Snow Labs +name: biolinkbert_base_finetuned_n2c2_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biolinkbert_base_finetuned_n2c2_ner` is a English model originally trained by georgeleung30. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biolinkbert_base_finetuned_n2c2_ner_en_5.2.0_3.0_1699387642688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biolinkbert_base_finetuned_n2c2_ner_en_5.2.0_3.0_1699387642688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biolinkbert_base_finetuned_n2c2_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biolinkbert_base_finetuned_n2c2_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biolinkbert_base_finetuned_n2c2_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.5 MB| + +## References + +https://huggingface.co/georgeleung30/BioLinkBERT-base-finetuned-n2c2-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease_en.md b/docs/_posts/ahmedlone127/2023-11-07-biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease_en.md new file mode 100644 index 000000000000..74f5cf4654d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease BertForTokenClassification from sarahmiller137 +author: John Snow Labs +name: biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease` is a English model originally trained by sarahmiller137. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease_en_5.2.0_3.0_1699392082410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease_en_5.2.0_3.0_1699392082410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomednlp_pubmedbert_base_uncased_abstract_ft_ncbi_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/sarahmiller137/BiomedNLP-PubMedBERT-base-uncased-abstract-ft-ncbi-disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease_en.md b/docs/_posts/ahmedlone127/2023-11-07-biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease_en.md new file mode 100644 index 000000000000..f43fa7a4dc25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease BertForTokenClassification from sarahmiller137 +author: John Snow Labs +name: biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease` is a English model originally trained by sarahmiller137. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease_en_5.2.0_3.0_1699388250102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease_en_5.2.0_3.0_1699388250102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomednlp_pubmedbert_base_uncased_abstract_fulltext_ft_ncbi_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/sarahmiller137/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-ft-ncbi-disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bioner_en.md b/docs/_posts/ahmedlone127/2023-11-07-bioner_en.md new file mode 100644 index 000000000000..f3d8853080f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bioner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bioner BertForTokenClassification from MilosKosRad +author: John Snow Labs +name: bioner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bioner` is a English model originally trained by MilosKosRad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bioner_en_5.2.0_3.0_1699383388414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bioner_en_5.2.0_3.0_1699383388414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bioner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bioner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bioner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/MilosKosRad/BioNER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-body_part_annotator_en.md b/docs/_posts/ahmedlone127/2023-11-07-body_part_annotator_en.md new file mode 100644 index 000000000000..efe52363d636 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-body_part_annotator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English body_part_annotator BertForTokenClassification from cp500 +author: John Snow Labs +name: body_part_annotator +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`body_part_annotator` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/body_part_annotator_en_5.2.0_3.0_1699385848533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/body_part_annotator_en_5.2.0_3.0_1699385848533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("body_part_annotator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("body_part_annotator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|body_part_annotator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.4 MB| + +## References + +https://huggingface.co/cp500/body_part_annotator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bpmn_information_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-07-bpmn_information_extraction_en.md new file mode 100644 index 000000000000..aa4677a59567 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bpmn_information_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bpmn_information_extraction BertForTokenClassification from jtlicardo +author: John Snow Labs +name: bpmn_information_extraction +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bpmn_information_extraction` is a English model originally trained by jtlicardo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bpmn_information_extraction_en_5.2.0_3.0_1699383560746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bpmn_information_extraction_en_5.2.0_3.0_1699383560746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bpmn_information_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bpmn_information_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bpmn_information_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jtlicardo/bpmn-information-extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bpmn_information_extraction_v2_en.md b/docs/_posts/ahmedlone127/2023-11-07-bpmn_information_extraction_v2_en.md new file mode 100644 index 000000000000..276260efedf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bpmn_information_extraction_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bpmn_information_extraction_v2 BertForTokenClassification from jtlicardo +author: John Snow Labs +name: bpmn_information_extraction_v2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bpmn_information_extraction_v2` is a English model originally trained by jtlicardo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bpmn_information_extraction_v2_en_5.2.0_3.0_1699387855385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bpmn_information_extraction_v2_en_5.2.0_3.0_1699387855385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bpmn_information_extraction_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bpmn_information_extraction_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bpmn_information_extraction_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jtlicardo/bpmn-information-extraction-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-bulbert_ner_bsnlp_en.md b/docs/_posts/ahmedlone127/2023-11-07-bulbert_ner_bsnlp_en.md new file mode 100644 index 000000000000..94aea5de8ce7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-bulbert_ner_bsnlp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bulbert_ner_bsnlp BertForTokenClassification from mor40 +author: John Snow Labs +name: bulbert_ner_bsnlp +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bulbert_ner_bsnlp` is a English model originally trained by mor40. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bulbert_ner_bsnlp_en_5.2.0_3.0_1699386167194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bulbert_ner_bsnlp_en_5.2.0_3.0_1699386167194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bulbert_ner_bsnlp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bulbert_ner_bsnlp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bulbert_ner_bsnlp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|306.1 MB| + +## References + +https://huggingface.co/mor40/BulBERT-ner-bsnlp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-chinese_address_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-chinese_address_ner_en.md new file mode 100644 index 000000000000..7c9f6dccbb8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-chinese_address_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chinese_address_ner BertForTokenClassification from jiaqianjing +author: John Snow Labs +name: chinese_address_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_address_ner` is a English model originally trained by jiaqianjing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_address_ner_en_5.2.0_3.0_1699386241645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_address_ner_en_5.2.0_3.0_1699386241645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("chinese_address_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("chinese_address_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_address_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/jiaqianjing/chinese-address-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-chinese_wiki_punctuation_restore_zh.md b/docs/_posts/ahmedlone127/2023-11-07-chinese_wiki_punctuation_restore_zh.md new file mode 100644 index 000000000000..64ab390dd3e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-chinese_wiki_punctuation_restore_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese chinese_wiki_punctuation_restore BertForTokenClassification from p208p2002 +author: John Snow Labs +name: chinese_wiki_punctuation_restore +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_wiki_punctuation_restore` is a Chinese model originally trained by p208p2002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_wiki_punctuation_restore_zh_5.2.0_3.0_1699384662197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_wiki_punctuation_restore_zh_5.2.0_3.0_1699384662197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("chinese_wiki_punctuation_restore","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("chinese_wiki_punctuation_restore", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_wiki_punctuation_restore| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.0 MB| + +## References + +https://huggingface.co/p208p2002/zh-wiki-punctuation-restore \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-classical_chinese_punctuation_guwen_biaodian_zh.md b/docs/_posts/ahmedlone127/2023-11-07-classical_chinese_punctuation_guwen_biaodian_zh.md new file mode 100644 index 000000000000..d9877268df55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-classical_chinese_punctuation_guwen_biaodian_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese classical_chinese_punctuation_guwen_biaodian BertForTokenClassification from raynardj +author: John Snow Labs +name: classical_chinese_punctuation_guwen_biaodian +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`classical_chinese_punctuation_guwen_biaodian` is a Chinese model originally trained by raynardj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/classical_chinese_punctuation_guwen_biaodian_zh_5.2.0_3.0_1699386868878.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/classical_chinese_punctuation_guwen_biaodian_zh_5.2.0_3.0_1699386868878.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("classical_chinese_punctuation_guwen_biaodian","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("classical_chinese_punctuation_guwen_biaodian", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|classical_chinese_punctuation_guwen_biaodian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/raynardj/classical-chinese-punctuation-guwen-biaodian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_chemical_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_chemical_pt.md new file mode 100644 index 000000000000..f7432e88155f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_chemical_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_chemical BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_chemical +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_chemical` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_chemical_pt_5.2.0_3.0_1699331473378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_chemical_pt_5.2.0_3.0_1699331473378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_chemical","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_chemical", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_chemical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-chemical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_diagnostic_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_diagnostic_pt.md new file mode 100644 index 000000000000..42bfe77f3b8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_diagnostic_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_diagnostic BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_diagnostic +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_diagnostic` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_diagnostic_pt_5.2.0_3.0_1699320480550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_diagnostic_pt_5.2.0_3.0_1699320480550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_diagnostic","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_diagnostic", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_diagnostic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-diagnostic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_disease_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_disease_pt.md new file mode 100644 index 000000000000..907b1d1ca279 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_disease_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_disease BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_disease +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_disease` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_disease_pt_5.2.0_3.0_1699384452527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_disease_pt_5.2.0_3.0_1699384452527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_disease","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_disease", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_disorder_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_disorder_pt.md new file mode 100644 index 000000000000..e6ac7305d90e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_disorder_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_disorder BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_disorder +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_disorder` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_disorder_pt_5.2.0_3.0_1699318631290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_disorder_pt_5.2.0_3.0_1699318631290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_disorder","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_disorder", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_disorder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-disorder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_finding_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_finding_pt.md new file mode 100644 index 000000000000..a79fb357e325 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_finding_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_finding BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_finding +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_finding` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_finding_pt_5.2.0_3.0_1699389281791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_finding_pt_5.2.0_3.0_1699389281791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_finding","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_finding", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_finding| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-finding \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_healthcare_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_healthcare_pt.md new file mode 100644 index 000000000000..8b26d9ad3a80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_healthcare_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_healthcare BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_healthcare +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_healthcare` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_healthcare_pt_5.2.0_3.0_1699395140561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_healthcare_pt_5.2.0_3.0_1699395140561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_healthcare","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_healthcare", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_healthcare| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-healthcare \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_laboratory_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_laboratory_pt.md new file mode 100644 index 000000000000..c11901f8700b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_laboratory_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_laboratory BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_laboratory +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_laboratory` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_laboratory_pt_5.2.0_3.0_1699384818446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_laboratory_pt_5.2.0_3.0_1699384818446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_laboratory","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_laboratory", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_laboratory| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-laboratory \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_medical_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_medical_pt.md new file mode 100644 index 000000000000..1fe515783d93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_medical_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_medical BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_medical +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_medical` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_medical_pt_5.2.0_3.0_1699385973342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_medical_pt_5.2.0_3.0_1699385973342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_medical","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_medical", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_medical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-medical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_pharmacologic_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_pharmacologic_pt.md new file mode 100644 index 000000000000..256430894d0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_pharmacologic_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_pharmacologic BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_pharmacologic +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_pharmacologic` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_pharmacologic_pt_5.2.0_3.0_1699388490264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_pharmacologic_pt_5.2.0_3.0_1699388490264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_pharmacologic","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_pharmacologic", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_pharmacologic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-pharmacologic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_sign_pt.md b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_sign_pt.md new file mode 100644 index 000000000000..fb184275143c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-clinicalnerpt_sign_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_sign BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_sign +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_sign` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_sign_pt_5.2.0_3.0_1699388993710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_sign_pt_5.2.0_3.0_1699388993710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_sign","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_sign", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_sign| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-sign \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-comp_seqlab_dslim_bert_en.md b/docs/_posts/ahmedlone127/2023-11-07-comp_seqlab_dslim_bert_en.md new file mode 100644 index 000000000000..91b72c2ee063 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-comp_seqlab_dslim_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English comp_seqlab_dslim_bert BertForTokenClassification from uhhlt +author: John Snow Labs +name: comp_seqlab_dslim_bert +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`comp_seqlab_dslim_bert` is a English model originally trained by uhhlt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/comp_seqlab_dslim_bert_en_5.2.0_3.0_1699387870385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/comp_seqlab_dslim_bert_en_5.2.0_3.0_1699387870385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("comp_seqlab_dslim_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("comp_seqlab_dslim_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|comp_seqlab_dslim_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/uhhlt/comp-seqlab-dslim-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-dark_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-dark_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..ce8c56c2b128 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-dark_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dark_bert_finetuned_ner BertForTokenClassification from pulkitkumar13 +author: John Snow Labs +name: dark_bert_finetuned_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dark_bert_finetuned_ner` is a English model originally trained by pulkitkumar13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dark_bert_finetuned_ner_en_5.2.0_3.0_1699387910801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dark_bert_finetuned_ner_en_5.2.0_3.0_1699387910801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("dark_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("dark_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dark_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/pulkitkumar13/dark-bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-dbbert_pos_en.md b/docs/_posts/ahmedlone127/2023-11-07-dbbert_pos_en.md new file mode 100644 index 000000000000..209eb8c7f19c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-dbbert_pos_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dbbert_pos BertForTokenClassification from colinswaelens +author: John Snow Labs +name: dbbert_pos +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbbert_pos` is a English model originally trained by colinswaelens. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbbert_pos_en_5.2.0_3.0_1699387552772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbbert_pos_en_5.2.0_3.0_1699387552772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("dbbert_pos","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("dbbert_pos", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbbert_pos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.4 MB| + +## References + +https://huggingface.co/colinswaelens/DBBErt_POS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-deepct_en.md b/docs/_posts/ahmedlone127/2023-11-07-deepct_en.md new file mode 100644 index 000000000000..c89265c54689 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-deepct_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English deepct BertForTokenClassification from macavaney +author: John Snow Labs +name: deepct +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deepct` is a English model originally trained by macavaney. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deepct_en_5.2.0_3.0_1699326749454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deepct_en_5.2.0_3.0_1699326749454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("deepct","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("deepct", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deepct| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/macavaney/deepct \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-deprem_ner_tr.md b/docs/_posts/ahmedlone127/2023-11-07-deprem_ner_tr.md new file mode 100644 index 000000000000..2eaafed6259a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-deprem_ner_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish deprem_ner BertForTokenClassification from deprem-ml +author: John Snow Labs +name: deprem_ner +date: 2023-11-07 +tags: [bert, tr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deprem_ner` is a Turkish model originally trained by deprem-ml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deprem_ner_tr_5.2.0_3.0_1699384420770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deprem_ner_tr_5.2.0_3.0_1699384420770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("deprem_ner","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("deprem_ner", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deprem_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| + +## References + +https://huggingface.co/deprem-ml/deprem-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-drbert_casm2_fr.md b/docs/_posts/ahmedlone127/2023-11-07-drbert_casm2_fr.md new file mode 100644 index 000000000000..d8d990d5f63f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-drbert_casm2_fr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: French drbert_casm2 BertForTokenClassification from camila-ud +author: John Snow Labs +name: drbert_casm2 +date: 2023-11-07 +tags: [bert, fr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`drbert_casm2` is a French model originally trained by camila-ud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/drbert_casm2_fr_5.2.0_3.0_1699318515122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/drbert_casm2_fr_5.2.0_3.0_1699318515122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("drbert_casm2","fr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("drbert_casm2", "fr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|drbert_casm2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fr| +|Size:|408.2 MB| + +## References + +https://huggingface.co/camila-ud/DrBERT-CASM2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-elhberteu_sayula_popoluca_ud1_2_eu.md b/docs/_posts/ahmedlone127/2023-11-07-elhberteu_sayula_popoluca_ud1_2_eu.md new file mode 100644 index 000000000000..f6f01ad26ea8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-elhberteu_sayula_popoluca_ud1_2_eu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Basque elhberteu_sayula_popoluca_ud1_2 BertForTokenClassification from orai-nlp +author: John Snow Labs +name: elhberteu_sayula_popoluca_ud1_2 +date: 2023-11-07 +tags: [bert, eu, open_source, token_classification, onnx] +task: Named Entity Recognition +language: eu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`elhberteu_sayula_popoluca_ud1_2` is a Basque model originally trained by orai-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/elhberteu_sayula_popoluca_ud1_2_eu_5.2.0_3.0_1699384623242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/elhberteu_sayula_popoluca_ud1_2_eu_5.2.0_3.0_1699384623242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("elhberteu_sayula_popoluca_ud1_2","eu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("elhberteu_sayula_popoluca_ud1_2", "eu") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|elhberteu_sayula_popoluca_ud1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|eu| +|Size:|464.7 MB| + +## References + +https://huggingface.co/orai-nlp/ElhBERTeu-pos-ud1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-emscad_skill_extraction_conference_token_classification_en.md b/docs/_posts/ahmedlone127/2023-11-07-emscad_skill_extraction_conference_token_classification_en.md new file mode 100644 index 000000000000..8bd7e889ad95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-emscad_skill_extraction_conference_token_classification_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English emscad_skill_extraction_conference_token_classification BertForTokenClassification from Ivo +author: John Snow Labs +name: emscad_skill_extraction_conference_token_classification +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emscad_skill_extraction_conference_token_classification` is a English model originally trained by Ivo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emscad_skill_extraction_conference_token_classification_en_5.2.0_3.0_1699385391119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emscad_skill_extraction_conference_token_classification_en_5.2.0_3.0_1699385391119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("emscad_skill_extraction_conference_token_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("emscad_skill_extraction_conference_token_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emscad_skill_extraction_conference_token_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Ivo/emscad-skill-extraction-conference-token-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-emscad_skill_extraction_token_classification_en.md b/docs/_posts/ahmedlone127/2023-11-07-emscad_skill_extraction_token_classification_en.md new file mode 100644 index 000000000000..de4fac875af0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-emscad_skill_extraction_token_classification_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English emscad_skill_extraction_token_classification BertForTokenClassification from Ivo +author: John Snow Labs +name: emscad_skill_extraction_token_classification +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emscad_skill_extraction_token_classification` is a English model originally trained by Ivo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emscad_skill_extraction_token_classification_en_5.2.0_3.0_1699389758974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emscad_skill_extraction_token_classification_en_5.2.0_3.0_1699389758974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("emscad_skill_extraction_token_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("emscad_skill_extraction_token_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emscad_skill_extraction_token_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Ivo/emscad-skill-extraction-token-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-finance_ner_v0_0_9_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-finance_ner_v0_0_9_finetuned_ner_en.md new file mode 100644 index 000000000000..a91647e8b1bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-finance_ner_v0_0_9_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finance_ner_v0_0_9_finetuned_ner BertForTokenClassification from AhmedTaha012 +author: John Snow Labs +name: finance_ner_v0_0_9_finetuned_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_ner_v0_0_9_finetuned_ner` is a English model originally trained by AhmedTaha012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_ner_v0_0_9_finetuned_ner_en_5.2.0_3.0_1699385195204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_ner_v0_0_9_finetuned_ner_en_5.2.0_3.0_1699385195204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("finance_ner_v0_0_9_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("finance_ner_v0_0_9_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_ner_v0_0_9_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AhmedTaha012/finance-ner-v0.0.9-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-finbert_ner_fi.md b/docs/_posts/ahmedlone127/2023-11-07-finbert_ner_fi.md new file mode 100644 index 000000000000..9d77d7d8a32a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-finbert_ner_fi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Finnish finbert_ner BertForTokenClassification from Kansallisarkisto +author: John Snow Labs +name: finbert_ner +date: 2023-11-07 +tags: [bert, fi, open_source, token_classification, onnx] +task: Named Entity Recognition +language: fi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finbert_ner` is a Finnish model originally trained by Kansallisarkisto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finbert_ner_fi_5.2.0_3.0_1699385738219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finbert_ner_fi_5.2.0_3.0_1699385738219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("finbert_ner","fi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("finbert_ner", "fi") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|fi| +|Size:|464.7 MB| + +## References + +https://huggingface.co/Kansallisarkisto/finbert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-fullstop_indonesian_punctuation_prediction_id.md b/docs/_posts/ahmedlone127/2023-11-07-fullstop_indonesian_punctuation_prediction_id.md new file mode 100644 index 000000000000..015e64df5218 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-fullstop_indonesian_punctuation_prediction_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian fullstop_indonesian_punctuation_prediction BertForTokenClassification from Rizkinoor16 +author: John Snow Labs +name: fullstop_indonesian_punctuation_prediction +date: 2023-11-07 +tags: [bert, id, open_source, token_classification, onnx] +task: Named Entity Recognition +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fullstop_indonesian_punctuation_prediction` is a Indonesian model originally trained by Rizkinoor16. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fullstop_indonesian_punctuation_prediction_id_5.2.0_3.0_1699391589605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fullstop_indonesian_punctuation_prediction_id_5.2.0_3.0_1699391589605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("fullstop_indonesian_punctuation_prediction","id") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("fullstop_indonesian_punctuation_prediction", "id") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fullstop_indonesian_punctuation_prediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|id| +|Size:|625.5 MB| + +## References + +https://huggingface.co/Rizkinoor16/fullstop-indonesian-punctuation-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-gbert_legal_ner_de.md b/docs/_posts/ahmedlone127/2023-11-07-gbert_legal_ner_de.md new file mode 100644 index 000000000000..78c089ec9674 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-gbert_legal_ner_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German gbert_legal_ner BertForTokenClassification from PaDaS-Lab +author: John Snow Labs +name: gbert_legal_ner +date: 2023-11-07 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gbert_legal_ner` is a German model originally trained by PaDaS-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gbert_legal_ner_de_5.2.0_3.0_1699387731402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gbert_legal_ner_de_5.2.0_3.0_1699387731402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("gbert_legal_ner","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("gbert_legal_ner", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gbert_legal_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|407.0 MB| + +## References + +https://huggingface.co/PaDaS-Lab/gbert-legal-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-german_english_code_switching_identification_en.md b/docs/_posts/ahmedlone127/2023-11-07-german_english_code_switching_identification_en.md new file mode 100644 index 000000000000..134bdfd852c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-german_english_code_switching_identification_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English german_english_code_switching_identification BertForTokenClassification from igorsterner +author: John Snow Labs +name: german_english_code_switching_identification +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`german_english_code_switching_identification` is a English model originally trained by igorsterner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/german_english_code_switching_identification_en_5.2.0_3.0_1699388184147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/german_english_code_switching_identification_en_5.2.0_3.0_1699388184147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("german_english_code_switching_identification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("german_english_code_switching_identification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|german_english_code_switching_identification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.7 MB| + +## References + +https://huggingface.co/igorsterner/german-english-code-switching-identification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-gilbert_en.md b/docs/_posts/ahmedlone127/2023-11-07-gilbert_en.md new file mode 100644 index 000000000000..bfcb510b766d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-gilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gilbert BertForTokenClassification from rajpurkarlab +author: John Snow Labs +name: gilbert +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gilbert` is a English model originally trained by rajpurkarlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gilbert_en_5.2.0_3.0_1699315652248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gilbert_en_5.2.0_3.0_1699315652248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("gilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("gilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/rajpurkarlab/gilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-gp3_medical_token_classification_en.md b/docs/_posts/ahmedlone127/2023-11-07-gp3_medical_token_classification_en.md new file mode 100644 index 000000000000..4958b7a09110 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-gp3_medical_token_classification_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gp3_medical_token_classification BertForTokenClassification from parsi-ai-nlpclass +author: John Snow Labs +name: gp3_medical_token_classification +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gp3_medical_token_classification` is a English model originally trained by parsi-ai-nlpclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gp3_medical_token_classification_en_5.2.0_3.0_1699399263292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gp3_medical_token_classification_en_5.2.0_3.0_1699399263292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("gp3_medical_token_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("gp3_medical_token_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gp3_medical_token_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/parsi-ai-nlpclass/Gp3_medical_token_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-hebert_medical_ner_fixed_labels_v1_en.md b/docs/_posts/ahmedlone127/2023-11-07-hebert_medical_ner_fixed_labels_v1_en.md new file mode 100644 index 000000000000..9def0193375f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-hebert_medical_ner_fixed_labels_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hebert_medical_ner_fixed_labels_v1 BertForTokenClassification from cp500 +author: John Snow Labs +name: hebert_medical_ner_fixed_labels_v1 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebert_medical_ner_fixed_labels_v1` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebert_medical_ner_fixed_labels_v1_en_5.2.0_3.0_1699388943686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebert_medical_ner_fixed_labels_v1_en_5.2.0_3.0_1699388943686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("hebert_medical_ner_fixed_labels_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("hebert_medical_ner_fixed_labels_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebert_medical_ner_fixed_labels_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.5 MB| + +## References + +https://huggingface.co/cp500/hebert_medical_ner_fixed_labels_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-hebert_medical_ner_fixed_labels_v3_en.md b/docs/_posts/ahmedlone127/2023-11-07-hebert_medical_ner_fixed_labels_v3_en.md new file mode 100644 index 000000000000..2a01f206d189 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-hebert_medical_ner_fixed_labels_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hebert_medical_ner_fixed_labels_v3 BertForTokenClassification from cp500 +author: John Snow Labs +name: hebert_medical_ner_fixed_labels_v3 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebert_medical_ner_fixed_labels_v3` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebert_medical_ner_fixed_labels_v3_en_5.2.0_3.0_1699383333088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebert_medical_ner_fixed_labels_v3_en_5.2.0_3.0_1699383333088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("hebert_medical_ner_fixed_labels_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("hebert_medical_ner_fixed_labels_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebert_medical_ner_fixed_labels_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.6 MB| + +## References + +https://huggingface.co/cp500/hebert_medical_ner_fixed_labels_v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-hindi_bert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-hindi_bert_ner_en.md new file mode 100644 index 000000000000..5c426a50c0ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-hindi_bert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hindi_bert_ner BertForTokenClassification from mirfan899 +author: John Snow Labs +name: hindi_bert_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hindi_bert_ner` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hindi_bert_ner_en_5.2.0_3.0_1699389197767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hindi_bert_ner_en_5.2.0_3.0_1699389197767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("hindi_bert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("hindi_bert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hindi_bert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/mirfan899/hindi-bert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-hotel_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-07-hotel_reviews_en.md new file mode 100644 index 000000000000..5d01884dae5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-hotel_reviews_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hotel_reviews BertForTokenClassification from MutazYoune +author: John Snow Labs +name: hotel_reviews +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hotel_reviews` is a English model originally trained by MutazYoune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hotel_reviews_en_5.2.0_3.0_1699387666666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hotel_reviews_en_5.2.0_3.0_1699387666666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("hotel_reviews","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("hotel_reviews", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hotel_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.4 MB| + +## References + +https://huggingface.co/MutazYoune/hotel_reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_random_typebased_en.md b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_random_typebased_en.md new file mode 100644 index 000000000000..89eaffa4873f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_random_typebased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English idrisi_lmr_en_random_typebased BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: idrisi_lmr_en_random_typebased +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idrisi_lmr_en_random_typebased` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_random_typebased_en_5.2.0_3.0_1699385564570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_random_typebased_en_5.2.0_3.0_1699385564570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("idrisi_lmr_en_random_typebased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("idrisi_lmr_en_random_typebased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idrisi_lmr_en_random_typebased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-EN-random-typebased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_random_typeless_en.md b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_random_typeless_en.md new file mode 100644 index 000000000000..ea0f4a669af6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_random_typeless_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English idrisi_lmr_en_random_typeless BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: idrisi_lmr_en_random_typeless +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idrisi_lmr_en_random_typeless` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_random_typeless_en_5.2.0_3.0_1699383382404.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_random_typeless_en_5.2.0_3.0_1699383382404.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("idrisi_lmr_en_random_typeless","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("idrisi_lmr_en_random_typeless", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idrisi_lmr_en_random_typeless| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-EN-random-typeless \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_timebased_typebased_en.md b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_timebased_typebased_en.md new file mode 100644 index 000000000000..e209f099cf95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_timebased_typebased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English idrisi_lmr_en_timebased_typebased BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: idrisi_lmr_en_timebased_typebased +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idrisi_lmr_en_timebased_typebased` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_timebased_typebased_en_5.2.0_3.0_1699387252805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_timebased_typebased_en_5.2.0_3.0_1699387252805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("idrisi_lmr_en_timebased_typebased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("idrisi_lmr_en_timebased_typebased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idrisi_lmr_en_timebased_typebased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-EN-timebased-typebased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_timebased_typeless_en.md b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_timebased_typeless_en.md new file mode 100644 index 000000000000..ec286ecb7698 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-idrisi_lmr_en_timebased_typeless_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English idrisi_lmr_en_timebased_typeless BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: idrisi_lmr_en_timebased_typeless +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idrisi_lmr_en_timebased_typeless` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_timebased_typeless_en_5.2.0_3.0_1699390384142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idrisi_lmr_en_timebased_typeless_en_5.2.0_3.0_1699390384142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("idrisi_lmr_en_timebased_typeless","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("idrisi_lmr_en_timebased_typeless", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idrisi_lmr_en_timebased_typeless| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-EN-timebased-typeless \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-indobert_large_p2_finetuned_chunking_id.md b/docs/_posts/ahmedlone127/2023-11-07-indobert_large_p2_finetuned_chunking_id.md new file mode 100644 index 000000000000..3ee6fa71b62b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-indobert_large_p2_finetuned_chunking_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian indobert_large_p2_finetuned_chunking BertForTokenClassification from ageng-anugrah +author: John Snow Labs +name: indobert_large_p2_finetuned_chunking +date: 2023-11-07 +tags: [bert, id, open_source, token_classification, onnx] +task: Named Entity Recognition +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_large_p2_finetuned_chunking` is a Indonesian model originally trained by ageng-anugrah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_large_p2_finetuned_chunking_id_5.2.0_3.0_1699385766100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_large_p2_finetuned_chunking_id_5.2.0_3.0_1699385766100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("indobert_large_p2_finetuned_chunking","id") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("indobert_large_p2_finetuned_chunking", "id") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_large_p2_finetuned_chunking| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|id| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ageng-anugrah/indobert-large-p2-finetuned-chunking \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-indobertweet_finetuned_ijelid_en.md b/docs/_posts/ahmedlone127/2023-11-07-indobertweet_finetuned_ijelid_en.md new file mode 100644 index 000000000000..69e68ac318b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-indobertweet_finetuned_ijelid_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English indobertweet_finetuned_ijelid BertForTokenClassification from fathan +author: John Snow Labs +name: indobertweet_finetuned_ijelid +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobertweet_finetuned_ijelid` is a English model originally trained by fathan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobertweet_finetuned_ijelid_en_5.2.0_3.0_1699387609403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobertweet_finetuned_ijelid_en_5.2.0_3.0_1699387609403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("indobertweet_finetuned_ijelid","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("indobertweet_finetuned_ijelid", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobertweet_finetuned_ijelid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.8 MB| + +## References + +https://huggingface.co/fathan/indobertweet-finetuned-ijelid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-jira_bert_nerr_en.md b/docs/_posts/ahmedlone127/2023-11-07-jira_bert_nerr_en.md new file mode 100644 index 000000000000..94df20c22a07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-jira_bert_nerr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jira_bert_nerr BertForTokenClassification from rouabelgacem +author: John Snow Labs +name: jira_bert_nerr +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jira_bert_nerr` is a English model originally trained by rouabelgacem. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jira_bert_nerr_en_5.2.0_3.0_1699385661443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jira_bert_nerr_en_5.2.0_3.0_1699385661443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("jira_bert_nerr","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("jira_bert_nerr", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jira_bert_nerr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|404.0 MB| + +## References + +https://huggingface.co/rouabelgacem/jira-bert-nerr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-jobbert_base_cased_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-jobbert_base_cased_ner_en.md new file mode 100644 index 000000000000..3b8b5577a5c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-jobbert_base_cased_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jobbert_base_cased_ner BertForTokenClassification from itsmeboris +author: John Snow Labs +name: jobbert_base_cased_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobbert_base_cased_ner` is a English model originally trained by itsmeboris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobbert_base_cased_ner_en_5.2.0_3.0_1699389113523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobbert_base_cased_ner_en_5.2.0_3.0_1699389113523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("jobbert_base_cased_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("jobbert_base_cased_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobbert_base_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/itsmeboris/jobbert-base-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-legal_bert_ner_base_cased_ptbr_pt.md b/docs/_posts/ahmedlone127/2023-11-07-legal_bert_ner_base_cased_ptbr_pt.md new file mode 100644 index 000000000000..4c385926fa64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-legal_bert_ner_base_cased_ptbr_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese legal_bert_ner_base_cased_ptbr BertForTokenClassification from dominguesm +author: John Snow Labs +name: legal_bert_ner_base_cased_ptbr +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_bert_ner_base_cased_ptbr` is a Portuguese model originally trained by dominguesm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_bert_ner_base_cased_ptbr_pt_5.2.0_3.0_1699388720293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_bert_ner_base_cased_ptbr_pt_5.2.0_3.0_1699388720293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("legal_bert_ner_base_cased_ptbr","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("legal_bert_ner_base_cased_ptbr", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_bert_ner_base_cased_ptbr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|405.9 MB| + +## References + +https://huggingface.co/dominguesm/legal-bert-ner-base-cased-ptbr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-macbert_base_chinese_medical_collation_zh.md b/docs/_posts/ahmedlone127/2023-11-07-macbert_base_chinese_medical_collation_zh.md new file mode 100644 index 000000000000..a0aa6d3b3cda --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-macbert_base_chinese_medical_collation_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese macbert_base_chinese_medical_collation BertForTokenClassification from 9pinus +author: John Snow Labs +name: macbert_base_chinese_medical_collation +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`macbert_base_chinese_medical_collation` is a Chinese model originally trained by 9pinus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/macbert_base_chinese_medical_collation_zh_5.2.0_3.0_1699392069704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/macbert_base_chinese_medical_collation_zh_5.2.0_3.0_1699392069704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("macbert_base_chinese_medical_collation","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("macbert_base_chinese_medical_collation", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|macbert_base_chinese_medical_collation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.0 MB| + +## References + +https://huggingface.co/9pinus/macbert-base-chinese-medical-collation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-macbert_base_chinese_medicine_recognition_zh.md b/docs/_posts/ahmedlone127/2023-11-07-macbert_base_chinese_medicine_recognition_zh.md new file mode 100644 index 000000000000..c783a642b837 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-macbert_base_chinese_medicine_recognition_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese macbert_base_chinese_medicine_recognition BertForTokenClassification from 9pinus +author: John Snow Labs +name: macbert_base_chinese_medicine_recognition +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`macbert_base_chinese_medicine_recognition` is a Chinese model originally trained by 9pinus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/macbert_base_chinese_medicine_recognition_zh_5.2.0_3.0_1699400808366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/macbert_base_chinese_medicine_recognition_zh_5.2.0_3.0_1699400808366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("macbert_base_chinese_medicine_recognition","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("macbert_base_chinese_medicine_recognition", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|macbert_base_chinese_medicine_recognition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|381.1 MB| + +## References + +https://huggingface.co/9pinus/macbert-base-chinese-medicine-recognition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-mbert_bengali_ner_bn.md b/docs/_posts/ahmedlone127/2023-11-07-mbert_bengali_ner_bn.md new file mode 100644 index 000000000000..ad0843cb0acc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-mbert_bengali_ner_bn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Bengali mbert_bengali_ner BertForTokenClassification from sagorsarker +author: John Snow Labs +name: mbert_bengali_ner +date: 2023-11-07 +tags: [bert, bn, open_source, token_classification, onnx] +task: Named Entity Recognition +language: bn +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_bengali_ner` is a Bengali model originally trained by sagorsarker. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_bengali_ner_bn_5.2.0_3.0_1699386696050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_bengali_ner_bn_5.2.0_3.0_1699386696050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("mbert_bengali_ner","bn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("mbert_bengali_ner", "bn") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_bengali_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|bn| +|Size:|625.5 MB| + +## References + +https://huggingface.co/sagorsarker/mbert-bengali-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-mbert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-mbert_finetuned_ner_en.md new file mode 100644 index 000000000000..81e650df1f47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-mbert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mbert_finetuned_ner BertForTokenClassification from Andrey1989 +author: John Snow Labs +name: mbert_finetuned_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_finetuned_ner` is a English model originally trained by Andrey1989. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_finetuned_ner_en_5.2.0_3.0_1699386433257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_finetuned_ner_en_5.2.0_3.0_1699386433257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("mbert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("mbert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Andrey1989/mbert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-med_ner_2_en.md b/docs/_posts/ahmedlone127/2023-11-07-med_ner_2_en.md new file mode 100644 index 000000000000..2579568a5710 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-med_ner_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English med_ner_2 BertForTokenClassification from m-aliabbas1 +author: John Snow Labs +name: med_ner_2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`med_ner_2` is a English model originally trained by m-aliabbas1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/med_ner_2_en_5.2.0_3.0_1699396604225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/med_ner_2_en_5.2.0_3.0_1699396604225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("med_ner_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("med_ner_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|med_ner_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/m-aliabbas1/med_ner_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-medical_condition_annotator_en.md b/docs/_posts/ahmedlone127/2023-11-07-medical_condition_annotator_en.md new file mode 100644 index 000000000000..a2e266d6a6eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-medical_condition_annotator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English medical_condition_annotator BertForTokenClassification from cp500 +author: John Snow Labs +name: medical_condition_annotator +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medical_condition_annotator` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medical_condition_annotator_en_5.2.0_3.0_1699387343846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medical_condition_annotator_en_5.2.0_3.0_1699387343846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("medical_condition_annotator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("medical_condition_annotator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medical_condition_annotator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.5 MB| + +## References + +https://huggingface.co/cp500/Medical_condition_annotator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-multilingual_arabic_token_classification_model_xx.md b/docs/_posts/ahmedlone127/2023-11-07-multilingual_arabic_token_classification_model_xx.md new file mode 100644 index 000000000000..bd8df3c0f440 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-multilingual_arabic_token_classification_model_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual multilingual_arabic_token_classification_model BertForTokenClassification from Cabooose +author: John Snow Labs +name: multilingual_arabic_token_classification_model +date: 2023-11-07 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multilingual_arabic_token_classification_model` is a Multilingual model originally trained by Cabooose. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multilingual_arabic_token_classification_model_xx_5.2.0_3.0_1699388993826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multilingual_arabic_token_classification_model_xx_5.2.0_3.0_1699388993826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("multilingual_arabic_token_classification_model","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("multilingual_arabic_token_classification_model", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multilingual_arabic_token_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Cabooose/multilingual_arabic_token_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-multilingual_english_token_classification_model_xx.md b/docs/_posts/ahmedlone127/2023-11-07-multilingual_english_token_classification_model_xx.md new file mode 100644 index 000000000000..8cdfd4a47dd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-multilingual_english_token_classification_model_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual multilingual_english_token_classification_model BertForTokenClassification from Cabooose +author: John Snow Labs +name: multilingual_english_token_classification_model +date: 2023-11-07 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multilingual_english_token_classification_model` is a Multilingual model originally trained by Cabooose. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multilingual_english_token_classification_model_xx_5.2.0_3.0_1699388733526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multilingual_english_token_classification_model_xx_5.2.0_3.0_1699388733526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("multilingual_english_token_classification_model","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("multilingual_english_token_classification_model", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multilingual_english_token_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Cabooose/multilingual_english_token_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-named_entity_recognition_en.md b/docs/_posts/ahmedlone127/2023-11-07-named_entity_recognition_en.md new file mode 100644 index 000000000000..77d17c036004 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-named_entity_recognition_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English named_entity_recognition BertForTokenClassification from mdarhri00 +author: John Snow Labs +name: named_entity_recognition +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`named_entity_recognition` is a English model originally trained by mdarhri00. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/named_entity_recognition_en_5.2.0_3.0_1699385114625.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/named_entity_recognition_en_5.2.0_3.0_1699385114625.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("named_entity_recognition","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("named_entity_recognition", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|named_entity_recognition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/mdarhri00/named-entity-recognition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ncbi_bc5cdr_disease_en.md b/docs/_posts/ahmedlone127/2023-11-07-ncbi_bc5cdr_disease_en.md new file mode 100644 index 000000000000..9b8c3982f665 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ncbi_bc5cdr_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ncbi_bc5cdr_disease BertForTokenClassification from datummd +author: John Snow Labs +name: ncbi_bc5cdr_disease +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ncbi_bc5cdr_disease` is a English model originally trained by datummd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ncbi_bc5cdr_disease_en_5.2.0_3.0_1699323955872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ncbi_bc5cdr_disease_en_5.2.0_3.0_1699323955872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ncbi_bc5cdr_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ncbi_bc5cdr_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ncbi_bc5cdr_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/datummd/NCBI_BC5CDR_disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ner_bert_base_cased_ontonotesv5_englishv4_en.md b/docs/_posts/ahmedlone127/2023-11-07-ner_bert_base_cased_ontonotesv5_englishv4_en.md new file mode 100644 index 000000000000..647ecc4d2b0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ner_bert_base_cased_ontonotesv5_englishv4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_bert_base_cased_ontonotesv5_englishv4 BertForTokenClassification from djagatiya +author: John Snow Labs +name: ner_bert_base_cased_ontonotesv5_englishv4 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_bert_base_cased_ontonotesv5_englishv4` is a English model originally trained by djagatiya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_bert_base_cased_ontonotesv5_englishv4_en_5.2.0_3.0_1699384083694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_bert_base_cased_ontonotesv5_englishv4_en_5.2.0_3.0_1699384083694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_bert_base_cased_ontonotesv5_englishv4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_bert_base_cased_ontonotesv5_englishv4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_bert_base_cased_ontonotesv5_englishv4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/djagatiya/ner-bert-base-cased-ontonotesv5-englishv4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ner_bert_large_cased_portuguese_lenerbr_pt.md b/docs/_posts/ahmedlone127/2023-11-07-ner_bert_large_cased_portuguese_lenerbr_pt.md new file mode 100644 index 000000000000..15b6bf893055 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ner_bert_large_cased_portuguese_lenerbr_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese ner_bert_large_cased_portuguese_lenerbr BertForTokenClassification from pierreguillou +author: John Snow Labs +name: ner_bert_large_cased_portuguese_lenerbr +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_bert_large_cased_portuguese_lenerbr` is a Portuguese model originally trained by pierreguillou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_bert_large_cased_portuguese_lenerbr_pt_5.2.0_3.0_1699384462079.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_bert_large_cased_portuguese_lenerbr_pt_5.2.0_3.0_1699384462079.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_bert_large_cased_portuguese_lenerbr","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_bert_large_cased_portuguese_lenerbr", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_bert_large_cased_portuguese_lenerbr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|1.2 GB| + +## References + +https://huggingface.co/pierreguillou/ner-bert-large-cased-pt-lenerbr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ner_bio_annotated_7_1_en.md b/docs/_posts/ahmedlone127/2023-11-07-ner_bio_annotated_7_1_en.md new file mode 100644 index 000000000000..6d88251d4dcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ner_bio_annotated_7_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_bio_annotated_7_1 BertForTokenClassification from urbija +author: John Snow Labs +name: ner_bio_annotated_7_1 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_bio_annotated_7_1` is a English model originally trained by urbija. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_bio_annotated_7_1_en_5.2.0_3.0_1699399873485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_bio_annotated_7_1_en_5.2.0_3.0_1699399873485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_bio_annotated_7_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_bio_annotated_7_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_bio_annotated_7_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/urbija/ner-bio-annotated-7-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ner_fine_tune_bert_en.md b/docs/_posts/ahmedlone127/2023-11-07-ner_fine_tune_bert_en.md new file mode 100644 index 000000000000..8b7d801139aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ner_fine_tune_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_fine_tune_bert BertForTokenClassification from cehongw +author: John Snow Labs +name: ner_fine_tune_bert +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_fine_tune_bert` is a English model originally trained by cehongw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_fine_tune_bert_en_5.2.0_3.0_1699385195759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_fine_tune_bert_en_5.2.0_3.0_1699385195759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_fine_tune_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_fine_tune_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_fine_tune_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/cehongw/ner-fine-tune-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ner_fine_tune_bert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-ner_fine_tune_bert_ner_en.md new file mode 100644 index 000000000000..62068cfd7d7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ner_fine_tune_bert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_fine_tune_bert_ner BertForTokenClassification from cehongw +author: John Snow Labs +name: ner_fine_tune_bert_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_fine_tune_bert_ner` is a English model originally trained by cehongw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_fine_tune_bert_ner_en_5.2.0_3.0_1699401114697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_fine_tune_bert_ner_en_5.2.0_3.0_1699401114697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_fine_tune_bert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_fine_tune_bert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_fine_tune_bert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/cehongw/ner-fine-tune-bert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations_en.md b/docs/_posts/ahmedlone127/2023-11-07-ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations_en.md new file mode 100644 index 000000000000..9cd05f97be78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations BertForTokenClassification from poodledude +author: John Snow Labs +name: ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations` is a English model originally trained by poodledude. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations_en_5.2.0_3.0_1699386946731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations_en_5.2.0_3.0_1699386946731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_test_bert_base_uncased_finetuned_500k_adamw_3_epoch_locations| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/poodledude/ner-test-bert-base-uncased-finetuned-500K-AdamW-3-epoch-locations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased_xx.md b/docs/_posts/ahmedlone127/2023-11-07-nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased_xx.md new file mode 100644 index 000000000000..1070f399d5e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased BertForTokenClassification from GuCuChiara +author: John Snow Labs +name: nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased +date: 2023-11-07 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased` is a Multilingual model originally trained by GuCuChiara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased_xx_5.2.0_3.0_1699389484578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased_xx_5.2.0_3.0_1699389484578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_cic_wfu_distemist_fine_tuned_bert_base_multilingual_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/GuCuChiara/NLP-CIC-WFU_DisTEMIST_fine_tuned_bert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-nlp_tokenclass_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-nlp_tokenclass_ner_en.md new file mode 100644 index 000000000000..f90bc6d00acf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-nlp_tokenclass_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlp_tokenclass_ner BertForTokenClassification from Endika99 +author: John Snow Labs +name: nlp_tokenclass_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_tokenclass_ner` is a English model originally trained by Endika99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_tokenclass_ner_en_5.2.0_3.0_1699384183925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_tokenclass_ner_en_5.2.0_3.0_1699384183925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nlp_tokenclass_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nlp_tokenclass_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_tokenclass_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Endika99/NLP-TokenClass-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner_es.md b/docs/_posts/ahmedlone127/2023-11-07-nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner_es.md new file mode 100644 index 000000000000..2247efd28bdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner BertForTokenClassification from pineiden +author: John Snow Labs +name: nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner +date: 2023-11-07 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner` is a Castilian, Spanish model originally trained by pineiden. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner_es_5.2.0_3.0_1699389473204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner_es_5.2.0_3.0_1699389473204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nominal_groups_recognition_medical_disease_competencia2_bert_medical_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|407.2 MB| + +## References + +https://huggingface.co/pineiden/nominal-groups-recognition-medical-disease-competencia2-bert-medical-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-nyt_ingredient_tagger_gte_small_en.md b/docs/_posts/ahmedlone127/2023-11-07-nyt_ingredient_tagger_gte_small_en.md new file mode 100644 index 000000000000..4026a220a70e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-nyt_ingredient_tagger_gte_small_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nyt_ingredient_tagger_gte_small BertForTokenClassification from napsternxg +author: John Snow Labs +name: nyt_ingredient_tagger_gte_small +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nyt_ingredient_tagger_gte_small` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nyt_ingredient_tagger_gte_small_en_5.2.0_3.0_1699389758527.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nyt_ingredient_tagger_gte_small_en_5.2.0_3.0_1699389758527.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nyt_ingredient_tagger_gte_small","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nyt_ingredient_tagger_gte_small", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nyt_ingredient_tagger_gte_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|113.1 MB| + +## References + +https://huggingface.co/napsternxg/nyt-ingredient-tagger-gte-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-pashto_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-07-pashto_sayula_popoluca_en.md new file mode 100644 index 000000000000..3c87a81d68aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-pashto_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pashto_sayula_popoluca BertForTokenClassification from ijazulhaq +author: John Snow Labs +name: pashto_sayula_popoluca +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pashto_sayula_popoluca` is a English model originally trained by ijazulhaq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pashto_sayula_popoluca_en_5.2.0_3.0_1699386046291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pashto_sayula_popoluca_en_5.2.0_3.0_1699386046291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("pashto_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("pashto_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pashto_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.6 MB| + +## References + +https://huggingface.co/ijazulhaq/pashto-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-pashto_word_segmentation_en.md b/docs/_posts/ahmedlone127/2023-11-07-pashto_word_segmentation_en.md new file mode 100644 index 000000000000..40a3024f7b57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-pashto_word_segmentation_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pashto_word_segmentation BertForTokenClassification from ijazulhaq +author: John Snow Labs +name: pashto_word_segmentation +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pashto_word_segmentation` is a English model originally trained by ijazulhaq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pashto_word_segmentation_en_5.2.0_3.0_1699383974575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pashto_word_segmentation_en_5.2.0_3.0_1699383974575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("pashto_word_segmentation","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("pashto_word_segmentation", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pashto_word_segmentation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.5 MB| + +## References + +https://huggingface.co/ijazulhaq/pashto-word-segmentation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-personal_noun_detection_german_bert_de.md b/docs/_posts/ahmedlone127/2023-11-07-personal_noun_detection_german_bert_de.md new file mode 100644 index 000000000000..9cd79e25c8da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-personal_noun_detection_german_bert_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German personal_noun_detection_german_bert BertForTokenClassification from CarlaSoe +author: John Snow Labs +name: personal_noun_detection_german_bert +date: 2023-11-07 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`personal_noun_detection_german_bert` is a German model originally trained by CarlaSoe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/personal_noun_detection_german_bert_de_5.2.0_3.0_1699388076431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/personal_noun_detection_german_bert_de_5.2.0_3.0_1699388076431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("personal_noun_detection_german_bert","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("personal_noun_detection_german_bert", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|personal_noun_detection_german_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|406.9 MB| + +## References + +https://huggingface.co/CarlaSoe/personal-noun-detection-german-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-phibert_finetuned_ner_girinlp_i2i_en.md b/docs/_posts/ahmedlone127/2023-11-07-phibert_finetuned_ner_girinlp_i2i_en.md new file mode 100644 index 000000000000..c3f8c7fba04f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-phibert_finetuned_ner_girinlp_i2i_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English phibert_finetuned_ner_girinlp_i2i BertForTokenClassification from girinlp-i2i +author: John Snow Labs +name: phibert_finetuned_ner_girinlp_i2i +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phibert_finetuned_ner_girinlp_i2i` is a English model originally trained by girinlp-i2i. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phibert_finetuned_ner_girinlp_i2i_en_5.2.0_3.0_1699316986375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phibert_finetuned_ner_girinlp_i2i_en_5.2.0_3.0_1699316986375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("phibert_finetuned_ner_girinlp_i2i","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("phibert_finetuned_ner_girinlp_i2i", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phibert_finetuned_ner_girinlp_i2i| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.2 MB| + +## References + +https://huggingface.co/girinlp-i2i/phibert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-pico_ner_adapter_en.md b/docs/_posts/ahmedlone127/2023-11-07-pico_ner_adapter_en.md new file mode 100644 index 000000000000..ade310de2463 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-pico_ner_adapter_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pico_ner_adapter BertForTokenClassification from reginaboateng +author: John Snow Labs +name: pico_ner_adapter +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pico_ner_adapter` is a English model originally trained by reginaboateng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pico_ner_adapter_en_5.2.0_3.0_1699388021718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pico_ner_adapter_en_5.2.0_3.0_1699388021718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("pico_ner_adapter","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("pico_ner_adapter", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pico_ner_adapter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/reginaboateng/pico_ner_adapter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-pii_annotator_en.md b/docs/_posts/ahmedlone127/2023-11-07-pii_annotator_en.md new file mode 100644 index 000000000000..1b7a565156fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-pii_annotator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pii_annotator BertForTokenClassification from cp500 +author: John Snow Labs +name: pii_annotator +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pii_annotator` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pii_annotator_en_5.2.0_3.0_1699386518686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pii_annotator_en_5.2.0_3.0_1699386518686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("pii_annotator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("pii_annotator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pii_annotator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.5 MB| + +## References + +https://huggingface.co/cp500/PII_annotator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-polymerner_en.md b/docs/_posts/ahmedlone127/2023-11-07-polymerner_en.md new file mode 100644 index 000000000000..b7ffecf915cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-polymerner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English polymerner BertForTokenClassification from pranav-s +author: John Snow Labs +name: polymerner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`polymerner` is a English model originally trained by pranav-s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/polymerner_en_5.2.0_3.0_1699384979853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/polymerner_en_5.2.0_3.0_1699384979853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("polymerner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("polymerner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|polymerner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/pranav-s/PolymerNER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-porttagger_base_en.md b/docs/_posts/ahmedlone127/2023-11-07-porttagger_base_en.md new file mode 100644 index 000000000000..1fd27ae7ff14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-porttagger_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English porttagger_base BertForTokenClassification from Emanuel +author: John Snow Labs +name: porttagger_base +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`porttagger_base` is a English model originally trained by Emanuel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/porttagger_base_en_5.2.0_3.0_1699384183984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/porttagger_base_en_5.2.0_3.0_1699384183984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("porttagger_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("porttagger_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|porttagger_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Emanuel/porttagger-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-postagger_portuguese_pt.md b/docs/_posts/ahmedlone127/2023-11-07-postagger_portuguese_pt.md new file mode 100644 index 000000000000..86394a95fd08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-postagger_portuguese_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese postagger_portuguese BertForTokenClassification from lisaterumi +author: John Snow Labs +name: postagger_portuguese +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`postagger_portuguese` is a Portuguese model originally trained by lisaterumi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/postagger_portuguese_pt_5.2.0_3.0_1699386278787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/postagger_portuguese_pt_5.2.0_3.0_1699386278787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("postagger_portuguese","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("postagger_portuguese", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|postagger_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|406.0 MB| + +## References + +https://huggingface.co/lisaterumi/postagger-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-products_ner8_en.md b/docs/_posts/ahmedlone127/2023-11-07-products_ner8_en.md new file mode 100644 index 000000000000..64c56a3451d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-products_ner8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English products_ner8 BertForTokenClassification from Atheer174 +author: John Snow Labs +name: products_ner8 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`products_ner8` is a English model originally trained by Atheer174. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/products_ner8_en_5.2.0_3.0_1699386899051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/products_ner8_en_5.2.0_3.0_1699386899051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("products_ner8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("products_ner8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|products_ner8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Atheer174/Products_NER8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-resumeparserbert_en.md b/docs/_posts/ahmedlone127/2023-11-07-resumeparserbert_en.md new file mode 100644 index 000000000000..32edff2c8e4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-resumeparserbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English resumeparserbert BertForTokenClassification from sravya-abburi +author: John Snow Labs +name: resumeparserbert +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`resumeparserbert` is a English model originally trained by sravya-abburi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/resumeparserbert_en_5.2.0_3.0_1699383698898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/resumeparserbert_en_5.2.0_3.0_1699383698898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("resumeparserbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("resumeparserbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|resumeparserbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/sravya-abburi/ResumeParserBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-roberta_finetuned_privacy_detection_zh.md b/docs/_posts/ahmedlone127/2023-11-07-roberta_finetuned_privacy_detection_zh.md new file mode 100644 index 000000000000..a10cdd8aaaab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-roberta_finetuned_privacy_detection_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese roberta_finetuned_privacy_detection BertForTokenClassification from gyr66 +author: John Snow Labs +name: roberta_finetuned_privacy_detection +date: 2023-11-07 +tags: [bert, zh, open_source, token_classification, onnx] +task: Named Entity Recognition +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_privacy_detection` is a Chinese model originally trained by gyr66. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_privacy_detection_zh_5.2.0_3.0_1699386723039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_privacy_detection_zh_5.2.0_3.0_1699386723039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("roberta_finetuned_privacy_detection","zh") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("roberta_finetuned_privacy_detection", "zh") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_privacy_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|zh| +|Size:|1.2 GB| + +## References + +https://huggingface.co/gyr66/RoBERTa-finetuned-privacy-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-rubert_base_cased_conversational_ner_v1_en.md b/docs/_posts/ahmedlone127/2023-11-07-rubert_base_cased_conversational_ner_v1_en.md new file mode 100644 index 000000000000..762f44e1b9fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-rubert_base_cased_conversational_ner_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_base_cased_conversational_ner_v1 BertForTokenClassification from Data-Lab +author: John Snow Labs +name: rubert_base_cased_conversational_ner_v1 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_cased_conversational_ner_v1` is a English model originally trained by Data-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_cased_conversational_ner_v1_en_5.2.0_3.0_1699385391187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_cased_conversational_ner_v1_en_5.2.0_3.0_1699385391187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rubert_base_cased_conversational_ner_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rubert_base_cased_conversational_ner_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_cased_conversational_ner_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|662.2 MB| + +## References + +https://huggingface.co/Data-Lab/rubert-base-cased-conversational_ner-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-rubert_base_massive_ner_ru.md b/docs/_posts/ahmedlone127/2023-11-07-rubert_base_massive_ner_ru.md new file mode 100644 index 000000000000..90ecab1e70c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-rubert_base_massive_ner_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian rubert_base_massive_ner BertForTokenClassification from 0x7194633 +author: John Snow Labs +name: rubert_base_massive_ner +date: 2023-11-07 +tags: [bert, ru, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_massive_ner` is a Russian model originally trained by 0x7194633. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_massive_ner_ru_5.2.0_3.0_1699389487868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_massive_ner_ru_5.2.0_3.0_1699389487868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rubert_base_massive_ner","ru") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rubert_base_massive_ner", "ru") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_massive_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ru| +|Size:|664.6 MB| + +## References + +https://huggingface.co/0x7194633/rubert-base-massive-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-rubert_ext_sum_gazeta_ru.md b/docs/_posts/ahmedlone127/2023-11-07-rubert_ext_sum_gazeta_ru.md new file mode 100644 index 000000000000..d39b9e1760ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-rubert_ext_sum_gazeta_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian rubert_ext_sum_gazeta BertForTokenClassification from IlyaGusev +author: John Snow Labs +name: rubert_ext_sum_gazeta +date: 2023-11-07 +tags: [bert, ru, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_ext_sum_gazeta` is a Russian model originally trained by IlyaGusev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_ext_sum_gazeta_ru_5.2.0_3.0_1699383839435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_ext_sum_gazeta_ru_5.2.0_3.0_1699383839435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rubert_ext_sum_gazeta","ru") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rubert_ext_sum_gazeta", "ru") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_ext_sum_gazeta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ru| +|Size:|664.3 MB| + +## References + +https://huggingface.co/IlyaGusev/rubert_ext_sum_gazeta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-rubert_tiny_obj_asp_en.md b/docs/_posts/ahmedlone127/2023-11-07-rubert_tiny_obj_asp_en.md new file mode 100644 index 000000000000..4012884002b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-rubert_tiny_obj_asp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_tiny_obj_asp BertForTokenClassification from lilaspourpre +author: John Snow Labs +name: rubert_tiny_obj_asp +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny_obj_asp` is a English model originally trained by lilaspourpre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny_obj_asp_en_5.2.0_3.0_1699384483183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny_obj_asp_en_5.2.0_3.0_1699384483183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rubert_tiny_obj_asp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rubert_tiny_obj_asp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny_obj_asp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|43.8 MB| + +## References + +https://huggingface.co/lilaspourpre/rubert-tiny-obj-asp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-russian_damage_trigger_effect_4_en.md b/docs/_posts/ahmedlone127/2023-11-07-russian_damage_trigger_effect_4_en.md new file mode 100644 index 000000000000..9e53b52b69aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-russian_damage_trigger_effect_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English russian_damage_trigger_effect_4 BertForTokenClassification from Lolimorimorf +author: John Snow Labs +name: russian_damage_trigger_effect_4 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`russian_damage_trigger_effect_4` is a English model originally trained by Lolimorimorf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/russian_damage_trigger_effect_4_en_5.2.0_3.0_1699387304257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/russian_damage_trigger_effect_4_en_5.2.0_3.0_1699387304257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("russian_damage_trigger_effect_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("russian_damage_trigger_effect_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|russian_damage_trigger_effect_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.3 MB| + +## References + +https://huggingface.co/Lolimorimorf/russian_damage_trigger_effect_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-sayula_popoluca_thai_th.md b/docs/_posts/ahmedlone127/2023-11-07-sayula_popoluca_thai_th.md new file mode 100644 index 000000000000..26c9aad8e99a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-sayula_popoluca_thai_th.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Thai sayula_popoluca_thai BertForTokenClassification from lunarlist +author: John Snow Labs +name: sayula_popoluca_thai +date: 2023-11-07 +tags: [bert, th, open_source, token_classification, onnx] +task: Named Entity Recognition +language: th +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sayula_popoluca_thai` is a Thai model originally trained by lunarlist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sayula_popoluca_thai_th_5.2.0_3.0_1699388742737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sayula_popoluca_thai_th_5.2.0_3.0_1699388742737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("sayula_popoluca_thai","th") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("sayula_popoluca_thai", "th") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sayula_popoluca_thai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|th| +|Size:|344.8 MB| + +## References + +https://huggingface.co/lunarlist/pos_thai \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-scbert_ser3_en.md b/docs/_posts/ahmedlone127/2023-11-07-scbert_ser3_en.md new file mode 100644 index 000000000000..3583bd9643e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-scbert_ser3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scbert_ser3 BertForTokenClassification from havens2 +author: John Snow Labs +name: scbert_ser3 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scbert_ser3` is a English model originally trained by havens2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scbert_ser3_en_5.2.0_3.0_1699385594161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scbert_ser3_en_5.2.0_3.0_1699385594161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scbert_ser3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scbert_ser3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scbert_ser3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/havens2/scBERT_SER3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-scibert_finetuned_ner_eeshclusive_en.md b/docs/_posts/ahmedlone127/2023-11-07-scibert_finetuned_ner_eeshclusive_en.md new file mode 100644 index 000000000000..edea0f9ea5de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-scibert_finetuned_ner_eeshclusive_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_finetuned_ner_eeshclusive BertForTokenClassification from eeshclusive +author: John Snow Labs +name: scibert_finetuned_ner_eeshclusive +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_finetuned_ner_eeshclusive` is a English model originally trained by eeshclusive. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_eeshclusive_en_5.2.0_3.0_1699397236557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_eeshclusive_en_5.2.0_3.0_1699397236557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_finetuned_ner_eeshclusive","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_finetuned_ner_eeshclusive", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_finetuned_ner_eeshclusive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/eeshclusive/scibert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-scibert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-scibert_ner_en.md new file mode 100644 index 000000000000..ef26b4f5402b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-scibert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_ner BertForTokenClassification from devanshrj +author: John Snow Labs +name: scibert_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_ner` is a English model originally trained by devanshrj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_ner_en_5.2.0_3.0_1699387144110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_ner_en_5.2.0_3.0_1699387144110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/devanshrj/scibert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-scibert_scivocab_uncased_finetuned_ner_jsylee_en.md b/docs/_posts/ahmedlone127/2023-11-07-scibert_scivocab_uncased_finetuned_ner_jsylee_en.md new file mode 100644 index 000000000000..6d8d7371ca16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-scibert_scivocab_uncased_finetuned_ner_jsylee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_scivocab_uncased_finetuned_ner_jsylee BertForTokenClassification from jsylee +author: John Snow Labs +name: scibert_scivocab_uncased_finetuned_ner_jsylee +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_scivocab_uncased_finetuned_ner_jsylee` is a English model originally trained by jsylee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_scivocab_uncased_finetuned_ner_jsylee_en_5.2.0_3.0_1699383047707.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_scivocab_uncased_finetuned_ner_jsylee_en_5.2.0_3.0_1699383047707.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_scivocab_uncased_finetuned_ner_jsylee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_scivocab_uncased_finetuned_ner_jsylee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_scivocab_uncased_finetuned_ner_jsylee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/jsylee/scibert_scivocab_uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-scibert_scivocab_uncased_ner_visbank_en.md b/docs/_posts/ahmedlone127/2023-11-07-scibert_scivocab_uncased_ner_visbank_en.md new file mode 100644 index 000000000000..f3222e01561b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-scibert_scivocab_uncased_ner_visbank_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_scivocab_uncased_ner_visbank BertForTokenClassification from Yamei +author: John Snow Labs +name: scibert_scivocab_uncased_ner_visbank +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_scivocab_uncased_ner_visbank` is a English model originally trained by Yamei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_scivocab_uncased_ner_visbank_en_5.2.0_3.0_1699401483817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_scivocab_uncased_ner_visbank_en_5.2.0_3.0_1699401483817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_scivocab_uncased_ner_visbank","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_scivocab_uncased_ner_visbank", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_scivocab_uncased_ner_visbank| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/Yamei/scibert_scivocab_uncased_NER_VISBank \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-sindhi_geneprod_roles_v2_en.md b/docs/_posts/ahmedlone127/2023-11-07-sindhi_geneprod_roles_v2_en.md new file mode 100644 index 000000000000..6fbbf2293c04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-sindhi_geneprod_roles_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sindhi_geneprod_roles_v2 BertForTokenClassification from EMBO +author: John Snow Labs +name: sindhi_geneprod_roles_v2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sindhi_geneprod_roles_v2` is a English model originally trained by EMBO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sindhi_geneprod_roles_v2_en_5.2.0_3.0_1699388297204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sindhi_geneprod_roles_v2_en_5.2.0_3.0_1699388297204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("sindhi_geneprod_roles_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("sindhi_geneprod_roles_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sindhi_geneprod_roles_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/EMBO/sd-geneprod-roles-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-sindhi_ner_v2_en.md b/docs/_posts/ahmedlone127/2023-11-07-sindhi_ner_v2_en.md new file mode 100644 index 000000000000..2afd4a307976 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-sindhi_ner_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sindhi_ner_v2 BertForTokenClassification from EMBO +author: John Snow Labs +name: sindhi_ner_v2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sindhi_ner_v2` is a English model originally trained by EMBO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sindhi_ner_v2_en_5.2.0_3.0_1699387293106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sindhi_ner_v2_en_5.2.0_3.0_1699387293106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("sindhi_ner_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("sindhi_ner_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sindhi_ner_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/EMBO/sd-ner-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-sindhi_panelization_v2_en.md b/docs/_posts/ahmedlone127/2023-11-07-sindhi_panelization_v2_en.md new file mode 100644 index 000000000000..e691e5e2eb55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-sindhi_panelization_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sindhi_panelization_v2 BertForTokenClassification from EMBO +author: John Snow Labs +name: sindhi_panelization_v2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sindhi_panelization_v2` is a English model originally trained by EMBO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sindhi_panelization_v2_en_5.2.0_3.0_1699388220382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sindhi_panelization_v2_en_5.2.0_3.0_1699388220382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("sindhi_panelization_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("sindhi_panelization_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sindhi_panelization_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/EMBO/sd-panelization-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-sindhi_smallmol_roles_v2_en.md b/docs/_posts/ahmedlone127/2023-11-07-sindhi_smallmol_roles_v2_en.md new file mode 100644 index 000000000000..b49b0b26f4d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-sindhi_smallmol_roles_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sindhi_smallmol_roles_v2 BertForTokenClassification from EMBO +author: John Snow Labs +name: sindhi_smallmol_roles_v2 +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sindhi_smallmol_roles_v2` is a English model originally trained by EMBO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sindhi_smallmol_roles_v2_en_5.2.0_3.0_1699387916066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sindhi_smallmol_roles_v2_en_5.2.0_3.0_1699387916066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("sindhi_smallmol_roles_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("sindhi_smallmol_roles_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sindhi_smallmol_roles_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/EMBO/sd-smallmol-roles-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-skill_role_mapper_en.md b/docs/_posts/ahmedlone127/2023-11-07-skill_role_mapper_en.md new file mode 100644 index 000000000000..ea3fb5ec6c57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-skill_role_mapper_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English skill_role_mapper BertForTokenClassification from MehdiHosseiniMoghadam +author: John Snow Labs +name: skill_role_mapper +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`skill_role_mapper` is a English model originally trained by MehdiHosseiniMoghadam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/skill_role_mapper_en_5.2.0_3.0_1699386711647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/skill_role_mapper_en_5.2.0_3.0_1699386711647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("skill_role_mapper","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("skill_role_mapper", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|skill_role_mapper| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.8 MB| + +## References + +https://huggingface.co/MehdiHosseiniMoghadam/skill-role-mapper \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-skillner_en.md b/docs/_posts/ahmedlone127/2023-11-07-skillner_en.md new file mode 100644 index 000000000000..4e1684837323 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-skillner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English skillner BertForTokenClassification from ihk +author: John Snow Labs +name: skillner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`skillner` is a English model originally trained by ihk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/skillner_en_5.2.0_3.0_1699329137492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/skillner_en_5.2.0_3.0_1699329137492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("skillner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("skillner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|skillner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/ihk/skillner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-spanish_capitalization_punctuation_restoration_es.md b/docs/_posts/ahmedlone127/2023-11-07-spanish_capitalization_punctuation_restoration_es.md new file mode 100644 index 000000000000..a7d2dac1df26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-spanish_capitalization_punctuation_restoration_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish spanish_capitalization_punctuation_restoration BertForTokenClassification from UMUTeam +author: John Snow Labs +name: spanish_capitalization_punctuation_restoration +date: 2023-11-07 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spanish_capitalization_punctuation_restoration` is a Castilian, Spanish model originally trained by UMUTeam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spanish_capitalization_punctuation_restoration_es_5.2.0_3.0_1699330499358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spanish_capitalization_punctuation_restoration_es_5.2.0_3.0_1699330499358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("spanish_capitalization_punctuation_restoration","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("spanish_capitalization_punctuation_restoration", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spanish_capitalization_punctuation_restoration| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.6 MB| + +## References + +https://huggingface.co/UMUTeam/spanish_capitalization_punctuation_restoration \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-species_identification_mbert_fine_tuned_train_test_en.md b/docs/_posts/ahmedlone127/2023-11-07-species_identification_mbert_fine_tuned_train_test_en.md new file mode 100644 index 000000000000..9b4ada8ef994 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-species_identification_mbert_fine_tuned_train_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English species_identification_mbert_fine_tuned_train_test BertForTokenClassification from ajtamayoh +author: John Snow Labs +name: species_identification_mbert_fine_tuned_train_test +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`species_identification_mbert_fine_tuned_train_test` is a English model originally trained by ajtamayoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/species_identification_mbert_fine_tuned_train_test_en_5.2.0_3.0_1699389436904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/species_identification_mbert_fine_tuned_train_test_en_5.2.0_3.0_1699389436904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("species_identification_mbert_fine_tuned_train_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("species_identification_mbert_fine_tuned_train_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|species_identification_mbert_fine_tuned_train_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.2 MB| + +## References + +https://huggingface.co/ajtamayoh/Species_Identification_mBERT_fine_tuned_Train_Test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-tempclin_biobertpt_all_pt.md b/docs/_posts/ahmedlone127/2023-11-07-tempclin_biobertpt_all_pt.md new file mode 100644 index 000000000000..0df4158c5741 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-tempclin_biobertpt_all_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese tempclin_biobertpt_all BertForTokenClassification from pucpr-br +author: John Snow Labs +name: tempclin_biobertpt_all +date: 2023-11-07 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tempclin_biobertpt_all` is a Portuguese model originally trained by pucpr-br. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tempclin_biobertpt_all_pt_5.2.0_3.0_1699386094949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tempclin_biobertpt_all_pt_5.2.0_3.0_1699386094949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("tempclin_biobertpt_all","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("tempclin_biobertpt_all", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tempclin_biobertpt_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.9 MB| + +## References + +https://huggingface.co/pucpr-br/tempclin-biobertpt-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-tiny_random_bertfortokenclassification_hf_internal_testing_en.md b/docs/_posts/ahmedlone127/2023-11-07-tiny_random_bertfortokenclassification_hf_internal_testing_en.md new file mode 100644 index 000000000000..d0e912198e8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-tiny_random_bertfortokenclassification_hf_internal_testing_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tiny_random_bertfortokenclassification_hf_internal_testing BertForTokenClassification from hf-internal-testing +author: John Snow Labs +name: tiny_random_bertfortokenclassification_hf_internal_testing +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_bertfortokenclassification_hf_internal_testing` is a English model originally trained by hf-internal-testing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_bertfortokenclassification_hf_internal_testing_en_5.2.0_3.0_1699384782616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_bertfortokenclassification_hf_internal_testing_en_5.2.0_3.0_1699384782616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("tiny_random_bertfortokenclassification_hf_internal_testing","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("tiny_random_bertfortokenclassification_hf_internal_testing", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_bertfortokenclassification_hf_internal_testing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|349.9 KB| + +## References + +https://huggingface.co/hf-internal-testing/tiny-random-BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-toponym_19thc_english_en.md b/docs/_posts/ahmedlone127/2023-11-07-toponym_19thc_english_en.md new file mode 100644 index 000000000000..d11bd7d01f13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-toponym_19thc_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English toponym_19thc_english BertForTokenClassification from Livingwithmachines +author: John Snow Labs +name: toponym_19thc_english +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toponym_19thc_english` is a English model originally trained by Livingwithmachines. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toponym_19thc_english_en_5.2.0_3.0_1699388663159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toponym_19thc_english_en_5.2.0_3.0_1699388663159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("toponym_19thc_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("toponym_19thc_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toponym_19thc_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.0 MB| + +## References + +https://huggingface.co/Livingwithmachines/toponym-19thC-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-treatment_disease_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-treatment_disease_ner_en.md new file mode 100644 index 000000000000..1da0f0b39ee4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-treatment_disease_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English treatment_disease_ner BertForTokenClassification from jnferfer +author: John Snow Labs +name: treatment_disease_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`treatment_disease_ner` is a English model originally trained by jnferfer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/treatment_disease_ner_en_5.2.0_3.0_1699386357805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/treatment_disease_ner_en_5.2.0_3.0_1699386357805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("treatment_disease_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("treatment_disease_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|treatment_disease_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/jnferfer/treatment-disease-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-unbias_named_entity_recognition_en.md b/docs/_posts/ahmedlone127/2023-11-07-unbias_named_entity_recognition_en.md new file mode 100644 index 000000000000..f746cbe8c17f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-unbias_named_entity_recognition_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English unbias_named_entity_recognition BertForTokenClassification from newsmediabias +author: John Snow Labs +name: unbias_named_entity_recognition +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unbias_named_entity_recognition` is a English model originally trained by newsmediabias. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unbias_named_entity_recognition_en_5.2.0_3.0_1699386641857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unbias_named_entity_recognition_en_5.2.0_3.0_1699386641857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("unbias_named_entity_recognition","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("unbias_named_entity_recognition", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unbias_named_entity_recognition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/newsmediabias/UnBIAS-Named-Entity-Recognition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-unbias_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-unbias_ner_en.md new file mode 100644 index 000000000000..74f70b4e4d97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-unbias_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English unbias_ner BertForTokenClassification from newsmediabias +author: John Snow Labs +name: unbias_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unbias_ner` is a English model originally trained by newsmediabias. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unbias_ner_en_5.2.0_3.0_1699386172323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unbias_ner_en_5.2.0_3.0_1699386172323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("unbias_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("unbias_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unbias_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/newsmediabias/UnBIAS-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-unicausal_tok_baseline_en.md b/docs/_posts/ahmedlone127/2023-11-07-unicausal_tok_baseline_en.md new file mode 100644 index 000000000000..6aa83ed26f94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-unicausal_tok_baseline_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English unicausal_tok_baseline BertForTokenClassification from tanfiona +author: John Snow Labs +name: unicausal_tok_baseline +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unicausal_tok_baseline` is a English model originally trained by tanfiona. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unicausal_tok_baseline_en_5.2.0_3.0_1699387468133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unicausal_tok_baseline_en_5.2.0_3.0_1699387468133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("unicausal_tok_baseline","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("unicausal_tok_baseline", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unicausal_tok_baseline| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/tanfiona/unicausal-tok-baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-urdu_bert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-07-urdu_bert_ner_en.md new file mode 100644 index 000000000000..0c1c2af896f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-urdu_bert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English urdu_bert_ner BertForTokenClassification from mirfan899 +author: John Snow Labs +name: urdu_bert_ner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`urdu_bert_ner` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/urdu_bert_ner_en_5.2.0_3.0_1699399089364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/urdu_bert_ner_en_5.2.0_3.0_1699399089364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("urdu_bert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("urdu_bert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|urdu_bert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/mirfan899/urdu-bert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-vila_scibert_cased_s2vl_en.md b/docs/_posts/ahmedlone127/2023-11-07-vila_scibert_cased_s2vl_en.md new file mode 100644 index 000000000000..5fea228cf27e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-vila_scibert_cased_s2vl_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English vila_scibert_cased_s2vl BertForTokenClassification from allenai +author: John Snow Labs +name: vila_scibert_cased_s2vl +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vila_scibert_cased_s2vl` is a English model originally trained by allenai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vila_scibert_cased_s2vl_en_5.2.0_3.0_1699385199476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vila_scibert_cased_s2vl_en_5.2.0_3.0_1699385199476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("vila_scibert_cased_s2vl","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("vila_scibert_cased_s2vl", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vila_scibert_cased_s2vl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/allenai/vila-scibert-cased-s2vl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-wikiser_bert_base_en.md b/docs/_posts/ahmedlone127/2023-11-07-wikiser_bert_base_en.md new file mode 100644 index 000000000000..c5eaae7f10f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-wikiser_bert_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wikiser_bert_base BertForTokenClassification from taidng +author: John Snow Labs +name: wikiser_bert_base +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wikiser_bert_base` is a English model originally trained by taidng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wikiser_bert_base_en_5.2.0_3.0_1699384974903.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wikiser_bert_base_en_5.2.0_3.0_1699384974903.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("wikiser_bert_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("wikiser_bert_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wikiser_bert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/taidng/wikiser-bert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-wikiser_bert_large_en.md b/docs/_posts/ahmedlone127/2023-11-07-wikiser_bert_large_en.md new file mode 100644 index 000000000000..7d10346d6b35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-wikiser_bert_large_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wikiser_bert_large BertForTokenClassification from taidng +author: John Snow Labs +name: wikiser_bert_large +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wikiser_bert_large` is a English model originally trained by taidng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wikiser_bert_large_en_5.2.0_3.0_1699387224834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wikiser_bert_large_en_5.2.0_3.0_1699387224834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("wikiser_bert_large","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("wikiser_bert_large", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wikiser_bert_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/taidng/wikiser-bert-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-07-zeroshotbioner_en.md b/docs/_posts/ahmedlone127/2023-11-07-zeroshotbioner_en.md new file mode 100644 index 000000000000..38ec98ccaa8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-07-zeroshotbioner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English zeroshotbioner BertForTokenClassification from ProdicusII +author: John Snow Labs +name: zeroshotbioner +date: 2023-11-07 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`zeroshotbioner` is a English model originally trained by ProdicusII. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/zeroshotbioner_en_5.2.0_3.0_1699386730433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/zeroshotbioner_en_5.2.0_3.0_1699386730433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("zeroshotbioner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("zeroshotbioner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|zeroshotbioner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.2 MB| + +## References + +https://huggingface.co/ProdicusII/ZeroShotBioNER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-11_711_project_2_en.md b/docs/_posts/ahmedlone127/2023-11-08-11_711_project_2_en.md new file mode 100644 index 000000000000..5e34a0960c0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-11_711_project_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English 11_711_project_2 BertForTokenClassification from yitengm +author: John Snow Labs +name: 11_711_project_2 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`11_711_project_2` is a English model originally trained by yitengm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/11_711_project_2_en_5.2.0_3.0_1699431406724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/11_711_project_2_en_5.2.0_3.0_1699431406724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("11_711_project_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("11_711_project_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|11_711_project_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/yitengm/11-711-project-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-adres_ner_v2_bert_128k_tr.md b/docs/_posts/ahmedlone127/2023-11-08-adres_ner_v2_bert_128k_tr.md new file mode 100644 index 000000000000..5eb6db6fde19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-adres_ner_v2_bert_128k_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish adres_ner_v2_bert_128k BertForTokenClassification from deprem-ml +author: John Snow Labs +name: adres_ner_v2_bert_128k +date: 2023-11-08 +tags: [bert, tr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adres_ner_v2_bert_128k` is a Turkish model originally trained by deprem-ml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adres_ner_v2_bert_128k_tr_5.2.0_3.0_1699429202686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adres_ner_v2_bert_128k_tr_5.2.0_3.0_1699429202686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("adres_ner_v2_bert_128k","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("adres_ner_v2_bert_128k", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adres_ner_v2_bert_128k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|689.0 MB| + +## References + +https://huggingface.co/deprem-ml/adres_ner_v2_bert_128k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-aldi_token_di_en.md b/docs/_posts/ahmedlone127/2023-11-08-aldi_token_di_en.md new file mode 100644 index 000000000000..9ceea3604936 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-aldi_token_di_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English aldi_token_di BertForTokenClassification from AMR-KELEG +author: John Snow Labs +name: aldi_token_di +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aldi_token_di` is a English model originally trained by AMR-KELEG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aldi_token_di_en_5.2.0_3.0_1699435695095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aldi_token_di_en_5.2.0_3.0_1699435695095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("aldi_token_di","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("aldi_token_di", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aldi_token_di| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|608.7 MB| + +## References + +https://huggingface.co/AMR-KELEG/ALDi-Token-DI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-all_15_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-all_15_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..db05f7e24ab5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-all_15_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English all_15_bert_finetuned_ner BertForTokenClassification from leo93 +author: John Snow Labs +name: all_15_bert_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_15_bert_finetuned_ner` is a English model originally trained by leo93. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_15_bert_finetuned_ner_en_5.2.0_3.0_1699411430446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_15_bert_finetuned_ner_en_5.2.0_3.0_1699411430446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("all_15_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("all_15_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_15_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/leo93/all-15-bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-arabert_finetuned_caner_en.md b/docs/_posts/ahmedlone127/2023-11-08-arabert_finetuned_caner_en.md new file mode 100644 index 000000000000..8822850a1bd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-arabert_finetuned_caner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English arabert_finetuned_caner BertForTokenClassification from Montazer +author: John Snow Labs +name: arabert_finetuned_caner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabert_finetuned_caner` is a English model originally trained by Montazer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabert_finetuned_caner_en_5.2.0_3.0_1699464760507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabert_finetuned_caner_en_5.2.0_3.0_1699464760507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("arabert_finetuned_caner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("arabert_finetuned_caner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabert_finetuned_caner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.2 MB| + +## References + +https://huggingface.co/Montazer/arabert-finetuned-caner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-archaeobert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-archaeobert_ner_en.md new file mode 100644 index 000000000000..943d00e0c708 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-archaeobert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English archaeobert_ner BertForTokenClassification from alexbrandsen +author: John Snow Labs +name: archaeobert_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`archaeobert_ner` is a English model originally trained by alexbrandsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/archaeobert_ner_en_5.2.0_3.0_1699419891519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/archaeobert_ner_en_5.2.0_3.0_1699419891519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("archaeobert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("archaeobert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|archaeobert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/alexbrandsen/ArchaeoBERT-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-archeological_ner_english_en.md b/docs/_posts/ahmedlone127/2023-11-08-archeological_ner_english_en.md new file mode 100644 index 000000000000..d19c16332789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-archeological_ner_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English archeological_ner_english BertForTokenClassification from nicolauduran45 +author: John Snow Labs +name: archeological_ner_english +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`archeological_ner_english` is a English model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/archeological_ner_english_en_5.2.0_3.0_1699444479821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/archeological_ner_english_en_5.2.0_3.0_1699444479821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("archeological_ner_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("archeological_ner_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|archeological_ner_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/nicolauduran45/archeological_ner_en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-assignment2_attempt10_en.md b/docs/_posts/ahmedlone127/2023-11-08-assignment2_attempt10_en.md new file mode 100644 index 000000000000..fb6213c446de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-assignment2_attempt10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English assignment2_attempt10 BertForTokenClassification from mpalaval +author: John Snow Labs +name: assignment2_attempt10 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`assignment2_attempt10` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/assignment2_attempt10_en_5.2.0_3.0_1699418198251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/assignment2_attempt10_en_5.2.0_3.0_1699418198251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("assignment2_attempt10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("assignment2_attempt10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|assignment2_attempt10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/assignment2_attempt10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-assignment2_attempt11_en.md b/docs/_posts/ahmedlone127/2023-11-08-assignment2_attempt11_en.md new file mode 100644 index 000000000000..eef01be57d98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-assignment2_attempt11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English assignment2_attempt11 BertForTokenClassification from mpalaval +author: John Snow Labs +name: assignment2_attempt11 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`assignment2_attempt11` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/assignment2_attempt11_en_5.2.0_3.0_1699428569679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/assignment2_attempt11_en_5.2.0_3.0_1699428569679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("assignment2_attempt11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("assignment2_attempt11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|assignment2_attempt11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/assignment2_attempt11 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-autotrain_ner_8_86129142996_en.md b/docs/_posts/ahmedlone127/2023-11-08-autotrain_ner_8_86129142996_en.md new file mode 100644 index 000000000000..f92a25de9fd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-autotrain_ner_8_86129142996_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_ner_8_86129142996 BertForTokenClassification from smirki +author: John Snow Labs +name: autotrain_ner_8_86129142996 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_ner_8_86129142996` is a English model originally trained by smirki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_ner_8_86129142996_en_5.2.0_3.0_1699448171608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_ner_8_86129142996_en_5.2.0_3.0_1699448171608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("autotrain_ner_8_86129142996","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("autotrain_ner_8_86129142996", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_ner_8_86129142996| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/smirki/autotrain-ner-8-86129142996 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-autotrain_re_syn_cleanedtext_bert_55272128958_en.md b/docs/_posts/ahmedlone127/2023-11-08-autotrain_re_syn_cleanedtext_bert_55272128958_en.md new file mode 100644 index 000000000000..915982ba6be7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-autotrain_re_syn_cleanedtext_bert_55272128958_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_re_syn_cleanedtext_bert_55272128958 BertForTokenClassification from sxandie +author: John Snow Labs +name: autotrain_re_syn_cleanedtext_bert_55272128958 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_re_syn_cleanedtext_bert_55272128958` is a English model originally trained by sxandie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_re_syn_cleanedtext_bert_55272128958_en_5.2.0_3.0_1699416369016.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_re_syn_cleanedtext_bert_55272128958_en_5.2.0_3.0_1699416369016.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("autotrain_re_syn_cleanedtext_bert_55272128958","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("autotrain_re_syn_cleanedtext_bert_55272128958", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_re_syn_cleanedtext_bert_55272128958| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/sxandie/autotrain-re_syn_cleanedtext_bert-55272128958 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert4ner_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert4ner_base_uncased_en.md new file mode 100644 index 000000000000..4a7200f0817d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert4ner_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert4ner_base_uncased BertForTokenClassification from shibing624 +author: John Snow Labs +name: bert4ner_base_uncased +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert4ner_base_uncased` is a English model originally trained by shibing624. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert4ner_base_uncased_en_5.2.0_3.0_1699429469899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert4ner_base_uncased_en_5.2.0_3.0_1699429469899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert4ner_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert4ner_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert4ner_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/shibing624/bert4ner-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_bangla_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_bangla_ner_en.md new file mode 100644 index 000000000000..b303f659092c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_bangla_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_bangla_ner BertForTokenClassification from Kowsher +author: John Snow Labs +name: bert_base_bangla_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_bangla_ner` is a English model originally trained by Kowsher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_bangla_ner_en_5.2.0_3.0_1699472427679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_bangla_ner_en_5.2.0_3.0_1699472427679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_bangla_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_bangla_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_bangla_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|612.1 MB| + +## References + +https://huggingface.co/Kowsher/bert-base-bangla-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_cased_finetuned_ner_intpc_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_cased_finetuned_ner_intpc_en.md new file mode 100644 index 000000000000..a168be252181 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_cased_finetuned_ner_intpc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_ner_intpc BertForTokenClassification from intpc +author: John Snow Labs +name: bert_base_cased_finetuned_ner_intpc +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_ner_intpc` is a English model originally trained by intpc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_ner_intpc_en_5.2.0_3.0_1699465436849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_ner_intpc_en_5.2.0_3.0_1699465436849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_finetuned_ner_intpc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_finetuned_ner_intpc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_ner_intpc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/intpc/bert-base-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_cased_ner_conll2003_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_cased_ner_conll2003_finetuned_ner_en.md new file mode 100644 index 000000000000..999429f73393 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_cased_ner_conll2003_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_ner_conll2003_finetuned_ner BertForTokenClassification from codenet +author: John Snow Labs +name: bert_base_cased_ner_conll2003_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ner_conll2003_finetuned_ner` is a English model originally trained by codenet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ner_conll2003_finetuned_ner_en_5.2.0_3.0_1699440670490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ner_conll2003_finetuned_ner_en_5.2.0_3.0_1699440670490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_ner_conll2003_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_ner_conll2003_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ner_conll2003_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/codenet/bert-base-cased-ner-conll2003-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_chinese_finetuned_ner_agdsga_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_chinese_finetuned_ner_agdsga_en.md new file mode 100644 index 000000000000..4c6a7334d6a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_chinese_finetuned_ner_agdsga_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_ner_agdsga BertForTokenClassification from agdsga +author: John Snow Labs +name: bert_base_chinese_finetuned_ner_agdsga +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_ner_agdsga` is a English model originally trained by agdsga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_agdsga_en_5.2.0_3.0_1699444328802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_ner_agdsga_en_5.2.0_3.0_1699444328802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_finetuned_ner_agdsga","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_finetuned_ner_agdsga", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_ner_agdsga| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.3 MB| + +## References + +https://huggingface.co/agdsga/bert-base-chinese-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_chinese_finetuned_split_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_chinese_finetuned_split_en.md new file mode 100644 index 000000000000..2fdfcab9c9c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_chinese_finetuned_split_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_split BertForTokenClassification from zhiguoxu +author: John Snow Labs +name: bert_base_chinese_finetuned_split +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_split` is a English model originally trained by zhiguoxu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_split_en_5.2.0_3.0_1699407784326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_split_en_5.2.0_3.0_1699407784326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_finetuned_split","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_finetuned_split", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_split| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.2 MB| + +## References + +https://huggingface.co/zhiguoxu/bert-base-chinese-finetuned-split \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_finetuned_ner_en.md new file mode 100644 index 000000000000..06a05e3c8272 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_finetuned_ner BertForTokenClassification from eeshclusive +author: John Snow Labs +name: bert_base_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_ner` is a English model originally trained by eeshclusive. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ner_en_5.2.0_3.0_1699406360106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_ner_en_5.2.0_3.0_1699406360106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/eeshclusive/bert-base-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_german_finetuned_ler_de.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_german_finetuned_ler_de.md new file mode 100644 index 000000000000..6d1c11203b84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_german_finetuned_ler_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German bert_base_german_finetuned_ler BertForTokenClassification from mrm8488 +author: John Snow Labs +name: bert_base_german_finetuned_ler +date: 2023-11-08 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_finetuned_ler` is a German model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_finetuned_ler_de_5.2.0_3.0_1699451946318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_finetuned_ler_de_5.2.0_3.0_1699451946318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_german_finetuned_ler","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_german_finetuned_ler", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_finetuned_ler| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|407.0 MB| + +## References + +https://huggingface.co/mrm8488/bert-base-german-finetuned-ler \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_multilingual_cased_finetuned_conll03_dutch_xx.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_multilingual_cased_finetuned_conll03_dutch_xx.md new file mode 100644 index 000000000000..19615fd21620 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_multilingual_cased_finetuned_conll03_dutch_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_conll03_dutch BertForTokenClassification from dbmdz +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_conll03_dutch +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_conll03_dutch` is a Multilingual model originally trained by dbmdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_conll03_dutch_xx_5.2.0_3.0_1699487190108.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_conll03_dutch_xx_5.2.0_3.0_1699487190108.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_finetuned_conll03_dutch","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_multilingual_cased_finetuned_conll03_dutch", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_conll03_dutch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/dbmdz/bert-base-multilingual-cased-finetuned-conll03-dutch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_multilingual_cased_finetuned_ner_mayagalvez_xx.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_multilingual_cased_finetuned_ner_mayagalvez_xx.md new file mode 100644 index 000000000000..17dd7bc3a107 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_multilingual_cased_finetuned_ner_mayagalvez_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_ner_mayagalvez BertForTokenClassification from MayaGalvez +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_ner_mayagalvez +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_ner_mayagalvez` is a Multilingual model originally trained by MayaGalvez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_ner_mayagalvez_xx_5.2.0_3.0_1699417045903.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_ner_mayagalvez_xx_5.2.0_3.0_1699417045903.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_multilingual_cased_finetuned_ner_mayagalvez","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_multilingual_cased_finetuned_ner_mayagalvez", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_ner_mayagalvez| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/MayaGalvez/bert-base-multilingual-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_ner_theseus_bulgarian_bg.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_ner_theseus_bulgarian_bg.md new file mode 100644 index 000000000000..d8b96a28d2a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_ner_theseus_bulgarian_bg.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Bulgarian bert_base_ner_theseus_bulgarian BertForTokenClassification from rmihaylov +author: John Snow Labs +name: bert_base_ner_theseus_bulgarian +date: 2023-11-08 +tags: [bert, bg, open_source, token_classification, onnx] +task: Named Entity Recognition +language: bg +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_ner_theseus_bulgarian` is a Bulgarian model originally trained by rmihaylov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_ner_theseus_bulgarian_bg_5.2.0_3.0_1699437046276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_ner_theseus_bulgarian_bg_5.2.0_3.0_1699437046276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_ner_theseus_bulgarian","bg") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_ner_theseus_bulgarian", "bg") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_ner_theseus_bulgarian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|bg| +|Size:|505.6 MB| + +## References + +https://huggingface.co/rmihaylov/bert-base-ner-theseus-bg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner_en.md new file mode 100644 index 000000000000..34d314af8445 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner BertForTokenClassification from jordyvl +author: John Snow Labs +name: bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner` is a English model originally trained by jordyvl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner_en_5.2.0_3.0_1699429469981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner_en_5.2.0_3.0_1699429469981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_harem_selective_lowc_samoan_first_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|405.9 MB| + +## References + +https://huggingface.co/jordyvl/bert-base-portuguese-cased_harem-selective-lowC-sm-first-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_portuguese_cased_harem_selective_samoan_first_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_portuguese_cased_harem_selective_samoan_first_ner_en.md new file mode 100644 index 000000000000..d671ad9c6491 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_portuguese_cased_harem_selective_samoan_first_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_portuguese_cased_harem_selective_samoan_first_ner BertForTokenClassification from jordyvl +author: John Snow Labs +name: bert_base_portuguese_cased_harem_selective_samoan_first_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_harem_selective_samoan_first_ner` is a English model originally trained by jordyvl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_harem_selective_samoan_first_ner_en_5.2.0_3.0_1699413077831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_harem_selective_samoan_first_ner_en_5.2.0_3.0_1699413077831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_portuguese_cased_harem_selective_samoan_first_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_portuguese_cased_harem_selective_samoan_first_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_harem_selective_samoan_first_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/jordyvl/bert-base-portuguese-cased_harem-selective-sm-first-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_spanish_wwm_uncased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_spanish_wwm_uncased_finetuned_ner_en.md new file mode 100644 index 000000000000..e48428c15828 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_spanish_wwm_uncased_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_finetuned_ner BertForTokenClassification from dccuchile +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_finetuned_ner` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_ner_en_5.2.0_3.0_1699407784218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_ner_en_5.2.0_3.0_1699407784218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_spanish_wwm_uncased_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_spanish_wwm_uncased_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.7 MB| + +## References + +https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_tweetner7_2021_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_tweetner7_2021_en.md new file mode 100644 index 000000000000..c7cc2650ed21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_tweetner7_2021_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_tweetner7_2021 BertForTokenClassification from tner +author: John Snow Labs +name: bert_base_tweetner7_2021 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_tweetner7_2021` is a English model originally trained by tner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_tweetner7_2021_en_5.2.0_3.0_1699463053326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_tweetner7_2021_en_5.2.0_3.0_1699463053326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_tweetner7_2021","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_tweetner7_2021", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_tweetner7_2021| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/tner/bert-base-tweetner7-2021 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_brands_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_brands_en.md new file mode 100644 index 000000000000..eebd2bcf60f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_brands_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_brands BertForTokenClassification from tp-runport +author: John Snow Labs +name: bert_base_uncased_brands +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_brands` is a English model originally trained by tp-runport. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_brands_en_5.2.0_3.0_1699463224622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_brands_en_5.2.0_3.0_1699463224622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_brands","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_brands", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_brands| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/tp-runport/bert-base-uncased-brands \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_arbert_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_arbert_en.md new file mode 100644 index 000000000000..ab71b8f4dd06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_arbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_ner_arbert BertForTokenClassification from ArBert +author: John Snow Labs +name: bert_base_uncased_finetuned_ner_arbert +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_ner_arbert` is a English model originally trained by ArBert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ner_arbert_en_5.2.0_3.0_1699439046014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ner_arbert_en_5.2.0_3.0_1699439046014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_finetuned_ner_arbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_finetuned_ner_arbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_ner_arbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/ArBert/bert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_kmeans_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_kmeans_en.md new file mode 100644 index 000000000000..351b17a9c826 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_kmeans_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_ner_kmeans BertForTokenClassification from ArBert +author: John Snow Labs +name: bert_base_uncased_finetuned_ner_kmeans +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_ner_kmeans` is a English model originally trained by ArBert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ner_kmeans_en_5.2.0_3.0_1699450340620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ner_kmeans_en_5.2.0_3.0_1699450340620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_finetuned_ner_kmeans","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_finetuned_ner_kmeans", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_ner_kmeans| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/ArBert/bert-base-uncased-finetuned-ner-kmeans \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_sohamtiwari3120_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_sohamtiwari3120_en.md new file mode 100644 index 000000000000..07ffa5b8fbe2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_uncased_finetuned_ner_sohamtiwari3120_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_ner_sohamtiwari3120 BertForTokenClassification from sohamtiwari3120 +author: John Snow Labs +name: bert_base_uncased_finetuned_ner_sohamtiwari3120 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_ner_sohamtiwari3120` is a English model originally trained by sohamtiwari3120. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ner_sohamtiwari3120_en_5.2.0_3.0_1699406908805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_ner_sohamtiwari3120_en_5.2.0_3.0_1699406908805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_finetuned_ner_sohamtiwari3120","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_finetuned_ner_sohamtiwari3120", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_ner_sohamtiwari3120| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/sohamtiwari3120/bert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_base_vietnamese_ud_goeswith_vi.md b/docs/_posts/ahmedlone127/2023-11-08-bert_base_vietnamese_ud_goeswith_vi.md new file mode 100644 index 000000000000..62049ef58d83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_base_vietnamese_ud_goeswith_vi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Vietnamese bert_base_vietnamese_ud_goeswith BertForTokenClassification from KoichiYasuoka +author: John Snow Labs +name: bert_base_vietnamese_ud_goeswith +date: 2023-11-08 +tags: [bert, vi, open_source, token_classification, onnx] +task: Named Entity Recognition +language: vi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_vietnamese_ud_goeswith` is a Vietnamese model originally trained by KoichiYasuoka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_vietnamese_ud_goeswith_vi_5.2.0_3.0_1699480348773.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_vietnamese_ud_goeswith_vi_5.2.0_3.0_1699480348773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_vietnamese_ud_goeswith","vi") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_vietnamese_ud_goeswith", "vi") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_vietnamese_ud_goeswith| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|vi| +|Size:|429.9 MB| + +## References + +https://huggingface.co/KoichiYasuoka/bert-base-vietnamese-ud-goeswith \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_based_multilingual_cased_finetuned_lid_xx.md b/docs/_posts/ahmedlone127/2023-11-08-bert_based_multilingual_cased_finetuned_lid_xx.md new file mode 100644 index 000000000000..ef948d453e13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_based_multilingual_cased_finetuned_lid_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_based_multilingual_cased_finetuned_lid BertForTokenClassification from ashnadua01 +author: John Snow Labs +name: bert_based_multilingual_cased_finetuned_lid +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_based_multilingual_cased_finetuned_lid` is a Multilingual model originally trained by ashnadua01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_based_multilingual_cased_finetuned_lid_xx_5.2.0_3.0_1699442196693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_based_multilingual_cased_finetuned_lid_xx_5.2.0_3.0_1699442196693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_based_multilingual_cased_finetuned_lid","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_based_multilingual_cased_finetuned_lid", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_based_multilingual_cased_finetuned_lid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/ashnadua01/bert-based-multilingual-cased-finetuned-lid \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_bulgarian_ner_bg.md b/docs/_posts/ahmedlone127/2023-11-08-bert_bulgarian_ner_bg.md new file mode 100644 index 000000000000..dff6e1d8060e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_bulgarian_ner_bg.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Bulgarian bert_bulgarian_ner BertForTokenClassification from auhide +author: John Snow Labs +name: bert_bulgarian_ner +date: 2023-11-08 +tags: [bert, bg, open_source, token_classification, onnx] +task: Named Entity Recognition +language: bg +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_bulgarian_ner` is a Bulgarian model originally trained by auhide. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_bulgarian_ner_bg_5.2.0_3.0_1699476803591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_bulgarian_ner_bg_5.2.0_3.0_1699476803591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_bulgarian_ner","bg") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_bulgarian_ner", "bg") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_bulgarian_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|bg| +|Size:|665.0 MB| + +## References + +https://huggingface.co/auhide/bert-bg-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_cased_keyword_discriminator_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_cased_keyword_discriminator_en.md new file mode 100644 index 000000000000..7bd24d43a136 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_cased_keyword_discriminator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_cased_keyword_discriminator BertForTokenClassification from yanekyuk +author: John Snow Labs +name: bert_cased_keyword_discriminator +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cased_keyword_discriminator` is a English model originally trained by yanekyuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cased_keyword_discriminator_en_5.2.0_3.0_1699478026532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cased_keyword_discriminator_en_5.2.0_3.0_1699478026532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_cased_keyword_discriminator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_cased_keyword_discriminator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cased_keyword_discriminator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/yanekyuk/bert-cased-keyword-discriminator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_dbmdz_3760_split_by_sentence_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_dbmdz_3760_split_by_sentence_en.md new file mode 100644 index 000000000000..5bea4f6163ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_dbmdz_3760_split_by_sentence_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_dbmdz_3760_split_by_sentence BertForTokenClassification from Gurkan +author: John Snow Labs +name: bert_dbmdz_3760_split_by_sentence +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_dbmdz_3760_split_by_sentence` is a English model originally trained by Gurkan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_dbmdz_3760_split_by_sentence_en_5.2.0_3.0_1699480089184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_dbmdz_3760_split_by_sentence_en_5.2.0_3.0_1699480089184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_dbmdz_3760_split_by_sentence","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_dbmdz_3760_split_by_sentence", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_dbmdz_3760_split_by_sentence| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/Gurkan/bert_dbmdz_3760_split_by_sentence_ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_bpmn_jtlicardo_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_bpmn_jtlicardo_en.md new file mode 100644 index 000000000000..fe4c6e928517 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_bpmn_jtlicardo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_bpmn_jtlicardo BertForTokenClassification from jtlicardo +author: John Snow Labs +name: bert_finetuned_bpmn_jtlicardo +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_bpmn_jtlicardo` is a English model originally trained by jtlicardo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_bpmn_jtlicardo_en_5.2.0_3.0_1699476803115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_bpmn_jtlicardo_en_5.2.0_3.0_1699476803115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_bpmn_jtlicardo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_bpmn_jtlicardo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_bpmn_jtlicardo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jtlicardo/bert-finetuned-bpmn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_food_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_food_en.md new file mode 100644 index 000000000000..e7a24103c3e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_food_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_food BertForTokenClassification from ZachBeesley +author: John Snow Labs +name: bert_finetuned_food +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_food` is a English model originally trained by ZachBeesley. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_food_en_5.2.0_3.0_1699463440797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_food_en_5.2.0_3.0_1699463440797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_food","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_food", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_food| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ZachBeesley/bert-finetuned-food \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_abkbvknv_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_abkbvknv_en.md new file mode 100644 index 000000000000..5a9b9f0fed76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_abkbvknv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_abkbvknv BertForTokenClassification from abkbvknv +author: John Snow Labs +name: bert_finetuned_ner_abkbvknv +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_abkbvknv` is a English model originally trained by abkbvknv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_abkbvknv_en_5.2.0_3.0_1699476310537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_abkbvknv_en_5.2.0_3.0_1699476310537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_abkbvknv","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_abkbvknv", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_abkbvknv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/abkbvknv/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_abx_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_abx_en.md new file mode 100644 index 000000000000..59086101249e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_abx_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_abx BertForTokenClassification from abx +author: John Snow Labs +name: bert_finetuned_ner_abx +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_abx` is a English model originally trained by abx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_abx_en_5.2.0_3.0_1699471734172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_abx_en_5.2.0_3.0_1699471734172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_abx","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_abx", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_abx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/abx/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_atajti_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_atajti_en.md new file mode 100644 index 000000000000..a09156078a89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_atajti_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_accelerate_atajti BertForTokenClassification from atajti +author: John Snow Labs +name: bert_finetuned_ner_accelerate_atajti +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_accelerate_atajti` is a English model originally trained by atajti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_atajti_en_5.2.0_3.0_1699424845393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_atajti_en_5.2.0_3.0_1699424845393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_accelerate_atajti","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_accelerate_atajti", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_accelerate_atajti| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/atajti/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_loganathanspr_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_loganathanspr_en.md new file mode 100644 index 000000000000..87a6b0e78664 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_loganathanspr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_accelerate_loganathanspr BertForTokenClassification from loganathanspr +author: John Snow Labs +name: bert_finetuned_ner_accelerate_loganathanspr +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_accelerate_loganathanspr` is a English model originally trained by loganathanspr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_loganathanspr_en_5.2.0_3.0_1699431095812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_loganathanspr_en_5.2.0_3.0_1699431095812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_accelerate_loganathanspr","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_accelerate_loganathanspr", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_accelerate_loganathanspr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/loganathanspr/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_schubertcarvalho_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_schubertcarvalho_en.md new file mode 100644 index 000000000000..d8ddf531e557 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_accelerate_schubertcarvalho_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_accelerate_schubertcarvalho BertForTokenClassification from schubertcarvalho +author: John Snow Labs +name: bert_finetuned_ner_accelerate_schubertcarvalho +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_accelerate_schubertcarvalho` is a English model originally trained by schubertcarvalho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_schubertcarvalho_en_5.2.0_3.0_1699478541167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_accelerate_schubertcarvalho_en_5.2.0_3.0_1699478541167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_accelerate_schubertcarvalho","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_accelerate_schubertcarvalho", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_accelerate_schubertcarvalho| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/schubertcarvalho/bert-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_adeep028_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_adeep028_en.md new file mode 100644 index 000000000000..f39d7e70c663 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_adeep028_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_adeep028 BertForTokenClassification from adeep028 +author: John Snow Labs +name: bert_finetuned_ner_adeep028 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_adeep028` is a English model originally trained by adeep028. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_adeep028_en_5.2.0_3.0_1699484792071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_adeep028_en_5.2.0_3.0_1699484792071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_adeep028","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_adeep028", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_adeep028| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/adeep028/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_aiventurer_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_aiventurer_en.md new file mode 100644 index 000000000000..c82d285ebe7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_aiventurer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_aiventurer BertForTokenClassification from AIventurer +author: John Snow Labs +name: bert_finetuned_ner_aiventurer +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_aiventurer` is a English model originally trained by AIventurer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_aiventurer_en_5.2.0_3.0_1699438549343.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_aiventurer_en_5.2.0_3.0_1699438549343.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_aiventurer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_aiventurer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_aiventurer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AIventurer/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_andersonjas_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_andersonjas_en.md new file mode 100644 index 000000000000..9dad5ad5b31f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_andersonjas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_andersonjas BertForTokenClassification from andersonjas +author: John Snow Labs +name: bert_finetuned_ner_andersonjas +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_andersonjas` is a English model originally trained by andersonjas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_andersonjas_en_5.2.0_3.0_1699463210511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_andersonjas_en_5.2.0_3.0_1699463210511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_andersonjas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_andersonjas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_andersonjas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/andersonjas/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_augment_3_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_augment_3_en.md new file mode 100644 index 000000000000..ba42676d420a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_augment_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_augment_3 BertForTokenClassification from lamthanhtin2811 +author: John Snow Labs +name: bert_finetuned_ner_augment_3 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_augment_3` is a English model originally trained by lamthanhtin2811. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_augment_3_en_5.2.0_3.0_1699452263781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_augment_3_en_5.2.0_3.0_1699452263781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_augment_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_augment_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_augment_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/lamthanhtin2811/bert_finetuned-ner-augment-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_augment_5_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_augment_5_en.md new file mode 100644 index 000000000000..62afe00b31d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_augment_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_augment_5 BertForTokenClassification from lamthanhtin2811 +author: John Snow Labs +name: bert_finetuned_ner_augment_5 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_augment_5` is a English model originally trained by lamthanhtin2811. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_augment_5_en_5.2.0_3.0_1699481374984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_augment_5_en_5.2.0_3.0_1699481374984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_augment_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_augment_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_augment_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/lamthanhtin2811/bert_finetuned-ner-augment-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_averageandyyy_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_averageandyyy_en.md new file mode 100644 index 000000000000..664ceb73b556 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_averageandyyy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_averageandyyy BertForTokenClassification from averageandyyy +author: John Snow Labs +name: bert_finetuned_ner_averageandyyy +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_averageandyyy` is a English model originally trained by averageandyyy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_averageandyyy_en_5.2.0_3.0_1699474095045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_averageandyyy_en_5.2.0_3.0_1699474095045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_averageandyyy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_averageandyyy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_averageandyyy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/averageandyyy/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_azuresonance_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_azuresonance_en.md new file mode 100644 index 000000000000..cc5444bb17cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_azuresonance_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_azuresonance BertForTokenClassification from azuresonance +author: John Snow Labs +name: bert_finetuned_ner_azuresonance +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_azuresonance` is a English model originally trained by azuresonance. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_azuresonance_en_5.2.0_3.0_1699485335096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_azuresonance_en_5.2.0_3.0_1699485335096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_azuresonance","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_azuresonance", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_azuresonance| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/azuresonance/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_canliu_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_canliu_en.md new file mode 100644 index 000000000000..59bf782c839d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_canliu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_canliu BertForTokenClassification from canLiu +author: John Snow Labs +name: bert_finetuned_ner_canliu +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_canliu` is a English model originally trained by canLiu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_canliu_en_5.2.0_3.0_1699452264032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_canliu_en_5.2.0_3.0_1699452264032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_canliu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_canliu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_canliu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/canLiu/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_carmeco_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_carmeco_en.md new file mode 100644 index 000000000000..9d4423a67d1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_carmeco_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_carmeco BertForTokenClassification from carmeco +author: John Snow Labs +name: bert_finetuned_ner_carmeco +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_carmeco` is a English model originally trained by carmeco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_carmeco_en_5.2.0_3.0_1699483849061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_carmeco_en_5.2.0_3.0_1699483849061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_carmeco","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_carmeco", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_carmeco| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/carmeco/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_chinese_people_daily_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_chinese_people_daily_en.md new file mode 100644 index 000000000000..6ab925d7f738 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_chinese_people_daily_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_chinese_people_daily BertForTokenClassification from johnyyhk +author: John Snow Labs +name: bert_finetuned_ner_chinese_people_daily +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_chinese_people_daily` is a English model originally trained by johnyyhk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_chinese_people_daily_en_5.2.0_3.0_1699415561401.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_chinese_people_daily_en_5.2.0_3.0_1699415561401.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_chinese_people_daily","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_chinese_people_daily", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_chinese_people_daily| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/johnyyhk/bert-finetuned-ner-chinese-people-daily \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_chunfengw_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_chunfengw_en.md new file mode 100644 index 000000000000..8cbdf50ad05d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_chunfengw_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_chunfengw BertForTokenClassification from chunfengw +author: John Snow Labs +name: bert_finetuned_ner_chunfengw +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_chunfengw` is a English model originally trained by chunfengw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_chunfengw_en_5.2.0_3.0_1699441333408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_chunfengw_en_5.2.0_3.0_1699441333408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_chunfengw","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_chunfengw", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_chunfengw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/chunfengw/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_clarechen101_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_clarechen101_en.md new file mode 100644 index 000000000000..45bdb00cdfed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_clarechen101_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_clarechen101 BertForTokenClassification from clarechen101 +author: John Snow Labs +name: bert_finetuned_ner_clarechen101 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_clarechen101` is a English model originally trained by clarechen101. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_clarechen101_en_5.2.0_3.0_1699452733657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_clarechen101_en_5.2.0_3.0_1699452733657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_clarechen101","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_clarechen101", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_clarechen101| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/clarechen101/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_cleandata_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_cleandata_en.md new file mode 100644 index 000000000000..c46acab90000 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_cleandata_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_cleandata BertForTokenClassification from cleandata +author: John Snow Labs +name: bert_finetuned_ner_cleandata +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_cleandata` is a English model originally trained by cleandata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_cleandata_en_5.2.0_3.0_1699486241285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_cleandata_en_5.2.0_3.0_1699486241285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_cleandata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_cleandata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_cleandata| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/cleandata/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_dxiao_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_dxiao_en.md new file mode 100644 index 000000000000..3c23cbabdfc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_dxiao_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_dxiao BertForTokenClassification from dxiao +author: John Snow Labs +name: bert_finetuned_ner_dxiao +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_dxiao` is a English model originally trained by dxiao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_dxiao_en_5.2.0_3.0_1699474723944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_dxiao_en_5.2.0_3.0_1699474723944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_dxiao","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_dxiao", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_dxiao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/dxiao/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_erickrribeiro_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_erickrribeiro_en.md new file mode 100644 index 000000000000..b6a5631994cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_erickrribeiro_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_erickrribeiro BertForTokenClassification from erickrribeiro +author: John Snow Labs +name: bert_finetuned_ner_erickrribeiro +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_erickrribeiro` is a English model originally trained by erickrribeiro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_erickrribeiro_en_5.2.0_3.0_1699420390755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_erickrribeiro_en_5.2.0_3.0_1699420390755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_erickrribeiro","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_erickrribeiro", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_erickrribeiro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/erickrribeiro/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_gabrielzang_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_gabrielzang_en.md new file mode 100644 index 000000000000..52d5b9e6a363 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_gabrielzang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_gabrielzang BertForTokenClassification from gabrielZang +author: John Snow Labs +name: bert_finetuned_ner_gabrielzang +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_gabrielzang` is a English model originally trained by gabrielZang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_gabrielzang_en_5.2.0_3.0_1699480304465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_gabrielzang_en_5.2.0_3.0_1699480304465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_gabrielzang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_gabrielzang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_gabrielzang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/gabrielZang/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_happy_ditto_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_happy_ditto_en.md new file mode 100644 index 000000000000..86528e4b03eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_happy_ditto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_happy_ditto BertForTokenClassification from happy-ditto +author: John Snow Labs +name: bert_finetuned_ner_happy_ditto +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_happy_ditto` is a English model originally trained by happy-ditto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_happy_ditto_en_5.2.0_3.0_1699408911205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_happy_ditto_en_5.2.0_3.0_1699408911205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_happy_ditto","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_happy_ditto", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_happy_ditto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/happy-ditto/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_heek_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_heek_en.md new file mode 100644 index 000000000000..23eb4d45f8e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_heek_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_heek BertForTokenClassification from HeeK +author: John Snow Labs +name: bert_finetuned_ner_heek +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_heek` is a English model originally trained by HeeK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_heek_en_5.2.0_3.0_1699470138038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_heek_en_5.2.0_3.0_1699470138038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_heek","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_heek", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_heek| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/HeeK/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_heenamir_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_heenamir_en.md new file mode 100644 index 000000000000..8c64a365272b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_heenamir_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_heenamir BertForTokenClassification from heenamir +author: John Snow Labs +name: bert_finetuned_ner_heenamir +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_heenamir` is a English model originally trained by heenamir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_heenamir_en_5.2.0_3.0_1699408228945.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_heenamir_en_5.2.0_3.0_1699408228945.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_heenamir","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_heenamir", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_heenamir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/heenamir/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ishantja_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ishantja_en.md new file mode 100644 index 000000000000..f367a9c2c5fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ishantja_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_ishantja BertForTokenClassification from ishantja +author: John Snow Labs +name: bert_finetuned_ner_ishantja +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_ishantja` is a English model originally trained by ishantja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ishantja_en_5.2.0_3.0_1699455206142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ishantja_en_5.2.0_3.0_1699455206142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_ishantja","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_ishantja", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_ishantja| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ishantja/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jake777_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jake777_en.md new file mode 100644 index 000000000000..d81db9dd5e00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jake777_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_jake777 BertForTokenClassification from JAKE777 +author: John Snow Labs +name: bert_finetuned_ner_jake777 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_jake777` is a English model originally trained by JAKE777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jake777_en_5.2.0_3.0_1699483115053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jake777_en_5.2.0_3.0_1699483115053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_jake777","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_jake777", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_jake777| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/JAKE777/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jmoraes_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jmoraes_en.md new file mode 100644 index 000000000000..693bafc543cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jmoraes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_jmoraes BertForTokenClassification from jmoraes +author: John Snow Labs +name: bert_finetuned_ner_jmoraes +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_jmoraes` is a English model originally trained by jmoraes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jmoraes_en_5.2.0_3.0_1699463527900.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jmoraes_en_5.2.0_3.0_1699463527900.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_jmoraes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_jmoraes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_jmoraes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jmoraes/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_joannaandrews_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_joannaandrews_en.md new file mode 100644 index 000000000000..3a79a2031fd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_joannaandrews_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_joannaandrews BertForTokenClassification from JoannaAndrews +author: John Snow Labs +name: bert_finetuned_ner_joannaandrews +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_joannaandrews` is a English model originally trained by JoannaAndrews. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_joannaandrews_en_5.2.0_3.0_1699411150741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_joannaandrews_en_5.2.0_3.0_1699411150741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_joannaandrews","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_joannaandrews", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_joannaandrews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/JoannaAndrews/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_joaomonteiro_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_joaomonteiro_en.md new file mode 100644 index 000000000000..a1e7707d31c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_joaomonteiro_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_joaomonteiro BertForTokenClassification from joaomonteiro +author: John Snow Labs +name: bert_finetuned_ner_joaomonteiro +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_joaomonteiro` is a English model originally trained by joaomonteiro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_joaomonteiro_en_5.2.0_3.0_1699463015152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_joaomonteiro_en_5.2.0_3.0_1699463015152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_joaomonteiro","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_joaomonteiro", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_joaomonteiro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/joaomonteiro/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_johnyyhk_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_johnyyhk_en.md new file mode 100644 index 000000000000..4256d3c14177 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_johnyyhk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_johnyyhk BertForTokenClassification from johnyyhk +author: John Snow Labs +name: bert_finetuned_ner_johnyyhk +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_johnyyhk` is a English model originally trained by johnyyhk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_johnyyhk_en_5.2.0_3.0_1699480210971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_johnyyhk_en_5.2.0_3.0_1699480210971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_johnyyhk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_johnyyhk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_johnyyhk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/johnyyhk/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jperezv_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jperezv_en.md new file mode 100644 index 000000000000..51d36f1d9d6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_jperezv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_jperezv BertForTokenClassification from jperezv +author: John Snow Labs +name: bert_finetuned_ner_jperezv +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_jperezv` is a English model originally trained by jperezv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jperezv_en_5.2.0_3.0_1699485262486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jperezv_en_5.2.0_3.0_1699485262486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_jperezv","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_jperezv", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_jperezv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jperezv/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_loganathanspr_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_loganathanspr_en.md new file mode 100644 index 000000000000..0e61da9c37e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_loganathanspr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_loganathanspr BertForTokenClassification from loganathanspr +author: John Snow Labs +name: bert_finetuned_ner_loganathanspr +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_loganathanspr` is a English model originally trained by loganathanspr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_loganathanspr_en_5.2.0_3.0_1699448171578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_loganathanspr_en_5.2.0_3.0_1699448171578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_loganathanspr","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_loganathanspr", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_loganathanspr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/loganathanspr/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_louislian2341_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_louislian2341_en.md new file mode 100644 index 000000000000..9262b72cad8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_louislian2341_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_louislian2341 BertForTokenClassification from louislian2341 +author: John Snow Labs +name: bert_finetuned_ner_louislian2341 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_louislian2341` is a English model originally trained by louislian2341. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_louislian2341_en_5.2.0_3.0_1699402395910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_louislian2341_en_5.2.0_3.0_1699402395910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_louislian2341","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_louislian2341", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_louislian2341| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/louislian2341/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_marcuslee_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_marcuslee_en.md new file mode 100644 index 000000000000..5b29ab3df8f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_marcuslee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_marcuslee BertForTokenClassification from MarcusLee +author: John Snow Labs +name: bert_finetuned_ner_marcuslee +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_marcuslee` is a English model originally trained by MarcusLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_marcuslee_en_5.2.0_3.0_1699473290406.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_marcuslee_en_5.2.0_3.0_1699473290406.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_marcuslee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_marcuslee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_marcuslee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/MarcusLee/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mie_zhz_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mie_zhz_en.md new file mode 100644 index 000000000000..b2d4c603a3dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mie_zhz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_mie_zhz BertForTokenClassification from mie-zhz +author: John Snow Labs +name: bert_finetuned_ner_mie_zhz +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_mie_zhz` is a English model originally trained by mie-zhz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mie_zhz_en_5.2.0_3.0_1699423718558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mie_zhz_en_5.2.0_3.0_1699423718558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_mie_zhz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_mie_zhz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_mie_zhz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mie-zhz/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_miguelangelocwb_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_miguelangelocwb_en.md new file mode 100644 index 000000000000..31844920a8c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_miguelangelocwb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_miguelangelocwb BertForTokenClassification from MiguelAngeloCwb +author: John Snow Labs +name: bert_finetuned_ner_miguelangelocwb +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_miguelangelocwb` is a English model originally trained by MiguelAngeloCwb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_miguelangelocwb_en_5.2.0_3.0_1699483849077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_miguelangelocwb_en_5.2.0_3.0_1699483849077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_miguelangelocwb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_miguelangelocwb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_miguelangelocwb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/MiguelAngeloCwb/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mmibrahim2006_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mmibrahim2006_en.md new file mode 100644 index 000000000000..2319d4073ac6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mmibrahim2006_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_mmibrahim2006 BertForTokenClassification from mmibrahim2006 +author: John Snow Labs +name: bert_finetuned_ner_mmibrahim2006 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_mmibrahim2006` is a English model originally trained by mmibrahim2006. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mmibrahim2006_en_5.2.0_3.0_1699465916697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mmibrahim2006_en_5.2.0_3.0_1699465916697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_mmibrahim2006","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_mmibrahim2006", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_mmibrahim2006| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mmibrahim2006/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mpalaval_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mpalaval_en.md new file mode 100644 index 000000000000..504ad7dc5fcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_mpalaval_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_mpalaval BertForTokenClassification from mpalaval +author: John Snow Labs +name: bert_finetuned_ner_mpalaval +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_mpalaval` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mpalaval_en_5.2.0_3.0_1699446270386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mpalaval_en_5.2.0_3.0_1699446270386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_mpalaval","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_mpalaval", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_mpalaval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_na20b039_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_na20b039_en.md new file mode 100644 index 000000000000..473b2494ca5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_na20b039_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_na20b039 BertForTokenClassification from na20b039 +author: John Snow Labs +name: bert_finetuned_ner_na20b039 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_na20b039` is a English model originally trained by na20b039. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_na20b039_en_5.2.0_3.0_1699412078538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_na20b039_en_5.2.0_3.0_1699412078538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_na20b039","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_na20b039", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_na20b039| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/na20b039/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_nourhanabosaeed_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_nourhanabosaeed_en.md new file mode 100644 index 000000000000..08e6a85d86d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_nourhanabosaeed_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_nourhanabosaeed BertForTokenClassification from NourhanAbosaeed +author: John Snow Labs +name: bert_finetuned_ner_nourhanabosaeed +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_nourhanabosaeed` is a English model originally trained by NourhanAbosaeed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_nourhanabosaeed_en_5.2.0_3.0_1699475223881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_nourhanabosaeed_en_5.2.0_3.0_1699475223881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_nourhanabosaeed","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_nourhanabosaeed", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_nourhanabosaeed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/NourhanAbosaeed/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ocm_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ocm_en.md new file mode 100644 index 000000000000..0fac4ba54da2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ocm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_ocm BertForTokenClassification from ocm +author: John Snow Labs +name: bert_finetuned_ner_ocm +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_ocm` is a English model originally trained by ocm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ocm_en_5.2.0_3.0_1699484562255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ocm_en_5.2.0_3.0_1699484562255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_ocm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_ocm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_ocm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ocm/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ontonotes_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ontonotes_en.md new file mode 100644 index 000000000000..39de3413011d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_ontonotes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_ontonotes BertForTokenClassification from nickprock +author: John Snow Labs +name: bert_finetuned_ner_ontonotes +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_ontonotes` is a English model originally trained by nickprock. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ontonotes_en_5.2.0_3.0_1699438831807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ontonotes_en_5.2.0_3.0_1699438831807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_ontonotes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_ontonotes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_ontonotes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/nickprock/bert-finetuned-ner-ontonotes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_phamvanlinh143_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_phamvanlinh143_en.md new file mode 100644 index 000000000000..28fa051adf33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_phamvanlinh143_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_phamvanlinh143 BertForTokenClassification from phamvanlinh143 +author: John Snow Labs +name: bert_finetuned_ner_phamvanlinh143 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_phamvanlinh143` is a English model originally trained by phamvanlinh143. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_phamvanlinh143_en_5.2.0_3.0_1699480205266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_phamvanlinh143_en_5.2.0_3.0_1699480205266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_phamvanlinh143","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_phamvanlinh143", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_phamvanlinh143| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/phamvanlinh143/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_rajknakka_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_rajknakka_en.md new file mode 100644 index 000000000000..e77557ba3475 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_rajknakka_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_rajknakka BertForTokenClassification from RajkNakka +author: John Snow Labs +name: bert_finetuned_ner_rajknakka +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_rajknakka` is a English model originally trained by RajkNakka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_rajknakka_en_5.2.0_3.0_1699467515672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_rajknakka_en_5.2.0_3.0_1699467515672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_rajknakka","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_rajknakka", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_rajknakka| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/RajkNakka/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_roverandom95_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_roverandom95_en.md new file mode 100644 index 000000000000..62bab92b7c91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_roverandom95_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_roverandom95 BertForTokenClassification from Roverandom95 +author: John Snow Labs +name: bert_finetuned_ner_roverandom95 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_roverandom95` is a English model originally trained by Roverandom95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_roverandom95_en_5.2.0_3.0_1699421456125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_roverandom95_en_5.2.0_3.0_1699421456125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_roverandom95","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_roverandom95", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_roverandom95| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.3 MB| + +## References + +https://huggingface.co/Roverandom95/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_sanjay7178_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_sanjay7178_en.md new file mode 100644 index 000000000000..cf02c4754e49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_sanjay7178_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_sanjay7178 BertForTokenClassification from sanjay7178 +author: John Snow Labs +name: bert_finetuned_ner_sanjay7178 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_sanjay7178` is a English model originally trained by sanjay7178. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_sanjay7178_en_5.2.0_3.0_1699468887582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_sanjay7178_en_5.2.0_3.0_1699468887582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_sanjay7178","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_sanjay7178", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_sanjay7178| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/sanjay7178/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_shadowtwin41_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_shadowtwin41_en.md new file mode 100644 index 000000000000..c7949a156edd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_shadowtwin41_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_shadowtwin41 BertForTokenClassification from ShadowTwin41 +author: John Snow Labs +name: bert_finetuned_ner_shadowtwin41 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_shadowtwin41` is a English model originally trained by ShadowTwin41. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_shadowtwin41_en_5.2.0_3.0_1699454308894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_shadowtwin41_en_5.2.0_3.0_1699454308894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_shadowtwin41","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_shadowtwin41", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_shadowtwin41| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ShadowTwin41/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_suraj_yadav_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_suraj_yadav_en.md new file mode 100644 index 000000000000..006950e17bea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_suraj_yadav_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_suraj_yadav BertForTokenClassification from Suraj-Yadav +author: John Snow Labs +name: bert_finetuned_ner_suraj_yadav +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_suraj_yadav` is a English model originally trained by Suraj-Yadav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_suraj_yadav_en_5.2.0_3.0_1699418864847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_suraj_yadav_en_5.2.0_3.0_1699418864847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_suraj_yadav","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_suraj_yadav", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_suraj_yadav| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Suraj-Yadav/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_test1_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_test1_en.md new file mode 100644 index 000000000000..99049e20d343 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_test1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_test1 BertForTokenClassification from jun-91 +author: John Snow Labs +name: bert_finetuned_ner_test1 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_test1` is a English model originally trained by jun-91. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_test1_en_5.2.0_3.0_1699478541245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_test1_en_5.2.0_3.0_1699478541245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_test1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_test1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_test1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jun-91/bert-finetuned-ner_test1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_tofunumber1_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_tofunumber1_en.md new file mode 100644 index 000000000000..9a99e859e434 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_tofunumber1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_tofunumber1 BertForTokenClassification from TofuNumber1 +author: John Snow Labs +name: bert_finetuned_ner_tofunumber1 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_tofunumber1` is a English model originally trained by TofuNumber1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_tofunumber1_en_5.2.0_3.0_1699470419392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_tofunumber1_en_5.2.0_3.0_1699470419392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_tofunumber1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_tofunumber1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_tofunumber1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/TofuNumber1/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_torayeff_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_torayeff_en.md new file mode 100644 index 000000000000..dc26f7185c54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_torayeff_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_torayeff BertForTokenClassification from torayeff +author: John Snow Labs +name: bert_finetuned_ner_torayeff +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_torayeff` is a English model originally trained by torayeff. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_torayeff_en_5.2.0_3.0_1699475051489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_torayeff_en_5.2.0_3.0_1699475051489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_torayeff","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_torayeff", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_torayeff| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/torayeff/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_trainer_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_trainer_en.md new file mode 100644 index 000000000000..f55affab985e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_trainer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_trainer BertForTokenClassification from marcellodomenis +author: John Snow Labs +name: bert_finetuned_ner_trainer +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_trainer` is a English model originally trained by marcellodomenis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_trainer_en_5.2.0_3.0_1699472427597.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_trainer_en_5.2.0_3.0_1699472427597.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_trainer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_trainer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_trainer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/marcellodomenis/bert-finetuned-ner-trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_trainerapi_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_trainerapi_en.md new file mode 100644 index 000000000000..e81fba4ca85c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_trainerapi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_trainerapi BertForTokenClassification from HeeK +author: John Snow Labs +name: bert_finetuned_ner_trainerapi +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_trainerapi` is a English model originally trained by HeeK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_trainerapi_en_5.2.0_3.0_1699486804911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_trainerapi_en_5.2.0_3.0_1699486804911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_trainerapi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_trainerapi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_trainerapi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/HeeK/bert-finetuned-ner-trainerAPI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_tw5n14_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_tw5n14_en.md new file mode 100644 index 000000000000..169fbb4f0187 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_tw5n14_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_tw5n14 BertForTokenClassification from tw5n14 +author: John Snow Labs +name: bert_finetuned_ner_tw5n14 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_tw5n14` is a English model originally trained by tw5n14. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_tw5n14_en_5.2.0_3.0_1699418917060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_tw5n14_en_5.2.0_3.0_1699418917060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_tw5n14","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_tw5n14", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_tw5n14| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/tw5n14/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_v2_sayalik13_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_v2_sayalik13_en.md new file mode 100644 index 000000000000..05b48371b4b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_v2_sayalik13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_v2_sayalik13 BertForTokenClassification from sayalik13 +author: John Snow Labs +name: bert_finetuned_ner_v2_sayalik13 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_v2_sayalik13` is a English model originally trained by sayalik13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_v2_sayalik13_en_5.2.0_3.0_1699437498638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_v2_sayalik13_en_5.2.0_3.0_1699437498638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_v2_sayalik13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_v2_sayalik13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_v2_sayalik13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/sayalik13/bert-finetuned-ner-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xiajun2001_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xiajun2001_en.md new file mode 100644 index 000000000000..d0ff16103c37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xiajun2001_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_xiajun2001 BertForTokenClassification from xiajun2001 +author: John Snow Labs +name: bert_finetuned_ner_xiajun2001 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_xiajun2001` is a English model originally trained by xiajun2001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_xiajun2001_en_5.2.0_3.0_1699440019645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_xiajun2001_en_5.2.0_3.0_1699440019645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_xiajun2001","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_xiajun2001", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_xiajun2001| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/xiajun2001/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xinhui_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xinhui_en.md new file mode 100644 index 000000000000..d304d6c4d611 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xinhui_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_xinhui BertForTokenClassification from xinhui +author: John Snow Labs +name: bert_finetuned_ner_xinhui +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_xinhui` is a English model originally trained by xinhui. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_xinhui_en_5.2.0_3.0_1699465195125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_xinhui_en_5.2.0_3.0_1699465195125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_xinhui","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_xinhui", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_xinhui| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/xinhui/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xupengzeng_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xupengzeng_en.md new file mode 100644 index 000000000000..70e3cbb754b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_xupengzeng_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_xupengzeng BertForTokenClassification from xupengzeng +author: John Snow Labs +name: bert_finetuned_ner_xupengzeng +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_xupengzeng` is a English model originally trained by xupengzeng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_xupengzeng_en_5.2.0_3.0_1699476681135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_xupengzeng_en_5.2.0_3.0_1699476681135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_xupengzeng","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_xupengzeng", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_xupengzeng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/xupengzeng/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_yitengm_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_yitengm_en.md new file mode 100644 index 000000000000..d83954811db1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_yitengm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_yitengm BertForTokenClassification from yitengm +author: John Snow Labs +name: bert_finetuned_ner_yitengm +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_yitengm` is a English model originally trained by yitengm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_yitengm_en_5.2.0_3.0_1699447798020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_yitengm_en_5.2.0_3.0_1699447798020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_yitengm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_yitengm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_yitengm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/yitengm/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_yuto01_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_yuto01_en.md new file mode 100644 index 000000000000..6b02e679c297 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_ner_yuto01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_yuto01 BertForTokenClassification from Yuto01 +author: John Snow Labs +name: bert_finetuned_ner_yuto01 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_yuto01` is a English model originally trained by Yuto01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_yuto01_en_5.2.0_3.0_1699487566052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_yuto01_en_5.2.0_3.0_1699487566052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_yuto01","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_yuto01", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_yuto01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Yuto01/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_requirements_andyuk_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_requirements_andyuk_en.md new file mode 100644 index 000000000000..3e28de80b774 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_requirements_andyuk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_requirements_andyuk BertForTokenClassification from andyuk +author: John Snow Labs +name: bert_finetuned_requirements_andyuk +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_requirements_andyuk` is a English model originally trained by andyuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_requirements_andyuk_en_5.2.0_3.0_1699463301160.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_requirements_andyuk_en_5.2.0_3.0_1699463301160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_requirements_andyuk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_requirements_andyuk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_requirements_andyuk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/andyuk/bert-finetuned-requirements \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_sst2_en.md new file mode 100644 index 000000000000..55a295ac6825 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_finetuned_sst2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_sst2 BertForTokenClassification from asimokby +author: John Snow Labs +name: bert_finetuned_sst2 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_sst2` is a English model originally trained by asimokby. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_sst2_en_5.2.0_3.0_1699416456457.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_sst2_en_5.2.0_3.0_1699416456457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_sst2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_sst2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/asimokby/bert-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_for_job_descr_parsing_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_for_job_descr_parsing_en.md new file mode 100644 index 000000000000..fa3f0ae65999 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_for_job_descr_parsing_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_for_job_descr_parsing BertForTokenClassification from jfriduss +author: John Snow Labs +name: bert_for_job_descr_parsing +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_for_job_descr_parsing` is a English model originally trained by jfriduss. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_for_job_descr_parsing_en_5.2.0_3.0_1699427593459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_for_job_descr_parsing_en_5.2.0_3.0_1699427593459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_for_job_descr_parsing","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_for_job_descr_parsing", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_for_job_descr_parsing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/jfriduss/bert_for_job_descr_parsing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_german_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_german_ner_en.md new file mode 100644 index 000000000000..1fa29a312b4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_german_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_german_ner BertForTokenClassification from lunesco +author: John Snow Labs +name: bert_german_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_german_ner` is a English model originally trained by lunesco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_german_ner_en_5.2.0_3.0_1699433664820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_german_ner_en_5.2.0_3.0_1699433664820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_german_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_german_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_german_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/lunesco/bert-german-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_mini_finetuned_ner_chinese_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_mini_finetuned_ner_chinese_en.md new file mode 100644 index 000000000000..766ab3086869 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_mini_finetuned_ner_chinese_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_mini_finetuned_ner_chinese BertForTokenClassification from IcyKallen +author: John Snow Labs +name: bert_mini_finetuned_ner_chinese +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_mini_finetuned_ner_chinese` is a English model originally trained by IcyKallen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_mini_finetuned_ner_chinese_en_5.2.0_3.0_1699409083770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_mini_finetuned_ner_chinese_en_5.2.0_3.0_1699409083770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_mini_finetuned_ner_chinese","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_mini_finetuned_ner_chinese", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_mini_finetuned_ner_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|46.0 MB| + +## References + +https://huggingface.co/IcyKallen/bert-mini-finetuned-ner-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_multilingual_finetuned_history_ner_sub_ontology_xx.md b/docs/_posts/ahmedlone127/2023-11-08-bert_multilingual_finetuned_history_ner_sub_ontology_xx.md new file mode 100644 index 000000000000..9c4768d91d54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_multilingual_finetuned_history_ner_sub_ontology_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_multilingual_finetuned_history_ner_sub_ontology BertForTokenClassification from QuanAI +author: John Snow Labs +name: bert_multilingual_finetuned_history_ner_sub_ontology +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingual_finetuned_history_ner_sub_ontology` is a Multilingual model originally trained by QuanAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingual_finetuned_history_ner_sub_ontology_xx_5.2.0_3.0_1699424152414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingual_finetuned_history_ner_sub_ontology_xx_5.2.0_3.0_1699424152414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_multilingual_finetuned_history_ner_sub_ontology","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_multilingual_finetuned_history_ner_sub_ontology", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingual_finetuned_history_ner_sub_ontology| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/QuanAI/bert-multilingual-finetuned-history-ner-sub-ontology \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_multilingual_finetuned_xtreme_tamil_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-08-bert_multilingual_finetuned_xtreme_tamil_ner_xx.md new file mode 100644 index 000000000000..8787dd0d0751 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_multilingual_finetuned_xtreme_tamil_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_multilingual_finetuned_xtreme_tamil_ner BertForTokenClassification from RamAnanth1 +author: John Snow Labs +name: bert_multilingual_finetuned_xtreme_tamil_ner +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingual_finetuned_xtreme_tamil_ner` is a Multilingual model originally trained by RamAnanth1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingual_finetuned_xtreme_tamil_ner_xx_5.2.0_3.0_1699444480863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingual_finetuned_xtreme_tamil_ner_xx_5.2.0_3.0_1699444480863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_multilingual_finetuned_xtreme_tamil_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_multilingual_finetuned_xtreme_tamil_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingual_finetuned_xtreme_tamil_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/RamAnanth1/bert-multilingual-finetuned-xtreme-tamil-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_ner_2_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_ner_2_en.md new file mode 100644 index 000000000000..e11975405d9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_ner_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_2 BertForTokenClassification from mpalaval +author: John Snow Labs +name: bert_ner_2 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_2` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_2_en_5.2.0_3.0_1699446696052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_2_en_5.2.0_3.0_1699446696052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/bert-ner-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_ner_3_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_ner_3_en.md new file mode 100644 index 000000000000..c29388814e55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_ner_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner_3 BertForTokenClassification from mpalaval +author: John Snow Labs +name: bert_ner_3 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner_3` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_3_en_5.2.0_3.0_1699476531212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_3_en_5.2.0_3.0_1699476531212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_ner_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_ner_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/bert-ner-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_portuguese_event_trigger_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_portuguese_event_trigger_en.md new file mode 100644 index 000000000000..07896a2b3bdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_portuguese_event_trigger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_portuguese_event_trigger BertForTokenClassification from lfcc +author: John Snow Labs +name: bert_portuguese_event_trigger +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_portuguese_event_trigger` is a English model originally trained by lfcc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_portuguese_event_trigger_en_5.2.0_3.0_1699420311858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_portuguese_event_trigger_en_5.2.0_3.0_1699420311858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_portuguese_event_trigger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_portuguese_event_trigger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_portuguese_event_trigger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/lfcc/bert-portuguese-event-trigger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_restore_punctuation_st1992_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_restore_punctuation_st1992_en.md new file mode 100644 index 000000000000..052bebcc84cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_restore_punctuation_st1992_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_restore_punctuation_st1992 BertForTokenClassification from st1992 +author: John Snow Labs +name: bert_restore_punctuation_st1992 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_restore_punctuation_st1992` is a English model originally trained by st1992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_restore_punctuation_st1992_en_5.2.0_3.0_1699413684970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_restore_punctuation_st1992_en_5.2.0_3.0_1699413684970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_restore_punctuation_st1992","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_restore_punctuation_st1992", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_restore_punctuation_st1992| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/st1992/bert-restore-punctuation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_restore_punctuation_turkish_legacy_tr.md b/docs/_posts/ahmedlone127/2023-11-08-bert_restore_punctuation_turkish_legacy_tr.md new file mode 100644 index 000000000000..5ddde1d64f71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_restore_punctuation_turkish_legacy_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish bert_restore_punctuation_turkish_legacy BertForTokenClassification from uygarkurt +author: John Snow Labs +name: bert_restore_punctuation_turkish_legacy +date: 2023-11-08 +tags: [bert, tr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_restore_punctuation_turkish_legacy` is a Turkish model originally trained by uygarkurt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_restore_punctuation_turkish_legacy_tr_5.2.0_3.0_1699435292231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_restore_punctuation_turkish_legacy_tr_5.2.0_3.0_1699435292231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_restore_punctuation_turkish_legacy","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_restore_punctuation_turkish_legacy", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_restore_punctuation_turkish_legacy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| + +## References + +https://huggingface.co/uygarkurt/bert-restore-punctuation-turkish-legacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_small_finetuned_wnut17_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_small_finetuned_wnut17_ner_en.md new file mode 100644 index 000000000000..7c1e04120c62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_small_finetuned_wnut17_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_small_finetuned_wnut17_ner BertForTokenClassification from muhtasham +author: John Snow Labs +name: bert_small_finetuned_wnut17_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_wnut17_ner` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_wnut17_ner_en_5.2.0_3.0_1699422386690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_wnut17_ner_en_5.2.0_3.0_1699422386690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_small_finetuned_wnut17_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_small_finetuned_wnut17_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_wnut17_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|107.0 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-wnut17-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_small_finetuned_xglue_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_small_finetuned_xglue_ner_en.md new file mode 100644 index 000000000000..116e4499c852 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_small_finetuned_xglue_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_small_finetuned_xglue_ner BertForTokenClassification from muhtasham +author: John Snow Labs +name: bert_small_finetuned_xglue_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_small_finetuned_xglue_ner` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_xglue_ner_en_5.2.0_3.0_1699408776412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_small_finetuned_xglue_ner_en_5.2.0_3.0_1699408776412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_small_finetuned_xglue_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_small_finetuned_xglue_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_small_finetuned_xglue_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|107.0 MB| + +## References + +https://huggingface.co/muhtasham/bert-small-finetuned-xglue-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_tamil_ta.md b/docs/_posts/ahmedlone127/2023-11-08-bert_tamil_ta.md new file mode 100644 index 000000000000..1ba3b0ebfac9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_tamil_ta.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Tamil bert_tamil BertForTokenClassification from Ambareeshkumar +author: John Snow Labs +name: bert_tamil +date: 2023-11-08 +tags: [bert, ta, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ta +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tamil` is a Tamil model originally trained by Ambareeshkumar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tamil_ta_5.2.0_3.0_1699470976452.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tamil_ta_5.2.0_3.0_1699470976452.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tamil","ta") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tamil", "ta") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tamil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ta| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Ambareeshkumar/BERT-Tamil \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_accelerate_gpu_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_accelerate_gpu_en.md new file mode 100644 index 000000000000..ef9f4487b378 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_accelerate_gpu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tiny_finetuned_finer_accelerate_gpu BertForTokenClassification from muhtasham +author: John Snow Labs +name: bert_tiny_finetuned_finer_accelerate_gpu +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_finer_accelerate_gpu` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_accelerate_gpu_en_5.2.0_3.0_1699464760074.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_accelerate_gpu_en_5.2.0_3.0_1699464760074.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_finetuned_finer_accelerate_gpu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tiny_finetuned_finer_accelerate_gpu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_finer_accelerate_gpu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/muhtasham/bert-tiny-finetuned-finer-accelerate-gpu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_accelerate_longer_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_accelerate_longer_en.md new file mode 100644 index 000000000000..5e37d9e0ff40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_accelerate_longer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tiny_finetuned_finer_accelerate_longer BertForTokenClassification from muhtasham +author: John Snow Labs +name: bert_tiny_finetuned_finer_accelerate_longer +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_finer_accelerate_longer` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_accelerate_longer_en_5.2.0_3.0_1699447139376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_accelerate_longer_en_5.2.0_3.0_1699447139376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_finetuned_finer_accelerate_longer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tiny_finetuned_finer_accelerate_longer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_finer_accelerate_longer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/muhtasham/bert-tiny-finetuned-finer-accelerate-longer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_en.md new file mode 100644 index 000000000000..c1806f511a04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tiny_finetuned_finer BertForTokenClassification from muhtasham +author: John Snow Labs +name: bert_tiny_finetuned_finer +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_finer` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_en_5.2.0_3.0_1699409931982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_en_5.2.0_3.0_1699409931982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_finetuned_finer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tiny_finetuned_finer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_finer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/muhtasham/bert-tiny-finetuned-finer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_longer_en.md b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_longer_en.md new file mode 100644 index 000000000000..c3b6db57208c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bert_tiny_finetuned_finer_longer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tiny_finetuned_finer_longer BertForTokenClassification from muhtasham +author: John Snow Labs +name: bert_tiny_finetuned_finer_longer +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tiny_finetuned_finer_longer` is a English model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_longer_en_5.2.0_3.0_1699485522802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tiny_finetuned_finer_longer_en_5.2.0_3.0_1699485522802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_tiny_finetuned_finer_longer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_tiny_finetuned_finer_longer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tiny_finetuned_finer_longer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.8 MB| + +## References + +https://huggingface.co/muhtasham/bert-tiny-finetuned-finer-longer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bertfortokenclassification_en.md b/docs/_posts/ahmedlone127/2023-11-08-bertfortokenclassification_en.md new file mode 100644 index 000000000000..6d187fefffb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bertfortokenclassification_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bertfortokenclassification BertForTokenClassification from namitsingal +author: John Snow Labs +name: bertfortokenclassification +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertfortokenclassification` is a English model originally trained by namitsingal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertfortokenclassification_en_5.2.0_3.0_1699435292266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertfortokenclassification_en_5.2.0_3.0_1699435292266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bertfortokenclassification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bertfortokenclassification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertfortokenclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/namitsingal/BertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-beto_finetuned_fact_es.md b/docs/_posts/ahmedlone127/2023-11-08-beto_finetuned_fact_es.md new file mode 100644 index 000000000000..00cb119e1cc4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-beto_finetuned_fact_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish beto_finetuned_fact BertForTokenClassification from filevich +author: John Snow Labs +name: beto_finetuned_fact +date: 2023-11-08 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_finetuned_fact` is a Castilian, Spanish model originally trained by filevich. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_finetuned_fact_es_5.2.0_3.0_1699466387009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_finetuned_fact_es_5.2.0_3.0_1699466387009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("beto_finetuned_fact","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("beto_finetuned_fact", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_finetuned_fact| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/filevich/beto-finetuned-fact \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bio_clinicalbert_2e5_top10_20testset_en.md b/docs/_posts/ahmedlone127/2023-11-08-bio_clinicalbert_2e5_top10_20testset_en.md new file mode 100644 index 000000000000..1c90c9a04bf6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bio_clinicalbert_2e5_top10_20testset_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bio_clinicalbert_2e5_top10_20testset BertForTokenClassification from alecocc +author: John Snow Labs +name: bio_clinicalbert_2e5_top10_20testset +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bio_clinicalbert_2e5_top10_20testset` is a English model originally trained by alecocc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_2e5_top10_20testset_en_5.2.0_3.0_1699414566133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bio_clinicalbert_2e5_top10_20testset_en_5.2.0_3.0_1699414566133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bio_clinicalbert_2e5_top10_20testset","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bio_clinicalbert_2e5_top10_20testset", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bio_clinicalbert_2e5_top10_20testset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.4 MB| + +## References + +https://huggingface.co/alecocc/Bio_ClinicalBERT_2e5_top10_20testset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biobert_base_cased_v1_2_finetuned_ner_sciarrilli_en.md b/docs/_posts/ahmedlone127/2023-11-08-biobert_base_cased_v1_2_finetuned_ner_sciarrilli_en.md new file mode 100644 index 000000000000..054c3c51c5fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biobert_base_cased_v1_2_finetuned_ner_sciarrilli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_base_cased_v1_2_finetuned_ner_sciarrilli BertForTokenClassification from sciarrilli +author: John Snow Labs +name: biobert_base_cased_v1_2_finetuned_ner_sciarrilli +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_base_cased_v1_2_finetuned_ner_sciarrilli` is a English model originally trained by sciarrilli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_2_finetuned_ner_sciarrilli_en_5.2.0_3.0_1699456511532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_base_cased_v1_2_finetuned_ner_sciarrilli_en_5.2.0_3.0_1699456511532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_base_cased_v1_2_finetuned_ner_sciarrilli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_base_cased_v1_2_finetuned_ner_sciarrilli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_base_cased_v1_2_finetuned_ner_sciarrilli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/sciarrilli/biobert-base-cased-v1.2-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biobert_finetuned_pico_en.md b/docs/_posts/ahmedlone127/2023-11-08-biobert_finetuned_pico_en.md new file mode 100644 index 000000000000..f5c536511b86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biobert_finetuned_pico_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_finetuned_pico BertForTokenClassification from Stardrums +author: John Snow Labs +name: biobert_finetuned_pico +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_finetuned_pico` is a English model originally trained by Stardrums. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_finetuned_pico_en_5.2.0_3.0_1699436694405.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_finetuned_pico_en_5.2.0_3.0_1699436694405.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_finetuned_pico","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_finetuned_pico", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_finetuned_pico| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/Stardrums/biobert-finetuned-pico \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi_en.md b/docs/_posts/ahmedlone127/2023-11-08-biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi_en.md new file mode 100644 index 000000000000..86842712bc4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi BertForTokenClassification from Dogebooch +author: John Snow Labs +name: biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi` is a English model originally trained by Dogebooch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi_en_5.2.0_3.0_1699474917036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi_en_5.2.0_3.0_1699474917036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_mnli_snli_scinli_scitail_mednli_stsb_ncbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/Dogebooch/BioBERT-mnli-snli-scinli-scitail-mednli-stsb-ncbi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biobert_ner_diseases_model_en.md b/docs/_posts/ahmedlone127/2023-11-08-biobert_ner_diseases_model_en.md new file mode 100644 index 000000000000..1ec12dc2c4b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biobert_ner_diseases_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_ner_diseases_model BertForTokenClassification from rjac +author: John Snow Labs +name: biobert_ner_diseases_model +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_ner_diseases_model` is a English model originally trained by rjac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_ner_diseases_model_en_5.2.0_3.0_1699411920789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_ner_diseases_model_en_5.2.0_3.0_1699411920789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_ner_diseases_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_ner_diseases_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_ner_diseases_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/rjac/biobert-ner-diseases-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biobert_protein_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-biobert_protein_ner_en.md new file mode 100644 index 000000000000..b17906b91c64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biobert_protein_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_protein_ner BertForTokenClassification from avishvj +author: John Snow Labs +name: biobert_protein_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_protein_ner` is a English model originally trained by avishvj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_protein_ner_en_5.2.0_3.0_1699406360113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_protein_ner_en_5.2.0_3.0_1699406360113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_protein_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_protein_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_protein_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/avishvj/biobert-protein-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biobertpt_all_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-biobertpt_all_finetuned_ner_en.md new file mode 100644 index 000000000000..fec4a6cef5da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biobertpt_all_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobertpt_all_finetuned_ner BertForTokenClassification from brunodorneles +author: John Snow Labs +name: biobertpt_all_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobertpt_all_finetuned_ner` is a English model originally trained by brunodorneles. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobertpt_all_finetuned_ner_en_5.2.0_3.0_1699454876413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobertpt_all_finetuned_ner_en_5.2.0_3.0_1699454876413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobertpt_all_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobertpt_all_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobertpt_all_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.9 MB| + +## References + +https://huggingface.co/brunodorneles/biobertpt-all-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bioformer_8l_bc2gm_en.md b/docs/_posts/ahmedlone127/2023-11-08-bioformer_8l_bc2gm_en.md new file mode 100644 index 000000000000..79951719b8e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bioformer_8l_bc2gm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bioformer_8l_bc2gm BertForTokenClassification from bioformers +author: John Snow Labs +name: bioformer_8l_bc2gm +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bioformer_8l_bc2gm` is a English model originally trained by bioformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bioformer_8l_bc2gm_en_5.2.0_3.0_1699453114580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bioformer_8l_bc2gm_en_5.2.0_3.0_1699453114580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bioformer_8l_bc2gm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bioformer_8l_bc2gm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bioformer_8l_bc2gm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|158.5 MB| + +## References + +https://huggingface.co/bioformers/bioformer-8L-bc2gm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biomedical_ner_maccrobat_bert_en.md b/docs/_posts/ahmedlone127/2023-11-08-biomedical_ner_maccrobat_bert_en.md new file mode 100644 index 000000000000..edaebeaa02b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biomedical_ner_maccrobat_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_maccrobat_bert BertForTokenClassification from vineetsharma +author: John Snow Labs +name: biomedical_ner_maccrobat_bert +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_maccrobat_bert` is a English model originally trained by vineetsharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_maccrobat_bert_en_5.2.0_3.0_1699413863398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_maccrobat_bert_en_5.2.0_3.0_1699413863398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biomedical_ner_maccrobat_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biomedical_ner_maccrobat_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_maccrobat_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/vineetsharma/BioMedical_NER-maccrobat-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope_en.md b/docs/_posts/ahmedlone127/2023-11-08-biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope_en.md new file mode 100644 index 000000000000..69b3af490c4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope BertForTokenClassification from PDBEurope +author: John Snow Labs +name: biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope` is a English model originally trained by PDBEurope. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope_en_5.2.0_3.0_1699429783769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope_en_5.2.0_3.0_1699429783769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomednlp_pubmedbert_proteinstructure_ner_v3_1_pdbeurope| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/PDBEurope/BiomedNLP-PubMedBERT-ProteinStructure-NER-v3.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bulbert_ner_udep_5epochs_en.md b/docs/_posts/ahmedlone127/2023-11-08-bulbert_ner_udep_5epochs_en.md new file mode 100644 index 000000000000..4b1cfada4db5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bulbert_ner_udep_5epochs_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bulbert_ner_udep_5epochs BertForTokenClassification from mor40 +author: John Snow Labs +name: bulbert_ner_udep_5epochs +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bulbert_ner_udep_5epochs` is a English model originally trained by mor40. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bulbert_ner_udep_5epochs_en_5.2.0_3.0_1699456188489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bulbert_ner_udep_5epochs_en_5.2.0_3.0_1699456188489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bulbert_ner_udep_5epochs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bulbert_ner_udep_5epochs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bulbert_ner_udep_5epochs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|306.1 MB| + +## References + +https://huggingface.co/mor40/BulBERT-ner-udep-5epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-bulbert_ner_wikiann_en.md b/docs/_posts/ahmedlone127/2023-11-08-bulbert_ner_wikiann_en.md new file mode 100644 index 000000000000..83c39c7211fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-bulbert_ner_wikiann_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bulbert_ner_wikiann BertForTokenClassification from mor40 +author: John Snow Labs +name: bulbert_ner_wikiann +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bulbert_ner_wikiann` is a English model originally trained by mor40. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bulbert_ner_wikiann_en_5.2.0_3.0_1699402275572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bulbert_ner_wikiann_en_5.2.0_3.0_1699402275572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bulbert_ner_wikiann","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bulbert_ner_wikiann", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bulbert_ner_wikiann| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|306.1 MB| + +## References + +https://huggingface.co/mor40/BulBERT-ner-wikiann \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-burmese_awesome_wnut_model_bahador0101_en.md b/docs/_posts/ahmedlone127/2023-11-08-burmese_awesome_wnut_model_bahador0101_en.md new file mode 100644 index 000000000000..69f7bca21aa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-burmese_awesome_wnut_model_bahador0101_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_bahador0101 BertForTokenClassification from BahAdoR0101 +author: John Snow Labs +name: burmese_awesome_wnut_model_bahador0101 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_bahador0101` is a English model originally trained by BahAdoR0101. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_bahador0101_en_5.2.0_3.0_1699463744662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_bahador0101_en_5.2.0_3.0_1699463744662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("burmese_awesome_wnut_model_bahador0101","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("burmese_awesome_wnut_model_bahador0101", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_bahador0101| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/BahAdoR0101/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-clinicalnerpt_quantitative_pt.md b/docs/_posts/ahmedlone127/2023-11-08-clinicalnerpt_quantitative_pt.md new file mode 100644 index 000000000000..3efab3787f39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-clinicalnerpt_quantitative_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese clinicalnerpt_quantitative BertForTokenClassification from pucpr +author: John Snow Labs +name: clinicalnerpt_quantitative +date: 2023-11-08 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinicalnerpt_quantitative` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinicalnerpt_quantitative_pt_5.2.0_3.0_1699423311702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinicalnerpt_quantitative_pt_5.2.0_3.0_1699423311702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("clinicalnerpt_quantitative","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("clinicalnerpt_quantitative", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinicalnerpt_quantitative| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|664.8 MB| + +## References + +https://huggingface.co/pucpr/clinicalnerpt-quantitative \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-darija_ner_ar.md b/docs/_posts/ahmedlone127/2023-11-08-darija_ner_ar.md new file mode 100644 index 000000000000..d44606f73e1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-darija_ner_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic darija_ner BertForTokenClassification from hananour +author: John Snow Labs +name: darija_ner +date: 2023-11-08 +tags: [bert, ar, open_source, token_classification, onnx] +task: Named Entity Recognition +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`darija_ner` is a Arabic model originally trained by hananour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/darija_ner_ar_5.2.0_3.0_1699425887867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/darija_ner_ar_5.2.0_3.0_1699425887867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("darija_ner","ar") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("darija_ner", "ar") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|darija_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|ar| +|Size:|505.1 MB| + +## References + +https://huggingface.co/hananour/darija-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-dark_bert_finetuned_ner1_en.md b/docs/_posts/ahmedlone127/2023-11-08-dark_bert_finetuned_ner1_en.md new file mode 100644 index 000000000000..4d6f17fa7723 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-dark_bert_finetuned_ner1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dark_bert_finetuned_ner1 BertForTokenClassification from pulkitkumar13 +author: John Snow Labs +name: dark_bert_finetuned_ner1 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dark_bert_finetuned_ner1` is a English model originally trained by pulkitkumar13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dark_bert_finetuned_ner1_en_5.2.0_3.0_1699466097298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dark_bert_finetuned_ner1_en_5.2.0_3.0_1699466097298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("dark_bert_finetuned_ner1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("dark_bert_finetuned_ner1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dark_bert_finetuned_ner1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/pulkitkumar13/dark-bert-finetuned-ner1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-drone_term_extractor_en.md b/docs/_posts/ahmedlone127/2023-11-08-drone_term_extractor_en.md new file mode 100644 index 000000000000..cfdb2de27819 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-drone_term_extractor_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English drone_term_extractor BertForTokenClassification from swardiantara +author: John Snow Labs +name: drone_term_extractor +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`drone_term_extractor` is a English model originally trained by swardiantara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/drone_term_extractor_en_5.2.0_3.0_1699463054091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/drone_term_extractor_en_5.2.0_3.0_1699463054091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("drone_term_extractor","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("drone_term_extractor", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|drone_term_extractor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/swardiantara/drone-term-extractor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-droner_en.md b/docs/_posts/ahmedlone127/2023-11-08-droner_en.md new file mode 100644 index 000000000000..a1e30b5843a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-droner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English droner BertForTokenClassification from swardiantara +author: John Snow Labs +name: droner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`droner` is a English model originally trained by swardiantara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/droner_en_5.2.0_3.0_1699463435799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/droner_en_5.2.0_3.0_1699463435799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("droner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("droner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|droner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/swardiantara/droner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-elastic_bert_chinese_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-elastic_bert_chinese_ner_en.md new file mode 100644 index 000000000000..31eaf9667b64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-elastic_bert_chinese_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English elastic_bert_chinese_ner BertForTokenClassification from xiaxy +author: John Snow Labs +name: elastic_bert_chinese_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`elastic_bert_chinese_ner` is a English model originally trained by xiaxy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/elastic_bert_chinese_ner_en_5.2.0_3.0_1699436694425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/elastic_bert_chinese_ner_en_5.2.0_3.0_1699436694425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("elastic_bert_chinese_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("elastic_bert_chinese_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|elastic_bert_chinese_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/xiaxy/elastic-bert-chinese-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-estbert_estonian_subtitles_token_classification_et.md b/docs/_posts/ahmedlone127/2023-11-08-estbert_estonian_subtitles_token_classification_et.md new file mode 100644 index 000000000000..a1b39db9075b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-estbert_estonian_subtitles_token_classification_et.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Estonian estbert_estonian_subtitles_token_classification BertForTokenClassification from IljaSamoilov +author: John Snow Labs +name: estbert_estonian_subtitles_token_classification +date: 2023-11-08 +tags: [bert, et, open_source, token_classification, onnx] +task: Named Entity Recognition +language: et +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`estbert_estonian_subtitles_token_classification` is a Estonian model originally trained by IljaSamoilov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/estbert_estonian_subtitles_token_classification_et_5.2.0_3.0_1699450340690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/estbert_estonian_subtitles_token_classification_et_5.2.0_3.0_1699450340690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("estbert_estonian_subtitles_token_classification","et") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("estbert_estonian_subtitles_token_classification", "et") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|estbert_estonian_subtitles_token_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|et| +|Size:|463.4 MB| + +## References + +https://huggingface.co/IljaSamoilov/EstBERT-estonian-subtitles-token-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-expanded_multilingual_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-08-expanded_multilingual_ner_xx.md new file mode 100644 index 000000000000..0f63a05498d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-expanded_multilingual_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual expanded_multilingual_ner BertForTokenClassification from gamzenurmadan +author: John Snow Labs +name: expanded_multilingual_ner +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`expanded_multilingual_ner` is a Multilingual model originally trained by gamzenurmadan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/expanded_multilingual_ner_xx_5.2.0_3.0_1699473320084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/expanded_multilingual_ner_xx_5.2.0_3.0_1699473320084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("expanded_multilingual_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("expanded_multilingual_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|expanded_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/gamzenurmadan/expanded-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-fairness_lab_hana_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-fairness_lab_hana_ner_en.md new file mode 100644 index 000000000000..da4df0dc3c25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-fairness_lab_hana_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fairness_lab_hana_ner BertForTokenClassification from Social-Media-Fairness +author: John Snow Labs +name: fairness_lab_hana_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fairness_lab_hana_ner` is a English model originally trained by Social-Media-Fairness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fairness_lab_hana_ner_en_5.2.0_3.0_1699444479706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fairness_lab_hana_ner_en_5.2.0_3.0_1699444479706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("fairness_lab_hana_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("fairness_lab_hana_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fairness_lab_hana_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Social-Media-Fairness/Fairness-lab-Hana-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-finance_ner_v0_0_1_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-finance_ner_v0_0_1_finetuned_ner_en.md new file mode 100644 index 000000000000..f3da46ec60f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-finance_ner_v0_0_1_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finance_ner_v0_0_1_finetuned_ner BertForTokenClassification from AhmedTaha012 +author: John Snow Labs +name: finance_ner_v0_0_1_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_ner_v0_0_1_finetuned_ner` is a English model originally trained by AhmedTaha012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_ner_v0_0_1_finetuned_ner_en_5.2.0_3.0_1699463210084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_ner_v0_0_1_finetuned_ner_en_5.2.0_3.0_1699463210084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("finance_ner_v0_0_1_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("finance_ner_v0_0_1_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_ner_v0_0_1_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AhmedTaha012/finance-ner-v0.0.1-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-finance_ner_v0_0_8_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-finance_ner_v0_0_8_finetuned_ner_en.md new file mode 100644 index 000000000000..635dea30261e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-finance_ner_v0_0_8_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finance_ner_v0_0_8_finetuned_ner BertForTokenClassification from AhmedTaha012 +author: John Snow Labs +name: finance_ner_v0_0_8_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_ner_v0_0_8_finetuned_ner` is a English model originally trained by AhmedTaha012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_ner_v0_0_8_finetuned_ner_en_5.2.0_3.0_1699449428232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_ner_v0_0_8_finetuned_ner_en_5.2.0_3.0_1699449428232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("finance_ner_v0_0_8_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("finance_ner_v0_0_8_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_ner_v0_0_8_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AhmedTaha012/finance-ner-v0.0.8-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-finer_139_xtremedistil_l12_h384_en.md b/docs/_posts/ahmedlone127/2023-11-08-finer_139_xtremedistil_l12_h384_en.md new file mode 100644 index 000000000000..0bc38cabf43d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-finer_139_xtremedistil_l12_h384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finer_139_xtremedistil_l12_h384 BertForTokenClassification from nbroad +author: John Snow Labs +name: finer_139_xtremedistil_l12_h384 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finer_139_xtremedistil_l12_h384` is a English model originally trained by nbroad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finer_139_xtremedistil_l12_h384_en_5.2.0_3.0_1699404859134.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finer_139_xtremedistil_l12_h384_en_5.2.0_3.0_1699404859134.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("finer_139_xtremedistil_l12_h384","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("finer_139_xtremedistil_l12_h384", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finer_139_xtremedistil_l12_h384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|124.4 MB| + +## References + +https://huggingface.co/nbroad/finer-139-xtremedistil-l12-h384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-fp_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-fp_ner_en.md new file mode 100644 index 000000000000..be12a7ce4e6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-fp_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fp_ner BertForTokenClassification from zhenglianchi +author: John Snow Labs +name: fp_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fp_ner` is a English model originally trained by zhenglianchi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fp_ner_en_5.2.0_3.0_1699435888148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fp_ner_en_5.2.0_3.0_1699435888148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("fp_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("fp_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fp_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.3 MB| + +## References + +https://huggingface.co/zhenglianchi/fp-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-greek_legal_bert_v2_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-greek_legal_bert_v2_finetuned_ner_en.md new file mode 100644 index 000000000000..c2622e5685b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-greek_legal_bert_v2_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English greek_legal_bert_v2_finetuned_ner BertForTokenClassification from amichailidis +author: John Snow Labs +name: greek_legal_bert_v2_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`greek_legal_bert_v2_finetuned_ner` is a English model originally trained by amichailidis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/greek_legal_bert_v2_finetuned_ner_en_5.2.0_3.0_1699474919715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/greek_legal_bert_v2_finetuned_ner_en_5.2.0_3.0_1699474919715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("greek_legal_bert_v2_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("greek_legal_bert_v2_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|greek_legal_bert_v2_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|421.0 MB| + +## References + +https://huggingface.co/amichailidis/greek_legal_bert_v2-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-greek_legal_bert_v2_finetuned_ner_v3_en.md b/docs/_posts/ahmedlone127/2023-11-08-greek_legal_bert_v2_finetuned_ner_v3_en.md new file mode 100644 index 000000000000..82dbd46202cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-greek_legal_bert_v2_finetuned_ner_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English greek_legal_bert_v2_finetuned_ner_v3 BertForTokenClassification from amichailidis +author: John Snow Labs +name: greek_legal_bert_v2_finetuned_ner_v3 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`greek_legal_bert_v2_finetuned_ner_v3` is a English model originally trained by amichailidis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/greek_legal_bert_v2_finetuned_ner_v3_en_5.2.0_3.0_1699417541056.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/greek_legal_bert_v2_finetuned_ner_v3_en_5.2.0_3.0_1699417541056.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("greek_legal_bert_v2_finetuned_ner_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("greek_legal_bert_v2_finetuned_ner_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|greek_legal_bert_v2_finetuned_ner_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|421.0 MB| + +## References + +https://huggingface.co/amichailidis/greek_legal_bert_v2-finetuned-ner-V3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-guj_sayula_popoluca_tagging_v2_en.md b/docs/_posts/ahmedlone127/2023-11-08-guj_sayula_popoluca_tagging_v2_en.md new file mode 100644 index 000000000000..bf3453716cec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-guj_sayula_popoluca_tagging_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English guj_sayula_popoluca_tagging_v2 BertForTokenClassification from om-ashish-soni +author: John Snow Labs +name: guj_sayula_popoluca_tagging_v2 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`guj_sayula_popoluca_tagging_v2` is a English model originally trained by om-ashish-soni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/guj_sayula_popoluca_tagging_v2_en_5.2.0_3.0_1699403641206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/guj_sayula_popoluca_tagging_v2_en_5.2.0_3.0_1699403641206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("guj_sayula_popoluca_tagging_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("guj_sayula_popoluca_tagging_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|guj_sayula_popoluca_tagging_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.2 MB| + +## References + +https://huggingface.co/om-ashish-soni/guj-pos-tagging-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-hana_unbias_ner_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-08-hana_unbias_ner_classifier_en.md new file mode 100644 index 000000000000..1e2556178639 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-hana_unbias_ner_classifier_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hana_unbias_ner_classifier BertForTokenClassification from Social-Media-Fairness +author: John Snow Labs +name: hana_unbias_ner_classifier +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hana_unbias_ner_classifier` is a English model originally trained by Social-Media-Fairness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hana_unbias_ner_classifier_en_5.2.0_3.0_1699452263831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hana_unbias_ner_classifier_en_5.2.0_3.0_1699452263831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("hana_unbias_ner_classifier","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("hana_unbias_ner_classifier", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hana_unbias_ner_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Social-Media-Fairness/hana_unbias_NER_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-heb_medical_baseline_en.md b/docs/_posts/ahmedlone127/2023-11-08-heb_medical_baseline_en.md new file mode 100644 index 000000000000..a4d77d9503d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-heb_medical_baseline_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English heb_medical_baseline BertForTokenClassification from cp500 +author: John Snow Labs +name: heb_medical_baseline +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`heb_medical_baseline` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/heb_medical_baseline_en_5.2.0_3.0_1699406769716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/heb_medical_baseline_en_5.2.0_3.0_1699406769716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("heb_medical_baseline","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("heb_medical_baseline", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|heb_medical_baseline| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.5 MB| + +## References + +https://huggingface.co/cp500/heb_medical_baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-idrisi_lmr_ar_random_typebased_en.md b/docs/_posts/ahmedlone127/2023-11-08-idrisi_lmr_ar_random_typebased_en.md new file mode 100644 index 000000000000..dc5db06ee84a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-idrisi_lmr_ar_random_typebased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English idrisi_lmr_ar_random_typebased BertForTokenClassification from rsuwaileh +author: John Snow Labs +name: idrisi_lmr_ar_random_typebased +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idrisi_lmr_ar_random_typebased` is a English model originally trained by rsuwaileh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idrisi_lmr_ar_random_typebased_en_5.2.0_3.0_1699482179555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idrisi_lmr_ar_random_typebased_en_5.2.0_3.0_1699482179555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("idrisi_lmr_ar_random_typebased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("idrisi_lmr_ar_random_typebased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idrisi_lmr_ar_random_typebased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|608.8 MB| + +## References + +https://huggingface.co/rsuwaileh/IDRISI-LMR-AR-random-typebased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-jobbert_en.md b/docs/_posts/ahmedlone127/2023-11-08-jobbert_en.md new file mode 100644 index 000000000000..085fcd7a2df1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-jobbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jobbert BertForTokenClassification from Andrei95 +author: John Snow Labs +name: jobbert +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobbert` is a English model originally trained by Andrei95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobbert_en_5.2.0_3.0_1699430866914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobbert_en_5.2.0_3.0_1699430866914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("jobbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("jobbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Andrei95/jobbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-klue_bert_base_ner_kluedata_en.md b/docs/_posts/ahmedlone127/2023-11-08-klue_bert_base_ner_kluedata_en.md new file mode 100644 index 000000000000..7fc0ad81cb07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-klue_bert_base_ner_kluedata_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English klue_bert_base_ner_kluedata BertForTokenClassification from datasciathlete +author: John Snow Labs +name: klue_bert_base_ner_kluedata +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_bert_base_ner_kluedata` is a English model originally trained by datasciathlete. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_bert_base_ner_kluedata_en_5.2.0_3.0_1699425754476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_bert_base_ner_kluedata_en_5.2.0_3.0_1699425754476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("klue_bert_base_ner_kluedata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("klue_bert_base_ner_kluedata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_bert_base_ner_kluedata| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/datasciathlete/KLUE-BERT-BASE-NER-kluedata \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-kr_finbert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-kr_finbert_finetuned_ner_en.md new file mode 100644 index 000000000000..00926e368139 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-kr_finbert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kr_finbert_finetuned_ner BertForTokenClassification from mepi +author: John Snow Labs +name: kr_finbert_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kr_finbert_finetuned_ner` is a English model originally trained by mepi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kr_finbert_finetuned_ner_en_5.2.0_3.0_1699468887942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kr_finbert_finetuned_ner_en_5.2.0_3.0_1699468887942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("kr_finbert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("kr_finbert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kr_finbert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|377.9 MB| + +## References + +https://huggingface.co/mepi/KR-FinBert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-marischal_college_hmbert_en.md b/docs/_posts/ahmedlone127/2023-11-08-marischal_college_hmbert_en.md new file mode 100644 index 000000000000..1bf517fe3c38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-marischal_college_hmbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English marischal_college_hmbert BertForTokenClassification from matthewleechen +author: John Snow Labs +name: marischal_college_hmbert +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marischal_college_hmbert` is a English model originally trained by matthewleechen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marischal_college_hmbert_en_5.2.0_3.0_1699468266317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marischal_college_hmbert_en_5.2.0_3.0_1699468266317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("marischal_college_hmbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("marischal_college_hmbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marischal_college_hmbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|504.6 MB| + +## References + +https://huggingface.co/matthewleechen/marischal_college_hmbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-mbert_bengali_ner_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-mbert_bengali_ner_finetuned_ner_en.md new file mode 100644 index 000000000000..4d18b5b501e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-mbert_bengali_ner_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mbert_bengali_ner_finetuned_ner BertForTokenClassification from BitanBiswas +author: John Snow Labs +name: mbert_bengali_ner_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_bengali_ner_finetuned_ner` is a English model originally trained by BitanBiswas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_bengali_ner_finetuned_ner_en_5.2.0_3.0_1699440756609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_bengali_ner_finetuned_ner_en_5.2.0_3.0_1699440756609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("mbert_bengali_ner_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("mbert_bengali_ner_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_bengali_ner_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|625.5 MB| + +## References + +https://huggingface.co/BitanBiswas/mbert-bengali-ner-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-mongolian_bert_base_demo_named_entity_mn.md b/docs/_posts/ahmedlone127/2023-11-08-mongolian_bert_base_demo_named_entity_mn.md new file mode 100644 index 000000000000..227535a9d67f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-mongolian_bert_base_demo_named_entity_mn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Mongolian mongolian_bert_base_demo_named_entity BertForTokenClassification from 2rtl3 +author: John Snow Labs +name: mongolian_bert_base_demo_named_entity +date: 2023-11-08 +tags: [bert, mn, open_source, token_classification, onnx] +task: Named Entity Recognition +language: mn +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mongolian_bert_base_demo_named_entity` is a Mongolian model originally trained by 2rtl3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mongolian_bert_base_demo_named_entity_mn_5.2.0_3.0_1699404521755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mongolian_bert_base_demo_named_entity_mn_5.2.0_3.0_1699404521755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("mongolian_bert_base_demo_named_entity","mn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("mongolian_bert_base_demo_named_entity", "mn") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mongolian_bert_base_demo_named_entity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|mn| +|Size:|665.1 MB| + +## References + +https://huggingface.co/2rtl3/mn-bert-base-demo-named-entity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-multibertbestmodeloct11_en.md b/docs/_posts/ahmedlone127/2023-11-08-multibertbestmodeloct11_en.md new file mode 100644 index 000000000000..1d8bab19bac2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-multibertbestmodeloct11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English multibertbestmodeloct11 BertForTokenClassification from Tommert25 +author: John Snow Labs +name: multibertbestmodeloct11 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multibertbestmodeloct11` is a English model originally trained by Tommert25. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multibertbestmodeloct11_en_5.2.0_3.0_1699433704627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multibertbestmodeloct11_en_5.2.0_3.0_1699433704627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("multibertbestmodeloct11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("multibertbestmodeloct11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multibertbestmodeloct11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|625.5 MB| + +## References + +https://huggingface.co/Tommert25/MultiBERTBestModelOct11 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-multilingual_bengali_token_classification_model_xx.md b/docs/_posts/ahmedlone127/2023-11-08-multilingual_bengali_token_classification_model_xx.md new file mode 100644 index 000000000000..5d96c75df5cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-multilingual_bengali_token_classification_model_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual multilingual_bengali_token_classification_model BertForTokenClassification from Cabooose +author: John Snow Labs +name: multilingual_bengali_token_classification_model +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multilingual_bengali_token_classification_model` is a Multilingual model originally trained by Cabooose. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multilingual_bengali_token_classification_model_xx_5.2.0_3.0_1699408573039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multilingual_bengali_token_classification_model_xx_5.2.0_3.0_1699408573039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("multilingual_bengali_token_classification_model","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("multilingual_bengali_token_classification_model", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multilingual_bengali_token_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Cabooose/multilingual_bengali_token_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-multilingual_indonesian_token_classification_model_xx.md b/docs/_posts/ahmedlone127/2023-11-08-multilingual_indonesian_token_classification_model_xx.md new file mode 100644 index 000000000000..11e6cba26482 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-multilingual_indonesian_token_classification_model_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual multilingual_indonesian_token_classification_model BertForTokenClassification from Cabooose +author: John Snow Labs +name: multilingual_indonesian_token_classification_model +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multilingual_indonesian_token_classification_model` is a Multilingual model originally trained by Cabooose. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multilingual_indonesian_token_classification_model_xx_5.2.0_3.0_1699410457762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multilingual_indonesian_token_classification_model_xx_5.2.0_3.0_1699410457762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("multilingual_indonesian_token_classification_model","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("multilingual_indonesian_token_classification_model", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multilingual_indonesian_token_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/Cabooose/multilingual_indonesian_token_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-name_anonymization_akoksal_tr.md b/docs/_posts/ahmedlone127/2023-11-08-name_anonymization_akoksal_tr.md new file mode 100644 index 000000000000..efc72c2c1861 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-name_anonymization_akoksal_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish name_anonymization_akoksal BertForTokenClassification from akoksal +author: John Snow Labs +name: name_anonymization_akoksal +date: 2023-11-08 +tags: [bert, tr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`name_anonymization_akoksal` is a Turkish model originally trained by akoksal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/name_anonymization_akoksal_tr_5.2.0_3.0_1699467515676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/name_anonymization_akoksal_tr_5.2.0_3.0_1699467515676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("name_anonymization_akoksal","tr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("name_anonymization_akoksal", "tr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|name_anonymization_akoksal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|412.3 MB| + +## References + +https://huggingface.co/akoksal/name_anonymization \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-negation_bert_en.md b/docs/_posts/ahmedlone127/2023-11-08-negation_bert_en.md new file mode 100644 index 000000000000..b3b2782f2417 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-negation_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English negation_bert BertForTokenClassification from shoubhikc +author: John Snow Labs +name: negation_bert +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`negation_bert` is a English model originally trained by shoubhikc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/negation_bert_en_5.2.0_3.0_1699468736421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/negation_bert_en_5.2.0_3.0_1699468736421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("negation_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("negation_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|negation_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/shoubhikc/negation_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-ner_dependency_dutch_official_11_en.md b/docs/_posts/ahmedlone127/2023-11-08-ner_dependency_dutch_official_11_en.md new file mode 100644 index 000000000000..6c9db1e2390d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-ner_dependency_dutch_official_11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_dependency_dutch_official_11 BertForTokenClassification from Annemae +author: John Snow Labs +name: ner_dependency_dutch_official_11 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_dependency_dutch_official_11` is a English model originally trained by Annemae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_dependency_dutch_official_11_en_5.2.0_3.0_1699478541524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_dependency_dutch_official_11_en_5.2.0_3.0_1699478541524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_dependency_dutch_official_11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_dependency_dutch_official_11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_dependency_dutch_official_11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/Annemae/ner_dependency_nl_official_11 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-ner_gdpr_en.md b/docs/_posts/ahmedlone127/2023-11-08-ner_gdpr_en.md new file mode 100644 index 000000000000..95d9bf1cc15e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-ner_gdpr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_gdpr BertForTokenClassification from buerokratt +author: John Snow Labs +name: ner_gdpr +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_gdpr` is a English model originally trained by buerokratt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_gdpr_en_5.2.0_3.0_1699454308894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_gdpr_en_5.2.0_3.0_1699454308894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_gdpr","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_gdpr", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_gdpr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|463.5 MB| + +## References + +https://huggingface.co/buerokratt/ner_gdpr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-ner_old_en.md b/docs/_posts/ahmedlone127/2023-11-08-ner_old_en.md new file mode 100644 index 000000000000..2f90499f22b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-ner_old_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_old BertForTokenClassification from buerokratt +author: John Snow Labs +name: ner_old +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_old` is a English model originally trained by buerokratt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_old_en_5.2.0_3.0_1699453983143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_old_en_5.2.0_3.0_1699453983143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_old","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_old", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_old| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|378.1 MB| + +## References + +https://huggingface.co/buerokratt/ner_old \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-ner_resume_en.md b/docs/_posts/ahmedlone127/2023-11-08-ner_resume_en.md new file mode 100644 index 000000000000..7f606c07586f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-ner_resume_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_resume BertForTokenClassification from momo22 +author: John Snow Labs +name: ner_resume +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_resume` is a English model originally trained by momo22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_resume_en_5.2.0_3.0_1699421268824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_resume_en_5.2.0_3.0_1699421268824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_resume","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_resume", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_resume| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/momo22/ner_resume \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-nerkor_hubert_hu.md b/docs/_posts/ahmedlone127/2023-11-08-nerkor_hubert_hu.md new file mode 100644 index 000000000000..89b9213b76ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-nerkor_hubert_hu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hungarian nerkor_hubert BertForTokenClassification from novakat +author: John Snow Labs +name: nerkor_hubert +date: 2023-11-08 +tags: [bert, hu, open_source, token_classification, onnx] +task: Named Entity Recognition +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerkor_hubert` is a Hungarian model originally trained by novakat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerkor_hubert_hu_5.2.0_3.0_1699447797913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerkor_hubert_hu_5.2.0_3.0_1699447797913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nerkor_hubert","hu") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nerkor_hubert", "hu") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerkor_hubert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|hu| +|Size:|412.5 MB| + +## References + +https://huggingface.co/novakat/nerkor-hubert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased_xx.md b/docs/_posts/ahmedlone127/2023-11-08-nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased_xx.md new file mode 100644 index 000000000000..85be49bd608e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased BertForTokenClassification from GuCuChiara +author: John Snow Labs +name: nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased` is a Multilingual model originally trained by GuCuChiara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased_xx_5.2.0_3.0_1699409457516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased_xx_5.2.0_3.0_1699409457516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_hiba_distemist_fine_tuned_bert_base_multilingual_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/GuCuChiara/NLP-HIBA_DisTEMIST_fine_tuned_bert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-nyt_ingredient_tagger_gte_small_l3_ingredient_v2_en.md b/docs/_posts/ahmedlone127/2023-11-08-nyt_ingredient_tagger_gte_small_l3_ingredient_v2_en.md new file mode 100644 index 000000000000..ba1354c8e229 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-nyt_ingredient_tagger_gte_small_l3_ingredient_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nyt_ingredient_tagger_gte_small_l3_ingredient_v2 BertForTokenClassification from napsternxg +author: John Snow Labs +name: nyt_ingredient_tagger_gte_small_l3_ingredient_v2 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nyt_ingredient_tagger_gte_small_l3_ingredient_v2` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nyt_ingredient_tagger_gte_small_l3_ingredient_v2_en_5.2.0_3.0_1699429469946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nyt_ingredient_tagger_gte_small_l3_ingredient_v2_en_5.2.0_3.0_1699429469946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nyt_ingredient_tagger_gte_small_l3_ingredient_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nyt_ingredient_tagger_gte_small_l3_ingredient_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nyt_ingredient_tagger_gte_small_l3_ingredient_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|64.7 MB| + +## References + +https://huggingface.co/napsternxg/nyt-ingredient-tagger-gte-small-L3-ingredient-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-phibert_finetuned_ner_nepal_bhasa_1_en.md b/docs/_posts/ahmedlone127/2023-11-08-phibert_finetuned_ner_nepal_bhasa_1_en.md new file mode 100644 index 000000000000..463f7d27a27f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-phibert_finetuned_ner_nepal_bhasa_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English phibert_finetuned_ner_nepal_bhasa_1 BertForTokenClassification from RUKESH +author: John Snow Labs +name: phibert_finetuned_ner_nepal_bhasa_1 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phibert_finetuned_ner_nepal_bhasa_1` is a English model originally trained by RUKESH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phibert_finetuned_ner_nepal_bhasa_1_en_5.2.0_3.0_1699483520154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phibert_finetuned_ner_nepal_bhasa_1_en_5.2.0_3.0_1699483520154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("phibert_finetuned_ner_nepal_bhasa_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("phibert_finetuned_ner_nepal_bhasa_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phibert_finetuned_ner_nepal_bhasa_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.2 MB| + +## References + +https://huggingface.co/RUKESH/phibert-finetuned-ner-new-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-political_entity_recognizer_en.md b/docs/_posts/ahmedlone127/2023-11-08-political_entity_recognizer_en.md new file mode 100644 index 000000000000..7e5f3fddc9e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-political_entity_recognizer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English political_entity_recognizer BertForTokenClassification from nlplab +author: John Snow Labs +name: political_entity_recognizer +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`political_entity_recognizer` is a English model originally trained by nlplab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/political_entity_recognizer_en_5.2.0_3.0_1699428569605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/political_entity_recognizer_en_5.2.0_3.0_1699428569605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("political_entity_recognizer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("political_entity_recognizer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|political_entity_recognizer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/nlplab/political-entity-recognizer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-porttagger_news_base_en.md b/docs/_posts/ahmedlone127/2023-11-08-porttagger_news_base_en.md new file mode 100644 index 000000000000..08d3f887d65b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-porttagger_news_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English porttagger_news_base BertForTokenClassification from Emanuel +author: John Snow Labs +name: porttagger_news_base +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`porttagger_news_base` is a English model originally trained by Emanuel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/porttagger_news_base_en_5.2.0_3.0_1699425406664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/porttagger_news_base_en_5.2.0_3.0_1699425406664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("porttagger_news_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("porttagger_news_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|porttagger_news_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Emanuel/porttagger-news-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-porttagger_tweets_base_en.md b/docs/_posts/ahmedlone127/2023-11-08-porttagger_tweets_base_en.md new file mode 100644 index 000000000000..643ca005b8b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-porttagger_tweets_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English porttagger_tweets_base BertForTokenClassification from Emanuel +author: John Snow Labs +name: porttagger_tweets_base +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`porttagger_tweets_base` is a English model originally trained by Emanuel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/porttagger_tweets_base_en_5.2.0_3.0_1699470138020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/porttagger_tweets_base_en_5.2.0_3.0_1699470138020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("porttagger_tweets_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("porttagger_tweets_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|porttagger_tweets_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Emanuel/porttagger-tweets-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-postagger_bio_english_en.md b/docs/_posts/ahmedlone127/2023-11-08-postagger_bio_english_en.md new file mode 100644 index 000000000000..df9c0408d581 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-postagger_bio_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English postagger_bio_english BertForTokenClassification from pucpr-br +author: John Snow Labs +name: postagger_bio_english +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`postagger_bio_english` is a English model originally trained by pucpr-br. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/postagger_bio_english_en_5.2.0_3.0_1699404792031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/postagger_bio_english_en_5.2.0_3.0_1699404792031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("postagger_bio_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("postagger_bio_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|postagger_bio_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.8 MB| + +## References + +https://huggingface.co/pucpr-br/postagger-bio-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-postagger_south_azerbaijani_az.md b/docs/_posts/ahmedlone127/2023-11-08-postagger_south_azerbaijani_az.md new file mode 100644 index 000000000000..d8e1b896c54c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-postagger_south_azerbaijani_az.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Azerbaijani postagger_south_azerbaijani BertForTokenClassification from language-ml-lab +author: John Snow Labs +name: postagger_south_azerbaijani +date: 2023-11-08 +tags: [bert, az, open_source, token_classification, onnx] +task: Named Entity Recognition +language: az +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`postagger_south_azerbaijani` is a Azerbaijani model originally trained by language-ml-lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/postagger_south_azerbaijani_az_5.2.0_3.0_1699420138102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/postagger_south_azerbaijani_az_5.2.0_3.0_1699420138102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("postagger_south_azerbaijani","az") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("postagger_south_azerbaijani", "az") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|postagger_south_azerbaijani| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|az| +|Size:|347.5 MB| + +## References + +https://huggingface.co/language-ml-lab/postagger-azb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-products_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-products_ner_en.md new file mode 100644 index 000000000000..18ab58033ac6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-products_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English products_ner BertForTokenClassification from Atheer174 +author: John Snow Labs +name: products_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`products_ner` is a English model originally trained by Atheer174. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/products_ner_en_5.2.0_3.0_1699446270363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/products_ner_en_5.2.0_3.0_1699446270363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("products_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("products_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|products_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Atheer174/Products_NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-pubmedbert_base_finetuned_n2c2_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-pubmedbert_base_finetuned_n2c2_ner_en.md new file mode 100644 index 000000000000..44531185a15f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-pubmedbert_base_finetuned_n2c2_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pubmedbert_base_finetuned_n2c2_ner BertForTokenClassification from georgeleung30 +author: John Snow Labs +name: pubmedbert_base_finetuned_n2c2_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmedbert_base_finetuned_n2c2_ner` is a English model originally trained by georgeleung30. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmedbert_base_finetuned_n2c2_ner_en_5.2.0_3.0_1699412158375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmedbert_base_finetuned_n2c2_ner_en_5.2.0_3.0_1699412158375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("pubmedbert_base_finetuned_n2c2_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("pubmedbert_base_finetuned_n2c2_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmedbert_base_finetuned_n2c2_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/georgeleung30/PubMedBERT-base-finetuned-n2c2-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-real_estate_financial_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-real_estate_financial_ner_en.md new file mode 100644 index 000000000000..c74fe43562ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-real_estate_financial_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English real_estate_financial_ner BertForTokenClassification from jnferfer +author: John Snow Labs +name: real_estate_financial_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`real_estate_financial_ner` is a English model originally trained by jnferfer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/real_estate_financial_ner_en_5.2.0_3.0_1699435250183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/real_estate_financial_ner_en_5.2.0_3.0_1699435250183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("real_estate_financial_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("real_estate_financial_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|real_estate_financial_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jnferfer/real_estate_financial_NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-resume_ner_1_en.md b/docs/_posts/ahmedlone127/2023-11-08-resume_ner_1_en.md new file mode 100644 index 000000000000..d14296942e75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-resume_ner_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English resume_ner_1 BertForTokenClassification from QuanjieHan +author: John Snow Labs +name: resume_ner_1 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`resume_ner_1` is a English model originally trained by QuanjieHan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/resume_ner_1_en_5.2.0_3.0_1699410735265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/resume_ner_1_en_5.2.0_3.0_1699410735265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("resume_ner_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("resume_ner_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|resume_ner_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/QuanjieHan/resume_ner_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-rhenus_v1_0_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2023-11-08-rhenus_v1_0_bert_base_multilingual_uncased_xx.md new file mode 100644 index 000000000000..7d95272f0002 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-rhenus_v1_0_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual rhenus_v1_0_bert_base_multilingual_uncased BertForTokenClassification from DataIntelligenceTeam +author: John Snow Labs +name: rhenus_v1_0_bert_base_multilingual_uncased +date: 2023-11-08 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rhenus_v1_0_bert_base_multilingual_uncased` is a Multilingual model originally trained by DataIntelligenceTeam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rhenus_v1_0_bert_base_multilingual_uncased_xx_5.2.0_3.0_1699410469287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rhenus_v1_0_bert_base_multilingual_uncased_xx_5.2.0_3.0_1699410469287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rhenus_v1_0_bert_base_multilingual_uncased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rhenus_v1_0_bert_base_multilingual_uncased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rhenus_v1_0_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|625.7 MB| + +## References + +https://huggingface.co/DataIntelligenceTeam/rhenus_v1.0_bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-rubert_base_lesha17_punctuation_en.md b/docs/_posts/ahmedlone127/2023-11-08-rubert_base_lesha17_punctuation_en.md new file mode 100644 index 000000000000..e9010add488b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-rubert_base_lesha17_punctuation_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_base_lesha17_punctuation BertForTokenClassification from cointegrated +author: John Snow Labs +name: rubert_base_lesha17_punctuation +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_lesha17_punctuation` is a English model originally trained by cointegrated. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_lesha17_punctuation_en_5.2.0_3.0_1699441915557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_lesha17_punctuation_en_5.2.0_3.0_1699441915557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rubert_base_lesha17_punctuation","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rubert_base_lesha17_punctuation", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_lesha17_punctuation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|664.3 MB| + +## References + +https://huggingface.co/cointegrated/rubert-base-lesha17-punctuation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-rubert_tiny2_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-rubert_tiny2_finetuned_ner_en.md new file mode 100644 index 000000000000..7b45bacbf509 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-rubert_tiny2_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_tiny2_finetuned_ner BertForTokenClassification from Evolett +author: John Snow Labs +name: rubert_tiny2_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_finetuned_ner` is a English model originally trained by Evolett. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_ner_en_5.2.0_3.0_1699427978197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_finetuned_ner_en_5.2.0_3.0_1699427978197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rubert_tiny2_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rubert_tiny2_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|109.1 MB| + +## References + +https://huggingface.co/Evolett/rubert-tiny2-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-rust_bert_base_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-rust_bert_base_ner_en.md new file mode 100644 index 000000000000..973557500c1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-rust_bert_base_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rust_bert_base_ner BertForTokenClassification from webmichaelnosenko +author: John Snow Labs +name: rust_bert_base_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rust_bert_base_ner` is a English model originally trained by webmichaelnosenko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rust_bert_base_ner_en_5.2.0_3.0_1699437361590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rust_bert_base_ner_en_5.2.0_3.0_1699437361590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rust_bert_base_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rust_bert_base_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rust_bert_base_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/webmichaelnosenko/rust-bert-base-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-sayula_popoluca_ner_tagging_en.md b/docs/_posts/ahmedlone127/2023-11-08-sayula_popoluca_ner_tagging_en.md new file mode 100644 index 000000000000..962a1c811cdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-sayula_popoluca_ner_tagging_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sayula_popoluca_ner_tagging BertForTokenClassification from om-ashish-soni +author: John Snow Labs +name: sayula_popoluca_ner_tagging +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sayula_popoluca_ner_tagging` is a English model originally trained by om-ashish-soni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sayula_popoluca_ner_tagging_en_5.2.0_3.0_1699442336748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sayula_popoluca_ner_tagging_en_5.2.0_3.0_1699442336748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("sayula_popoluca_ner_tagging","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("sayula_popoluca_ner_tagging", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sayula_popoluca_ner_tagging| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|404.9 MB| + +## References + +https://huggingface.co/om-ashish-soni/pos-ner-tagging \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_fl_en.md b/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_fl_en.md new file mode 100644 index 000000000000..8f59d86bfe99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_fl_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_finetuned_ner_fl BertForTokenClassification from vbhasin +author: John Snow Labs +name: scibert_finetuned_ner_fl +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_finetuned_ner_fl` is a English model originally trained by vbhasin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_fl_en_5.2.0_3.0_1699434573304.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_fl_en_5.2.0_3.0_1699434573304.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_finetuned_ner_fl","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_finetuned_ner_fl", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_finetuned_ner_fl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/vbhasin/sciBERT-finetuned-ner-FL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_siddharthtumre_en.md b/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_siddharthtumre_en.md new file mode 100644 index 000000000000..e034a285c15b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_siddharthtumre_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_finetuned_ner_siddharthtumre BertForTokenClassification from siddharthtumre +author: John Snow Labs +name: scibert_finetuned_ner_siddharthtumre +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_finetuned_ner_siddharthtumre` is a English model originally trained by siddharthtumre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_siddharthtumre_en_5.2.0_3.0_1699486801290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_siddharthtumre_en_5.2.0_3.0_1699486801290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_finetuned_ner_siddharthtumre","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_finetuned_ner_siddharthtumre", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_finetuned_ner_siddharthtumre| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/siddharthtumre/scibert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_v1_en.md b/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_v1_en.md new file mode 100644 index 000000000000..386e476c3aa2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-scibert_finetuned_ner_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_finetuned_ner_v1 BertForTokenClassification from sayalik13 +author: John Snow Labs +name: scibert_finetuned_ner_v1 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_finetuned_ner_v1` is a English model originally trained by sayalik13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_v1_en_5.2.0_3.0_1699463473734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_finetuned_ner_v1_en_5.2.0_3.0_1699463473734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_finetuned_ner_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_finetuned_ner_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_finetuned_ner_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/sayalik13/scibert-finetuned-ner-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-scibert_scivocab_uncased_finetuned_ner_sschet_en.md b/docs/_posts/ahmedlone127/2023-11-08-scibert_scivocab_uncased_finetuned_ner_sschet_en.md new file mode 100644 index 000000000000..e6a26e516b82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-scibert_scivocab_uncased_finetuned_ner_sschet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_scivocab_uncased_finetuned_ner_sschet BertForTokenClassification from sschet +author: John Snow Labs +name: scibert_scivocab_uncased_finetuned_ner_sschet +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_scivocab_uncased_finetuned_ner_sschet` is a English model originally trained by sschet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_scivocab_uncased_finetuned_ner_sschet_en_5.2.0_3.0_1699422037617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_scivocab_uncased_finetuned_ner_sschet_en_5.2.0_3.0_1699422037617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_scivocab_uncased_finetuned_ner_sschet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_scivocab_uncased_finetuned_ner_sschet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_scivocab_uncased_finetuned_ner_sschet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/sschet/scibert_scivocab_uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-scideberta_ser_en.md b/docs/_posts/ahmedlone127/2023-11-08-scideberta_ser_en.md new file mode 100644 index 000000000000..2f1d0b40587c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-scideberta_ser_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scideberta_ser BertForTokenClassification from havens2 +author: John Snow Labs +name: scideberta_ser +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scideberta_ser` is a English model originally trained by havens2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scideberta_ser_en_5.2.0_3.0_1699473723465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scideberta_ser_en_5.2.0_3.0_1699473723465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scideberta_ser","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scideberta_ser", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scideberta_ser| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/havens2/scideberta_SER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-scientific_paper_bert_fine_tuned_new_en.md b/docs/_posts/ahmedlone127/2023-11-08-scientific_paper_bert_fine_tuned_new_en.md new file mode 100644 index 000000000000..477ab2c4a5b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-scientific_paper_bert_fine_tuned_new_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scientific_paper_bert_fine_tuned_new BertForTokenClassification from MrSoapman +author: John Snow Labs +name: scientific_paper_bert_fine_tuned_new +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scientific_paper_bert_fine_tuned_new` is a English model originally trained by MrSoapman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scientific_paper_bert_fine_tuned_new_en_5.2.0_3.0_1699481850368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scientific_paper_bert_fine_tuned_new_en_5.2.0_3.0_1699481850368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scientific_paper_bert_fine_tuned_new","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scientific_paper_bert_fine_tuned_new", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scientific_paper_bert_fine_tuned_new| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/MrSoapman/scientific-paper-BERT-fine-tuned-NEW \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-sec_bert_base_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-sec_bert_base_finetuned_ner_en.md new file mode 100644 index 000000000000..da3f34bea083 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-sec_bert_base_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sec_bert_base_finetuned_ner BertForTokenClassification from elshehawy +author: John Snow Labs +name: sec_bert_base_finetuned_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sec_bert_base_finetuned_ner` is a English model originally trained by elshehawy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sec_bert_base_finetuned_ner_en_5.2.0_3.0_1699469932167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sec_bert_base_finetuned_ner_en_5.2.0_3.0_1699469932167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("sec_bert_base_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("sec_bert_base_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sec_bert_base_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.5 MB| + +## References + +https://huggingface.co/elshehawy/sec-bert-base-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-shingazidja_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-08-shingazidja_sayula_popoluca_en.md new file mode 100644 index 000000000000..e46775a46ef6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-shingazidja_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English shingazidja_sayula_popoluca BertForTokenClassification from nairaxo +author: John Snow Labs +name: shingazidja_sayula_popoluca +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`shingazidja_sayula_popoluca` is a English model originally trained by nairaxo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/shingazidja_sayula_popoluca_en_5.2.0_3.0_1699419736974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/shingazidja_sayula_popoluca_en_5.2.0_3.0_1699419736974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("shingazidja_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("shingazidja_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|shingazidja_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|752.9 MB| + +## References + +https://huggingface.co/nairaxo/shingazidja-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-specific_arabic_language_token_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-08-specific_arabic_language_token_classification_model_en.md new file mode 100644 index 000000000000..f5238fbebbc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-specific_arabic_language_token_classification_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English specific_arabic_language_token_classification_model BertForTokenClassification from Cabooose +author: John Snow Labs +name: specific_arabic_language_token_classification_model +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`specific_arabic_language_token_classification_model` is a English model originally trained by Cabooose. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/specific_arabic_language_token_classification_model_en_5.2.0_3.0_1699466870338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/specific_arabic_language_token_classification_model_en_5.2.0_3.0_1699466870338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("specific_arabic_language_token_classification_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("specific_arabic_language_token_classification_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|specific_arabic_language_token_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.6 MB| + +## References + +https://huggingface.co/Cabooose/specific_arabic_language_token_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-specific_bengali_language_token_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-08-specific_bengali_language_token_classification_model_en.md new file mode 100644 index 000000000000..d823f2a6159e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-specific_bengali_language_token_classification_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English specific_bengali_language_token_classification_model BertForTokenClassification from Cabooose +author: John Snow Labs +name: specific_bengali_language_token_classification_model +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`specific_bengali_language_token_classification_model` is a English model originally trained by Cabooose. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/specific_bengali_language_token_classification_model_en_5.2.0_3.0_1699463126735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/specific_bengali_language_token_classification_model_en_5.2.0_3.0_1699463126735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("specific_bengali_language_token_classification_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("specific_bengali_language_token_classification_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|specific_bengali_language_token_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|890.6 MB| + +## References + +https://huggingface.co/Cabooose/specific_bengali_language_token_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-swiss_german_sayula_popoluca_model_en.md b/docs/_posts/ahmedlone127/2023-11-08-swiss_german_sayula_popoluca_model_en.md new file mode 100644 index 000000000000..49cf093058f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-swiss_german_sayula_popoluca_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English swiss_german_sayula_popoluca_model BertForTokenClassification from noeminaepli +author: John Snow Labs +name: swiss_german_sayula_popoluca_model +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swiss_german_sayula_popoluca_model` is a English model originally trained by noeminaepli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swiss_german_sayula_popoluca_model_en_5.2.0_3.0_1699438831413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swiss_german_sayula_popoluca_model_en_5.2.0_3.0_1699438831413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("swiss_german_sayula_popoluca_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("swiss_german_sayula_popoluca_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swiss_german_sayula_popoluca_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.8 MB| + +## References + +https://huggingface.co/noeminaepli/swiss_german_pos_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-swiss_german_stts_sayula_popoluca_model_en.md b/docs/_posts/ahmedlone127/2023-11-08-swiss_german_stts_sayula_popoluca_model_en.md new file mode 100644 index 000000000000..65f2a8aa46e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-swiss_german_stts_sayula_popoluca_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English swiss_german_stts_sayula_popoluca_model BertForTokenClassification from noeminaepli +author: John Snow Labs +name: swiss_german_stts_sayula_popoluca_model +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`swiss_german_stts_sayula_popoluca_model` is a English model originally trained by noeminaepli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/swiss_german_stts_sayula_popoluca_model_en_5.2.0_3.0_1699463356832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/swiss_german_stts_sayula_popoluca_model_en_5.2.0_3.0_1699463356832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("swiss_german_stts_sayula_popoluca_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("swiss_german_stts_sayula_popoluca_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|swiss_german_stts_sayula_popoluca_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/noeminaepli/swiss_german_stts_pos_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-tagged_one_100v7_ner_model_3epochs_augmented_en.md b/docs/_posts/ahmedlone127/2023-11-08-tagged_one_100v7_ner_model_3epochs_augmented_en.md new file mode 100644 index 000000000000..b69dada0dc7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-tagged_one_100v7_ner_model_3epochs_augmented_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tagged_one_100v7_ner_model_3epochs_augmented BertForTokenClassification from DOOGLAK +author: John Snow Labs +name: tagged_one_100v7_ner_model_3epochs_augmented +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tagged_one_100v7_ner_model_3epochs_augmented` is a English model originally trained by DOOGLAK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tagged_one_100v7_ner_model_3epochs_augmented_en_5.2.0_3.0_1699433269132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tagged_one_100v7_ner_model_3epochs_augmented_en_5.2.0_3.0_1699433269132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("tagged_one_100v7_ner_model_3epochs_augmented","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("tagged_one_100v7_ner_model_3epochs_augmented", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tagged_one_100v7_ner_model_3epochs_augmented| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/DOOGLAK/Tagged_One_100v7_NER_Model_3Epochs_AUGMENTED \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-tamil_ner_model_en.md b/docs/_posts/ahmedlone127/2023-11-08-tamil_ner_model_en.md new file mode 100644 index 000000000000..b45557c820ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-tamil_ner_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tamil_ner_model BertForTokenClassification from sathishmahi +author: John Snow Labs +name: tamil_ner_model +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tamil_ner_model` is a English model originally trained by sathishmahi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tamil_ner_model_en_5.2.0_3.0_1699415561393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tamil_ner_model_en_5.2.0_3.0_1699415561393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("tamil_ner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("tamil_ner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tamil_ner_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/sathishmahi/tamil-ner-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-teste_tcc_en.md b/docs/_posts/ahmedlone127/2023-11-08-teste_tcc_en.md new file mode 100644 index 000000000000..569b85cacb7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-teste_tcc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English teste_tcc BertForTokenClassification from witalo +author: John Snow Labs +name: teste_tcc +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`teste_tcc` is a English model originally trained by witalo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/teste_tcc_en_5.2.0_3.0_1699446270323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/teste_tcc_en_5.2.0_3.0_1699446270323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("teste_tcc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("teste_tcc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|teste_tcc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/witalo/teste_tcc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-toxicbert_params_en.md b/docs/_posts/ahmedlone127/2023-11-08-toxicbert_params_en.md new file mode 100644 index 000000000000..e224c75b5e56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-toxicbert_params_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English toxicbert_params BertForTokenClassification from troesy +author: John Snow Labs +name: toxicbert_params +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicbert_params` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicbert_params_en_5.2.0_3.0_1699467199213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicbert_params_en_5.2.0_3.0_1699467199213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("toxicbert_params","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("toxicbert_params", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicbert_params| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/troesy/toxicBERT-params \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-turkish_ner_model_en.md b/docs/_posts/ahmedlone127/2023-11-08-turkish_ner_model_en.md new file mode 100644 index 000000000000..b8b2301cd564 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-turkish_ner_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English turkish_ner_model BertForTokenClassification from alpcansoydas +author: John Snow Labs +name: turkish_ner_model +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`turkish_ner_model` is a English model originally trained by alpcansoydas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/turkish_ner_model_en_5.2.0_3.0_1699485434405.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/turkish_ner_model_en_5.2.0_3.0_1699485434405.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("turkish_ner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("turkish_ner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|turkish_ner_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.3 MB| + +## References + +https://huggingface.co/alpcansoydas/tr-ner.model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-uner_muril_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-uner_muril_ner_en.md new file mode 100644 index 000000000000..218ce2f9c7a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-uner_muril_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English uner_muril_ner BertForTokenClassification from mirfan899 +author: John Snow Labs +name: uner_muril_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`uner_muril_ner` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/uner_muril_ner_en_5.2.0_3.0_1699468736524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/uner_muril_ner_en_5.2.0_3.0_1699468736524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("uner_muril_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("uner_muril_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|uner_muril_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/mirfan899/uner-muril-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-v4_combined_ner_en.md b/docs/_posts/ahmedlone127/2023-11-08-v4_combined_ner_en.md new file mode 100644 index 000000000000..4c360708a8a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-v4_combined_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English v4_combined_ner BertForTokenClassification from cp500 +author: John Snow Labs +name: v4_combined_ner +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`v4_combined_ner` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/v4_combined_ner_en_5.2.0_3.0_1699433727244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/v4_combined_ner_en_5.2.0_3.0_1699433727244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("v4_combined_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("v4_combined_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|v4_combined_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.5 MB| + +## References + +https://huggingface.co/cp500/v4_combined_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-vietnamese_ner_v1_4_0a2_en.md b/docs/_posts/ahmedlone127/2023-11-08-vietnamese_ner_v1_4_0a2_en.md new file mode 100644 index 000000000000..50a2ef9235f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-vietnamese_ner_v1_4_0a2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English vietnamese_ner_v1_4_0a2 BertForTokenClassification from rain1024 +author: John Snow Labs +name: vietnamese_ner_v1_4_0a2 +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vietnamese_ner_v1_4_0a2` is a English model originally trained by rain1024. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vietnamese_ner_v1_4_0a2_en_5.2.0_3.0_1699421616560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vietnamese_ner_v1_4_0a2_en_5.2.0_3.0_1699421616560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("vietnamese_ner_v1_4_0a2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("vietnamese_ner_v1_4_0a2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vietnamese_ner_v1_4_0a2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|428.8 MB| + +## References + +https://huggingface.co/rain1024/vietnamese-ner-v1.4.0a2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-08-wobert_bio_en.md b/docs/_posts/ahmedlone127/2023-11-08-wobert_bio_en.md new file mode 100644 index 000000000000..99bafd1d3aa1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-08-wobert_bio_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wobert_bio BertForTokenClassification from onefox +author: John Snow Labs +name: wobert_bio +date: 2023-11-08 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wobert_bio` is a English model originally trained by onefox. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wobert_bio_en_5.2.0_3.0_1699464213984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wobert_bio_en_5.2.0_3.0_1699464213984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("wobert_bio","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("wobert_bio", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wobert_bio| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/onefox/WoBERT-BIO \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-article_50v0_ner_model_3epochs_en.md b/docs/_posts/ahmedlone127/2023-11-09-article_50v0_ner_model_3epochs_en.md new file mode 100644 index 000000000000..690cd75ac762 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-article_50v0_ner_model_3epochs_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English article_50v0_ner_model_3epochs BertForTokenClassification from DOOGLAK +author: John Snow Labs +name: article_50v0_ner_model_3epochs +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`article_50v0_ner_model_3epochs` is a English model originally trained by DOOGLAK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/article_50v0_ner_model_3epochs_en_5.2.0_3.0_1699519099821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/article_50v0_ner_model_3epochs_en_5.2.0_3.0_1699519099821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("article_50v0_ner_model_3epochs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("article_50v0_ner_model_3epochs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|article_50v0_ner_model_3epochs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/DOOGLAK/Article_50v0_NER_Model_3Epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-assignment2_attempt12_en.md b/docs/_posts/ahmedlone127/2023-11-09-assignment2_attempt12_en.md new file mode 100644 index 000000000000..fc9c90dbe730 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-assignment2_attempt12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English assignment2_attempt12 BertForTokenClassification from mpalaval +author: John Snow Labs +name: assignment2_attempt12 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`assignment2_attempt12` is a English model originally trained by mpalaval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/assignment2_attempt12_en_5.2.0_3.0_1699508615969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/assignment2_attempt12_en_5.2.0_3.0_1699508615969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("assignment2_attempt12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("assignment2_attempt12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|assignment2_attempt12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mpalaval/assignment2_attempt12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_conll2003_samoan_all_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_conll2003_samoan_all_ner_en.md new file mode 100644 index 000000000000..771b6ceac833 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_conll2003_samoan_all_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_conll2003_samoan_all_ner BertForTokenClassification from jordyvl +author: John Snow Labs +name: bert_base_cased_conll2003_samoan_all_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_conll2003_samoan_all_ner` is a English model originally trained by jordyvl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_conll2003_samoan_all_ner_en_5.2.0_3.0_1699495665394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_conll2003_samoan_all_ner_en_5.2.0_3.0_1699495665394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_conll2003_samoan_all_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_conll2003_samoan_all_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_conll2003_samoan_all_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jordyvl/bert-base-cased_conll2003-sm-all-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_conll2003_samoan_first_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_conll2003_samoan_first_ner_en.md new file mode 100644 index 000000000000..df5710535879 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_conll2003_samoan_first_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_conll2003_samoan_first_ner BertForTokenClassification from jordyvl +author: John Snow Labs +name: bert_base_cased_conll2003_samoan_first_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_conll2003_samoan_first_ner` is a English model originally trained by jordyvl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_conll2003_samoan_first_ner_en_5.2.0_3.0_1699509621560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_conll2003_samoan_first_ner_en_5.2.0_3.0_1699509621560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_conll2003_samoan_first_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_conll2003_samoan_first_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_conll2003_samoan_first_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jordyvl/bert-base-cased_conll2003-sm-first-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_finetuned_ner_guitap_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_finetuned_ner_guitap_en.md new file mode 100644 index 000000000000..a5aeff17e32b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_finetuned_ner_guitap_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_finetuned_ner_guitap BertForTokenClassification from GuiTap +author: John Snow Labs +name: bert_base_cased_finetuned_ner_guitap +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_finetuned_ner_guitap` is a English model originally trained by GuiTap. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_ner_guitap_en_5.2.0_3.0_1699512873743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_finetuned_ner_guitap_en_5.2.0_3.0_1699512873743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_finetuned_ner_guitap","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_finetuned_ner_guitap", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_finetuned_ner_guitap| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/GuiTap/bert-base-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_ner_trained_on_synthea_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_ner_trained_on_synthea_en.md new file mode 100644 index 000000000000..a75b3bc53589 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_cased_ner_trained_on_synthea_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_ner_trained_on_synthea BertForTokenClassification from jage +author: John Snow Labs +name: bert_base_cased_ner_trained_on_synthea +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_ner_trained_on_synthea` is a English model originally trained by jage. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_ner_trained_on_synthea_en_5.2.0_3.0_1699511557797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_ner_trained_on_synthea_en_5.2.0_3.0_1699511557797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_cased_ner_trained_on_synthea","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_cased_ner_trained_on_synthea", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_ner_trained_on_synthea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jage/bert-base-cased-NER-trained-on-synthea \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_chinese_wikiann_chinese_ner_nepal_bhasa_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_chinese_wikiann_chinese_ner_nepal_bhasa_en.md new file mode 100644 index 000000000000..9dc25578ae6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_chinese_wikiann_chinese_ner_nepal_bhasa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_wikiann_chinese_ner_nepal_bhasa BertForTokenClassification from davidliu1110 +author: John Snow Labs +name: bert_base_chinese_wikiann_chinese_ner_nepal_bhasa +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_wikiann_chinese_ner_nepal_bhasa` is a English model originally trained by davidliu1110. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wikiann_chinese_ner_nepal_bhasa_en_5.2.0_3.0_1699492268394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_wikiann_chinese_ner_nepal_bhasa_en_5.2.0_3.0_1699492268394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_chinese_wikiann_chinese_ner_nepal_bhasa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_chinese_wikiann_chinese_ner_nepal_bhasa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_wikiann_chinese_ner_nepal_bhasa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/davidliu1110/bert-base-chinese-wikiann-zh-ner-new \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_dutch_cased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_dutch_cased_finetuned_ner_en.md new file mode 100644 index 000000000000..596905175158 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_dutch_cased_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_dutch_cased_finetuned_ner BertForTokenClassification from Matthijsvanhof +author: John Snow Labs +name: bert_base_dutch_cased_finetuned_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_dutch_cased_finetuned_ner` is a English model originally trained by Matthijsvanhof. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_ner_en_5.2.0_3.0_1699495824524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_ner_en_5.2.0_3.0_1699495824524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_dutch_cased_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_dutch_cased_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_dutch_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/Matthijsvanhof/bert-base-dutch-cased-finetuned-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_german_cased_noisy_pretrain_fine_tuned_v2_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_german_cased_noisy_pretrain_fine_tuned_v2_en.md new file mode 100644 index 000000000000..f2122ba85521 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_german_cased_noisy_pretrain_fine_tuned_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_german_cased_noisy_pretrain_fine_tuned_v2 BertForTokenClassification from tbosse +author: John Snow Labs +name: bert_base_german_cased_noisy_pretrain_fine_tuned_v2 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_german_cased_noisy_pretrain_fine_tuned_v2` is a English model originally trained by tbosse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_noisy_pretrain_fine_tuned_v2_en_5.2.0_3.0_1699515722761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_german_cased_noisy_pretrain_fine_tuned_v2_en_5.2.0_3.0_1699515722761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_german_cased_noisy_pretrain_fine_tuned_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_german_cased_noisy_pretrain_fine_tuned_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_german_cased_noisy_pretrain_fine_tuned_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.9 MB| + +## References + +https://huggingface.co/tbosse/bert-base-german-cased-noisy-pretrain-fine-tuned_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_portuguese_cased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_portuguese_cased_finetuned_ner_en.md new file mode 100644 index 000000000000..fd10b3401f74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_portuguese_cased_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_portuguese_cased_finetuned_ner BertForTokenClassification from tvtcm +author: John Snow Labs +name: bert_base_portuguese_cased_finetuned_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_portuguese_cased_finetuned_ner` is a English model originally trained by tvtcm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_finetuned_ner_en_5.2.0_3.0_1699513246007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_portuguese_cased_finetuned_ner_en_5.2.0_3.0_1699513246007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_portuguese_cased_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_portuguese_cased_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_portuguese_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/tvtcm/bert-base-portuguese-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_spanish_wwm_cased_finetuned_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_spanish_wwm_cased_finetuned_sayula_popoluca_en.md new file mode 100644 index 000000000000..c6e5a974eaf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_spanish_wwm_cased_finetuned_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_sayula_popoluca BertForTokenClassification from dccuchile +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_sayula_popoluca +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_sayula_popoluca` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_sayula_popoluca_en_5.2.0_3.0_1699502208620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_sayula_popoluca_en_5.2.0_3.0_1699502208620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_spanish_wwm_cased_finetuned_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_spanish_wwm_cased_finetuned_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_material_synthesis_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_material_synthesis_en.md new file mode 100644 index 000000000000..b5c48f166588 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_material_synthesis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_material_synthesis BertForTokenClassification from Dagobert42 +author: John Snow Labs +name: bert_base_uncased_finetuned_material_synthesis +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_material_synthesis` is a English model originally trained by Dagobert42. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_material_synthesis_en_5.2.0_3.0_1699517255786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_material_synthesis_en_5.2.0_3.0_1699517255786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_finetuned_material_synthesis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_finetuned_material_synthesis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_material_synthesis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.9 MB| + +## References + +https://huggingface.co/Dagobert42/bert-base-uncased-finetuned-material-synthesis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_math_punctuation_ignore_word_parts_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_math_punctuation_ignore_word_parts_en.md new file mode 100644 index 000000000000..efbc809a287e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_math_punctuation_ignore_word_parts_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_math_punctuation_ignore_word_parts BertForTokenClassification from JoshuaRubin +author: John Snow Labs +name: bert_base_uncased_finetuned_math_punctuation_ignore_word_parts +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_math_punctuation_ignore_word_parts` is a English model originally trained by JoshuaRubin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_math_punctuation_ignore_word_parts_en_5.2.0_3.0_1699505856150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_math_punctuation_ignore_word_parts_en_5.2.0_3.0_1699505856150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_finetuned_math_punctuation_ignore_word_parts","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_finetuned_math_punctuation_ignore_word_parts", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_math_punctuation_ignore_word_parts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/JoshuaRubin/bert-base-uncased-finetuned-math_punctuation-ignore_word_parts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_recruitment_exp_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_recruitment_exp_en.md new file mode 100644 index 000000000000..c89db3f6631e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_base_uncased_finetuned_recruitment_exp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_recruitment_exp BertForTokenClassification from reyhanemyr +author: John Snow Labs +name: bert_base_uncased_finetuned_recruitment_exp +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_recruitment_exp` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_recruitment_exp_en_5.2.0_3.0_1699497599843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_recruitment_exp_en_5.2.0_3.0_1699497599843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_base_uncased_finetuned_recruitment_exp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_base_uncased_finetuned_recruitment_exp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_recruitment_exp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/reyhanemyr/bert-base-uncased-finetuned-recruitment-exp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_cased_ner_fcit499_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_cased_ner_fcit499_en.md new file mode 100644 index 000000000000..5911845d3fb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_cased_ner_fcit499_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_cased_ner_fcit499 BertForTokenClassification from Ahmed87 +author: John Snow Labs +name: bert_cased_ner_fcit499 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cased_ner_fcit499` is a English model originally trained by Ahmed87. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cased_ner_fcit499_en_5.2.0_3.0_1699497599825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cased_ner_fcit499_en_5.2.0_3.0_1699497599825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_cased_ner_fcit499","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_cased_ner_fcit499", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cased_ner_fcit499| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Ahmed87/bert-cased-ner-fcit499 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_abhishekverma_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_abhishekverma_en.md new file mode 100644 index 000000000000..945afd7faf87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_abhishekverma_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_abhishekverma BertForTokenClassification from AbhishekVerma +author: John Snow Labs +name: bert_finetuned_ner_abhishekverma +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_abhishekverma` is a English model originally trained by AbhishekVerma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_abhishekverma_en_5.2.0_3.0_1699512991381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_abhishekverma_en_5.2.0_3.0_1699512991381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_abhishekverma","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_abhishekverma", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_abhishekverma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/AbhishekVerma/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_aimarsg_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_aimarsg_en.md new file mode 100644 index 000000000000..a60b21147d58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_aimarsg_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_aimarsg BertForTokenClassification from aimarsg +author: John Snow Labs +name: bert_finetuned_ner_aimarsg +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_aimarsg` is a English model originally trained by aimarsg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_aimarsg_en_5.2.0_3.0_1699511281669.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_aimarsg_en_5.2.0_3.0_1699511281669.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_aimarsg","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_aimarsg", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_aimarsg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/aimarsg/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alokps_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alokps_en.md new file mode 100644 index 000000000000..852795f4924a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alokps_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_alokps BertForTokenClassification from alokps +author: John Snow Labs +name: bert_finetuned_ner_alokps +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_alokps` is a English model originally trained by alokps. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_alokps_en_5.2.0_3.0_1699489990217.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_alokps_en_5.2.0_3.0_1699489990217.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_alokps","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_alokps", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_alokps| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/alokps/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alphasaber_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alphasaber_en.md new file mode 100644 index 000000000000..c0ded97f4a5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alphasaber_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_alphasaber BertForTokenClassification from alphaSaber +author: John Snow Labs +name: bert_finetuned_ner_alphasaber +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_alphasaber` is a English model originally trained by alphaSaber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_alphasaber_en_5.2.0_3.0_1699495802896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_alphasaber_en_5.2.0_3.0_1699495802896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_alphasaber","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_alphasaber", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_alphasaber| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/alphaSaber/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alvin0220_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alvin0220_en.md new file mode 100644 index 000000000000..84a8b4c75df8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_alvin0220_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_alvin0220 BertForTokenClassification from alvin0220 +author: John Snow Labs +name: bert_finetuned_ner_alvin0220 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_alvin0220` is a English model originally trained by alvin0220. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_alvin0220_en_5.2.0_3.0_1699506652569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_alvin0220_en_5.2.0_3.0_1699506652569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_alvin0220","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_alvin0220", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_alvin0220| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/alvin0220/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_amartyobanerjee_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_amartyobanerjee_en.md new file mode 100644 index 000000000000..2c53a52eac29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_amartyobanerjee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_amartyobanerjee BertForTokenClassification from amartyobanerjee +author: John Snow Labs +name: bert_finetuned_ner_amartyobanerjee +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_amartyobanerjee` is a English model originally trained by amartyobanerjee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_amartyobanerjee_en_5.2.0_3.0_1699516388390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_amartyobanerjee_en_5.2.0_3.0_1699516388390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_amartyobanerjee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_amartyobanerjee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_amartyobanerjee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/amartyobanerjee/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_atajti_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_atajti_en.md new file mode 100644 index 000000000000..d8c8137cef67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_atajti_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_atajti BertForTokenClassification from atajti +author: John Snow Labs +name: bert_finetuned_ner_atajti +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_atajti` is a English model originally trained by atajti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_atajti_en_5.2.0_3.0_1699488891291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_atajti_en_5.2.0_3.0_1699488891291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_atajti","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_atajti", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_atajti| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/atajti/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_bbbbearczx_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_bbbbearczx_en.md new file mode 100644 index 000000000000..13311b03d8c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_bbbbearczx_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_bbbbearczx BertForTokenClassification from bbbbearczx +author: John Snow Labs +name: bert_finetuned_ner_bbbbearczx +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_bbbbearczx` is a English model originally trained by bbbbearczx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_bbbbearczx_en_5.2.0_3.0_1699499541218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_bbbbearczx_en_5.2.0_3.0_1699499541218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_bbbbearczx","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_bbbbearczx", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_bbbbearczx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/bbbbearczx/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_boaii_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_boaii_en.md new file mode 100644 index 000000000000..7a96b876dc5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_boaii_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_boaii BertForTokenClassification from boaii +author: John Snow Labs +name: bert_finetuned_ner_boaii +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_boaii` is a English model originally trained by boaii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_boaii_en_5.2.0_3.0_1699509884210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_boaii_en_5.2.0_3.0_1699509884210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_boaii","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_boaii", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_boaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/boaii/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_bpatwa_shi_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_bpatwa_shi_en.md new file mode 100644 index 000000000000..a1e1ab664aea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_bpatwa_shi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_bpatwa_shi BertForTokenClassification from bpatwa-shi +author: John Snow Labs +name: bert_finetuned_ner_bpatwa_shi +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_bpatwa_shi` is a English model originally trained by bpatwa-shi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_bpatwa_shi_en_5.2.0_3.0_1699499541262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_bpatwa_shi_en_5.2.0_3.0_1699499541262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_bpatwa_shi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_bpatwa_shi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_bpatwa_shi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/bpatwa-shi/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_carlomax_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_carlomax_en.md new file mode 100644 index 000000000000..608793e4ef59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_carlomax_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_carlomax BertForTokenClassification from carlomax +author: John Snow Labs +name: bert_finetuned_ner_carlomax +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_carlomax` is a English model originally trained by carlomax. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_carlomax_en_5.2.0_3.0_1699504054894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_carlomax_en_5.2.0_3.0_1699504054894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_carlomax","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_carlomax", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_carlomax| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/carlomax/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_chonlam_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_chonlam_en.md new file mode 100644 index 000000000000..6714e119046a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_chonlam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_chonlam BertForTokenClassification from chonlam +author: John Snow Labs +name: bert_finetuned_ner_chonlam +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_chonlam` is a English model originally trained by chonlam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_chonlam_en_5.2.0_3.0_1699500752023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_chonlam_en_5.2.0_3.0_1699500752023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_chonlam","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_chonlam", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_chonlam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/chonlam/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_cindymc_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_cindymc_en.md new file mode 100644 index 000000000000..f3ce117d1159 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_cindymc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_cindymc BertForTokenClassification from cindymc +author: John Snow Labs +name: bert_finetuned_ner_cindymc +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_cindymc` is a English model originally trained by cindymc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_cindymc_en_5.2.0_3.0_1699506062294.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_cindymc_en_5.2.0_3.0_1699506062294.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_cindymc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_cindymc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_cindymc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/cindymc/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_cptbaas_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_cptbaas_en.md new file mode 100644 index 000000000000..3325fad40c8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_cptbaas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_cptbaas BertForTokenClassification from CptBaas +author: John Snow Labs +name: bert_finetuned_ner_cptbaas +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_cptbaas` is a English model originally trained by CptBaas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_cptbaas_en_5.2.0_3.0_1699511280012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_cptbaas_en_5.2.0_3.0_1699511280012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_cptbaas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_cptbaas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_cptbaas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/CptBaas/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_daethyra_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_daethyra_en.md new file mode 100644 index 000000000000..fef2f6cc06a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_daethyra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_daethyra BertForTokenClassification from daethyra +author: John Snow Labs +name: bert_finetuned_ner_daethyra +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_daethyra` is a English model originally trained by daethyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_daethyra_en_5.2.0_3.0_1699501984291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_daethyra_en_5.2.0_3.0_1699501984291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_daethyra","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_daethyra", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_daethyra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/daethyra/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_fatmazahraz_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_fatmazahraz_en.md new file mode 100644 index 000000000000..8d57d9b9d324 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_fatmazahraz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_fatmazahraz BertForTokenClassification from FatmaZahraZ +author: John Snow Labs +name: bert_finetuned_ner_fatmazahraz +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_fatmazahraz` is a English model originally trained by FatmaZahraZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_fatmazahraz_en_5.2.0_3.0_1699497466595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_fatmazahraz_en_5.2.0_3.0_1699497466595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_fatmazahraz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_fatmazahraz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_fatmazahraz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/FatmaZahraZ/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_fengi_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_fengi_en.md new file mode 100644 index 000000000000..0bb7dbd70833 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_fengi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_fengi BertForTokenClassification from fengi +author: John Snow Labs +name: bert_finetuned_ner_fengi +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_fengi` is a English model originally trained by fengi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_fengi_en_5.2.0_3.0_1699491714551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_fengi_en_5.2.0_3.0_1699491714551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_fengi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_fengi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_fengi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/fengi/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_hatman_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_hatman_en.md new file mode 100644 index 000000000000..e797714819cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_hatman_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_hatman BertForTokenClassification from Hatman +author: John Snow Labs +name: bert_finetuned_ner_hatman +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_hatman` is a English model originally trained by Hatman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_hatman_en_5.2.0_3.0_1699515131416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_hatman_en_5.2.0_3.0_1699515131416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_hatman","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_hatman", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_hatman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Hatman/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_ish97_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_ish97_en.md new file mode 100644 index 000000000000..45c19037dcb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_ish97_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_ish97 BertForTokenClassification from ish97 +author: John Snow Labs +name: bert_finetuned_ner_ish97 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_ish97` is a English model originally trained by ish97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ish97_en_5.2.0_3.0_1699514322342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ish97_en_5.2.0_3.0_1699514322342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_ish97","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_ish97", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_ish97| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ish97/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_jfarmerphd_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_jfarmerphd_en.md new file mode 100644 index 000000000000..bc16b848404b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_jfarmerphd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_jfarmerphd BertForTokenClassification from jfarmerphd +author: John Snow Labs +name: bert_finetuned_ner_jfarmerphd +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_jfarmerphd` is a English model originally trained by jfarmerphd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jfarmerphd_en_5.2.0_3.0_1699502208660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jfarmerphd_en_5.2.0_3.0_1699502208660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_jfarmerphd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_jfarmerphd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_jfarmerphd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jfarmerphd/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_jimbung_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_jimbung_en.md new file mode 100644 index 000000000000..17c6de84d8e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_jimbung_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_jimbung BertForTokenClassification from jimbung +author: John Snow Labs +name: bert_finetuned_ner_jimbung +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_jimbung` is a English model originally trained by jimbung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jimbung_en_5.2.0_3.0_1699505163647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jimbung_en_5.2.0_3.0_1699505163647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_jimbung","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_jimbung", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_jimbung| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/jimbung/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_krolis_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_krolis_en.md new file mode 100644 index 000000000000..2a4ef543fb5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_krolis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_krolis BertForTokenClassification from krolis +author: John Snow Labs +name: bert_finetuned_ner_krolis +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_krolis` is a English model originally trained by krolis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_krolis_en_5.2.0_3.0_1699514321195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_krolis_en_5.2.0_3.0_1699514321195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_krolis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_krolis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_krolis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/krolis/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_lyk0013_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_lyk0013_en.md new file mode 100644 index 000000000000..f0db96a72682 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_lyk0013_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_lyk0013 BertForTokenClassification from lyk0013 +author: John Snow Labs +name: bert_finetuned_ner_lyk0013 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_lyk0013` is a English model originally trained by lyk0013. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_lyk0013_en_5.2.0_3.0_1699489989478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_lyk0013_en_5.2.0_3.0_1699489989478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_lyk0013","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_lyk0013", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_lyk0013| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/lyk0013/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_manoharahuggingface_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_manoharahuggingface_en.md new file mode 100644 index 000000000000..d0083eae3304 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_manoharahuggingface_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_manoharahuggingface BertForTokenClassification from manoharahuggingface +author: John Snow Labs +name: bert_finetuned_ner_manoharahuggingface +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_manoharahuggingface` is a English model originally trained by manoharahuggingface. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_manoharahuggingface_en_5.2.0_3.0_1699503716867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_manoharahuggingface_en_5.2.0_3.0_1699503716867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_manoharahuggingface","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_manoharahuggingface", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_manoharahuggingface| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/manoharahuggingface/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_mholi_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_mholi_en.md new file mode 100644 index 000000000000..bb2b12b2f6d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_mholi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_mholi BertForTokenClassification from mholi +author: John Snow Labs +name: bert_finetuned_ner_mholi +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_mholi` is a English model originally trained by mholi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mholi_en_5.2.0_3.0_1699507963413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_mholi_en_5.2.0_3.0_1699507963413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_mholi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_mholi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_mholi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/mholi/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_petros89_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_petros89_en.md new file mode 100644 index 000000000000..6c73a49bf17d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_petros89_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_petros89 BertForTokenClassification from Petros89 +author: John Snow Labs +name: bert_finetuned_ner_petros89 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_petros89` is a English model originally trained by Petros89. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_petros89_en_5.2.0_3.0_1699498767212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_petros89_en_5.2.0_3.0_1699498767212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_petros89","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_petros89", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_petros89| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/Petros89/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_sdinger_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_sdinger_en.md new file mode 100644 index 000000000000..00b49aad88bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_sdinger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_sdinger BertForTokenClassification from sdinger +author: John Snow Labs +name: bert_finetuned_ner_sdinger +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_sdinger` is a English model originally trained by sdinger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_sdinger_en_5.2.0_3.0_1699489989902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_sdinger_en_5.2.0_3.0_1699489989902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_sdinger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_sdinger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_sdinger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/sdinger/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_shamweel_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_shamweel_en.md new file mode 100644 index 000000000000..b361111115cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_shamweel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_shamweel BertForTokenClassification from shamweel +author: John Snow Labs +name: bert_finetuned_ner_shamweel +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_shamweel` is a English model originally trained by shamweel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_shamweel_en_5.2.0_3.0_1699513787914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_shamweel_en_5.2.0_3.0_1699513787914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_shamweel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_shamweel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_shamweel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/shamweel/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_steven_qi_zhao_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_steven_qi_zhao_en.md new file mode 100644 index 000000000000..7eeb98f914cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_steven_qi_zhao_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_steven_qi_zhao BertForTokenClassification from steven-qi-zhao +author: John Snow Labs +name: bert_finetuned_ner_steven_qi_zhao +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_steven_qi_zhao` is a English model originally trained by steven-qi-zhao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_steven_qi_zhao_en_5.2.0_3.0_1699519339158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_steven_qi_zhao_en_5.2.0_3.0_1699519339158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_steven_qi_zhao","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_steven_qi_zhao", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_steven_qi_zhao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/steven-qi-zhao/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_the_bee_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_the_bee_en.md new file mode 100644 index 000000000000..e039abac121d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_the_bee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_the_bee BertForTokenClassification from the-bee +author: John Snow Labs +name: bert_finetuned_ner_the_bee +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_the_bee` is a English model originally trained by the-bee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_the_bee_en_5.2.0_3.0_1699501147930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_the_bee_en_5.2.0_3.0_1699501147930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_the_bee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_the_bee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_the_bee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/the-bee/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_thientran_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_thientran_en.md new file mode 100644 index 000000000000..2b8f06e4a0be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_thientran_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_thientran BertForTokenClassification from thientran +author: John Snow Labs +name: bert_finetuned_ner_thientran +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_thientran` is a English model originally trained by thientran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_thientran_en_5.2.0_3.0_1699497264498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_thientran_en_5.2.0_3.0_1699497264498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_thientran","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_thientran", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_thientran| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/thientran/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_v2_vbhasin_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_v2_vbhasin_en.md new file mode 100644 index 000000000000..8b8c8e68d65f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_v2_vbhasin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_v2_vbhasin BertForTokenClassification from vbhasin +author: John Snow Labs +name: bert_finetuned_ner_v2_vbhasin +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_v2_vbhasin` is a English model originally trained by vbhasin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_v2_vbhasin_en_5.2.0_3.0_1699500752025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_v2_vbhasin_en_5.2.0_3.0_1699500752025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_v2_vbhasin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_v2_vbhasin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_v2_vbhasin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/vbhasin/bert-finetuned-ner-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_viktordo_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_viktordo_en.md new file mode 100644 index 000000000000..8390a280819b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_viktordo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_viktordo BertForTokenClassification from ViktorDo +author: John Snow Labs +name: bert_finetuned_ner_viktordo +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_viktordo` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_viktordo_en_5.2.0_3.0_1699488420472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_viktordo_en_5.2.0_3.0_1699488420472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_viktordo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_viktordo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_viktordo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ViktorDo/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_yixi_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_yixi_en.md new file mode 100644 index 000000000000..73da8ce207f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_ner_yixi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_yixi BertForTokenClassification from yixi +author: John Snow Labs +name: bert_finetuned_ner_yixi +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_yixi` is a English model originally trained by yixi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_yixi_en_5.2.0_3.0_1699519339198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_yixi_en_5.2.0_3.0_1699519339198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_ner_yixi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_ner_yixi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_yixi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/yixi/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_requirements_rodp_en.md b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_requirements_rodp_en.md new file mode 100644 index 000000000000..265baf95ac77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bert_finetuned_requirements_rodp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_requirements_rodp BertForTokenClassification from RodP +author: John Snow Labs +name: bert_finetuned_requirements_rodp +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_requirements_rodp` is a English model originally trained by RodP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_requirements_rodp_en_5.2.0_3.0_1699515407370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_requirements_rodp_en_5.2.0_3.0_1699515407370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bert_finetuned_requirements_rodp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bert_finetuned_requirements_rodp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_requirements_rodp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/RodP/bert-finetuned-requirements \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-bertimbau_base_lener_breton_finetuned_lener_breton_pt.md b/docs/_posts/ahmedlone127/2023-11-09-bertimbau_base_lener_breton_finetuned_lener_breton_pt.md new file mode 100644 index 000000000000..8fb00faaa9e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-bertimbau_base_lener_breton_finetuned_lener_breton_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese bertimbau_base_lener_breton_finetuned_lener_breton BertForTokenClassification from Luciano +author: John Snow Labs +name: bertimbau_base_lener_breton_finetuned_lener_breton +date: 2023-11-09 +tags: [bert, pt, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertimbau_base_lener_breton_finetuned_lener_breton` is a Portuguese model originally trained by Luciano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertimbau_base_lener_breton_finetuned_lener_breton_pt_5.2.0_3.0_1699515722790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertimbau_base_lener_breton_finetuned_lener_breton_pt_5.2.0_3.0_1699515722790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("bertimbau_base_lener_breton_finetuned_lener_breton","pt") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("bertimbau_base_lener_breton_finetuned_lener_breton", "pt") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertimbau_base_lener_breton_finetuned_lener_breton| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pt| +|Size:|405.9 MB| + +## References + +https://huggingface.co/Luciano/bertimbau-base-lener-br-finetuned-lener-br \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-berttest2_classtest_en.md b/docs/_posts/ahmedlone127/2023-11-09-berttest2_classtest_en.md new file mode 100644 index 000000000000..5458fd658911 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-berttest2_classtest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English berttest2_classtest BertForTokenClassification from classtest +author: John Snow Labs +name: berttest2_classtest +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`berttest2_classtest` is a English model originally trained by classtest. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/berttest2_classtest_en_5.2.0_3.0_1699507827407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/berttest2_classtest_en_5.2.0_3.0_1699507827407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("berttest2_classtest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("berttest2_classtest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|berttest2_classtest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/classtest/berttest2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-beto_sentiment_analysis_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-beto_sentiment_analysis_finetuned_ner_en.md new file mode 100644 index 000000000000..5f21791f5bed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-beto_sentiment_analysis_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English beto_sentiment_analysis_finetuned_ner BertForTokenClassification from asdc +author: John Snow Labs +name: beto_sentiment_analysis_finetuned_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`beto_sentiment_analysis_finetuned_ner` is a English model originally trained by asdc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/beto_sentiment_analysis_finetuned_ner_en_5.2.0_3.0_1699505163606.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/beto_sentiment_analysis_finetuned_ner_en_5.2.0_3.0_1699505163606.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("beto_sentiment_analysis_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("beto_sentiment_analysis_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|beto_sentiment_analysis_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/asdc/beto-sentiment-analysis-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-biobert_finetuned_ner_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-09-biobert_finetuned_ner_conll2003_en.md new file mode 100644 index 000000000000..64302f8a1842 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-biobert_finetuned_ner_conll2003_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biobert_finetuned_ner_conll2003 BertForTokenClassification from ViktorDo +author: John Snow Labs +name: biobert_finetuned_ner_conll2003 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biobert_finetuned_ner_conll2003` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biobert_finetuned_ner_conll2003_en_5.2.0_3.0_1699517061564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biobert_finetuned_ner_conll2003_en_5.2.0_3.0_1699517061564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("biobert_finetuned_ner_conll2003","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("biobert_finetuned_ner_conll2003", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biobert_finetuned_ner_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ViktorDo/BioBERT-finetuned-ner-conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-burmese_ner_model1_2_split_by_sentence_en.md b/docs/_posts/ahmedlone127/2023-11-09-burmese_ner_model1_2_split_by_sentence_en.md new file mode 100644 index 000000000000..89baff7379ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-burmese_ner_model1_2_split_by_sentence_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_ner_model1_2_split_by_sentence BertForTokenClassification from Gurkan +author: John Snow Labs +name: burmese_ner_model1_2_split_by_sentence +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_ner_model1_2_split_by_sentence` is a English model originally trained by Gurkan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_ner_model1_2_split_by_sentence_en_5.2.0_3.0_1699520490339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_ner_model1_2_split_by_sentence_en_5.2.0_3.0_1699520490339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("burmese_ner_model1_2_split_by_sentence","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("burmese_ner_model1_2_split_by_sentence", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_ner_model1_2_split_by_sentence| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/Gurkan/my_ner_model1_2_split_by_sentence_ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-burmese_xlm_roberta_large_finetuned_conlljob01_en.md b/docs/_posts/ahmedlone127/2023-11-09-burmese_xlm_roberta_large_finetuned_conlljob01_en.md new file mode 100644 index 000000000000..2a63a76d4c94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-burmese_xlm_roberta_large_finetuned_conlljob01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_xlm_roberta_large_finetuned_conlljob01 BertForTokenClassification from BahAdoR0101 +author: John Snow Labs +name: burmese_xlm_roberta_large_finetuned_conlljob01 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_xlm_roberta_large_finetuned_conlljob01` is a English model originally trained by BahAdoR0101. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_xlm_roberta_large_finetuned_conlljob01_en_5.2.0_3.0_1699502609190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_xlm_roberta_large_finetuned_conlljob01_en_5.2.0_3.0_1699502609190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("burmese_xlm_roberta_large_finetuned_conlljob01","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("burmese_xlm_roberta_large_finetuned_conlljob01", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_xlm_roberta_large_finetuned_conlljob01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/BahAdoR0101/my_xlm-roberta-large-finetuned-conlljob01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-cause_effect_detection_persian_en.md b/docs/_posts/ahmedlone127/2023-11-09-cause_effect_detection_persian_en.md new file mode 100644 index 000000000000..8cd7c04aa688 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-cause_effect_detection_persian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cause_effect_detection_persian BertForTokenClassification from Amin-Saeidi +author: John Snow Labs +name: cause_effect_detection_persian +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cause_effect_detection_persian` is a English model originally trained by Amin-Saeidi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cause_effect_detection_persian_en_5.2.0_3.0_1699491781196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cause_effect_detection_persian_en_5.2.0_3.0_1699491781196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("cause_effect_detection_persian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("cause_effect_detection_persian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cause_effect_detection_persian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|606.4 MB| + +## References + +https://huggingface.co/Amin-Saeidi/Cause_Effect_Detection_persian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-chinese_macbert_base_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-chinese_macbert_base_finetuned_ner_en.md new file mode 100644 index 000000000000..5cfb10a5cabd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-chinese_macbert_base_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chinese_macbert_base_finetuned_ner BertForTokenClassification from zhiguoxu +author: John Snow Labs +name: chinese_macbert_base_finetuned_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_macbert_base_finetuned_ner` is a English model originally trained by zhiguoxu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_macbert_base_finetuned_ner_en_5.2.0_3.0_1699512017261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_macbert_base_finetuned_ner_en_5.2.0_3.0_1699512017261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("chinese_macbert_base_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("chinese_macbert_base_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_macbert_base_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.2 MB| + +## References + +https://huggingface.co/zhiguoxu/chinese-macbert-base-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-ckiplab_bert_chinese_david_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-ckiplab_bert_chinese_david_ner_en.md new file mode 100644 index 000000000000..0f7412715568 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-ckiplab_bert_chinese_david_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ckiplab_bert_chinese_david_ner BertForTokenClassification from davidliu1110 +author: John Snow Labs +name: ckiplab_bert_chinese_david_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ckiplab_bert_chinese_david_ner` is a English model originally trained by davidliu1110. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ckiplab_bert_chinese_david_ner_en_5.2.0_3.0_1699510375633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ckiplab_bert_chinese_david_ner_en_5.2.0_3.0_1699510375633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ckiplab_bert_chinese_david_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ckiplab_bert_chinese_david_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ckiplab_bert_chinese_david_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/davidliu1110/ckiplab-bert-chinese-david-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-downstream_german_bert_en.md b/docs/_posts/ahmedlone127/2023-11-09-downstream_german_bert_en.md new file mode 100644 index 000000000000..1e49f7f2cc94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-downstream_german_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English downstream_german_bert BertForTokenClassification from codern +author: John Snow Labs +name: downstream_german_bert +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`downstream_german_bert` is a English model originally trained by codern. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/downstream_german_bert_en_5.2.0_3.0_1699508288253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/downstream_german_bert_en_5.2.0_3.0_1699508288253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("downstream_german_bert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("downstream_german_bert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|downstream_german_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/codern/downstream-german-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-english_bertsemtagger_gold_en.md b/docs/_posts/ahmedlone127/2023-11-09-english_bertsemtagger_gold_en.md new file mode 100644 index 000000000000..d1afb11dff1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-english_bertsemtagger_gold_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English english_bertsemtagger_gold BertForTokenClassification from hfunakura +author: John Snow Labs +name: english_bertsemtagger_gold +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_bertsemtagger_gold` is a English model originally trained by hfunakura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_bertsemtagger_gold_en_5.2.0_3.0_1699488767757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_bertsemtagger_gold_en_5.2.0_3.0_1699488767757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("english_bertsemtagger_gold","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("english_bertsemtagger_gold", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_bertsemtagger_gold| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.4 MB| + +## References + +https://huggingface.co/hfunakura/en-bertsemtagger-gold \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-entity_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-09-entity_extraction_en.md new file mode 100644 index 000000000000..0dae1a27bb06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-entity_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English entity_extraction BertForTokenClassification from test123 +author: John Snow Labs +name: entity_extraction +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`entity_extraction` is a English model originally trained by test123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_extraction_en_5.2.0_3.0_1699499020342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/entity_extraction_en_5.2.0_3.0_1699499020342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("entity_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("entity_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|entity_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/test123/entity_extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-fin3_en.md b/docs/_posts/ahmedlone127/2023-11-09-fin3_en.md new file mode 100644 index 000000000000..1fbeb051e68a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-fin3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fin3 BertForTokenClassification from redevaaa +author: John Snow Labs +name: fin3 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fin3` is a English model originally trained by redevaaa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fin3_en_5.2.0_3.0_1699508290975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fin3_en_5.2.0_3.0_1699508290975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("fin3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("fin3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fin3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.5 MB| + +## References + +https://huggingface.co/redevaaa/fin3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100_en.md b/docs/_posts/ahmedlone127/2023-11-09-finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100_en.md new file mode 100644 index 000000000000..b3fa5a4a4b36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100 BertForTokenClassification from emilylearning +author: John Snow Labs +name: finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100` is a English model originally trained by emilylearning. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100_en_5.2.0_3.0_1699506462950.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100_en_5.2.0_3.0_1699506462950.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_cgp_added_birth_date__female_weight_1_5__test_rundi_false__p_dataset_100| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/emilylearning/finetuned_cgp_added_birth_date__female_weight_1.5__test_run_False__p_dataset_100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-gene_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-09-gene_finetuned_en.md new file mode 100644 index 000000000000..a56af11016e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-gene_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English gene_finetuned BertForTokenClassification from Randomui +author: John Snow Labs +name: gene_finetuned +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gene_finetuned` is a English model originally trained by Randomui. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gene_finetuned_en_5.2.0_3.0_1699513245987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gene_finetuned_en_5.2.0_3.0_1699513245987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("gene_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("gene_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gene_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/Randomui/gene_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-german_bert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-german_bert_finetuned_ner_en.md new file mode 100644 index 000000000000..c8e6b91e7e5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-german_bert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English german_bert_finetuned_ner BertForTokenClassification from herokiller +author: John Snow Labs +name: german_bert_finetuned_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`german_bert_finetuned_ner` is a English model originally trained by herokiller. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/german_bert_finetuned_ner_en_5.2.0_3.0_1699520424997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/german_bert_finetuned_ner_en_5.2.0_3.0_1699520424997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("german_bert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("german_bert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|german_bert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/herokiller/german-bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-german_medbert_birads_ner100_de.md b/docs/_posts/ahmedlone127/2023-11-09-german_medbert_birads_ner100_de.md new file mode 100644 index 000000000000..6fb38adb4186 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-german_medbert_birads_ner100_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German german_medbert_birads_ner100 BertForTokenClassification from BobbyG97 +author: John Snow Labs +name: german_medbert_birads_ner100 +date: 2023-11-09 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`german_medbert_birads_ner100` is a German model originally trained by BobbyG97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/german_medbert_birads_ner100_de_5.2.0_3.0_1699521853207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/german_medbert_birads_ner100_de_5.2.0_3.0_1699521853207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("german_medbert_birads_ner100","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("german_medbert_birads_ner100", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|german_medbert_birads_ner100| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|406.9 MB| + +## References + +https://huggingface.co/BobbyG97/German-MedBERT-Birads-NER100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-hebert_medical_ner_non_pii_en.md b/docs/_posts/ahmedlone127/2023-11-09-hebert_medical_ner_non_pii_en.md new file mode 100644 index 000000000000..7d0448cefd2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-hebert_medical_ner_non_pii_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hebert_medical_ner_non_pii BertForTokenClassification from cp500 +author: John Snow Labs +name: hebert_medical_ner_non_pii +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hebert_medical_ner_non_pii` is a English model originally trained by cp500. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hebert_medical_ner_non_pii_en_5.2.0_3.0_1699496002641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hebert_medical_ner_non_pii_en_5.2.0_3.0_1699496002641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("hebert_medical_ner_non_pii","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("hebert_medical_ner_non_pii", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hebert_medical_ner_non_pii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|690.4 MB| + +## References + +https://huggingface.co/cp500/hebert_medical_ner_non_pii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-indobert_model_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-indobert_model_ner_en.md new file mode 100644 index 000000000000..f7ef92f20531 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-indobert_model_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English indobert_model_ner BertForTokenClassification from syafiqfaray +author: John Snow Labs +name: indobert_model_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_model_ner` is a English model originally trained by syafiqfaray. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_model_ner_en_5.2.0_3.0_1699502608859.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_model_ner_en_5.2.0_3.0_1699502608859.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("indobert_model_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("indobert_model_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_model_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|411.8 MB| + +## References + +https://huggingface.co/syafiqfaray/indobert-model-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-jobbert_skill_en.md b/docs/_posts/ahmedlone127/2023-11-09-jobbert_skill_en.md new file mode 100644 index 000000000000..de35db84dc77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-jobbert_skill_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jobbert_skill BertForTokenClassification from Andrei95 +author: John Snow Labs +name: jobbert_skill +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jobbert_skill` is a English model originally trained by Andrei95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jobbert_skill_en_5.2.0_3.0_1699490100170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jobbert_skill_en_5.2.0_3.0_1699490100170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("jobbert_skill","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("jobbert_skill", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jobbert_skill| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|402.2 MB| + +## References + +https://huggingface.co/Andrei95/jobbert-skill \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-lewit_informal_tagger_en.md b/docs/_posts/ahmedlone127/2023-11-09-lewit_informal_tagger_en.md new file mode 100644 index 000000000000..e2352ecca3f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-lewit_informal_tagger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English lewit_informal_tagger BertForTokenClassification from s-nlp +author: John Snow Labs +name: lewit_informal_tagger +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lewit_informal_tagger` is a English model originally trained by s-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lewit_informal_tagger_en_5.2.0_3.0_1699505290280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lewit_informal_tagger_en_5.2.0_3.0_1699505290280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("lewit_informal_tagger","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("lewit_informal_tagger", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lewit_informal_tagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/s-nlp/lewit-informal-tagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-marathi_ner_iob_mr.md b/docs/_posts/ahmedlone127/2023-11-09-marathi_ner_iob_mr.md new file mode 100644 index 000000000000..bbcc20c8898f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-marathi_ner_iob_mr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Marathi marathi_ner_iob BertForTokenClassification from l3cube-pune +author: John Snow Labs +name: marathi_ner_iob +date: 2023-11-09 +tags: [bert, mr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: mr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`marathi_ner_iob` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/marathi_ner_iob_mr_5.2.0_3.0_1699488727458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/marathi_ner_iob_mr_5.2.0_3.0_1699488727458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("marathi_ner_iob","mr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("marathi_ner_iob", "mr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|marathi_ner_iob| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|mr| +|Size:|665.1 MB| + +## References + +https://huggingface.co/l3cube-pune/marathi-ner-iob \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-me_lid_bert_mr.md b/docs/_posts/ahmedlone127/2023-11-09-me_lid_bert_mr.md new file mode 100644 index 000000000000..dfed4f6577d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-me_lid_bert_mr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Marathi me_lid_bert BertForTokenClassification from l3cube-pune +author: John Snow Labs +name: me_lid_bert +date: 2023-11-09 +tags: [bert, mr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: mr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`me_lid_bert` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/me_lid_bert_mr_5.2.0_3.0_1699521268448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/me_lid_bert_mr_5.2.0_3.0_1699521268448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("me_lid_bert","mr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("me_lid_bert", "mr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|me_lid_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|mr| +|Size:|890.6 MB| + +## References + +https://huggingface.co/l3cube-pune/me-lid-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-medicare_ner_en.md b/docs/_posts/ahmedlone127/2023-11-09-medicare_ner_en.md new file mode 100644 index 000000000000..ecd6ac5cf01c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-medicare_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English medicare_ner BertForTokenClassification from m-aliabbas1 +author: John Snow Labs +name: medicare_ner +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medicare_ner` is a English model originally trained by m-aliabbas1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medicare_ner_en_5.2.0_3.0_1699513787914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medicare_ner_en_5.2.0_3.0_1699513787914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("medicare_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("medicare_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medicare_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|16.7 MB| + +## References + +https://huggingface.co/m-aliabbas1/medicare_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-n2c2_container_bert_10ep_en.md b/docs/_posts/ahmedlone127/2023-11-09-n2c2_container_bert_10ep_en.md new file mode 100644 index 000000000000..75734e695cd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-n2c2_container_bert_10ep_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English n2c2_container_bert_10ep BertForTokenClassification from georgeleung30 +author: John Snow Labs +name: n2c2_container_bert_10ep +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`n2c2_container_bert_10ep` is a English model originally trained by georgeleung30. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/n2c2_container_bert_10ep_en_5.2.0_3.0_1699514320686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/n2c2_container_bert_10ep_en_5.2.0_3.0_1699514320686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("n2c2_container_bert_10ep","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("n2c2_container_bert_10ep", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|n2c2_container_bert_10ep| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/georgeleung30/n2c2_container_bert_10ep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-ner_bert_base_multilingual_uncased_xx.md b/docs/_posts/ahmedlone127/2023-11-09-ner_bert_base_multilingual_uncased_xx.md new file mode 100644 index 000000000000..bf766895a4e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-ner_bert_base_multilingual_uncased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual ner_bert_base_multilingual_uncased BertForTokenClassification from Jyotiyadav +author: John Snow Labs +name: ner_bert_base_multilingual_uncased +date: 2023-11-09 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_bert_base_multilingual_uncased` is a Multilingual model originally trained by Jyotiyadav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_bert_base_multilingual_uncased_xx_5.2.0_3.0_1699511453699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_bert_base_multilingual_uncased_xx_5.2.0_3.0_1699511453699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_bert_base_multilingual_uncased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_bert_base_multilingual_uncased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_bert_base_multilingual_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|625.6 MB| + +## References + +https://huggingface.co/Jyotiyadav/NER-bert-base-multilingual-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-ner_conditional_dutch_official_11_en.md b/docs/_posts/ahmedlone127/2023-11-09-ner_conditional_dutch_official_11_en.md new file mode 100644 index 000000000000..ed9f0b664b7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-ner_conditional_dutch_official_11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_conditional_dutch_official_11 BertForTokenClassification from Annemae +author: John Snow Labs +name: ner_conditional_dutch_official_11 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_conditional_dutch_official_11` is a English model originally trained by Annemae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_conditional_dutch_official_11_en_5.2.0_3.0_1699497599770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_conditional_dutch_official_11_en_5.2.0_3.0_1699497599770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_conditional_dutch_official_11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_conditional_dutch_official_11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_conditional_dutch_official_11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.8 MB| + +## References + +https://huggingface.co/Annemae/ner_conditional_nl_official_11 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-ner_natfike_en.md b/docs/_posts/ahmedlone127/2023-11-09-ner_natfike_en.md new file mode 100644 index 000000000000..24f128990364 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-ner_natfike_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_natfike BertForTokenClassification from Natfike +author: John Snow Labs +name: ner_natfike +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_natfike` is a English model originally trained by Natfike. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_natfike_en_5.2.0_3.0_1699509621618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_natfike_en_5.2.0_3.0_1699509621618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ner_natfike","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ner_natfike", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_natfike| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/Natfike/NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas_es.md b/docs/_posts/ahmedlone127/2023-11-09-nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas_es.md new file mode 100644 index 000000000000..45da2a790568 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas BertForTokenClassification from pineiden +author: John Snow Labs +name: nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas +date: 2023-11-09 +tags: [bert, es, open_source, token_classification, onnx] +task: Named Entity Recognition +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas` is a Castilian, Spanish model originally trained by pineiden. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas_es_5.2.0_3.0_1699504237540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas_es_5.2.0_3.0_1699504237540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas","es") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas", "es") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nominal_groups_recognition_medical_disease_beto_cmm_competencia2_beto_prescripciones_medicas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|es| +|Size:|409.4 MB| + +## References + +https://huggingface.co/pineiden/nominal-groups-recognition-medical-disease-beto-cmm-competencia2-beto-prescripciones-medicas \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-novels_ner_model_pl.md b/docs/_posts/ahmedlone127/2023-11-09-novels_ner_model_pl.md new file mode 100644 index 000000000000..afcae88ca549 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-novels_ner_model_pl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Polish novels_ner_model BertForTokenClassification from ZuzannaH +author: John Snow Labs +name: novels_ner_model +date: 2023-11-09 +tags: [bert, pl, open_source, token_classification, onnx] +task: Named Entity Recognition +language: pl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`novels_ner_model` is a Polish model originally trained by ZuzannaH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/novels_ner_model_pl_5.2.0_3.0_1699501372607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/novels_ner_model_pl_5.2.0_3.0_1699501372607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("novels_ner_model","pl") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("novels_ner_model", "pl") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|novels_ner_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|pl| +|Size:|665.0 MB| + +## References + +https://huggingface.co/ZuzannaH/novels_ner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-nyt_ingredient_tagger_paraphrase_minilm_l3_v2_en.md b/docs/_posts/ahmedlone127/2023-11-09-nyt_ingredient_tagger_paraphrase_minilm_l3_v2_en.md new file mode 100644 index 000000000000..21caebd51108 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-nyt_ingredient_tagger_paraphrase_minilm_l3_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nyt_ingredient_tagger_paraphrase_minilm_l3_v2 BertForTokenClassification from napsternxg +author: John Snow Labs +name: nyt_ingredient_tagger_paraphrase_minilm_l3_v2 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nyt_ingredient_tagger_paraphrase_minilm_l3_v2` is a English model originally trained by napsternxg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nyt_ingredient_tagger_paraphrase_minilm_l3_v2_en_5.2.0_3.0_1699506734376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nyt_ingredient_tagger_paraphrase_minilm_l3_v2_en_5.2.0_3.0_1699506734376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("nyt_ingredient_tagger_paraphrase_minilm_l3_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("nyt_ingredient_tagger_paraphrase_minilm_l3_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nyt_ingredient_tagger_paraphrase_minilm_l3_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|64.1 MB| + +## References + +https://huggingface.co/napsternxg/nyt-ingredient-tagger-paraphrase-MiniLM-L3-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-ote_domianadaption_absa_marbert2_hard_run1_en.md b/docs/_posts/ahmedlone127/2023-11-09-ote_domianadaption_absa_marbert2_hard_run1_en.md new file mode 100644 index 000000000000..1f6202ae79c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-ote_domianadaption_absa_marbert2_hard_run1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ote_domianadaption_absa_marbert2_hard_run1 BertForTokenClassification from salohnana2018 +author: John Snow Labs +name: ote_domianadaption_absa_marbert2_hard_run1 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ote_domianadaption_absa_marbert2_hard_run1` is a English model originally trained by salohnana2018. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ote_domianadaption_absa_marbert2_hard_run1_en_5.2.0_3.0_1699517348941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ote_domianadaption_absa_marbert2_hard_run1_en_5.2.0_3.0_1699517348941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("ote_domianadaption_absa_marbert2_hard_run1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("ote_domianadaption_absa_marbert2_hard_run1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ote_domianadaption_absa_marbert2_hard_run1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|606.5 MB| + +## References + +https://huggingface.co/salohnana2018/OTE-domianAdaption-ABSA-MARBERT2-HARD-run1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-porttagger_oilgas_base_en.md b/docs/_posts/ahmedlone127/2023-11-09-porttagger_oilgas_base_en.md new file mode 100644 index 000000000000..c84d8e2d2b4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-porttagger_oilgas_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English porttagger_oilgas_base BertForTokenClassification from Emanuel +author: John Snow Labs +name: porttagger_oilgas_base +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`porttagger_oilgas_base` is a English model originally trained by Emanuel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/porttagger_oilgas_base_en_5.2.0_3.0_1699521038207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/porttagger_oilgas_base_en_5.2.0_3.0_1699521038207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("porttagger_oilgas_base","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("porttagger_oilgas_base", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|porttagger_oilgas_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|406.0 MB| + +## References + +https://huggingface.co/Emanuel/porttagger-oilgas-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-primary_outcome_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-09-primary_outcome_extraction_en.md new file mode 100644 index 000000000000..c9553142acd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-primary_outcome_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English primary_outcome_extraction BertForTokenClassification from aakorolyova +author: John Snow Labs +name: primary_outcome_extraction +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`primary_outcome_extraction` is a English model originally trained by aakorolyova. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/primary_outcome_extraction_en_5.2.0_3.0_1699488767563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/primary_outcome_extraction_en_5.2.0_3.0_1699488767563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("primary_outcome_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("primary_outcome_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|primary_outcome_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.1 MB| + +## References + +https://huggingface.co/aakorolyova/primary_outcome_extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-rubert_tiny2_srl_en.md b/docs/_posts/ahmedlone127/2023-11-09-rubert_tiny2_srl_en.md new file mode 100644 index 000000000000..81758a9eeafa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-rubert_tiny2_srl_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_tiny2_srl BertForTokenClassification from dl-ru +author: John Snow Labs +name: rubert_tiny2_srl +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_srl` is a English model originally trained by dl-ru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_srl_en_5.2.0_3.0_1699489702161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_srl_en_5.2.0_3.0_1699489702161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("rubert_tiny2_srl","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("rubert_tiny2_srl", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_srl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|109.1 MB| + +## References + +https://huggingface.co/dl-ru/rubert-tiny2-srl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-scibert_allen_case_v3_en.md b/docs/_posts/ahmedlone127/2023-11-09-scibert_allen_case_v3_en.md new file mode 100644 index 000000000000..5f162bb51366 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-scibert_allen_case_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English scibert_allen_case_v3 BertForTokenClassification from sayalik13 +author: John Snow Labs +name: scibert_allen_case_v3 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scibert_allen_case_v3` is a English model originally trained by sayalik13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scibert_allen_case_v3_en_5.2.0_3.0_1699503716877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scibert_allen_case_v3_en_5.2.0_3.0_1699503716877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("scibert_allen_case_v3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("scibert_allen_case_v3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scibert_allen_case_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/sayalik13/scibert-allen-case-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-software_benchmark_bio_en.md b/docs/_posts/ahmedlone127/2023-11-09-software_benchmark_bio_en.md new file mode 100644 index 000000000000..224f9056a581 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-software_benchmark_bio_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English software_benchmark_bio BertForTokenClassification from oeg +author: John Snow Labs +name: software_benchmark_bio +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`software_benchmark_bio` is a English model originally trained by oeg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/software_benchmark_bio_en_5.2.0_3.0_1699495824568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/software_benchmark_bio_en_5.2.0_3.0_1699495824568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("software_benchmark_bio","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("software_benchmark_bio", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|software_benchmark_bio| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|410.0 MB| + +## References + +https://huggingface.co/oeg/software_benchmark_bio \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-test_nerv2_en.md b/docs/_posts/ahmedlone127/2023-11-09-test_nerv2_en.md new file mode 100644 index 000000000000..0c408b93fd50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-test_nerv2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_nerv2 BertForTokenClassification from CarlosDataAnalysis +author: John Snow Labs +name: test_nerv2 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_nerv2` is a English model originally trained by CarlosDataAnalysis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_nerv2_en_5.2.0_3.0_1699521037511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_nerv2_en_5.2.0_3.0_1699521037511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("test_nerv2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("test_nerv2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_nerv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/CarlosDataAnalysis/test-NERv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-09-vietnamese_feedback_model_1_1_0_en.md b/docs/_posts/ahmedlone127/2023-11-09-vietnamese_feedback_model_1_1_0_en.md new file mode 100644 index 000000000000..0f937d346de6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-09-vietnamese_feedback_model_1_1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English vietnamese_feedback_model_1_1_0 BertForTokenClassification from P829692 +author: John Snow Labs +name: vietnamese_feedback_model_1_1_0 +date: 2023-11-09 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vietnamese_feedback_model_1_1_0` is a English model originally trained by P829692. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vietnamese_feedback_model_1_1_0_en_5.2.0_3.0_1699517348693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vietnamese_feedback_model_1_1_0_en_5.2.0_3.0_1699517348693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = BertForTokenClassification.pretrained("vietnamese_feedback_model_1_1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = BertForTokenClassification + .pretrained("vietnamese_feedback_model_1_1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vietnamese_feedback_model_1_1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/P829692/vietnamese_feedback_model_1_1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_base_cased_qa_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_base_cased_qa_squad2_en.md new file mode 100644 index 000000000000..c98054bcc7d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_base_cased_qa_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from deepset) +author: John Snow Labs +name: bert_base_cased_qa_squad2 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-squad2` is a English model orginally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_qa_squad2_en_5.2.0_3.0_1699785841705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_qa_squad2_en_5.2.0_3.0_1699785841705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_cased_qa_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_base_cased_qa_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.base_cased.by_deepset").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_qa_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/bert-base-cased-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_3lang_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_3lang_xx.md new file mode 100644 index 000000000000..4e6153e9cff6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_3lang_xx.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Cased model (from krinal214) +author: John Snow Labs +name: bert_qa_3lang +date: 2023-11-12 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-3lang` is a Multilingual model originally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_3lang_xx_5.2.0_3.0_1699786324189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_3lang_xx_5.2.0_3.0_1699786324189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_3lang","xx") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_3lang","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.bert.tydiqa.3lang").predict("""PUT YOUR QUESTION HERE|||"PUT YOUR CONTEXT HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_3lang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-3lang \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_adars_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_adars_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..f1016ad30df9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_adars_base_cased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from Adars) +author: John Snow Labs +name: bert_qa_adars_base_cased_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad` is a English model originally trained by `Adars`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_adars_base_cased_finetuned_squad_en_5.2.0_3.0_1699788427985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_adars_base_cased_finetuned_squad_en_5.2.0_3.0_1699788427985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_adars_base_cased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_adars_base_cased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_adars_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Adars/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad_en.md new file mode 100644 index 000000000000..a108fd3e5c23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from ahujaniharika95) +author: John Snow Labs +name: bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `minilm-uncased-squad2-finetuned-squad` is a English model originally trained by `ahujaniharika95`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad_en_5.2.0_3.0_1699788587653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad_en_5.2.0_3.0_1699788587653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.uncased_mini_lm_mini_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ahujaniharika95_minilm_uncased_squad2_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ahujaniharika95/minilm-uncased-squad2-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ainize_klue_bert_base_mrc_ko.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ainize_klue_bert_base_mrc_ko.md new file mode 100644 index 000000000000..37271950166e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ainize_klue_bert_base_mrc_ko.md @@ -0,0 +1,113 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from ainize) +author: John Snow Labs +name: bert_qa_ainize_klue_bert_base_mrc +date: 2023-11-12 +tags: [ko, open_source, question_answering, bert, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `klue-bert-base-mrc` is a Korean model orginally trained by `ainize`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ainize_klue_bert_base_mrc_ko_5.2.0_3.0_1699787802266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ainize_klue_bert_base_mrc_ko_5.2.0_3.0_1699787802266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_ainize_klue_bert_base_mrc","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_ainize_klue_bert_base_mrc","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.klue.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ainize_klue_bert_base_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|412.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ainize/klue-bert-base-mrc +- https://ainize.ai/ +- https://main-klue-mrc-bert-scy6500.endpoint.ainize.ai/ +- https://ainize.ai/scy6500/KLUE-MRC-BERT?branch=main +- https://ainize.ai/teachable-nlp +- https://link.ainize.ai/3FjvBVn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_aiyshwariya_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_aiyshwariya_finetuned_squad_en.md new file mode 100644 index 000000000000..81e655f413eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_aiyshwariya_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Aiyshwariya) +author: John Snow Labs +name: bert_qa_aiyshwariya_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `Aiyshwariya`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_aiyshwariya_finetuned_squad_en_5.2.0_3.0_1699788850083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_aiyshwariya_finetuned_squad_en_5.2.0_3.0_1699788850083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_aiyshwariya_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_aiyshwariya_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_aiyshwariya_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Aiyshwariya/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ajuste_01_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ajuste_01_en.md new file mode 100644 index 000000000000..906550f0d7ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ajuste_01_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_ajuste_01 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ajuste_01` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ajuste_01_en_5.2.0_3.0_1699789090928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ajuste_01_en_5.2.0_3.0_1699789090928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ajuste_01","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ajuste_01","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ajuste_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/ajuste_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ajuste_02_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ajuste_02_en.md new file mode 100644 index 000000000000..e5f825c7f2aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ajuste_02_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_ajuste_02 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ajuste_02` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ajuste_02_en_5.2.0_3.0_1699788063198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ajuste_02_en_5.2.0_3.0_1699788063198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ajuste_02","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ajuste_02","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ajuste_02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/ajuste_02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_akihiro2_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_akihiro2_finetuned_squad_en.md new file mode 100644 index 000000000000..a6e8766973b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_akihiro2_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Akihiro2) +author: John Snow Labs +name: bert_qa_akihiro2_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `Akihiro2`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_akihiro2_finetuned_squad_en_5.2.0_3.0_1699787030761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_akihiro2_finetuned_squad_en_5.2.0_3.0_1699787030761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_akihiro2_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_akihiro2_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_akihiro2_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Akihiro2/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_akshay1791_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_akshay1791_finetuned_squad_en.md new file mode 100644 index 000000000000..e1caefe29675 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_akshay1791_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Akshay1791) +author: John Snow Labs +name: bert_qa_akshay1791_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `Akshay1791`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_akshay1791_finetuned_squad_en_5.2.0_3.0_1699788131498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_akshay1791_finetuned_squad_en_5.2.0_3.0_1699788131498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_akshay1791_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_akshay1791_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_akshay1791_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Akshay1791/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_alexander_learn_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_alexander_learn_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..0c7d42093ae7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_alexander_learn_bert_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_alexander_learn_bert_finetuned_squad BertForQuestionAnswering from Alexander-Learn +author: John Snow Labs +name: bert_qa_alexander_learn_bert_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_alexander_learn_bert_finetuned_squad` is a English model originally trained by Alexander-Learn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_alexander_learn_bert_finetuned_squad_en_5.2.0_3.0_1699785091187.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_alexander_learn_bert_finetuned_squad_en_5.2.0_3.0_1699785091187.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_alexander_learn_bert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_alexander_learn_bert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_alexander_learn_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Alexander-Learn/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_amartyobanerjee_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_amartyobanerjee_finetuned_squad_en.md new file mode 100644 index 000000000000..48d0a5d1036f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_amartyobanerjee_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from amartyobanerjee) +author: John Snow Labs +name: bert_qa_amartyobanerjee_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `amartyobanerjee`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_amartyobanerjee_finetuned_squad_en_5.2.0_3.0_1699787315808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_amartyobanerjee_finetuned_squad_en_5.2.0_3.0_1699787315808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_amartyobanerjee_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_amartyobanerjee_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_amartyobanerjee_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/amartyobanerjee/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ancient_chinese_base_ud_head_zh.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ancient_chinese_base_ud_head_zh.md new file mode 100644 index 000000000000..b65fb685bb07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ancient_chinese_base_ud_head_zh.md @@ -0,0 +1,96 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Base Cased model (from KoichiYasuoka) +author: John Snow Labs +name: bert_qa_ancient_chinese_base_ud_head +date: 2023-11-12 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-ancient-chinese-base-ud-head` is a Chinese model originally trained by `KoichiYasuoka`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ancient_chinese_base_ud_head_zh_5.2.0_3.0_1699788240227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ancient_chinese_base_ud_head_zh_5.2.0_3.0_1699788240227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ancient_chinese_base_ud_head","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ancient_chinese_base_ud_head","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ancient_chinese_base_ud_head| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|430.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/KoichiYasuoka/bert-ancient-chinese-base-ud-head +- https://github.com/UniversalDependencies/UD_Classical_Chinese-Kyoto +- https://pypi.org/project/ufal.chu-liu-edmonds/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_andresestevez_bert_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_andresestevez_bert_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..4d642478a0ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_andresestevez_bert_base_cased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from andresestevez) +author: John Snow Labs +name: bert_qa_andresestevez_bert_base_cased_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad` is a English model orginally trained by `andresestevez`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_andresestevez_bert_base_cased_finetuned_squad_en_5.2.0_3.0_1699787555761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_andresestevez_bert_base_cased_finetuned_squad_en_5.2.0_3.0_1699787555761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_andresestevez_bert_base_cased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_andresestevez_bert_base_cased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_andresestevez_bert_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andresestevez/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_andresestevez_bert_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_andresestevez_bert_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..5c3636dd447a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_andresestevez_bert_finetuned_squad_accelerate_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from andresestevez) +author: John Snow Labs +name: bert_qa_andresestevez_bert_finetuned_squad_accelerate +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model orginally trained by `andresestevez`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_andresestevez_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1699788526078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_andresestevez_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1699788526078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_andresestevez_bert_finetuned_squad_accelerate","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_andresestevez_bert_finetuned_squad_accelerate","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_andresestevez").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_andresestevez_bert_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andresestevez/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ankitkupadhyay_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ankitkupadhyay_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..c239103f34f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_ankitkupadhyay_bert_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ankitkupadhyay) +author: John Snow Labs +name: bert_qa_ankitkupadhyay_bert_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model orginally trained by `ankitkupadhyay`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ankitkupadhyay_bert_finetuned_squad_en_5.2.0_3.0_1699788452476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ankitkupadhyay_bert_finetuned_squad_en_5.2.0_3.0_1699788452476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_ankitkupadhyay_bert_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_ankitkupadhyay_bert_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_ankitkupadhyay").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ankitkupadhyay_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ankitkupadhyay/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arabert_finetuned_arcd_ar.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arabert_finetuned_arcd_ar.md new file mode 100644 index 000000000000..0883a18c309d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arabert_finetuned_arcd_ar.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Arabic BertForQuestionAnswering Cased model (from Sh3ra) +author: John Snow Labs +name: bert_qa_arabert_finetuned_arcd +date: 2023-11-12 +tags: [ar, open_source, bert, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `arabert-finetuned-arcd` is a Arabic model originally trained by `Sh3ra`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_arabert_finetuned_arcd_ar_5.2.0_3.0_1699788857557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_arabert_finetuned_arcd_ar_5.2.0_3.0_1699788857557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_arabert_finetuned_arcd","ar") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["ما هو اسمي؟", "اسمي كلارا وأنا أعيش في بيركلي."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_arabert_finetuned_arcd","ar") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("ما هو اسمي؟", "اسمي كلارا وأنا أعيش في بيركلي.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_arabert_finetuned_arcd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|504.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Sh3ra/arabert-finetuned-arcd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arabic_ar.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arabic_ar.md new file mode 100644 index 000000000000..a0effe3f3c59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arabic_ar.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Arabic bert_qa_arabic BertForQuestionAnswering from abdalrahmanshahrour +author: John Snow Labs +name: bert_qa_arabic +date: 2023-11-12 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_arabic` is a Arabic model originally trained by abdalrahmanshahrour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_arabic_ar_5.2.0_3.0_1699781377235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_arabic_ar_5.2.0_3.0_1699781377235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_arabic","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_arabic", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_arabic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|412.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/abdalrahmanshahrour/ArabicQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_ar.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_ar.md new file mode 100644 index 000000000000..3075d4252c17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_ar.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Arabic bert_qa_arap BertForQuestionAnswering from gfdgdfgdg +author: John Snow Labs +name: bert_qa_arap +date: 2023-11-12 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_arap` is a Arabic model originally trained by gfdgdfgdg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_arap_ar_5.2.0_3.0_1699781625274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_arap_ar_5.2.0_3.0_1699781625274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_arap","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_arap", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_arap| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/gfdgdfgdg/arap_qa_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_large_v2_ar.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_large_v2_ar.md new file mode 100644 index 000000000000..63ad04971313 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_large_v2_ar.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Arabic bert_qa_arap_large_v2 BertForQuestionAnswering from gfdgdfgdg +author: John Snow Labs +name: bert_qa_arap_large_v2 +date: 2023-11-12 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_arap_large_v2` is a Arabic model originally trained by gfdgdfgdg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_arap_large_v2_ar_5.2.0_3.0_1699782151696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_arap_large_v2_ar_5.2.0_3.0_1699782151696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_arap_large_v2","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_arap_large_v2", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_arap_large_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|1.4 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/gfdgdfgdg/arap_qa_bert_large_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_v2_ar.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_v2_ar.md new file mode 100644 index 000000000000..27e649703841 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_arap_v2_ar.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Arabic bert_qa_arap_v2 BertForQuestionAnswering from gfdgdfgdg +author: John Snow Labs +name: bert_qa_arap_v2 +date: 2023-11-12 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_arap_v2` is a Arabic model originally trained by gfdgdfgdg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_arap_v2_ar_5.2.0_3.0_1699781386386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_arap_v2_ar_5.2.0_3.0_1699781386386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_arap_v2","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_arap_v2", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_arap_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|504.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/gfdgdfgdg/arap_qa_bert_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_araspeedest_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_araspeedest_en.md new file mode 100644 index 000000000000..81ef73621c3d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_araspeedest_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_araspeedest BertForQuestionAnswering from aymanm419 +author: John Snow Labs +name: bert_qa_araspeedest +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_araspeedest` is a English model originally trained by aymanm419. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_araspeedest_en_5.2.0_3.0_1699789370590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_araspeedest_en_5.2.0_3.0_1699789370590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_araspeedest","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_araspeedest", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_araspeedest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|504.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/aymanm419/araSpeedest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_augmented_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_augmented_en.md new file mode 100644 index 000000000000..c1326dace3bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_augmented_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_augmented +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `augmented` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_augmented_en_5.2.0_3.0_1699788905018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_augmented_en_5.2.0_3.0_1699788905018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_augmented","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_augmented","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.augmented").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_augmented| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/augmented \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_augmented_squad_translated_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_augmented_squad_translated_en.md new file mode 100644 index 000000000000..af5fb81fd7c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_augmented_squad_translated_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_augmented_squad_translated BertForQuestionAnswering from krinal214 +author: John Snow Labs +name: bert_qa_augmented_squad_translated +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_augmented_squad_translated` is a English model originally trained by krinal214. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_augmented_squad_translated_en_5.2.0_3.0_1699789286206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_augmented_squad_translated_en_5.2.0_3.0_1699789286206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_augmented_squad_translated","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_augmented_squad_translated", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_augmented_squad_translated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/krinal214/augmented_Squad_Translated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_a3_1043835930_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_a3_1043835930_en.md new file mode 100644 index 000000000000..bbd476abc9ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_a3_1043835930_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from deepesh0x) +author: John Snow Labs +name: bert_qa_autotrain_a3_1043835930 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-a3-1043835930` is a English model originally trained by `deepesh0x`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_autotrain_a3_1043835930_en_5.2.0_3.0_1699788524206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_autotrain_a3_1043835930_en_5.2.0_3.0_1699788524206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_autotrain_a3_1043835930","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_autotrain_a3_1043835930","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_autotrain_a3_1043835930| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepesh0x/autotrain-a3-1043835930 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_small_qna_1380352953_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_small_qna_1380352953_en.md new file mode 100644 index 000000000000..1c0af074fce0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_small_qna_1380352953_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from gaaush) +author: John Snow Labs +name: bert_qa_autotrain_small_qna_1380352953 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-small-qna-1380352953` is a English model originally trained by `gaaush`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_autotrain_small_qna_1380352953_en_5.2.0_3.0_1699788070058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_autotrain_small_qna_1380352953_en_5.2.0_3.0_1699788070058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_autotrain_small_qna_1380352953","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_autotrain_small_qna_1380352953","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_autotrain_small_qna_1380352953| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/gaaush/autotrain-small-qna-1380352953 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_xlm_fine_tune_1380052948_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_xlm_fine_tune_1380052948_en.md new file mode 100644 index 000000000000..d9a2df74d4a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_autotrain_xlm_fine_tune_1380052948_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from tushar23) +author: John Snow Labs +name: bert_qa_autotrain_xlm_fine_tune_1380052948 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-xlm_bert_fine_tune-1380052948` is a English model originally trained by `tushar23`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_autotrain_xlm_fine_tune_1380052948_en_5.2.0_3.0_1699789795351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_autotrain_xlm_fine_tune_1380052948_en_5.2.0_3.0_1699789795351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_autotrain_xlm_fine_tune_1380052948","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_autotrain_xlm_fine_tune_1380052948","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_autotrain_xlm_fine_tune_1380052948| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tushar23/autotrain-xlm_bert_fine_tune-1380052948 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_baru98_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_baru98_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..e47d0cd8a0e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_baru98_base_cased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from baru98) +author: John Snow Labs +name: bert_qa_baru98_base_cased_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad` is a English model originally trained by `baru98`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_baru98_base_cased_finetuned_squad_en_5.2.0_3.0_1699788813572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_baru98_base_cased_finetuned_squad_en_5.2.0_3.0_1699788813572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_baru98_base_cased_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_baru98_base_cased_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.cased_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_baru98_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/baru98/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_1024_full_trivia_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_1024_full_trivia_en.md new file mode 100644 index 000000000000..f2f5e1106622 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_1024_full_trivia_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from MrAnderson) +author: John Snow Labs +name: bert_qa_base_1024_full_trivia +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-1024-full-trivia` is a English model originally trained by `MrAnderson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_1024_full_trivia_en_5.2.0_3.0_1699789105143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_1024_full_trivia_en_5.2.0_3.0_1699789105143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_1024_full_trivia","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_1024_full_trivia","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.trivia.base_1024d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_1024_full_trivia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MrAnderson/bert-base-1024-full-trivia \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_alian_uncased_squad_it.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_alian_uncased_squad_it.md new file mode 100644 index 000000000000..5fb7af492ae9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_alian_uncased_squad_it.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Italian bert_qa_base_alian_uncased_squad BertForQuestionAnswering from antoniocappiello +author: John Snow Labs +name: bert_qa_base_alian_uncased_squad +date: 2023-11-12 +tags: [bert, it, open_source, question_answering, onnx] +task: Question Answering +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_alian_uncased_squad` is a Italian model originally trained by antoniocappiello. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_alian_uncased_squad_it_5.2.0_3.0_1699781375077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_alian_uncased_squad_it_5.2.0_3.0_1699781375077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_alian_uncased_squad","it") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_alian_uncased_squad", "it") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_alian_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|it| +|Size:|409.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/antoniocappiello/bert-base-italian-uncased-squad-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_finetuned_squad_r3f_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_finetuned_squad_r3f_en.md new file mode 100644 index 000000000000..d814e86535bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_finetuned_squad_r3f_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_cased_finetuned_squad_r3f +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad-r3f` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_finetuned_squad_r3f_en_5.2.0_3.0_1699789462063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_finetuned_squad_r3f_en_5.2.0_3.0_1699789462063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_cased_finetuned_squad_r3f","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_cased_finetuned_squad_r3f","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.cased_base_finetuned.by_anas_awadalla").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_cased_finetuned_squad_r3f| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-cased-finetuned-squad-r3f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_finetuned_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_finetuned_squad_v2_en.md new file mode 100644 index 000000000000..d2db7bc891c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_finetuned_squad_v2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from victorlee071200) +author: John Snow Labs +name: bert_qa_base_cased_finetuned_squad_v2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad_v2` is a English model originally trained by `victorlee071200`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_finetuned_squad_v2_en_5.2.0_3.0_1699789748848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_finetuned_squad_v2_en_5.2.0_3.0_1699789748848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_cased_finetuned_squad_v2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_cased_finetuned_squad_v2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.cased_v2_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_cased_finetuned_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/victorlee071200/bert-base-cased-finetuned-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022_en.md new file mode 100644 index 000000000000..e5b3ca5a2f9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from nntadotzip) +author: John Snow Labs +name: bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-IUChatbot-ontologyDts-bertBaseCased-bertTokenizer-12April2022` is a English model originally trained by `nntadotzip`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022_en_5.2.0_3.0_1699789068115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022_en_5.2.0_3.0_1699789068115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.cased_base").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_cased_iuchatbot_ontologydts_berttokenizer_12april2022| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/nntadotzip/bert-base-cased-IUChatbot-ontologyDts-bertBaseCased-bertTokenizer-12April2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad2_en.md new file mode 100644 index 000000000000..002eec855c8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_cased_squad2 BertForQuestionAnswering from Shobhank-iiitdwd +author: John Snow Labs +name: bert_qa_base_cased_squad2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_cased_squad2` is a English model originally trained by Shobhank-iiitdwd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_squad2_en_5.2.0_3.0_1699781323466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_squad2_en_5.2.0_3.0_1699781323466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_cased_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_cased_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_cased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Shobhank-iiitdwd/BERT-base-cased-squad2-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad_v1.1_portuguese_pt.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad_v1.1_portuguese_pt.md new file mode 100644 index 000000000000..41949376850c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad_v1.1_portuguese_pt.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Portuguese bert_qa_base_cased_squad_v1.1_portuguese BertForQuestionAnswering from pierreguillou +author: John Snow Labs +name: bert_qa_base_cased_squad_v1.1_portuguese +date: 2023-11-12 +tags: [bert, pt, open_source, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_cased_squad_v1.1_portuguese` is a Portuguese model originally trained by pierreguillou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_squad_v1.1_portuguese_pt_5.2.0_3.0_1699781642034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_squad_v1.1_portuguese_pt_5.2.0_3.0_1699781642034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_cased_squad_v1.1_portuguese","pt") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_cased_squad_v1.1_portuguese", "pt") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_cased_squad_v1.1_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/pierreguillou/bert-base-cased-squad-v1.1-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad_v1_en.md new file mode 100644 index 000000000000..aa6a76ab67ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_cased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_cased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_base_cased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_cased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_squad_v1_en_5.2.0_3.0_1699781630158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_cased_squad_v1_en_5.2.0_3.0_1699781630158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_cased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_cased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_cased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/bert-base-cased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_chinese_finetuned_squad_zh.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_chinese_finetuned_squad_zh.md new file mode 100644 index 000000000000..c47ed62eee26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_chinese_finetuned_squad_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Base Cased model (from jimmy880219) +author: John Snow Labs +name: bert_qa_base_chinese_finetuned_squad +date: 2023-11-12 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-chinese-finetuned-squad` is a Chinese model originally trained by `jimmy880219`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_chinese_finetuned_squad_zh_5.2.0_3.0_1699791265983.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_chinese_finetuned_squad_zh_5.2.0_3.0_1699791265983.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_chinese_finetuned_squad","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_chinese_finetuned_squad","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_chinese_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jimmy880219/bert-base-chinese-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_chinese_zh.md new file mode 100644 index 000000000000..e8b6bac91ef5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_chinese_zh.md @@ -0,0 +1,99 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Base Cased model (from ckiplab) +author: John Snow Labs +name: bert_qa_base_chinese +date: 2023-11-12 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-chinese-qa` is a Chinese model originally trained by `ckiplab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_chinese_zh_5.2.0_3.0_1699789991574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_chinese_zh_5.2.0_3.0_1699789991574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_chinese","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_chinese","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ckiplab/bert-base-chinese-qa +- https://github.com/ckiplab/ckip-transformers +- https://muyang.pro +- https://ckip.iis.sinica.edu.tw +- https://github.com/ckiplab/ckip-transformers +- https://github.com/ckiplab/ckip-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_finetuned_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_finetuned_squad2_en.md new file mode 100644 index 000000000000..b686610782d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_finetuned_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_finetuned_squad2 BertForQuestionAnswering from phiyodr +author: John Snow Labs +name: bert_qa_base_finetuned_squad2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_finetuned_squad2` is a English model originally trained by phiyodr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_finetuned_squad2_en_5.2.0_3.0_1699781918539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_finetuned_squad2_en_5.2.0_3.0_1699781918539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_finetuned_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_finetuned_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/phiyodr/bert-base-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_for_question_answering_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_for_question_answering_en.md new file mode 100644 index 000000000000..ce691323b96d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_for_question_answering_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from Zamachi) +author: John Snow Labs +name: bert_qa_base_for_question_answering +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-for-question-answering` is a English model originally trained by `Zamachi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_for_question_answering_en_5.2.0_3.0_1699789672656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_for_question_answering_en_5.2.0_3.0_1699789672656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_for_question_answering","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_for_question_answering","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_for_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Zamachi/bert-base-for-question-answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_indonesian_tydiqa_id.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_indonesian_tydiqa_id.md new file mode 100644 index 000000000000..72fcbb6111b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_indonesian_tydiqa_id.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Indonesian BertForQuestionAnswering Base Cased model (from cahya) +author: John Snow Labs +name: bert_qa_base_indonesian_tydiqa +date: 2023-11-12 +tags: [id, open_source, bert, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-indonesian-tydiqa` is a Indonesian model originally trained by `cahya`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_indonesian_tydiqa_id_5.2.0_3.0_1699789939394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_indonesian_tydiqa_id_5.2.0_3.0_1699789939394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_indonesian_tydiqa","id") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Siapa namaku?", "Nama saya Clara dan saya tinggal di Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_indonesian_tydiqa","id") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Siapa namaku?", "Nama saya Clara dan saya tinggal di Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.answer_question.bert.tydiqa.base").predict("""Siapa namaku?|||"Nama saya Clara dan saya tinggal di Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_indonesian_tydiqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|412.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/cahya/bert-base-indonesian-tydiqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_japanese_wikipedia_ud_head_ja.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_japanese_wikipedia_ud_head_ja.md new file mode 100644 index 000000000000..7cde7fbdc338 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_japanese_wikipedia_ud_head_ja.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Japanese BertForQuestionAnswering Base model (from KoichiYasuoka) +author: John Snow Labs +name: bert_qa_base_japanese_wikipedia_ud_head +date: 2023-11-12 +tags: [ja, open_source, bert, question_answering, onnx] +task: Question Answering +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-japanese-wikipedia-ud-head` is a Japanese model originally trained by `KoichiYasuoka`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_japanese_wikipedia_ud_head_ja_5.2.0_3.0_1699789292031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_japanese_wikipedia_ud_head_ja_5.2.0_3.0_1699789292031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_japanese_wikipedia_ud_head","ja") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["私の名前は何ですか?", "私の名前はクララで、私はバークレーに住んでいます。"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_japanese_wikipedia_ud_head","ja") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("私の名前は何ですか?", "私の名前はクララで、私はバークレーに住んでいます。").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ja.answer_question.wikipedia.bert.base").predict("""私の名前は何ですか?|||"私の名前はクララで、私はバークレーに住んでいます。""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_japanese_wikipedia_ud_head| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ja| +|Size:|338.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/KoichiYasuoka/bert-base-japanese-wikipedia-ud-head +- https://github.com/UniversalDependencies/UD_Japanese-GSDLUW \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multi_mlqa_dev_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multi_mlqa_dev_en.md new file mode 100644 index 000000000000..5c7a58aabaaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multi_mlqa_dev_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from roshnir) +author: John Snow Labs +name: bert_qa_base_multi_mlqa_dev +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multi-mlqa-dev-en` is a English model originally trained by `roshnir`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multi_mlqa_dev_en_5.2.0_3.0_1699791505023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multi_mlqa_dev_en_5.2.0_3.0_1699791505023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_multi_mlqa_dev","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_multi_mlqa_dev","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.mlqa.base").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multi_mlqa_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|625.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/roshnir/bert-base-multi-mlqa-dev-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multi_uncased_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multi_uncased_xx.md new file mode 100644 index 000000000000..cee66ea81d1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multi_uncased_xx.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Uncased model (from roshnir) +author: John Snow Labs +name: bert_qa_base_multi_uncased +date: 2023-11-12 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multi-uncased-en-hi` is a Multilingual model originally trained by `roshnir`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multi_uncased_xx_5.2.0_3.0_1699790863248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multi_uncased_xx_5.2.0_3.0_1699790863248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_multi_uncased","xx") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_multi_uncased","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.bert.uncased_base").predict("""PUT YOUR QUESTION HERE|||"PUT YOUR CONTEXT HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multi_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/roshnir/bert-base-multi-uncased-en-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_dutch_squad2_nl.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_dutch_squad2_nl.md new file mode 100644 index 000000000000..743a007da2eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_dutch_squad2_nl.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Dutch, Flemish bert_qa_base_multilingual_cased_finetuned_dutch_squad2 BertForQuestionAnswering from henryk +author: John Snow Labs +name: bert_qa_base_multilingual_cased_finetuned_dutch_squad2 +date: 2023-11-12 +tags: [bert, nl, open_source, question_answering, onnx] +task: Question Answering +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_multilingual_cased_finetuned_dutch_squad2` is a Dutch, Flemish model originally trained by henryk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_dutch_squad2_nl_5.2.0_3.0_1699782486758.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_dutch_squad2_nl_5.2.0_3.0_1699782486758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned_dutch_squad2","nl") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_multilingual_cased_finetuned_dutch_squad2", "nl") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_cased_finetuned_dutch_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|nl| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/henryk/bert-base-multilingual-cased-finetuned-dutch-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_polish_squad1_pl.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_polish_squad1_pl.md new file mode 100644 index 000000000000..74bbc88fa855 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_polish_squad1_pl.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Polish bert_qa_base_multilingual_cased_finetuned_polish_squad1 BertForQuestionAnswering from henryk +author: John Snow Labs +name: bert_qa_base_multilingual_cased_finetuned_polish_squad1 +date: 2023-11-12 +tags: [bert, pl, open_source, question_answering, onnx] +task: Question Answering +language: pl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_multilingual_cased_finetuned_polish_squad1` is a Polish model originally trained by henryk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_polish_squad1_pl_5.2.0_3.0_1699782279830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_polish_squad1_pl_5.2.0_3.0_1699782279830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned_polish_squad1","pl") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_multilingual_cased_finetuned_polish_squad1", "pl") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_cased_finetuned_polish_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pl| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/henryk/bert-base-multilingual-cased-finetuned-polish-squad1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_polish_squad2_pl.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_polish_squad2_pl.md new file mode 100644 index 000000000000..9834d68c8200 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_cased_finetuned_polish_squad2_pl.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Polish bert_qa_base_multilingual_cased_finetuned_polish_squad2 BertForQuestionAnswering from henryk +author: John Snow Labs +name: bert_qa_base_multilingual_cased_finetuned_polish_squad2 +date: 2023-11-12 +tags: [bert, pl, open_source, question_answering, onnx] +task: Question Answering +language: pl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_multilingual_cased_finetuned_polish_squad2` is a Polish model originally trained by henryk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_polish_squad2_pl_5.2.0_3.0_1699781674966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_polish_squad2_pl_5.2.0_3.0_1699781674966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned_polish_squad2","pl") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_multilingual_cased_finetuned_polish_squad2", "pl") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_cased_finetuned_polish_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pl| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/henryk/bert-base-multilingual-cased-finetuned-polish-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squad_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squad_xx.md new file mode 100644 index 000000000000..e75bdb0df19d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squad_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Uncased model (from monakth) +author: John Snow Labs +name: bert_qa_base_multilingual_uncased_finetuned_squad +date: 2023-11-12 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-uncased-finetuned-squad` is a Multilingual model originally trained by `monakth`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squad_xx_5.2.0_3.0_1699788565904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squad_xx_5.2.0_3.0_1699788565904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squad","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squad","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monakth/bert-base-multilingual-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full_xx.md new file mode 100644 index 000000000000..b4ef46a27062 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Uncased model (from khoanvm) +author: John Snow Labs +name: bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full +date: 2023-11-12 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-uncased-finetuned-squadv2-finetuned-vizalo-full` is a Multilingual model originally trained by `khoanvm`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full_xx_5.2.0_3.0_1699788894223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full_xx_5.2.0_3.0_1699788894223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_full| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/khoanvm/bert-base-multilingual-uncased-finetuned-squadv2-finetuned-vizalo-full \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_xx.md new file mode 100644 index 000000000000..fd32c6ace751 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Uncased model (from khoanvm) +author: John Snow Labs +name: bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo +date: 2023-11-12 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-uncased-finetuned-squadv2-finetuned-vizalo` is a Multilingual model originally trained by `khoanvm`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_xx_5.2.0_3.0_1699789612662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo_xx_5.2.0_3.0_1699789612662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_uncased_finetuned_squadv2_finetuned_vizalo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/khoanvm/bert-base-multilingual-uncased-finetuned-squadv2-finetuned-vizalo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2_xx.md new file mode 100644 index 000000000000..df8164ce9a3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Uncased model (from monakth) +author: John Snow Labs +name: bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2 +date: 2023-11-12 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-uncased-finetuned-squad-squadv2` is a Multilingual model originally trained by `monakth`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2_xx_5.2.0_3.0_1699789914874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2_xx_5.2.0_3.0_1699789914874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_uncased_mo_finetuned_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monakth/bert-base-multilingual-uncased-finetuned-squad-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_nnish_cased_squad2_fi.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_nnish_cased_squad2_fi.md new file mode 100644 index 000000000000..99a54db2140e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_nnish_cased_squad2_fi.md @@ -0,0 +1,96 @@ +--- +layout: model +title: Finnish BertForQuestionAnswering Base Cased model (from ilmariky) +author: John Snow Labs +name: bert_qa_base_nnish_cased_squad2 +date: 2023-11-12 +tags: [fi, open_source, bert, question_answering, onnx] +task: Question Answering +language: fi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-finnish-cased-squad2-fi` is a Finnish model originally trained by `ilmariky`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_nnish_cased_squad2_fi_5.2.0_3.0_1699791134037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_nnish_cased_squad2_fi_5.2.0_3.0_1699791134037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_nnish_cased_squad2","fi")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_nnish_cased_squad2","fi") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_nnish_cased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fi| +|Size:|464.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ilmariky/bert-base-finnish-cased-squad2-fi +- https://github.com/google-research-datasets/tydiqa +- https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_pars_uncased_persian_fa.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_pars_uncased_persian_fa.md new file mode 100644 index 000000000000..9a274580202d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_pars_uncased_persian_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_persian +date: 2023-11-12 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_persian_qa` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_persian_fa_5.2.0_3.0_1699789257443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_persian_fa_5.2.0_3.0_1699789257443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_persian","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_persian","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_persian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_persian_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_pars_uncased_pquad_and_persian_fa.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_pars_uncased_pquad_and_persian_fa.md new file mode 100644 index 000000000000..8100e0deef6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_pars_uncased_pquad_and_persian_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_pquad_and_persian +date: 2023-11-12 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_pquad_and_persian_qa` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_and_persian_fa_5.2.0_3.0_1699797094922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_and_persian_fa_5.2.0_3.0_1699797094922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_and_persian","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_and_persian","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_pquad_and_persian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_pquad_and_persian_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_parsbert_uncased_finetuned_perqa_fa.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_parsbert_uncased_finetuned_perqa_fa.md new file mode 100644 index 000000000000..c6fe0a806783 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_parsbert_uncased_finetuned_perqa_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from aminnaghavi) +author: John Snow Labs +name: bert_qa_base_parsbert_uncased_finetuned_perqa +date: 2023-11-12 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased-finetuned-perQA` is a Persian model originally trained by `aminnaghavi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_parsbert_uncased_finetuned_perqa_fa_5.2.0_3.0_1699804081424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_parsbert_uncased_finetuned_perqa_fa_5.2.0_3.0_1699804081424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_parsbert_uncased_finetuned_perqa","fa") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_parsbert_uncased_finetuned_perqa","fa") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_parsbert_uncased_finetuned_perqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aminnaghavi/bert-base-parsbert-uncased-finetuned-perQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_portuguese_cased_finetuned_squad_v1_pt.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_portuguese_cased_finetuned_squad_v1_pt.md new file mode 100644 index 000000000000..197af35d79e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_portuguese_cased_finetuned_squad_v1_pt.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Portuguese bert_qa_base_portuguese_cased_finetuned_squad_v1 BertForQuestionAnswering from mrm8488 +author: John Snow Labs +name: bert_qa_base_portuguese_cased_finetuned_squad_v1 +date: 2023-11-12 +tags: [bert, pt, open_source, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_portuguese_cased_finetuned_squad_v1` is a Portuguese model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_portuguese_cased_finetuned_squad_v1_pt_5.2.0_3.0_1699781369340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_portuguese_cased_finetuned_squad_v1_pt_5.2.0_3.0_1699781369340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_portuguese_cased_finetuned_squad_v1","pt") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_portuguese_cased_finetuned_squad_v1", "pt") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_portuguese_cased_finetuned_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/mrm8488/bert-base-portuguese-cased-finetuned-squad-v1-pt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_sinhala_si.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_sinhala_si.md new file mode 100644 index 000000000000..0b85fee212f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_sinhala_si.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Sinhala, Sinhalese bert_qa_base_sinhala BertForQuestionAnswering from sankhajay +author: John Snow Labs +name: bert_qa_base_sinhala +date: 2023-11-12 +tags: [bert, si, open_source, question_answering, onnx] +task: Question Answering +language: si +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_sinhala` is a Sinhala, Sinhalese model originally trained by sankhajay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_sinhala_si_5.2.0_3.0_1699782858660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_sinhala_si_5.2.0_3.0_1699782858660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_sinhala","si") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_sinhala", "si") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_sinhala| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|si| +|Size:|751.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/sankhajay/bert-base-sinhala-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_es.md new file mode 100644 index 000000000000..0720492d3dbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_es.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_base_spanish_wwm_cased_finetuned_s_c BertForQuestionAnswering from MMG +author: John Snow Labs +name: bert_qa_base_spanish_wwm_cased_finetuned_s_c +date: 2023-11-12 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_spanish_wwm_cased_finetuned_s_c` is a Castilian, Spanish model originally trained by MMG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_s_c_es_5.2.0_3.0_1699781982780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_s_c_es_5.2.0_3.0_1699781982780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_cased_finetuned_s_c","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_spanish_wwm_cased_finetuned_s_c", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_cased_finetuned_s_c| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/MMG/bert-base-spanish-wwm-cased-finetuned-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2_es.md new file mode 100644 index 000000000000..d1ef9a19f5b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2_es.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2 BertForQuestionAnswering from MMG +author: John Snow Labs +name: bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2 +date: 2023-11-12 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2` is a Castilian, Spanish model originally trained by MMG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2_es_5.2.0_3.0_1699782522865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2_es_5.2.0_3.0_1699782522865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/MMG/bert-base-spanish-wwm-cased-finetuned-sqac-finetuned-squad2-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad_es.md new file mode 100644 index 000000000000..17b2bf7f59e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad_es.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad BertForQuestionAnswering from MMG +author: John Snow Labs +name: bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad +date: 2023-11-12 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad` is a Castilian, Spanish model originally trained by MMG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad_es_5.2.0_3.0_1699783117208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad_es_5.2.0_3.0_1699783117208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_cased_finetuned_s_c_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/MMG/bert-base-spanish-wwm-cased-finetuned-sqac-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_es.md new file mode 100644 index 000000000000..067deb8da947 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_es.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2 BertForQuestionAnswering from mrm8488 +author: John Snow Labs +name: bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2 +date: 2023-11-12 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2` is a Castilian, Spanish model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_es_5.2.0_3.0_1699782227618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_es_5.2.0_3.0_1699782227618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/mrm8488/bert-base-spanish-wwm-cased-finetuned-spa-squad2-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c_es.md new file mode 100644 index 000000000000..044a037ce9cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c BertForQuestionAnswering from MMG +author: John Snow Labs +name: bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c +date: 2023-11-12 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c` is a Castilian, Spanish model originally trained by MMG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c_es_5.2.0_3.0_1699783294681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c_es_5.2.0_3.0_1699783294681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_finetuned_s_c| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/MMG/bert-base-spanish-wwm-cased-finetuned-spa-squad2-es-finetuned-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_squad2_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_squad2_es.md new file mode 100644 index 000000000000..3f2e3d0a6d72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_squad2_es.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_base_spanish_wwm_cased_finetuned_squad2 BertForQuestionAnswering from MMG +author: John Snow Labs +name: bert_qa_base_spanish_wwm_cased_finetuned_squad2 +date: 2023-11-12 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_spanish_wwm_cased_finetuned_squad2` is a Castilian, Spanish model originally trained by MMG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_squad2_es_5.2.0_3.0_1699782470750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_squad2_es_5.2.0_3.0_1699782470750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_cased_finetuned_squad2","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_spanish_wwm_cased_finetuned_squad2", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_cased_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/MMG/bert-base-spanish-wwm-cased-finetuned-squad2-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c_es.md new file mode 100644 index 000000000000..f20257219e8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c BertForQuestionAnswering from MMG +author: John Snow Labs +name: bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c +date: 2023-11-12 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c` is a Castilian, Spanish model originally trained by MMG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c_es_5.2.0_3.0_1699783492585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c_es_5.2.0_3.0_1699783492585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_cased_finetuned_squad2_spanish_finetuned_s_c| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/MMG/bert-base-spanish-wwm-cased-finetuned-squad2-es-finetuned-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_uncased_finetuned_squad_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_uncased_finetuned_squad_es.md new file mode 100644 index 000000000000..3cda5639f894 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_spanish_wwm_uncased_finetuned_squad_es.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Spanish BertForQuestionAnswering Base Uncased model (from stevemobs) +author: John Snow Labs +name: bert_qa_base_spanish_wwm_uncased_finetuned_squad +date: 2023-11-12 +tags: [es, open_source, bert, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-uncased-finetuned-squad_es` is a Spanish model originally trained by `stevemobs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_uncased_finetuned_squad_es_5.2.0_3.0_1699798973637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_spanish_wwm_uncased_finetuned_squad_es_5.2.0_3.0_1699798973637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_uncased_finetuned_squad","es") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_spanish_wwm_uncased_finetuned_squad","es") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.bert.squad_es.uncased_base_finetuned").predict("""¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_spanish_wwm_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/stevemobs/bert-base-spanish-wwm-uncased-finetuned-squad_es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad2_en.md new file mode 100644 index 000000000000..77616c2788d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from ModelTC) +author: John Snow Labs +name: bert_qa_base_squad2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-squad2` is a English model originally trained by `ModelTC`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_squad2_en_5.2.0_3.0_1699800909449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_squad2_en_5.2.0_3.0_1699800909449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_squad2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_squad2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ModelTC/bert-base-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad_en.md new file mode 100644 index 000000000000..d44d790a41e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from ModelTC) +author: John Snow Labs +name: bert_qa_base_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-squad` is a English model originally trained by `ModelTC`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_squad_en_5.2.0_3.0_1699805632427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_squad_en_5.2.0_3.0_1699805632427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ModelTC/bert-base-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad_v2_portuguese_pt.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad_v2_portuguese_pt.md new file mode 100644 index 000000000000..88be44dcfabc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_squad_v2_portuguese_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese BertForQuestionAnswering Base Cased model (from brianpaiva) +author: John Snow Labs +name: bert_qa_base_squad_v2_portuguese +date: 2023-11-12 +tags: [pt, open_source, bert, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-squad-v2-portuguese` is a Portuguese model originally trained by `brianpaiva`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_squad_v2_portuguese_pt_5.2.0_3.0_1699789510332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_squad_v2_portuguese_pt_5.2.0_3.0_1699789510332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_squad_v2_portuguese","pt")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_squad_v2_portuguese","pt") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_squad_v2_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/brianpaiva/bert-base-squad-v2-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_cased_finetuned_squad_sv.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_cased_finetuned_squad_sv.md new file mode 100644 index 000000000000..e7d75d8b8aa9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_cased_finetuned_squad_sv.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Swedish BertForQuestionAnswering Base Cased model (from miwink) +author: John Snow Labs +name: bert_qa_base_swedish_cased_finetuned_squad +date: 2023-11-12 +tags: [sv, open_source, bert, question_answering, onnx] +task: Question Answering +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-swedish-cased-finetuned-squad` is a Swedish model originally trained by `miwink`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_swedish_cased_finetuned_squad_sv_5.2.0_3.0_1699802912894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_swedish_cased_finetuned_squad_sv_5.2.0_3.0_1699802912894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_swedish_cased_finetuned_squad","sv")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_swedish_cased_finetuned_squad","sv") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_swedish_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/miwink/bert-base-swedish-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_cased_squad_experimental_sv.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_cased_squad_experimental_sv.md new file mode 100644 index 000000000000..4d552e409337 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_cased_squad_experimental_sv.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Swedish BertForQuestionAnswering Base Cased model (from KBLab) +author: John Snow Labs +name: bert_qa_base_swedish_cased_squad_experimental +date: 2023-11-12 +tags: [sv, open_source, bert, question_answering, onnx] +task: Question Answering +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-swedish-cased-squad-experimental` is a Swedish model originally trained by `KBLab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_swedish_cased_squad_experimental_sv_5.2.0_3.0_1699807628119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_swedish_cased_squad_experimental_sv_5.2.0_3.0_1699807628119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_swedish_cased_squad_experimental","sv") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Vad är mitt namn?", "Jag heter Clara och jag bor i Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_swedish_cased_squad_experimental","sv") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Vad är mitt namn?", "Jag heter Clara och jag bor i Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sv.answer_question.bert.squad.cased_base.by_KBLab").predict("""Vad är mitt namn?|||"Jag heter Clara och jag bor i Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_swedish_cased_squad_experimental| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/KBLab/bert-base-swedish-cased-squad-experimental \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_squad2_sv.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_squad2_sv.md new file mode 100644 index 000000000000..797608ead19d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_swedish_squad2_sv.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Swedish bert_qa_base_swedish_squad2 BertForQuestionAnswering from susumu2357 +author: John Snow Labs +name: bert_qa_base_swedish_squad2 +date: 2023-11-12 +tags: [bert, sv, open_source, question_answering, onnx] +task: Question Answering +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_swedish_squad2` is a Swedish model originally trained by susumu2357. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_swedish_squad2_sv_5.2.0_3.0_1699781919044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_swedish_squad2_sv_5.2.0_3.0_1699781919044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_swedish_squad2","sv") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_swedish_squad2", "sv") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_swedish_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/susumu2357/bert-base-swedish-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3_tr.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3_tr.md new file mode 100644 index 000000000000..2489a0ba71b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Base Cased model (from husnu) +author: John Snow Labs +name: bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3 +date: 2023-11-12 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3` is a Turkish model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1699789839321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1699789839321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_turkish_128k_cased_finetuned_lr_2e_05_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_turkish_squad_tr.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_turkish_squad_tr.md new file mode 100644 index 000000000000..f7904d923a53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_turkish_squad_tr.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Turkish bert_qa_base_turkish_squad BertForQuestionAnswering from savasy +author: John Snow Labs +name: bert_qa_base_turkish_squad +date: 2023-11-12 +tags: [bert, tr, open_source, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_turkish_squad` is a Turkish model originally trained by savasy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_squad_tr_5.2.0_3.0_1699782767219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_squad_tr_5.2.0_3.0_1699782767219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_squad","tr") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_turkish_squad", "tr") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_turkish_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/savasy/bert-base-turkish-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..86ea3c8c9173 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-0` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0_en_5.2.0_3.0_1699797702107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0_en_5.2.0_3.0_1699797702107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_0_base_1024d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..a25cb5e061ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10_en_5.2.0_3.0_1699790580658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10_en_5.2.0_3.0_1699790580658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_10_base_1024d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..eb013da312dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2_en_5.2.0_3.0_1699796153809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2_en_5.2.0_3.0_1699796153809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_1024d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..3d83abaa06bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6_en_5.2.0_3.0_1699799571644.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6_en_5.2.0_3.0_1699799571644.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_6_base_1024d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..553d5fbe42c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8_en_5.2.0_3.0_1699792540163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8_en_5.2.0_3.0_1699792540163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_8_base_1024d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_1024_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..38649954a9d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10_en_5.2.0_3.0_1699812103122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10_en_5.2.0_3.0_1699812103122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_10_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42_en.md new file mode 100644 index 000000000000..25d8c6240018 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-42` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42_en_5.2.0_3.0_1699804991277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42_en_5.2.0_3.0_1699804991277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_42_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..d262d8caffe2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4_en_5.2.0_3.0_1699801625595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4_en_5.2.0_3.0_1699801625595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_4_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..b5d08b0e6f1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6_en_5.2.0_3.0_1699815637334.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6_en_5.2.0_3.0_1699815637334.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_6_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..cde9d5c14d3d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8_en_5.2.0_3.0_1699800249353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8_en_5.2.0_3.0_1699800249353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_8_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..e20489a378ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-0` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0_en_5.2.0_3.0_1699817675983.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0_en_5.2.0_3.0_1699817675983.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_0_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..ffec2d16282a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10_en_5.2.0_3.0_1699802159274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10_en_5.2.0_3.0_1699802159274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_10_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..af721a2e3556 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en_5.2.0_3.0_1699794852712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en_5.2.0_3.0_1699794852712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..74940063368b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4_en_5.2.0_3.0_1699797006561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4_en_5.2.0_3.0_1699797006561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_4_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..7d7032a8a8e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6_en_5.2.0_3.0_1699798973643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6_en_5.2.0_3.0_1699798973643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_6_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..6462fc67df2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8_en_5.2.0_3.0_1699819269765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8_en_5.2.0_3.0_1699819269765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_8_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..722a34368ab8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-256-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10_en_5.2.0_3.0_1699806905048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10_en_5.2.0_3.0_1699806905048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_10_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-256-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..6b3e8b52de17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-256-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2_en_5.2.0_3.0_1699803627711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2_en_5.2.0_3.0_1699803627711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-256-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..3686f064729b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-256-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4_en_5.2.0_3.0_1699821131222.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4_en_5.2.0_3.0_1699821131222.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_4_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-256-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..4df67c0a0dcc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-256-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6_en_5.2.0_3.0_1699822967928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6_en_5.2.0_3.0_1699822967928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_6_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-256-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..7580d4e989c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-256-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8_en_5.2.0_3.0_1699808541247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8_en_5.2.0_3.0_1699808541247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_8_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_256_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-256-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..ba48c2d299de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-32-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10_en_5.2.0_3.0_1699810325201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10_en_5.2.0_3.0_1699810325201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_10_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-32-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..c17adeec7985 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-32-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2_en_5.2.0_3.0_1699803627728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2_en_5.2.0_3.0_1699803627728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-32-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..7678de92be3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-32-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4_en_5.2.0_3.0_1699824633822.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4_en_5.2.0_3.0_1699824633822.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_4_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-32-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..f4d8ad230e99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-32-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6_en_5.2.0_3.0_1699812114701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6_en_5.2.0_3.0_1699812114701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_6_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-32-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..8bed06eae488 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-512-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10_en_5.2.0_3.0_1699814236059.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10_en_5.2.0_3.0_1699814236059.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_10_base_512d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-512-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..dd696e0a86a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-512-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2_en_5.2.0_3.0_1699805511614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2_en_5.2.0_3.0_1699805511614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_512d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-512-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..d9c113acc570 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-512-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4_en_5.2.0_3.0_1699802944662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4_en_5.2.0_3.0_1699802944662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_4_base_512d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-512-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..808635c14167 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-512-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6_en_5.2.0_3.0_1699826674380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6_en_5.2.0_3.0_1699826674380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_6_base_512d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-512-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..15cb47e47c5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-512-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8_en_5.2.0_3.0_1699805501759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8_en_5.2.0_3.0_1699805501759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_8_base_512d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_512_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-512-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..656dfcb1026e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-64-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10_en_5.2.0_3.0_1699828184256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10_en_5.2.0_3.0_1699828184256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_10_base_64d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-64-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..21cfe8d71327 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-64-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2_en_5.2.0_3.0_1699829742647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2_en_5.2.0_3.0_1699829742647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_64d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-64-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_news_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_news_en.md new file mode 100644 index 000000000000..b81aeb991753 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_news_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_finetuned_news BertForQuestionAnswering from mirbostani +author: John Snow Labs +name: bert_qa_base_uncased_finetuned_news +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_finetuned_news` is a English model originally trained by mirbostani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_finetuned_news_en_5.2.0_3.0_1699782754816.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_finetuned_news_en_5.2.0_3.0_1699782754816.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_finetuned_news","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_finetuned_news", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_finetuned_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/mirbostani/bert-base-uncased-finetuned-newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_squad_finetuned_trivia_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_squad_finetuned_trivia_en.md new file mode 100644 index 000000000000..eb260de13b14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_squad_finetuned_trivia_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from FabianWillner) +author: John Snow Labs +name: bert_qa_base_uncased_finetuned_squad_finetuned_trivia +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad-finetuned-triviaqa` is a English model originally trained by `FabianWillner`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_finetuned_squad_finetuned_trivia_en_5.2.0_3.0_1699804899623.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_finetuned_squad_finetuned_trivia_en_5.2.0_3.0_1699804899623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_finetuned_squad_finetuned_trivia","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_finetuned_squad_finetuned_trivia","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_finetuned_squad_finetuned_trivia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/FabianWillner/bert-base-uncased-finetuned-squad-finetuned-triviaqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_trivia_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_trivia_finetuned_squad_en.md new file mode 100644 index 000000000000..7a37538b9328 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_finetuned_trivia_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from FabianWillner) +author: John Snow Labs +name: bert_qa_base_uncased_finetuned_trivia_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-triviaqa-finetuned-squad` is a English model originally trained by `FabianWillner`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_finetuned_trivia_finetuned_squad_en_5.2.0_3.0_1699831353838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_finetuned_trivia_finetuned_squad_en_5.2.0_3.0_1699831353838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_finetuned_trivia_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_finetuned_trivia_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_finetuned_trivia_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/FabianWillner/bert-base-uncased-finetuned-triviaqa-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_pretrain_finetuned_coqa_fal_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_pretrain_finetuned_coqa_fal_en.md new file mode 100644 index 000000000000..5e23489cf6b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_pretrain_finetuned_coqa_fal_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from alistvt) +author: John Snow Labs +name: bert_qa_base_uncased_pretrain_finetuned_coqa_fal +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-pretrain-finetuned-coqa-falt` is a English model originally trained by `alistvt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_pretrain_finetuned_coqa_fal_en_5.2.0_3.0_1699833119214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_pretrain_finetuned_coqa_fal_en_5.2.0_3.0_1699833119214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_pretrain_finetuned_coqa_fal","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_pretrain_finetuned_coqa_fal","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.uncased_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_pretrain_finetuned_coqa_fal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/alistvt/bert-base-uncased-pretrain-finetuned-coqa-falt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_spanish_sign_language_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_spanish_sign_language_en.md new file mode 100644 index 000000000000..1a07c465d364 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_spanish_sign_language_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_base_uncased_spanish_sign_language BertForQuestionAnswering from michaelrglass +author: John Snow Labs +name: bert_qa_base_uncased_spanish_sign_language +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_spanish_sign_language` is a English model originally trained by michaelrglass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_spanish_sign_language_en_5.2.0_3.0_1699815469233.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_spanish_sign_language_en_5.2.0_3.0_1699815469233.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_spanish_sign_language","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_spanish_sign_language", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_spanish_sign_language| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|258.6 MB| + +## References + +https://huggingface.co/michaelrglass/bert-base-uncased-sspt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1_en.md new file mode 100644 index 000000000000..45403e30f755 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1_en_5.2.0_3.0_1699782975867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1_en_5.2.0_3.0_1699782975867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad1.1_block_sparse_0.07_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|132.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squad1.1-block-sparse-0.07-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1_en.md new file mode 100644 index 000000000000..3e1d65acaa86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1_en_5.2.0_3.0_1699781603540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1_en_5.2.0_3.0_1699781603540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad1.1_block_sparse_0.13_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|148.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squad1.1-block-sparse-0.13-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1_en.md new file mode 100644 index 000000000000..5e50ade13aef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1_en_5.2.0_3.0_1699783190361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1_en_5.2.0_3.0_1699783190361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad1.1_block_sparse_0.20_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|172.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squad1.1-block-sparse-0.20-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1_en.md new file mode 100644 index 000000000000..74c178d332c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1_en_5.2.0_3.0_1699783735181.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1_en_5.2.0_3.0_1699783735181.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad1.1_block_sparse_0.32_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|207.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squad1.1-block-sparse-0.32-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1_en.md new file mode 100644 index 000000000000..7f6295b278a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from sguskin) +author: John Snow Labs +name: bert_qa_base_uncased_squad1 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad1` is a English model originally trained by `sguskin`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1_en_5.2.0_3.0_1699806905051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad1_en_5.2.0_3.0_1699806905051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad1","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad1","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sguskin/bert-base-uncased-squad1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1.0_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1.0_finetuned_en.md new file mode 100644 index 000000000000..79c238d175e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1.0_finetuned_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from kamalkraj) +author: John Snow Labs +name: bert_qa_base_uncased_squad_v1.0_finetuned +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad-v1.0-finetuned` is a English model originally trained by `kamalkraj`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v1.0_finetuned_en_5.2.0_3.0_1699807969771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v1.0_finetuned_en_5.2.0_3.0_1699807969771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad_v1.0_finetuned","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad_v1.0_finetuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad_v1.0_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kamalkraj/bert-base-uncased-squad-v1.0-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1_en.md new file mode 100644 index 000000000000..f46908af2041 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_base_uncased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v1_en_5.2.0_3.0_1699782199497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v1_en_5.2.0_3.0_1699782199497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/bert-base-uncased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1_sparse0.25_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1_sparse0.25_en.md new file mode 100644 index 000000000000..143e31771935 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squad_v1_sparse0.25_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squad_v1_sparse0.25 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squad_v1_sparse0.25 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squad_v1_sparse0.25` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v1_sparse0.25_en_5.2.0_3.0_1699783981765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v1_sparse0.25_en_5.2.0_3.0_1699783981765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad_v1_sparse0.25","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squad_v1_sparse0.25", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad_v1_sparse0.25| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|194.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squad-v1-sparse0.25 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1_en.md new file mode 100644 index 000000000000..98f96456d419 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1_en_5.2.0_3.0_1699782439487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1_en_5.2.0_3.0_1699782439487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squadv1_x1.16_f88.1_d8_unstruct_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|145.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squadv1-x1.16-f88.1-d8-unstruct-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1_en.md new file mode 100644 index 000000000000..8ae608e93c57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1_en_5.2.0_3.0_1699782682798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1_en_5.2.0_3.0_1699782682798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squadv1_x1.84_f88.7_d36_hybrid_filled_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|205.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squadv1-x1.84-f88.7-d36-hybrid-filled-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1_en.md new file mode 100644 index 000000000000..22a2c0e8583d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1_en_5.2.0_3.0_1699782925798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1_en_5.2.0_3.0_1699782925798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squadv1_x1.96_f88.3_d27_hybrid_filled_opt_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|187.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squadv1-x1.96-f88.3-d27-hybrid-filled-opt-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1_en.md new file mode 100644 index 000000000000..0e80146bb450 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1_en_5.2.0_3.0_1699781844341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1_en_5.2.0_3.0_1699781844341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squadv1_x2.01_f89.2_d30_hybrid_rewind_opt_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|193.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1_en.md new file mode 100644 index 000000000000..5aae8945ef24 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1_en_5.2.0_3.0_1699784261780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1_en_5.2.0_3.0_1699784261780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squadv1_x2.32_f86.6_d15_hybrid_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|148.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1_en.md new file mode 100644 index 000000000000..b7d2d5e82e3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1_en_5.2.0_3.0_1699783123612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1_en_5.2.0_3.0_1699783123612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squadv1_x2.44_f87.7_d26_hybrid_filled_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|173.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-base-uncased-squadv1-x2.44-f87.7-d26-hybrid-filled-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_battery_cased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_battery_cased_squad_v1_en.md new file mode 100644 index 000000000000..c1488167a515 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_battery_cased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_battery_cased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_battery_cased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_battery_cased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_battery_cased_squad_v1_en_5.2.0_3.0_1699783036370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_battery_cased_squad_v1_en_5.2.0_3.0_1699783036370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_battery_cased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_battery_cased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_battery_cased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/batterybert-cased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_battery_uncased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_battery_uncased_squad_v1_en.md new file mode 100644 index 000000000000..df5c31f5825f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_battery_uncased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_battery_uncased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_battery_uncased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_battery_uncased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_battery_uncased_squad_v1_en_5.2.0_3.0_1699783423288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_battery_uncased_squad_v1_en_5.2.0_3.0_1699783423288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_battery_uncased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_battery_uncased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_battery_uncased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/batterybert-uncased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batteryonly_cased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batteryonly_cased_squad_v1_en.md new file mode 100644 index 000000000000..9c781392d725 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batteryonly_cased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_batteryonly_cased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_batteryonly_cased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_batteryonly_cased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_batteryonly_cased_squad_v1_en_5.2.0_3.0_1699784577318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_batteryonly_cased_squad_v1_en_5.2.0_3.0_1699784577318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_batteryonly_cased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_batteryonly_cased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_batteryonly_cased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/batteryonlybert-cased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batteryonly_uncased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batteryonly_uncased_squad_v1_en.md new file mode 100644 index 000000000000..c2e9a47aea3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batteryonly_uncased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_batteryonly_uncased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_batteryonly_uncased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_batteryonly_uncased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_batteryonly_uncased_squad_v1_en_5.2.0_3.0_1699782126318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_batteryonly_uncased_squad_v1_en_5.2.0_3.0_1699782126318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_batteryonly_uncased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_batteryonly_uncased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_batteryonly_uncased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/batteryonlybert-uncased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batterysci_cased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batterysci_cased_squad_v1_en.md new file mode 100644 index 000000000000..47ccc54327f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batterysci_cased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_batterysci_cased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_batterysci_cased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_batterysci_cased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_batterysci_cased_squad_v1_en_5.2.0_3.0_1699783492180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_batterysci_cased_squad_v1_en_5.2.0_3.0_1699783492180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_batterysci_cased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_batterysci_cased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_batterysci_cased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/batteryscibert-cased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batterysci_uncased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batterysci_uncased_squad_v1_en.md new file mode 100644 index 000000000000..1f9ee7d92de1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_batterysci_uncased_squad_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_batterysci_uncased_squad_v1 BertForQuestionAnswering from batterydata +author: John Snow Labs +name: bert_qa_batterysci_uncased_squad_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_batterysci_uncased_squad_v1` is a English model originally trained by batterydata. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_batterysci_uncased_squad_v1_en_5.2.0_3.0_1699783784910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_batterysci_uncased_squad_v1_en_5.2.0_3.0_1699783784910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_batterysci_uncased_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_batterysci_uncased_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_batterysci_uncased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|410.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/batterydata/batteryscibert-uncased-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert001_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert001_en.md new file mode 100644 index 000000000000..d0140713ba9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert001_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_bert001 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert001` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert001_en_5.2.0_3.0_1699819796932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert001_en_5.2.0_3.0_1699819796932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bert001","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bert001","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert001| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/bert001 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert003_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert003_en.md new file mode 100644 index 000000000000..a54b40704745 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert003_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_bert003 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert003` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert003_en_5.2.0_3.0_1699808541136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert003_en_5.2.0_3.0_1699808541136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bert003","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bert003","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/bert003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_1024_full_trivia_copied_embeddings_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_1024_full_trivia_copied_embeddings_en.md new file mode 100644 index 000000000000..d19a1bfffe98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_1024_full_trivia_copied_embeddings_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MrAnderson) +author: John Snow Labs +name: bert_qa_bert_base_1024_full_trivia_copied_embeddings +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-1024-full-trivia-copied-embeddings` is a English model orginally trained by `MrAnderson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_1024_full_trivia_copied_embeddings_en_5.2.0_3.0_1699825506288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_1024_full_trivia_copied_embeddings_en_5.2.0_3.0_1699825506288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_1024_full_trivia_copied_embeddings","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_1024_full_trivia_copied_embeddings","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.trivia.bert.base_1024d").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_1024_full_trivia_copied_embeddings| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MrAnderson/bert-base-1024-full-trivia-copied-embeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_4096_full_trivia_copied_embeddings_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_4096_full_trivia_copied_embeddings_en.md new file mode 100644 index 000000000000..2fc545583ebf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_4096_full_trivia_copied_embeddings_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MrAnderson) +author: John Snow Labs +name: bert_qa_bert_base_4096_full_trivia_copied_embeddings +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-4096-full-trivia-copied-embeddings` is a English model orginally trained by `MrAnderson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_4096_full_trivia_copied_embeddings_en_5.2.0_3.0_1699812642173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_4096_full_trivia_copied_embeddings_en_5.2.0_3.0_1699812642173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_4096_full_trivia_copied_embeddings","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_4096_full_trivia_copied_embeddings","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.trivia.bert.base_4096.by_MrAnderson").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_4096_full_trivia_copied_embeddings| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|417.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MrAnderson/bert-base-4096-full-trivia-copied-embeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_512_full_trivia_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_512_full_trivia_en.md new file mode 100644 index 000000000000..44c133a86777 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_512_full_trivia_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MrAnderson) +author: John Snow Labs +name: bert_qa_bert_base_512_full_trivia +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-512-full-trivia` is a English model orginally trained by `MrAnderson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_512_full_trivia_en_5.2.0_3.0_1699827497585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_512_full_trivia_en_5.2.0_3.0_1699827497585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_512_full_trivia","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_512_full_trivia","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.trivia.bert.base_512d").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_512_full_trivia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MrAnderson/bert-base-512-full-trivia \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_chaii_en.md new file mode 100644 index 000000000000..5b65349ebcdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_base_cased_chaii +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_chaii_en_5.2.0_3.0_1699812120231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_chaii_en_5.2.0_3.0_1699812120231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_cased_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_cased_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_cased_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-base-cased-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_finetuned_squad_test_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_finetuned_squad_test_en.md new file mode 100644 index 000000000000..3c108cd1a3e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_finetuned_squad_test_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ncduy) +author: John Snow Labs +name: bert_qa_bert_base_cased_finetuned_squad_test +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad-test` is a English model orginally trained by `ncduy`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_finetuned_squad_test_en_5.2.0_3.0_1699830895202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_finetuned_squad_test_en_5.2.0_3.0_1699830895202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_cased_finetuned_squad_test","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_cased_finetuned_squad_test","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_cased.by_ncduy").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_cased_finetuned_squad_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ncduy/bert-base-cased-finetuned-squad-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_iuchatbot_ontologydts_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_iuchatbot_ontologydts_en.md new file mode 100644 index 000000000000..5d478935df76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_cased_iuchatbot_ontologydts_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_base_cased_iuchatbot_ontologydts BertForQuestionAnswering from nntadotzip +author: John Snow Labs +name: bert_qa_bert_base_cased_iuchatbot_ontologydts +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_base_cased_iuchatbot_ontologydts` is a English model originally trained by nntadotzip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_iuchatbot_ontologydts_en_5.2.0_3.0_1699829027066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_iuchatbot_ontologydts_en_5.2.0_3.0_1699829027066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_cased_iuchatbot_ontologydts","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_base_cased_iuchatbot_ontologydts", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_cased_iuchatbot_ontologydts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/nntadotzip/bert-base-cased-IUChatbot-ontologyDts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_persian_farsi_qa_fa.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_persian_farsi_qa_fa.md new file mode 100644 index 000000000000..9ee910fb852d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_persian_farsi_qa_fa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Persian bert_qa_bert_base_persian_farsi_qa BertForQuestionAnswering from SajjadAyoubi +author: John Snow Labs +name: bert_qa_bert_base_persian_farsi_qa +date: 2023-11-12 +tags: [bert, fa, open_source, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_base_persian_farsi_qa` is a Persian model originally trained by SajjadAyoubi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_persian_farsi_qa_fa_5.2.0_3.0_1699814025272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_persian_farsi_qa_fa_5.2.0_3.0_1699814025272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_persian_farsi_qa","fa") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_base_persian_farsi_qa", "fa") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_persian_farsi_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.5 MB| + +## References + +https://huggingface.co/SajjadAyoubi/bert-base-fa-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar_es.md new file mode 100644 index 000000000000..b3493c3179d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar +date: 2023-11-12 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-cased-finetuned-qa-tar` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar_es_5.2.0_3.0_1699820076531.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar_es_5.2.0_3.0_1699820076531.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.bert.base_cased.by_CenIA").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_tar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-cased-finetuned-qa-tar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_es.md new file mode 100644 index 000000000000..1e3284189141 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa +date: 2023-11-12 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-uncased-finetuned-qa-mlqa` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_es_5.2.0_3.0_1699817108555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_es_5.2.0_3.0_1699817108555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.mlqa.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_mlqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-uncased-finetuned-qa-mlqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac_es.md new file mode 100644 index 000000000000..dbe0359ffd58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac +date: 2023-11-12 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-uncased-finetuned-qa-sqac` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac_es_5.2.0_3.0_1699819125792.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac_es_5.2.0_3.0_1699819125792.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.sqac.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_sqac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-uncased-finetuned-qa-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar_es.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar_es.md new file mode 100644 index 000000000000..4631b19d15c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar +date: 2023-11-12 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-uncased-finetuned-qa-tar` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar_es_5.2.0_3.0_1699822047235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar_es_5.2.0_3.0_1699822047235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.bert.base_uncased.by_CenIA").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_uncased_finetuned_qa_tar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-uncased-finetuned-qa-tar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_swedish_cased_squad_experimental_sv.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_swedish_cased_squad_experimental_sv.md new file mode 100644 index 000000000000..75f991866031 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_swedish_cased_squad_experimental_sv.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Swedish BertForQuestionAnswering model (from KB) +author: John Snow Labs +name: bert_qa_bert_base_swedish_cased_squad_experimental +date: 2023-11-12 +tags: [open_source, question_answering, bert, sv, onnx] +task: Question Answering +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-swedish-cased-squad-experimental` is a Swedish model orginally trained by `KB`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_swedish_cased_squad_experimental_sv_5.2.0_3.0_1699819009756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_swedish_cased_squad_experimental_sv_5.2.0_3.0_1699819009756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_swedish_cased_squad_experimental","sv") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_swedish_cased_squad_experimental","sv") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("sv.answer_question.squad.bert.base_cased.by_KB").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_swedish_cased_squad_experimental| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|sv| +|Size:|465.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/KB/bert-base-swedish-cased-squad-experimental \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr.md new file mode 100644 index 000000000000..ff2cffde5527 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering model (from husnu) +author: John Snow Labs +name: bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3 +date: 2023-11-12 +tags: [open_source, question_answering, bert, tr, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-cased-finetuned_lr-2e-05_epochs-3` is a Turkish model orginally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1699821252993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1699821252993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3","tr") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/bert-base-turkish-cased-finetuned_lr-2e-05_epochs-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42_en.md new file mode 100644 index 000000000000..596257d0b91f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-42` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42_en_5.2.0_3.0_1699823850067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42_en_5.2.0_3.0_1699823850067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_1024d_seed_42").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_few_shot_k_1024_finetuned_squad_seed_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-1024-finetuned-squad-seed-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..4e676eb68949 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0_en_5.2.0_3.0_1699825506203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0_en_5.2.0_3.0_1699825506203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_128d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_few_shot_k_128_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42_en.md new file mode 100644 index 000000000000..f569a38b50b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-42` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42_en_5.2.0_3.0_1699821131822.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42_en_5.2.0_3.0_1699821131822.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_seed_42").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_few_shot_k_16_finetuned_squad_seed_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..cf5aa4dc2d48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-256-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0_en_5.2.0_3.0_1699823144608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0_en_5.2.0_3.0_1699823144608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_256d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_few_shot_k_256_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-256-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..96df84acf132 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-32-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0_en_5.2.0_3.0_1699823217399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0_en_5.2.0_3.0_1699823217399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_32d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_few_shot_k_32_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-32-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..9967a251ac34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-512-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0_en_5.2.0_3.0_1699825076044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0_en_5.2.0_3.0_1699825076044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_512d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_few_shot_k_512_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-512-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..38972f7e0ba7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-64-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0_en_5.2.0_3.0_1699825076110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0_en_5.2.0_3.0_1699825076110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_64d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_few_shot_k_64_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-64-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_infovqa_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_infovqa_en.md new file mode 100644 index 000000000000..4dad5b8b17c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_infovqa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from tiennvcs) +author: John Snow Labs +name: bert_qa_bert_base_uncased_finetuned_infovqa +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-infovqa` is a English model orginally trained by `tiennvcs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_infovqa_en_5.2.0_3.0_1699827080467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_infovqa_en_5.2.0_3.0_1699827080467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_finetuned_infovqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_finetuned_infovqa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.infovqa.base_uncased.by_tiennvcs").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_finetuned_infovqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tiennvcs/bert-base-uncased-finetuned-infovqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_squad_v1_en.md new file mode 100644 index 000000000000..4ddec6968b31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_squad_v1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from lewtun) +author: John Snow Labs +name: bert_qa_bert_base_uncased_finetuned_squad_v1 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad-v1` is a English model orginally trained by `lewtun`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_squad_v1_en_5.2.0_3.0_1699827083140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_squad_v1_en_5.2.0_3.0_1699827083140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_finetuned_squad_v1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_finetuned_squad_v1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_lewtun").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_finetuned_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/lewtun/bert-base-uncased-finetuned-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa_vi.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa_vi.md new file mode 100644 index 000000000000..a2bf597901bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa_vi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Vietnamese bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa BertForQuestionAnswering from tiennvcs +author: John Snow Labs +name: bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa +date: 2023-11-12 +tags: [bert, vi, open_source, question_answering, onnx] +task: Question Answering +language: vi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa` is a Vietnamese model originally trained by tiennvcs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa_vi_5.2.0_3.0_1699827289415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa_vi_5.2.0_3.0_1699827289415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa","vi") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa", "vi") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_finetuned_vietnamese_infovqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|vi| +|Size:|407.2 MB| + +## References + +https://huggingface.co/tiennvcs/bert-base-uncased-finetuned-vi-infovqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_qa_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_qa_squad2_en.md new file mode 100644 index 000000000000..d16675dfb7dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_qa_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Vasanth) +author: John Snow Labs +name: bert_qa_bert_base_uncased_qa_squad2 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-qa-squad2` is a English model orginally trained by `Vasanth`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_qa_squad2_en_5.2.0_3.0_1699828845456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_qa_squad2_en_5.2.0_3.0_1699828845456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_qa_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_qa_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.base_uncased.by_Vasanth").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_qa_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Vasanth/bert-base-uncased-qa-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2_en.md new file mode 100644 index 000000000000..a9aea21ec009 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from madlag) +author: John Snow Labs +name: bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad1.1-pruned-x3.2-v2` is a English model orginally trained by `madlag`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2_en_5.2.0_3.0_1699830948837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2_en_5.2.0_3.0_1699830948837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_v2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_squad1.1_pruned_x3.2_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|171.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/madlag/bert-base-uncased-squad1.1-pruned-x3.2-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squad_l3_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squad_l3_en.md new file mode 100644 index 000000000000..cf9dafe0f9b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squad_l3_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_base_uncased_squad_l3 BertForQuestionAnswering from howey +author: John Snow Labs +name: bert_qa_bert_base_uncased_squad_l3 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_base_uncased_squad_l3` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad_l3_en_5.2.0_3.0_1699828370740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad_l3_en_5.2.0_3.0_1699828370740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_squad_l3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_base_uncased_squad_l3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_squad_l3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|168.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/howey/bert_base_uncased_squad_L3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md new file mode 100644 index 000000000000..35a29c2fa86d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Intel) +author: John Snow Labs +name: bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squadv1.1-sparse-80-1x4-block-pruneofa` is a English model orginally trained by `Intel`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1699828847199.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1699828847199.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_Intel").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_squadv1.1_sparse_80_1x4_block_pruneofa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|178.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Intel/bert-base-uncased-squadv1.1-sparse-80-1x4-block-pruneofa +- https://arxiv.org/abs/2111.05754 +- https://github.com/IntelLabs/Model-Compression-Research-Package/tree/main/research/prune-once-for-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_chinese_finetuned_zh.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_chinese_finetuned_zh.md new file mode 100644 index 000000000000..a82db84a5dac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_chinese_finetuned_zh.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from jackh1995) +author: John Snow Labs +name: bert_qa_bert_chinese_finetuned +date: 2023-11-12 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-chinese-finetuned` is a Chinese model orginally trained by `jackh1995`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_chinese_finetuned_zh_5.2.0_3.0_1699830895206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_chinese_finetuned_zh_5.2.0_3.0_1699830895206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_chinese_finetuned","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_chinese_finetuned","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.by_jackh1995").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_chinese_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|380.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jackh1995/bert-chinese-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_en.md new file mode 100644 index 000000000000..83dc2643f16a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from nlpunibo) +author: John Snow Labs +name: bert_qa_bert +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert` is a English model orginally trained by `nlpunibo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_en_5.2.0_3.0_1699817957155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_en_5.2.0_3.0_1699817957155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_nlpunibo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/nlpunibo/bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_finetuned_squad1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_finetuned_squad1_en.md new file mode 100644 index 000000000000..91a32cff15c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_finetuned_squad1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Ghost1) +author: John Snow Labs +name: bert_qa_bert_finetuned_squad1 +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad1` is a English model orginally trained by `Ghost1`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_squad1_en_5.2.0_3.0_1699829857657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_squad1_en_5.2.0_3.0_1699829857657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_finetuned_squad1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_finetuned_squad1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_Ghost1").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_finetuned_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Ghost1/bert-finetuned-squad1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen_en.md new file mode 100644 index 000000000000..4cc26786e350 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from DaisyMak) +author: John Snow Labs +name: bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen +date: 2023-11-12 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate-10epoch_transformerfrozen` is a English model orginally trained by `DaisyMak`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen_en_5.2.0_3.0_1699831597359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen_en_5.2.0_3.0_1699831597359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_DaisyMak").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_finetuned_squad_accelerate_10epoch_transformerfrozen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/DaisyMak/bert-finetuned-squad-accelerate-10epoch_transformerfrozen \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_ft_nepal_bhasa_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_ft_nepal_bhasa_newsqa_en.md new file mode 100644 index 000000000000..780bbfd266c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_ft_nepal_bhasa_newsqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_bert_ft_nepal_bhasa_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_bert_ft_nepal_bhasa_newsqa +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_ft_nepal_bhasa_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1699810716617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1699810716617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_ft_nepal_bhasa_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_ft_nepal_bhasa_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_ft_nepal_bhasa_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/AnonymousSub/bert_FT_new_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_ft_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_ft_newsqa_en.md new file mode 100644 index 000000000000..95f8696efe7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bert_ft_newsqa_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_ft_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_bert_ft_newsqa +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_ft_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_ft_newsqa_en_5.2.0_3.0_1699810324949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_ft_newsqa_en_5.2.0_3.0_1699810324949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_ft_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_ft_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_ft_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/AnonymousSub/bert_FT_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bertv1_fine_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bertv1_fine_en.md new file mode 100644 index 000000000000..084f9e220341 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_bertv1_fine_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bertv1_fine BertForQuestionAnswering from JAlexis +author: John Snow Labs +name: bert_qa_bertv1_fine +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bertv1_fine` is a English model originally trained by JAlexis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertv1_fine_en_5.2.0_3.0_1699783655316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertv1_fine_en_5.2.0_3.0_1699783655316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertv1_fine","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bertv1_fine", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertv1_fine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/JAlexis/Bertv1_fine \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_berta_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_berta_en.md new file mode 100644 index 000000000000..cedd9e3b8986 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_berta_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_covid_berta BertForQuestionAnswering from rahulkuruvilla +author: John Snow Labs +name: bert_qa_covid_berta +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_covid_berta` is a English model originally trained by rahulkuruvilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_covid_berta_en_5.2.0_3.0_1699785648056.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_covid_berta_en_5.2.0_3.0_1699785648056.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_covid_berta","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_covid_berta", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_covid_berta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/rahulkuruvilla/COVID-BERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_bertb_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_bertb_en.md new file mode 100644 index 000000000000..c6331cf3939d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_bertb_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_covid_bertb BertForQuestionAnswering from rahulkuruvilla +author: John Snow Labs +name: bert_qa_covid_bertb +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_covid_bertb` is a English model originally trained by rahulkuruvilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_covid_bertb_en_5.2.0_3.0_1699784684258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_covid_bertb_en_5.2.0_3.0_1699784684258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_covid_bertb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_covid_bertb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_covid_bertb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/rahulkuruvilla/COVID-BERTb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_bertc_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_bertc_en.md new file mode 100644 index 000000000000..be216d548049 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_covid_bertc_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_covid_bertc BertForQuestionAnswering from rahulkuruvilla +author: John Snow Labs +name: bert_qa_covid_bertc +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_covid_bertc` is a English model originally trained by rahulkuruvilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_covid_bertc_en_5.2.0_3.0_1699785238336.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_covid_bertc_en_5.2.0_3.0_1699785238336.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_covid_bertc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_covid_bertc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_covid_bertc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/rahulkuruvilla/COVID-BERTc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_dist_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_dist_squad2_en.md new file mode 100644 index 000000000000..c134f5a36b16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_dist_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_dist_squad2 BertForQuestionAnswering from Shobhank-iiitdwd +author: John Snow Labs +name: bert_qa_dist_squad2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_dist_squad2` is a English model originally trained by Shobhank-iiitdwd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_dist_squad2_en_5.2.0_3.0_1699784824216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_dist_squad2_en_5.2.0_3.0_1699784824216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_dist_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_dist_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_dist_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|248.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Shobhank-iiitdwd/DistBERT-squad2-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_distiled_medium_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_distiled_medium_squad2_en.md new file mode 100644 index 000000000000..07855ac7b4e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_distiled_medium_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_distiled_medium_squad2 BertForQuestionAnswering from Shobhank-iiitdwd +author: John Snow Labs +name: bert_qa_distiled_medium_squad2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_distiled_medium_squad2` is a English model originally trained by Shobhank-iiitdwd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_distiled_medium_squad2_en_5.2.0_3.0_1699782361823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_distiled_medium_squad2_en_5.2.0_3.0_1699782361823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_distiled_medium_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_distiled_medium_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_distiled_medium_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|154.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Shobhank-iiitdwd/Distiled-bert-medium-squad2-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_fardinsaboori_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_fardinsaboori_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..3588769bda77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_fardinsaboori_bert_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_fardinsaboori_bert_finetuned_squad BertForQuestionAnswering from FardinSaboori +author: John Snow Labs +name: bert_qa_fardinsaboori_bert_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_fardinsaboori_bert_finetuned_squad` is a English model originally trained by FardinSaboori. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fardinsaboori_bert_finetuned_squad_en_5.2.0_3.0_1699785542325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fardinsaboori_bert_finetuned_squad_en_5.2.0_3.0_1699785542325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fardinsaboori_bert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_fardinsaboori_bert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fardinsaboori_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/FardinSaboori/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_gbertqna_de.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_gbertqna_de.md new file mode 100644 index 000000000000..9323a8997eb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_gbertqna_de.md @@ -0,0 +1,95 @@ +--- +layout: model +title: German bert_qa_gbertqna BertForQuestionAnswering from Sahajtomar +author: John Snow Labs +name: bert_qa_gbertqna +date: 2023-11-12 +tags: [bert, de, open_source, question_answering, onnx] +task: Question Answering +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_gbertqna` is a German model originally trained by Sahajtomar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_gbertqna_de_5.2.0_3.0_1699783578911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_gbertqna_de_5.2.0_3.0_1699783578911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_gbertqna","de") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_gbertqna", "de") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_gbertqna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Sahajtomar/GBERTQnA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_graphcore_bert_large_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_graphcore_bert_large_uncased_squad_en.md new file mode 100644 index 000000000000..f38101123741 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_graphcore_bert_large_uncased_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_graphcore_bert_large_uncased_squad BertForQuestionAnswering from Graphcore +author: John Snow Labs +name: bert_qa_graphcore_bert_large_uncased_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_graphcore_bert_large_uncased_squad` is a English model originally trained by Graphcore. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_graphcore_bert_large_uncased_squad_en_5.2.0_3.0_1699786550142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_graphcore_bert_large_uncased_squad_en_5.2.0_3.0_1699786550142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_graphcore_bert_large_uncased_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_graphcore_bert_large_uncased_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_graphcore_bert_large_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|797.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Graphcore/bert-large-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_harsit_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_harsit_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..af45519bae17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_harsit_bert_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_harsit_bert_finetuned_squad BertForQuestionAnswering from Harsit +author: John Snow Labs +name: bert_qa_harsit_bert_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_harsit_bert_finetuned_squad` is a English model originally trained by Harsit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_harsit_bert_finetuned_squad_en_5.2.0_3.0_1699786550713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_harsit_bert_finetuned_squad_en_5.2.0_3.0_1699786550713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_harsit_bert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_harsit_bert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_harsit_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Harsit/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_indo_id.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_indo_id.md new file mode 100644 index 000000000000..00c1d4133167 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_indo_id.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Indonesian bert_qa_indo BertForQuestionAnswering from Rifky +author: John Snow Labs +name: bert_qa_indo +date: 2023-11-12 +tags: [bert, id, open_source, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_indo` is a Indonesian model originally trained by Rifky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_indo_id_5.2.0_3.0_1699785132714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_indo_id_5.2.0_3.0_1699785132714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_indo","id") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_indo", "id") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_indo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|411.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Rifky/Indobert-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_italian_finedtuned_squadv1_italian_alfa_it.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_italian_finedtuned_squadv1_italian_alfa_it.md new file mode 100644 index 000000000000..2e00602638bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_italian_finedtuned_squadv1_italian_alfa_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian bert_qa_italian_finedtuned_squadv1_italian_alfa BertForQuestionAnswering from mrm8488 +author: John Snow Labs +name: bert_qa_italian_finedtuned_squadv1_italian_alfa +date: 2023-11-12 +tags: [bert, it, open_source, question_answering, onnx] +task: Question Answering +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_italian_finedtuned_squadv1_italian_alfa` is a Italian model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_italian_finedtuned_squadv1_italian_alfa_it_5.2.0_3.0_1699782548719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_italian_finedtuned_squadv1_italian_alfa_it_5.2.0_3.0_1699782548719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_italian_finedtuned_squadv1_italian_alfa","it") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_italian_finedtuned_squadv1_italian_alfa", "it") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_italian_finedtuned_squadv1_italian_alfa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|it| +|Size:|409.6 MB| + +## References + +https://huggingface.co/mrm8488/bert-italian-finedtuned-squadv1-it-alfa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_kevinchoi_bert_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_kevinchoi_bert_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..eb2f64c6dc20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_kevinchoi_bert_finetuned_squad_accelerate_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_kevinchoi_bert_finetuned_squad_accelerate BertForQuestionAnswering from KevinChoi +author: John Snow Labs +name: bert_qa_kevinchoi_bert_finetuned_squad_accelerate +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_kevinchoi_bert_finetuned_squad_accelerate` is a English model originally trained by KevinChoi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kevinchoi_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1699786794507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kevinchoi_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1699786794507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kevinchoi_bert_finetuned_squad_accelerate","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_kevinchoi_bert_finetuned_squad_accelerate", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kevinchoi_bert_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/KevinChoi/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_kevinchoi_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_kevinchoi_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..9902d7976814 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_kevinchoi_bert_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_kevinchoi_bert_finetuned_squad BertForQuestionAnswering from KevinChoi +author: John Snow Labs +name: bert_qa_kevinchoi_bert_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_kevinchoi_bert_finetuned_squad` is a English model originally trained by KevinChoi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kevinchoi_bert_finetuned_squad_en_5.2.0_3.0_1699786850491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kevinchoi_bert_finetuned_squad_en_5.2.0_3.0_1699786850491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kevinchoi_bert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_kevinchoi_bert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kevinchoi_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/KevinChoi/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_klue_commonsense_model_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_klue_commonsense_model_en.md new file mode 100644 index 000000000000..dd8660ef00eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_klue_commonsense_model_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_klue_commonsense_model BertForQuestionAnswering from EasthShin +author: John Snow Labs +name: bert_qa_klue_commonsense_model +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_klue_commonsense_model` is a English model originally trained by EasthShin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_klue_commonsense_model_en_5.2.0_3.0_1699785924753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_klue_commonsense_model_en_5.2.0_3.0_1699785924753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_klue_commonsense_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_klue_commonsense_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_klue_commonsense_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|412.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/EasthShin/Klue-CommonSense-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_l_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_l_en.md new file mode 100644 index 000000000000..4148a4a26c18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_l_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_l BertForQuestionAnswering from Shobhank-iiitdwd +author: John Snow Labs +name: bert_qa_l +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_l` is a English model originally trained by Shobhank-iiitdwd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_l_en_5.2.0_3.0_1699784332739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_l_en_5.2.0_3.0_1699784332739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_l","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_l", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_l| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Shobhank-iiitdwd/BERT-L-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_cased_squad_v1.1_portuguese_pt.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_cased_squad_v1.1_portuguese_pt.md new file mode 100644 index 000000000000..3bd48924d816 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_cased_squad_v1.1_portuguese_pt.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Portuguese bert_qa_large_cased_squad_v1.1_portuguese BertForQuestionAnswering from pierreguillou +author: John Snow Labs +name: bert_qa_large_cased_squad_v1.1_portuguese +date: 2023-11-12 +tags: [bert, pt, open_source, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_cased_squad_v1.1_portuguese` is a Portuguese model originally trained by pierreguillou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_cased_squad_v1.1_portuguese_pt_5.2.0_3.0_1699784212057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_cased_squad_v1.1_portuguese_pt_5.2.0_3.0_1699784212057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_cased_squad_v1.1_portuguese","pt") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_cased_squad_v1.1_portuguese", "pt") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_cased_squad_v1.1_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/pierreguillou/bert-large-cased-squad-v1.1-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_cased_whole_word_masking_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_cased_whole_word_masking_finetuned_squad_en.md new file mode 100644 index 000000000000..3f39721f3178 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_cased_whole_word_masking_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_cased_whole_word_masking_finetuned_squad BertForQuestionAnswering from huggingface +author: John Snow Labs +name: bert_qa_large_cased_whole_word_masking_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_cased_whole_word_masking_finetuned_squad` is a English model originally trained by huggingface. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_cased_whole_word_masking_finetuned_squad_en_5.2.0_3.0_1699784210088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_cased_whole_word_masking_finetuned_squad_en_5.2.0_3.0_1699784210088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_cased_whole_word_masking_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_cased_whole_word_masking_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_cased_whole_word_masking_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_finetuned_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_finetuned_squad2_en.md new file mode 100644 index 000000000000..532e392439cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_finetuned_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_finetuned_squad2 BertForQuestionAnswering from phiyodr +author: John Snow Labs +name: bert_qa_large_finetuned_squad2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_finetuned_squad2` is a English model originally trained by phiyodr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_finetuned_squad2_en_5.2.0_3.0_1699785750435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_finetuned_squad2_en_5.2.0_3.0_1699785750435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_finetuned_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_finetuned_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/phiyodr/bert-large-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_squadv1.1_sparse_90_unstructured_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_squadv1.1_sparse_90_unstructured_en.md new file mode 100644 index 000000000000..952bfbf14311 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_squadv1.1_sparse_90_unstructured_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_squadv1.1_sparse_90_unstructured BertForQuestionAnswering from Intel +author: John Snow Labs +name: bert_qa_large_uncased_squadv1.1_sparse_90_unstructured +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_squadv1.1_sparse_90_unstructured` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_squadv1.1_sparse_90_unstructured_en_5.2.0_3.0_1699784745302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_squadv1.1_sparse_90_unstructured_en_5.2.0_3.0_1699784745302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_squadv1.1_sparse_90_unstructured","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_squadv1.1_sparse_90_unstructured", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_squadv1.1_sparse_90_unstructured| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|362.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Intel/bert-large-uncased-squadv1.1-sparse-90-unstructured \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_finetuned_squad_en.md new file mode 100644 index 000000000000..c7b49265f719 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Uncased model (from Jiqing) +author: John Snow Labs +name: bert_qa_large_uncased_whole_word_masking_finetuned_squad +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-finetuned-squad-finetuned-squad` is a English model originally trained by `Jiqing`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_finetuned_squad_en_5.2.0_3.0_1699784867260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_finetuned_squad_en_5.2.0_3.0_1699784867260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_whole_word_masking_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_whole_word_masking_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_whole_word_masking_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Jiqing/bert-large-uncased-whole-word-masking-finetuned-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat_en.md new file mode 100644 index 000000000000..54c28209bc59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat BertForQuestionAnswering from andi611 +author: John Snow Labs +name: bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat` is a English model originally trained by andi611. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat_en_5.2.0_3.0_1699786355324.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat_en_5.2.0_3.0_1699786355324.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_whole_word_masking_squad2_with_ner_conll2003_with_neg_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/andi611/bert-large-uncased-whole-word-masking-squad2-with-ner-conll2003-with-neg-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat_en.md new file mode 100644 index 000000000000..408d81068e9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat BertForQuestionAnswering from andi611 +author: John Snow Labs +name: bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat` is a English model originally trained by andi611. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat_en_5.2.0_3.0_1699785429863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat_en_5.2.0_3.0_1699785429863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_movie_with_neg_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/andi611/bert-large-uncased-whole-word-masking-squad2-with-ner-mit-movie-with-neg-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en.md new file mode 100644 index 000000000000..89e5f832aab2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat BertForQuestionAnswering from andi611 +author: John Snow Labs +name: bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat` is a English model originally trained by andi611. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en_5.2.0_3.0_1699785324740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en_5.2.0_3.0_1699785324740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_whole_word_masking_squad2_with_ner_mit_restaurant_with_neg_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/andi611/bert-large-uncased-whole-word-masking-squad2-with-ner-mit-restaurant-with-neg-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat_en.md new file mode 100644 index 000000000000..808381caeedd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat BertForQuestionAnswering from andi611 +author: John Snow Labs +name: bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat` is a English model originally trained by andi611. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat_en_5.2.0_3.0_1699783049982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat_en_5.2.0_3.0_1699783049982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pistherea_conll2003_with_neg_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/andi611/bert-large-uncased-whole-word-masking-squad2-with-ner-Pistherea-conll2003-with-neg-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat_en.md new file mode 100644 index 000000000000..8d440af9585b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat BertForQuestionAnswering from andi611 +author: John Snow Labs +name: bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat` is a English model originally trained by andi611. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat_en_5.2.0_3.0_1699783587877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat_en_5.2.0_3.0_1699783587877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_whole_word_masking_squad2_with_ner_pwhatisthe_conll2003_with_neg_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/andi611/bert-large-uncased-whole-word-masking-squad2-with-ner-Pwhatisthe-conll2003-with-neg-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1_en.md new file mode 100644 index 000000000000..15ec31216a49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1_en_5.2.0_3.0_1699784758641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1_en_5.2.0_3.0_1699784758641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_wwm_squadv2_x2.15_f83.2_d25_hybrid_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|452.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-large-uncased-wwm-squadv2-x2.15-f83.2-d25-hybrid-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1_en.md new file mode 100644 index 000000000000..9153ee20f794 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1 BertForQuestionAnswering from madlag +author: John Snow Labs +name: bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1` is a English model originally trained by madlag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1_en_5.2.0_3.0_1699785717960.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1_en_5.2.0_3.0_1699785717960.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_wwm_squadv2_x2.63_f82.6_d16_hybrid_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|346.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/madlag/bert-large-uncased-wwm-squadv2-x2.63-f82.6-d16-hybrid-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_manuert_for_xqua_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_manuert_for_xqua_en.md new file mode 100644 index 000000000000..075b9334b7eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_manuert_for_xqua_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_manuert_for_xqua BertForQuestionAnswering from mrm8488 +author: John Snow Labs +name: bert_qa_manuert_for_xqua +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_manuert_for_xqua` is a English model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_manuert_for_xqua_en_5.2.0_3.0_1699786245896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_manuert_for_xqua_en_5.2.0_3.0_1699786245896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_manuert_for_xqua","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_manuert_for_xqua", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_manuert_for_xqua| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/mrm8488/ManuERT-for-xqua \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_medium_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_medium_finetuned_squadv2_en.md new file mode 100644 index 000000000000..993512d64ae1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_medium_finetuned_squadv2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_medium_finetuned_squadv2 BertForQuestionAnswering from mrm8488 +author: John Snow Labs +name: bert_qa_medium_finetuned_squadv2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_medium_finetuned_squadv2` is a English model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_medium_finetuned_squadv2_en_5.2.0_3.0_1699785901564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_medium_finetuned_squadv2_en_5.2.0_3.0_1699785901564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_medium_finetuned_squadv2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_medium_finetuned_squadv2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_medium_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|154.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/mrm8488/bert-medium-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_mini_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_mini_finetuned_squadv2_en.md new file mode 100644 index 000000000000..969584bb713d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_mini_finetuned_squadv2_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Cased model (from M-FAC) +author: John Snow Labs +name: bert_qa_mini_finetuned_squadv2 +date: 2023-11-12 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-mini-finetuned-squadv2` is a English model originally trained by `M-FAC`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mini_finetuned_squadv2_en_5.2.0_3.0_1699785590314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mini_finetuned_squadv2_en_5.2.0_3.0_1699785590314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mini_finetuned_squadv2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_mini_finetuned_squadv2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.v2_mini_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mini_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|41.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/M-FAC/bert-mini-finetuned-squadv2 +- https://arxiv.org/pdf/2107.03356.pdf +- https://github.com/IST-DASLab/M-FAC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_minilm_l12_h384_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_minilm_l12_h384_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..8ad7de3c888c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_minilm_l12_h384_uncased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_minilm_l12_h384_uncased_finetuned_squad BertForQuestionAnswering from ncduy +author: John Snow Labs +name: bert_qa_minilm_l12_h384_uncased_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_minilm_l12_h384_uncased_finetuned_squad` is a English model originally trained by ncduy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_minilm_l12_h384_uncased_finetuned_squad_en_5.2.0_3.0_1699787061428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_minilm_l12_h384_uncased_finetuned_squad_en_5.2.0_3.0_1699787061428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_minilm_l12_h384_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_minilm_l12_h384_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_minilm_l12_h384_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/ncduy/MiniLM-L12-H384-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_mtl_bert_base_uncased_ww_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_mtl_bert_base_uncased_ww_squad_en.md new file mode 100644 index 000000000000..d0b9690ea99f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_mtl_bert_base_uncased_ww_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_mtl_bert_base_uncased_ww_squad BertForQuestionAnswering from jgammack +author: John Snow Labs +name: bert_qa_mtl_bert_base_uncased_ww_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mtl_bert_base_uncased_ww_squad` is a English model originally trained by jgammack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mtl_bert_base_uncased_ww_squad_en_5.2.0_3.0_1699785823473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mtl_bert_base_uncased_ww_squad_en_5.2.0_3.0_1699785823473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mtl_bert_base_uncased_ww_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mtl_bert_base_uncased_ww_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mtl_bert_base_uncased_ww_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/jgammack/MTL-bert-base-uncased-ww-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_multi_ling_bert_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_multi_ling_bert_en.md new file mode 100644 index 000000000000..38293711b3a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_multi_ling_bert_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_multi_ling_bert BertForQuestionAnswering from HankyStyle +author: John Snow Labs +name: bert_qa_multi_ling_bert +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_multi_ling_bert` is a English model originally trained by HankyStyle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multi_ling_bert_en_5.2.0_3.0_1699787188522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multi_ling_bert_en_5.2.0_3.0_1699787188522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multi_ling_bert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_multi_ling_bert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multi_ling_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|625.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/HankyStyle/Multi-ling-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_neulvo_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_neulvo_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..32e377594371 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_neulvo_bert_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_neulvo_bert_finetuned_squad BertForQuestionAnswering from Neulvo +author: John Snow Labs +name: bert_qa_neulvo_bert_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_neulvo_bert_finetuned_squad` is a English model originally trained by Neulvo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_neulvo_bert_finetuned_squad_en_5.2.0_3.0_1699786256914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_neulvo_bert_finetuned_squad_en_5.2.0_3.0_1699786256914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_neulvo_bert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_neulvo_bert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_neulvo_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Neulvo/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2_en.md new file mode 100644 index 000000000000..c1bbc93d1c07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2 BertForQuestionAnswering from TobiasFrey98 +author: John Snow Labs +name: bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2` is a English model originally trained by TobiasFrey98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2_en_5.2.0_3.0_1699786502052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2_en_5.2.0_3.0_1699786502052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_nlp4web_xtremedistil_l6_h256_uncased_trivia_group2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|47.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/TobiasFrey98/NLP4Web-xtremedistil-l6-h256-uncased-TriviaQA-Group2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_1_mbert_model_e1_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_1_mbert_model_e1_xx.md new file mode 100644 index 000000000000..8520bfeb668d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_1_mbert_model_e1_xx.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Multilingual bert_qa_part_1_mbert_model_e1 BertForQuestionAnswering from horsbug98 +author: John Snow Labs +name: bert_qa_part_1_mbert_model_e1 +date: 2023-11-12 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_part_1_mbert_model_e1` is a Multilingual model originally trained by horsbug98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_part_1_mbert_model_e1_xx_5.2.0_3.0_1699786587415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_part_1_mbert_model_e1_xx_5.2.0_3.0_1699786587415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_part_1_mbert_model_e1","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_part_1_mbert_model_e1", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_part_1_mbert_model_e1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/horsbug98/Part_1_mBERT_Model_E1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_1_mbert_model_e2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_1_mbert_model_e2_en.md new file mode 100644 index 000000000000..43ebec16b036 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_1_mbert_model_e2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_part_1_mbert_model_e2 BertForQuestionAnswering from horsbug98 +author: John Snow Labs +name: bert_qa_part_1_mbert_model_e2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_part_1_mbert_model_e2` is a English model originally trained by horsbug98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_part_1_mbert_model_e2_en_5.2.0_3.0_1699787563345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_part_1_mbert_model_e2_en_5.2.0_3.0_1699787563345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_part_1_mbert_model_e2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_part_1_mbert_model_e2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_part_1_mbert_model_e2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/horsbug98/Part_1_mBERT_Model_E2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_2_bert_multilingual_dutch_model_e1_nl.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_2_bert_multilingual_dutch_model_e1_nl.md new file mode 100644 index 000000000000..1201ea94f266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_2_bert_multilingual_dutch_model_e1_nl.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Dutch, Flemish bert_qa_part_2_bert_multilingual_dutch_model_e1 BertForQuestionAnswering from horsbug98 +author: John Snow Labs +name: bert_qa_part_2_bert_multilingual_dutch_model_e1 +date: 2023-11-12 +tags: [bert, nl, open_source, question_answering, onnx] +task: Question Answering +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_part_2_bert_multilingual_dutch_model_e1` is a Dutch, Flemish model originally trained by horsbug98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_part_2_bert_multilingual_dutch_model_e1_nl_5.2.0_3.0_1699786903306.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_part_2_bert_multilingual_dutch_model_e1_nl_5.2.0_3.0_1699786903306.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_part_2_bert_multilingual_dutch_model_e1","nl") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_part_2_bert_multilingual_dutch_model_e1", "nl") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_part_2_bert_multilingual_dutch_model_e1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|nl| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/horsbug98/Part_2_BERT_Multilingual_Dutch_Model_E1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_2_mbert_model_e2_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_2_mbert_model_e2_en.md new file mode 100644 index 000000000000..b95d615e465a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_part_2_mbert_model_e2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_part_2_mbert_model_e2 BertForQuestionAnswering from horsbug98 +author: John Snow Labs +name: bert_qa_part_2_mbert_model_e2 +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_part_2_mbert_model_e2` is a English model originally trained by horsbug98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_part_2_mbert_model_e2_en_5.2.0_3.0_1699787364065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_part_2_mbert_model_e2_en_5.2.0_3.0_1699787364065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_part_2_mbert_model_e2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_part_2_mbert_model_e2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_part_2_mbert_model_e2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/horsbug98/Part_2_mBERT_Model_E2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad_xx.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad_xx.md new file mode 100644 index 000000000000..cf2e4c47275b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad_xx.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Multilingual bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad BertForQuestionAnswering from Paul-Vinh +author: John Snow Labs +name: bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad +date: 2023-11-12 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad` is a Multilingual model originally trained by Paul-Vinh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1699787269894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1699787269894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_paul_vinh_bert_base_multilingual_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Paul-Vinh/bert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_pruebabert_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_pruebabert_en.md new file mode 100644 index 000000000000..81c3423ca108 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_pruebabert_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_pruebabert BertForQuestionAnswering from JAlexis +author: John Snow Labs +name: bert_qa_pruebabert +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_pruebabert` is a English model originally trained by JAlexis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_pruebabert_en_5.2.0_3.0_1699783903394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_pruebabert_en_5.2.0_3.0_1699783903394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_pruebabert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_pruebabert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_pruebabert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/JAlexis/PruebaBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_question_answering_for_argriculture_zh.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_question_answering_for_argriculture_zh.md new file mode 100644 index 000000000000..95ad35d75735 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_question_answering_for_argriculture_zh.md @@ -0,0 +1,98 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Cased model (from HankyStyle) +author: John Snow Labs +name: bert_qa_question_answering_for_argriculture +date: 2023-11-12 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Question-Answering-for-Argriculture` is a Chinese model originally trained by `HankyStyle`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_for_argriculture_zh_5.2.0_3.0_1699786850722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_for_argriculture_zh_5.2.0_3.0_1699786850722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_question_answering_for_argriculture","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_question_answering_for_argriculture","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_question_answering_for_argriculture| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/HankyStyle/Question-Answering-for-Argriculture +- https://nlpnchu.org/ +- https://demo.nlpnchu.org/ +- https://github.com/NCHU-NLP-Lab +- https://paperswithcode.com/sota?task=Question+Answering&dataset=ArgricultureQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sci_squad_quac_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sci_squad_quac_en.md new file mode 100644 index 000000000000..ce72a9a9d72b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sci_squad_quac_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_sci_squad_quac BertForQuestionAnswering from ixa-ehu +author: John Snow Labs +name: bert_qa_sci_squad_quac +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_sci_squad_quac` is a English model originally trained by ixa-ehu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sci_squad_quac_en_5.2.0_3.0_1699787138676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sci_squad_quac_en_5.2.0_3.0_1699787138676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sci_squad_quac","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_sci_squad_quac", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sci_squad_quac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|410.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/ixa-ehu/SciBERT-SQuAD-QuAC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_seongkyu_bert_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_seongkyu_bert_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..ca032f50d600 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_seongkyu_bert_base_cased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_seongkyu_bert_base_cased_finetuned_squad BertForQuestionAnswering from Seongkyu +author: John Snow Labs +name: bert_qa_seongkyu_bert_base_cased_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_seongkyu_bert_base_cased_finetuned_squad` is a English model originally trained by Seongkyu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_seongkyu_bert_base_cased_finetuned_squad_en_5.2.0_3.0_1699787850251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_seongkyu_bert_base_cased_finetuned_squad_en_5.2.0_3.0_1699787850251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_seongkyu_bert_base_cased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_seongkyu_bert_base_cased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_seongkyu_bert_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Seongkyu/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en.md new file mode 100644 index 000000000000..d7f0d2f5d932 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert BertForQuestionAnswering from Shushant +author: John Snow Labs +name: bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert` is a English model originally trained by Shushant. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en_5.2.0_3.0_1699787514569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en_5.2.0_3.0_1699787514569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_shushant_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Shushant/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-ContaminationQAmodel_PubmedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en.md new file mode 100644 index 000000000000..ec472b858073 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert BertForQuestionAnswering from Sotireas +author: John Snow Labs +name: bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert` is a English model originally trained by Sotireas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en_5.2.0_3.0_1699787495976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert_en_5.2.0_3.0_1699787495976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sotireas_biomednlp_pubmedbert_base_uncased_abstract_fulltext_contaminationqamodel_pubmedbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Sotireas/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-ContaminationQAmodel_PubmedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_spanbert_emotion_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_spanbert_emotion_extraction_en.md new file mode 100644 index 000000000000..792348ccab35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_spanbert_emotion_extraction_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_spanbert_emotion_extraction BertForQuestionAnswering from Nakul24 +author: John Snow Labs +name: bert_qa_spanbert_emotion_extraction +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_spanbert_emotion_extraction` is a English model originally trained by Nakul24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_emotion_extraction_en_5.2.0_3.0_1699786510105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_emotion_extraction_en_5.2.0_3.0_1699786510105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_emotion_extraction","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_spanbert_emotion_extraction", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_emotion_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|384.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Nakul24/Spanbert-emotion-extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_srcocotero_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_srcocotero_en.md new file mode 100644 index 000000000000..9afd0ebe6d25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_srcocotero_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_srcocotero BertForQuestionAnswering from srcocotero +author: John Snow Labs +name: bert_qa_srcocotero +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_srcocotero` is a English model originally trained by srcocotero. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_srcocotero_en_5.2.0_3.0_1699784120796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_srcocotero_en_5.2.0_3.0_1699784120796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_srcocotero","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_srcocotero", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_srcocotero| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/srcocotero/bert-qa-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..a57e5d89d29c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad BertForQuestionAnswering from SreyanG-NVIDIA +author: John Snow Labs +name: bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad` is a English model originally trained by SreyanG-NVIDIA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad_en_5.2.0_3.0_1699786799320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad_en_5.2.0_3.0_1699786799320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sreyang_nvidia_bert_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/SreyanG-NVIDIA/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..892026183ee9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad BertForQuestionAnswering from SreyanG-NVIDIA +author: John Snow Labs +name: bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad` is a English model originally trained by SreyanG-NVIDIA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699787611987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699787611987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sreyang_nvidia_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/SreyanG-NVIDIA/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_supriyaarun_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_supriyaarun_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..6bc2657ebad9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_supriyaarun_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_supriyaarun_bert_base_uncased_finetuned_squad BertForQuestionAnswering from SupriyaArun +author: John Snow Labs +name: bert_qa_supriyaarun_bert_base_uncased_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_supriyaarun_bert_base_uncased_finetuned_squad` is a English model originally trained by SupriyaArun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_supriyaarun_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699787853621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_supriyaarun_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699787853621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_supriyaarun_bert_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_supriyaarun_bert_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_supriyaarun_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/SupriyaArun/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_tianle_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_tianle_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..9100c6b23839 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_tianle_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_tianle_bert_base_uncased_finetuned_squad BertForQuestionAnswering from Tianle +author: John Snow Labs +name: bert_qa_tianle_bert_base_uncased_finetuned_squad +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_tianle_bert_base_uncased_finetuned_squad` is a English model originally trained by Tianle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tianle_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699787796865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tianle_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699787796865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tianle_bert_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_tianle_bert_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tianle_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Tianle/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-12-bert_qa_trial_3_results_en.md b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_trial_3_results_en.md new file mode 100644 index 000000000000..498d20a1487b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-12-bert_qa_trial_3_results_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_trial_3_results BertForQuestionAnswering from sunitha +author: John Snow Labs +name: bert_qa_trial_3_results +date: 2023-11-12 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_trial_3_results` is a English model originally trained by sunitha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_trial_3_results_en_5.2.0_3.0_1699788149777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_trial_3_results_en_5.2.0_3.0_1699788149777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_trial_3_results","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_trial_3_results", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_trial_3_results| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/sunitha/Trial_3_Results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_base_uncased_squad_v2.0_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_base_uncased_squad_v2.0_finetuned_en.md new file mode 100644 index 000000000000..ac10d62206ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_base_uncased_squad_v2.0_finetuned_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from kamalkraj) +author: John Snow Labs +name: bert_qa_base_uncased_squad_v2.0_finetuned +date: 2023-11-13 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad-v2.0-finetuned` is a English model originally trained by `kamalkraj`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v2.0_finetuned_en_5.2.0_3.0_1699835177400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_squad_v2.0_finetuned_en_5.2.0_3.0_1699835177400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad_v2.0_finetuned","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_squad_v2.0_finetuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.uncased_v2_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_squad_v2.0_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kamalkraj/bert-base-uncased-squad-v2.0-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_chinese_finetuned_squad_colab_zh.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_chinese_finetuned_squad_colab_zh.md new file mode 100644 index 000000000000..073e4c76bff3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_chinese_finetuned_squad_colab_zh.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from TingChenChang) +author: John Snow Labs +name: bert_qa_bert_base_chinese_finetuned_squad_colab +date: 2023-11-13 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-chinese-finetuned-squad-colab` is a Chinese model orginally trained by `TingChenChang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_chinese_finetuned_squad_colab_zh_5.2.0_3.0_1699843099568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_chinese_finetuned_squad_colab_zh_5.2.0_3.0_1699843099568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_chinese_finetuned_squad_colab","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_chinese_finetuned_squad_colab","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.squad.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_chinese_finetuned_squad_colab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/TingChenChang/bert-base-chinese-finetuned-squad-colab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_squadv1_en.md new file mode 100644 index 000000000000..01b0d9a68959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_squadv1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from vuiseng9) +author: John Snow Labs +name: bert_qa_bert_base_squadv1 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-squadv1` is a English model orginally trained by `vuiseng9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_squadv1_en_5.2.0_3.0_1699845342258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_squadv1_en_5.2.0_3.0_1699845342258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_squadv1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base.by_vuiseng9").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vuiseng9/bert-base-squadv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_docvqa_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_docvqa_en.md new file mode 100644 index 000000000000..b241ff3f9321 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_docvqa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from tiennvcs) +author: John Snow Labs +name: bert_qa_bert_base_uncased_finetuned_docvqa +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-docvqa` is a English model orginally trained by `tiennvcs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_docvqa_en_5.2.0_3.0_1699843186356.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_docvqa_en_5.2.0_3.0_1699843186356.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_finetuned_docvqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_finetuned_docvqa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.docvqa.base_uncased.by_tiennvcs").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_finetuned_docvqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tiennvcs/bert-base-uncased-finetuned-docvqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_duorc_bert_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_duorc_bert_en.md new file mode 100644 index 000000000000..9a45e68b7271 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_duorc_bert_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from machine2049) +author: John Snow Labs +name: bert_qa_bert_base_uncased_finetuned_duorc_bert +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-duorc_bert` is a English model orginally trained by `machine2049`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_duorc_bert_en_5.2.0_3.0_1699845004120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_duorc_bert_en_5.2.0_3.0_1699845004120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_finetuned_duorc_bert","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_finetuned_duorc_bert","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base_uncased.by_machine2049").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_finetuned_duorc_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/machine2049/bert-base-uncased-finetuned-duorc_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_squad_frozen_v2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_squad_frozen_v2_en.md new file mode 100644 index 000000000000..117a7c814f72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_finetuned_squad_frozen_v2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ericRosello) +author: John Snow Labs +name: bert_qa_bert_base_uncased_finetuned_squad_frozen_v2 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad-frozen-v2` is a English model orginally trained by `ericRosello`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_squad_frozen_v2_en_5.2.0_3.0_1699846639592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_finetuned_squad_frozen_v2_en_5.2.0_3.0_1699846639592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_finetuned_squad_frozen_v2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_finetuned_squad_frozen_v2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased_v2.by_ericRosello").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_finetuned_squad_frozen_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ericRosello/bert-base-uncased-finetuned-squad-frozen-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_fiqa_flm_albanian_flit_sq.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_fiqa_flm_albanian_flit_sq.md new file mode 100644 index 000000000000..6bc4f4358508 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_fiqa_flm_albanian_flit_sq.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Albanian bert_qa_bert_base_uncased_fiqa_flm_albanian_flit BertForQuestionAnswering from vanadhi +author: John Snow Labs +name: bert_qa_bert_base_uncased_fiqa_flm_albanian_flit +date: 2023-11-13 +tags: [bert, sq, open_source, question_answering, onnx] +task: Question Answering +language: sq +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_base_uncased_fiqa_flm_albanian_flit` is a Albanian model originally trained by vanadhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_fiqa_flm_albanian_flit_sq_5.2.0_3.0_1699846216180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_fiqa_flm_albanian_flit_sq_5.2.0_3.0_1699846216180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_fiqa_flm_albanian_flit","sq") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_base_uncased_fiqa_flm_albanian_flit", "sq") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_fiqa_flm_albanian_flit| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|sq| +|Size:|407.1 MB| + +## References + +https://huggingface.co/vanadhi/bert-base-uncased-fiqa-flm-sq-flit \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_squad_l6_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_squad_l6_en.md new file mode 100644 index 000000000000..b47148892632 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_base_uncased_squad_l6_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_base_uncased_squad_l6 BertForQuestionAnswering from howey +author: John Snow Labs +name: bert_qa_bert_base_uncased_squad_l6 +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_base_uncased_squad_l6` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad_l6_en_5.2.0_3.0_1699847398609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad_l6_en_5.2.0_3.0_1699847398609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_squad_l6","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_base_uncased_squad_l6", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_squad_l6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|248.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/howey/bert-base-uncased-squad-L6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_finetuned_squad_pytorch_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_finetuned_squad_pytorch_en.md new file mode 100644 index 000000000000..168bebf95b37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_finetuned_squad_pytorch_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from stevemobs) +author: John Snow Labs +name: bert_qa_bert_finetuned_squad_pytorch +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-pytorch` is a English model orginally trained by `stevemobs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_squad_pytorch_en_5.2.0_3.0_1699849827618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_squad_pytorch_en_5.2.0_3.0_1699849827618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_finetuned_squad_pytorch","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_finetuned_squad_pytorch","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_stevemobs").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_finetuned_squad_pytorch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/stevemobs/bert-finetuned-squad-pytorch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_medium_pretrained_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_medium_pretrained_finetuned_squad_en.md new file mode 100644 index 000000000000..1b393e013bf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_medium_pretrained_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_medium_pretrained_finetuned_squad +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-medium-pretrained-finetuned-squad` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_pretrained_finetuned_squad_en_5.2.0_3.0_1699853638961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_pretrained_finetuned_squad_en_5.2.0_3.0_1699853638961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_medium_pretrained_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_medium_pretrained_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.medium_finetuned.by_anas-awadalla").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_medium_pretrained_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|154.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-medium-pretrained-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_medium_wrslb_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_medium_wrslb_finetuned_squadv1_en.md new file mode 100644 index 000000000000..358f59dad1ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_medium_wrslb_finetuned_squadv1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_medium_wrslb_finetuned_squadv1 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-medium-wrslb-finetuned-squadv1` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_wrslb_finetuned_squadv1_en_5.2.0_3.0_1699835262254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_wrslb_finetuned_squadv1_en_5.2.0_3.0_1699835262254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_medium_wrslb_finetuned_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_medium_wrslb_finetuned_squadv1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.medium").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_medium_wrslb_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|154.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-medium-wrslb-finetuned-squadv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_mini_wrslb_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_mini_wrslb_finetuned_squadv1_en.md new file mode 100644 index 000000000000..efefb55f11f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_mini_wrslb_finetuned_squadv1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_mini_wrslb_finetuned_squadv1 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-mini-wrslb-finetuned-squadv1` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_mini_wrslb_finetuned_squadv1_en_5.2.0_3.0_1699836970376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_mini_wrslb_finetuned_squadv1_en_5.2.0_3.0_1699836970376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_mini_wrslb_finetuned_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_mini_wrslb_finetuned_squadv1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_mini_wrslb_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-mini-wrslb-finetuned-squadv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_multi_cased_squad_swedish_marbogusz_sv.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_multi_cased_squad_swedish_marbogusz_sv.md new file mode 100644 index 000000000000..d42754f1578f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_multi_cased_squad_swedish_marbogusz_sv.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Swedish bert_qa_bert_multi_cased_squad_swedish_marbogusz BertForQuestionAnswering from marbogusz +author: John Snow Labs +name: bert_qa_bert_multi_cased_squad_swedish_marbogusz +date: 2023-11-13 +tags: [bert, sv, open_source, question_answering, onnx] +task: Question Answering +language: sv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_multi_cased_squad_swedish_marbogusz` is a Swedish model originally trained by marbogusz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_squad_swedish_marbogusz_sv_5.2.0_3.0_1699846129239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_squad_swedish_marbogusz_sv_5.2.0_3.0_1699846129239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_squad_swedish_marbogusz","sv") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_multi_cased_squad_swedish_marbogusz", "sv") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_squad_swedish_marbogusz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|sv| +|Size:|465.2 MB| + +## References + +https://huggingface.co/marbogusz/bert-multi-cased-squad_sv \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_persian_farsi_qa_v1_fa.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_persian_farsi_qa_v1_fa.md new file mode 100644 index 000000000000..5f08df62b627 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_persian_farsi_qa_v1_fa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Persian bert_qa_bert_persian_farsi_qa_v1 BertForQuestionAnswering from ForutanRad +author: John Snow Labs +name: bert_qa_bert_persian_farsi_qa_v1 +date: 2023-11-13 +tags: [bert, fa, open_source, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_persian_farsi_qa_v1` is a Persian model originally trained by ForutanRad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_persian_farsi_qa_v1_fa_5.2.0_3.0_1699847918388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_persian_farsi_qa_v1_fa_5.2.0_3.0_1699847918388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_persian_farsi_qa_v1","fa") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_persian_farsi_qa_v1", "fa") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_persian_farsi_qa_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.5 MB| + +## References + +https://huggingface.co/ForutanRad/bert-fa-QA-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_qa_vietnamese_nvkha_vi.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_qa_vietnamese_nvkha_vi.md new file mode 100644 index 000000000000..61c3219205f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_qa_vietnamese_nvkha_vi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Vietnamese bert_qa_bert_qa_vietnamese_nvkha BertForQuestionAnswering from nvkha +author: John Snow Labs +name: bert_qa_bert_qa_vietnamese_nvkha +date: 2023-11-13 +tags: [bert, vi, open_source, question_answering, onnx] +task: Question Answering +language: vi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_qa_vietnamese_nvkha` is a Vietnamese model originally trained by nvkha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_qa_vietnamese_nvkha_vi_5.2.0_3.0_1699840938730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_qa_vietnamese_nvkha_vi_5.2.0_3.0_1699840938730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_qa_vietnamese_nvkha","vi") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_qa_vietnamese_nvkha", "vi") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_qa_vietnamese_nvkha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|vi| +|Size:|665.0 MB| + +## References + +https://huggingface.co/nvkha/bert-qa-vi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_reader_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_reader_squad2_en.md new file mode 100644 index 000000000000..54c342ecb40b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_reader_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from pinecone) +author: John Snow Labs +name: bert_qa_bert_reader_squad2 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-reader-squad2` is a English model orginally trained by `pinecone`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_reader_squad2_en_5.2.0_3.0_1699847638974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_reader_squad2_en_5.2.0_3.0_1699847638974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_reader_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_reader_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.by_pinecone").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_reader_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/pinecone/bert-reader-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_2_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_2_finetuned_squadv2_en.md new file mode 100644 index 000000000000..e98a80cab564 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_2_finetuned_squadv2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_small_2_finetuned_squadv2 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-2-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_2_finetuned_squadv2_en_5.2.0_3.0_1699841053924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_2_finetuned_squadv2_en_5.2.0_3.0_1699841053924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_small_2_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_small_2_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.small_v2.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_small_2_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|130.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-small-2-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_cord19_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_cord19_squad2_en.md new file mode 100644 index 000000000000..f538e92ca34e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_cord19_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from NeuML) +author: John Snow Labs +name: bert_qa_bert_small_cord19_squad2 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-cord19-squad2` is a English model orginally trained by `NeuML`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_cord19_squad2_en_5.2.0_3.0_1699842918116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_cord19_squad2_en_5.2.0_3.0_1699842918116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_small_cord19_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_small_cord19_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_cord19.bert.small").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_small_cord19_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|130.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/NeuML/bert-small-cord19-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_cord19qa_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_cord19qa_en.md new file mode 100644 index 000000000000..c895bb6b4480 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_cord19qa_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from NeuML) +author: John Snow Labs +name: bert_qa_bert_small_cord19qa +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-cord19qa` is a English model orginally trained by `NeuML`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_cord19qa_en_5.2.0_3.0_1699844663819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_cord19qa_en_5.2.0_3.0_1699844663819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_small_cord19qa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_small_cord19qa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.cord19.bert.small").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_small_cord19qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|130.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/NeuML/bert-small-cord19qa +- https://www.kaggle.com/davidmezzetti/cord19-qa?select=cord19-qa.json +- https://www.semanticscholar.org/cord19 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_finetuned_squad_en.md new file mode 100644 index 000000000000..389f93023595 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_small_finetuned_squad +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-finetuned-squad` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_finetuned_squad_en_5.2.0_3.0_1699842629949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_finetuned_squad_en_5.2.0_3.0_1699842629949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_small_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_small_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.small.by_anas-awadalla").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_small_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-small-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_pretrained_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_pretrained_finetuned_squad_en.md new file mode 100644 index 000000000000..538e14306a15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_pretrained_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_small_pretrained_finetuned_squad +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-pretrained-finetuned-squad` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_pretrained_finetuned_squad_en_5.2.0_3.0_1699844207778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_pretrained_finetuned_squad_en_5.2.0_3.0_1699844207778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_small_pretrained_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_small_pretrained_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.small_finetuned.by_anas-awadalla").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_small_pretrained_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-small-pretrained-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_wrslb_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_wrslb_finetuned_squadv1_en.md new file mode 100644 index 000000000000..baf78019034b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_small_wrslb_finetuned_squadv1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_small_wrslb_finetuned_squadv1 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-wrslb-finetuned-squadv1` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_wrslb_finetuned_squadv1_en_5.2.0_3.0_1699848647000.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_wrslb_finetuned_squadv1_en_5.2.0_3.0_1699848647000.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_small_wrslb_finetuned_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_small_wrslb_finetuned_squadv1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.small").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_small_wrslb_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-small-wrslb-finetuned-squadv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_3_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_3_finetuned_squadv2_en.md new file mode 100644 index 000000000000..97cd02ae1572 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_3_finetuned_squadv2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_tiny_3_finetuned_squadv2 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-3-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_3_finetuned_squadv2_en_5.2.0_3.0_1699845398939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_3_finetuned_squadv2_en_5.2.0_3.0_1699845398939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_tiny_3_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_tiny_3_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.tiny_v3.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_tiny_3_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|21.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-3-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_4_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_4_finetuned_squadv2_en.md new file mode 100644 index 000000000000..459194cadd36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_4_finetuned_squadv2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_tiny_4_finetuned_squadv2 +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-4-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_4_finetuned_squadv2_en_5.2.0_3.0_1699849909461.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_4_finetuned_squadv2_en_5.2.0_3.0_1699849909461.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_tiny_4_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_tiny_4_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.tiny_v4.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_tiny_4_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|22.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-4-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_finetuned_squad_en.md new file mode 100644 index 000000000000..30cc4e33314e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_tiny_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_tiny_finetuned_squad +date: 2023-11-13 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-squad` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_finetuned_squad_en_5.2.0_3.0_1699845403647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_finetuned_squad_en_5.2.0_3.0_1699845403647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_tiny_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_tiny_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.tiny").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_tiny_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-tiny-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna_en.md new file mode 100644 index 000000000000..bcd620520594 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna_en_5.2.0_3.0_1699846606190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna_en_5.2.0_3.0_1699846606190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_10_h_512_a_8_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|177.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-10_H-512_A-8_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna_en.md new file mode 100644 index 000000000000..97faf59b406f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna_en_5.2.0_3.0_1699853610007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna_en_5.2.0_3.0_1699853610007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_2_h_512_a_8_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|83.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-2_H-512_A-8_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_2_h_512_a_8_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_2_h_512_a_8_squad2_en.md new file mode 100644 index 000000000000..58d307a1c8da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_2_h_512_a_8_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_2_h_512_a_8_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_2_h_512_a_8_squad2 +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_2_h_512_a_8_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_2_h_512_a_8_squad2_en_5.2.0_3.0_1699846397754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_2_h_512_a_8_squad2_en_5.2.0_3.0_1699846397754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_2_h_512_a_8_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_2_h_512_a_8_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_2_h_512_a_8_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|83.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-2_H-512_A-8_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna_en.md new file mode 100644 index 000000000000..0b7fb7c8dbe0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1699848434078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1699848434078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-256_A-4_cord19-200616_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..556fc1100c50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2 +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_en_5.2.0_3.0_1699847284144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2_en_5.2.0_3.0_1699847284144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_256_a_4_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-256_A-4_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_squad2_en.md new file mode 100644 index 000000000000..9d0b6617ec45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_256_a_4_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_256_a_4_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_256_a_4_squad2 +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_256_a_4_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_squad2_en_5.2.0_3.0_1699847283140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_squad2_en_5.2.0_3.0_1699847283140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_256_a_4_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-256_A-4_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna_en.md new file mode 100644 index 000000000000..fb6c985970db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna_en_5.2.0_3.0_1699848465367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna_en_5.2.0_3.0_1699848465367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_512_a_8_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-512_A-8_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..482fadcd1562 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2 +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_en_5.2.0_3.0_1699849732825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_en_5.2.0_3.0_1699849732825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|194.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-768_A-12_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna_en.md new file mode 100644 index 000000000000..c5d3573dc916 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna +date: 2023-11-13 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna_en_5.2.0_3.0_1699851705959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna_en_5.2.0_3.0_1699851705959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_6_h_128_a_2_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|19.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-6_H-128_A-2_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bespin_global_klue_bert_base_mrc_ko.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bespin_global_klue_bert_base_mrc_ko.md new file mode 100644 index 000000000000..739202415476 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_bespin_global_klue_bert_base_mrc_ko.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from bespin-global) +author: John Snow Labs +name: bert_qa_bespin_global_klue_bert_base_mrc +date: 2023-11-13 +tags: [ko, open_source, question_answering, bert, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `klue-bert-base-mrc` is a Korean model orginally trained by `bespin-global`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bespin_global_klue_bert_base_mrc_ko_5.2.0_3.0_1699850011426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bespin_global_klue_bert_base_mrc_ko_5.2.0_3.0_1699850011426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bespin_global_klue_bert_base_mrc","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bespin_global_klue_bert_base_mrc","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.klue.bert.base.by_bespin-global").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bespin_global_klue_bert_base_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|412.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bespin-global/klue-bert-base-mrc +- https://www.bespinglobal.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-13-bert_qa_beto_espanhol_squad2_es.md b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_beto_espanhol_squad2_es.md new file mode 100644 index 000000000000..3f9ff460463f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-13-bert_qa_beto_espanhol_squad2_es.md @@ -0,0 +1,104 @@ +--- +layout: model +title: Spanish BertForQuestionAnswering Cased model (from Josue) +author: John Snow Labs +name: bert_qa_beto_espanhol_squad2 +date: 2023-11-13 +tags: [es, open_source, bert, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BETO-espanhol-Squad2` is a Spanish model originally trained by `Josue`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_beto_espanhol_squad2_es_5.2.0_3.0_1699851919323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_beto_espanhol_squad2_es_5.2.0_3.0_1699851919323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_beto_espanhol_squad2","es")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_beto_espanhol_squad2","es") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_beto_espanhol_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Josue/BETO-espanhol-Squad2 +- https://github.com/dccuchile/beto +- https://github.com/ccasimiro88/TranslateAlignRetrieve +- https://github.com/dccuchile/beto/blob/master/README.md +- https://github.com/google-research/bert +- https://github.com/josecannete/spanish-corpora +- https://github.com/google-research/bert/blob/master/multilingual.md +- https://github.com/ccasimiro88/TranslateAlignRetrieve +- https://media.giphy.com/media/mCIaBpfN0LQcuzkA2F/giphy.gif +- https://media.giphy.com/media/WT453aptcbCP7hxWTZ/giphy.gif +- https://twitter.com/Josuehu_ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_cased_finetuned_viquad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_cased_finetuned_viquad_en.md new file mode 100644 index 000000000000..b56f668a2dd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_cased_finetuned_viquad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from Khanh) +author: John Snow Labs +name: bert_qa_base_multilingual_cased_finetuned_viquad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-viquad` is a English model originally trained by `Khanh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_viquad_en_5.2.0_3.0_1699993581067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_viquad_en_5.2.0_3.0_1699993581067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned_viquad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned_viquad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.cased_multilingual_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_cased_finetuned_viquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Khanh/bert-base-multilingual-cased-finetuned-viquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_cased_finetuned_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_cased_finetuned_xx.md new file mode 100644 index 000000000000..c68bccc8198d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_cased_finetuned_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Cased model (from obokkkk) +author: John Snow Labs +name: bert_qa_base_multilingual_cased_finetuned +date: 2023-11-14 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned` is a Multilingual model originally trained by `obokkkk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_xx_5.2.0_3.0_1699993678387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_xx_5.2.0_3.0_1699993678387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_cased_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/obokkkk/bert-base-multilingual-cased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_uncased_finetuned_squadv2_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_uncased_finetuned_squadv2_xx.md new file mode 100644 index 000000000000..f9b91f4ba9bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_multilingual_uncased_finetuned_squadv2_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Uncased model (from khoanvm) +author: John Snow Labs +name: bert_qa_base_multilingual_uncased_finetuned_squadv2 +date: 2023-11-14 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-uncased-finetuned-squadv2` is a Multilingual model originally trained by `khoanvm`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_xx_5.2.0_3.0_1699993920867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_xx_5.2.0_3.0_1699993920867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_uncased_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/khoanvm/bert-base-multilingual-uncased-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_nnish_cased_squad1_fi.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_nnish_cased_squad1_fi.md new file mode 100644 index 000000000000..eaf190971bc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_nnish_cased_squad1_fi.md @@ -0,0 +1,96 @@ +--- +layout: model +title: Finnish BertForQuestionAnswering Base Cased model (from ilmariky) +author: John Snow Labs +name: bert_qa_base_nnish_cased_squad1 +date: 2023-11-14 +tags: [fi, open_source, bert, question_answering, onnx] +task: Question Answering +language: fi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-finnish-cased-squad1-fi` is a Finnish model originally trained by `ilmariky`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_nnish_cased_squad1_fi_5.2.0_3.0_1699993974738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_nnish_cased_squad1_fi_5.2.0_3.0_1699993974738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_nnish_cased_squad1","fi")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_nnish_cased_squad1","fi") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_nnish_cased_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fi| +|Size:|464.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ilmariky/bert-base-finnish-cased-squad1-fi +- https://github.com/google-research-datasets/tydiqa +- https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_parsquad_fa.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_parsquad_fa.md new file mode 100644 index 000000000000..7e5cda9101c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_parsquad_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_parsquad +date: 2023-11-14 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_parsquad` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_parsquad_fa_5.2.0_3.0_1699994256254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_parsquad_fa_5.2.0_3.0_1699994256254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_parsquad","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_parsquad","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_parsquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_parsquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_1epoch_fa.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_1epoch_fa.md new file mode 100644 index 000000000000..29890b6ff0ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_1epoch_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_pquad_1epoch +date: 2023-11-14 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_pquad_1epoch` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_1epoch_fa_5.2.0_3.0_1699994627595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_1epoch_fa_5.2.0_3.0_1699994627595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_1epoch","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_1epoch","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_pquad_1epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_pquad_1epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_fa.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_fa.md new file mode 100644 index 000000000000..76e113873fca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_pquad +date: 2023-11-14 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_pquad` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_fa_5.2.0_3.0_1699993573583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_fa_5.2.0_3.0_1699993573583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_pquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_pquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_lr1e_5_fa.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_lr1e_5_fa.md new file mode 100644 index 000000000000..acfc528f7652 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_pars_uncased_pquad_lr1e_5_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_pquad_lr1e_5 +date: 2023-11-14 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_pquad_lr1e-5` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_lr1e_5_fa_5.2.0_3.0_1699994981140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_lr1e_5_fa_5.2.0_3.0_1699994981140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_lr1e_5","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_lr1e_5","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_pquad_lr1e_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_pquad_lr1e-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_parsbert_uncased_finetuned_squad_fa.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_parsbert_uncased_finetuned_squad_fa.md new file mode 100644 index 000000000000..fed4852c1868 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_parsbert_uncased_finetuned_squad_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mhmsadegh) +author: John Snow Labs +name: bert_qa_base_parsbert_uncased_finetuned_squad +date: 2023-11-14 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased-finetuned-squad` is a Persian model originally trained by `mhmsadegh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_parsbert_uncased_finetuned_squad_fa_5.2.0_3.0_1699993442017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_parsbert_uncased_finetuned_squad_fa_5.2.0_3.0_1699993442017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_parsbert_uncased_finetuned_squad","fa") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_parsbert_uncased_finetuned_squad","fa") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_parsbert_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mhmsadegh/bert-base-parsbert-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr.md new file mode 100644 index 000000000000..0c55dcee49d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Base Cased model (from husnu) +author: John Snow Labs +name: bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1 +date: 2023-11-14 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-1` is a Turkish model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr_5.2.0_3.0_1699995382989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr_5.2.0_3.0_1699995382989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr.md new file mode 100644 index 000000000000..876a42a77bb9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Base Cased model (from husnu) +author: John Snow Labs +name: bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3 +date: 2023-11-14 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-3` is a Turkish model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1699993798250.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1699993798250.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..e2f65de0194d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en_5.2.0_3.0_1699995746222.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en_5.2.0_3.0_1699995746222.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..22a3950d375b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-32-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en_5.2.0_3.0_1699993504295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en_5.2.0_3.0_1699993504295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_8_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-32-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..16961e4a5094 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-64-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en_5.2.0_3.0_1699994169986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en_5.2.0_3.0_1699994169986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_4_base_64d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-64-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en.md new file mode 100644 index 000000000000..6d437c3d4dda --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from alistvt) +author: John Snow Labs +name: bert_qa_base_uncased_pretrain_finetuned_coqa_falttened +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-pretrain-finetuned-coqa-falttened` is a English model originally trained by `alistvt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en_5.2.0_3.0_1699994432857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en_5.2.0_3.0_1699994432857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_pretrain_finetuned_coqa_falttened","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_pretrain_finetuned_coqa_falttened","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.uncased_base_finetuned.by_alistvt").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_pretrain_finetuned_coqa_falttened| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/alistvt/bert-base-uncased-pretrain-finetuned-coqa-falttened \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bdickson_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bdickson_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..20958975177c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bdickson_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from bdickson) +author: John Snow Labs +name: bert_qa_bdickson_bert_base_uncased_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad` is a English model orginally trained by `bdickson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bdickson_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699996011166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bdickson_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1699996011166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bdickson_bert_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bdickson_bert_base_uncased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_bdickson").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bdickson_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bdickson/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_en.md new file mode 100644 index 000000000000..449c217ecdfe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_en_5.2.0_3.0_1699994404485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_en_5.2.0_3.0_1699994404485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.tydiqa.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_all_translated_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_all_translated_en.md new file mode 100644 index 000000000000..63adf7243a19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_all_translated_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_squad_all_translated +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-squad_all_translated` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_all_translated_en_5.2.0_3.0_1699994792788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_all_translated_en_5.2.0_3.0_1699994792788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_squad_all_translated","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_squad_all_translated","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_translated.bert.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_squad_all_translated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-squad_all_translated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_ben_tel_context_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_ben_tel_context_en.md new file mode 100644 index 000000000000..a50f3b6f1bba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_ben_tel_context_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_squad_ben_tel_context +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-squad_ben_tel_context` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_ben_tel_context_en_5.2.0_3.0_1699993841524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_ben_tel_context_en_5.2.0_3.0_1699993841524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_squad_ben_tel_context","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_squad_ben_tel_context","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_ben_tel.bert.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_squad_ben_tel_context| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-squad_ben_tel_context \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_que_translated_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_que_translated_en.md new file mode 100644 index 000000000000..e7d1a2b74fa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_squad_que_translated_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_squad_que_translated +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-squad_que_translated` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_que_translated_en_5.2.0_3.0_1699996407002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_que_translated_en_5.2.0_3.0_1699996407002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_squad_que_translated","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_squad_que_translated","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_translated.bert.que.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_squad_que_translated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-squad_que_translated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_translated_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_translated_en.md new file mode 100644 index 000000000000..999ccb618f1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_all_translated_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_translated +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-translated` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_translated_en_5.2.0_3.0_1699994248431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_translated_en_5.2.0_3.0_1699994248431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_translated","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_translated","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_translated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-translated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_2048_full_trivia_copied_embeddings_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_2048_full_trivia_copied_embeddings_en.md new file mode 100644 index 000000000000..d59b5bc852b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_2048_full_trivia_copied_embeddings_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MrAnderson) +author: John Snow Labs +name: bert_qa_bert_base_2048_full_trivia_copied_embeddings +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-2048-full-trivia-copied-embeddings` is a English model orginally trained by `MrAnderson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_2048_full_trivia_copied_embeddings_en_5.2.0_3.0_1699996762251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_2048_full_trivia_copied_embeddings_en_5.2.0_3.0_1699996762251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_2048_full_trivia_copied_embeddings","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_2048_full_trivia_copied_embeddings","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.trivia.bert.base_2048.by_MrAnderson").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_2048_full_trivia_copied_embeddings| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|411.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MrAnderson/bert-base-2048-full-trivia-copied-embeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_cased_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_cased_chaii_en.md new file mode 100644 index 000000000000..f69cecb207c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_cased_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_base_cased_chaii +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_chaii_en_5.2.0_3.0_1699994506880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_chaii_en_5.2.0_3.0_1699994506880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_cased_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_cased_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_cased_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-base-cased-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_faquad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_faquad_en.md new file mode 100644 index 000000000000..9074135cff3d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_faquad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ricardo-filho) +author: John Snow Labs +name: bert_qa_bert_base_faquad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_base_faquad` is a English model orginally trained by `ricardo-filho`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_faquad_en_5.2.0_3.0_1699995091550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_faquad_en_5.2.0_3.0_1699995091550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_faquad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_faquad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_faquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ricardo-filho/bert_base_faquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetune_qa_th.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetune_qa_th.md new file mode 100644 index 000000000000..cabb6bd23312 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetune_qa_th.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Thai BertForQuestionAnswering model (from airesearch) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_finetune_qa +date: 2023-11-14 +tags: [th, open_source, question_answering, bert, onnx] +task: Question Answering +language: th +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetune-qa` is a Thai model orginally trained by `airesearch`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetune_qa_th_5.2.0_3.0_1699994217748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetune_qa_th_5.2.0_3.0_1699994217748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_finetune_qa","th") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_finetune_qa","th") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("th.answer_question.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_finetune_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|th| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/airesearch/bert-base-multilingual-cased-finetune-qa +- https://github.com/vistec-AI/thai2transformers/blob/dev/scripts/downstream/train_question_answering_lm_finetuning.py +- https://wandb.ai/cstorm125/wangchanberta-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta.md new file mode 100644 index 000000000000..1c6019679741 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Tamil BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_finetuned_chaii +date: 2023-11-14 +tags: [open_source, question_answering, bert, ta, onnx] +task: Question Answering +language: ta +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-chaii` is a Tamil model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta_5.2.0_3.0_1699994790461.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta_5.2.0_3.0_1699994790461.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_chaii","ta") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_chaii","ta") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ta.answer_question.chaii.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_finetuned_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ta| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-base-multilingual-cased-finetuned-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetuned_klue_ko.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetuned_klue_ko.md new file mode 100644 index 000000000000..b1ff72947c31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_finetuned_klue_ko.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from obokkkk) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_finetuned_klue +date: 2023-11-14 +tags: [open_source, question_answering, bert, ko, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-klue` is a Korean model orginally trained by `obokkkk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_klue_ko_5.2.0_3.0_1699994852278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_klue_ko_5.2.0_3.0_1699994852278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_klue","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_klue","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.klue.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_finetuned_klue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/obokkkk/bert-base-multilingual-cased-finetuned-klue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_korquad_ko.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_korquad_ko.md new file mode 100644 index 000000000000..bb13de932e50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_korquad_ko.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from sangrimlee) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_korquad +date: 2023-11-14 +tags: [open_source, question_answering, bert, ko, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-korquad` is a Korean model orginally trained by `sangrimlee`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_ko_5.2.0_3.0_1699995481370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_ko_5.2.0_3.0_1699995481370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_korquad","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_korquad","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.korquad.bert.multilingual_base_cased.by_sangrimlee").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_korquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sangrimlee/bert-base-multilingual-cased-korquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_korquad_v1_ko.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_korquad_v1_ko.md new file mode 100644 index 000000000000..9218d44efb48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_cased_korquad_v1_ko.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from eliza-dukim) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_korquad_v1 +date: 2023-11-14 +tags: [open_source, question_answering, bert, ko, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased_korquad-v1` is a Korean model orginally trained by `eliza-dukim`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_v1_ko_5.2.0_3.0_1699995262571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_v1_ko_5.2.0_3.0_1699995262571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_korquad_v1","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_korquad_v1","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.korquad.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_korquad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/eliza-dukim/bert-base-multilingual-cased_korquad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_xquad_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_xquad_xx.md new file mode 100644 index 000000000000..f5280bb4b4f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_multilingual_xquad_xx.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering model (from alon-albalak) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_xquad +date: 2023-11-14 +tags: [open_source, question_answering, bert, xx, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-xquad` is a Multilingual model orginally trained by `alon-albalak`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_xquad_xx_5.2.0_3.0_1699995793907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_xquad_xx_5.2.0_3.0_1699995793907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_xquad","xx") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_xquad","xx") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.xquad.bert.multilingual_base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_xquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/alon-albalak/bert-base-multilingual-xquad +- https://github.com/deepmind/xquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es.md new file mode 100644 index 000000000000..41c5f565dfc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa +date: 2023-11-14 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-cased-finetuned-qa-mlqa` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es_5.2.0_3.0_1699996044180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es_5.2.0_3.0_1699996044180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.mlqa.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-cased-finetuned-qa-mlqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es.md new file mode 100644 index 000000000000..48e56746ef97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac +date: 2023-11-14 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-cased-finetuned-qa-sqac` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es_5.2.0_3.0_1699995110057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es_5.2.0_3.0_1699995110057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.sqac.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-cased-finetuned-qa-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_uncased_coqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_uncased_coqa_en.md new file mode 100644 index 000000000000..dbeabc2db443 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_uncased_coqa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from peggyhuang) +author: John Snow Labs +name: bert_qa_bert_base_uncased_coqa +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-coqa` is a English model orginally trained by `peggyhuang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_coqa_en_5.2.0_3.0_1699996350702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_coqa_en_5.2.0_3.0_1699996350702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_coqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_coqa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_coqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peggyhuang/bert-base-uncased-coqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en.md new file mode 100644 index 000000000000..d958ce216134 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from armageddon) +author: John Snow Labs +name: bert_qa_bert_base_uncased_squad2_covid_qa_deepset +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad2-covid-qa-deepset` is a English model orginally trained by `armageddon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1699994554541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1699994554541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_squad2_covid_qa_deepset","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_squad2_covid_qa_deepset","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_covid.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_squad2_covid_qa_deepset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/armageddon/bert-base-uncased-squad2-covid-qa-deepset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_finetuned_jackh1995_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_finetuned_jackh1995_en.md new file mode 100644 index 000000000000..043ab6e556da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_finetuned_jackh1995_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from jackh1995) +author: John Snow Labs +name: bert_qa_bert_finetuned_jackh1995 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned` is a English model orginally trained by `jackh1995`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_jackh1995_en_5.2.0_3.0_1699995632485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_jackh1995_en_5.2.0_3.0_1699995632485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_finetuned_jackh1995","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_finetuned_jackh1995","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_jackh1995").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_finetuned_jackh1995| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|380.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jackh1995/bert-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_finetuned_lr2_e5_b16_ep2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_finetuned_lr2_e5_b16_ep2_en.md new file mode 100644 index 000000000000..12aacdaac30b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_finetuned_lr2_e5_b16_ep2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from motiondew) +author: John Snow Labs +name: bert_qa_bert_finetuned_lr2_e5_b16_ep2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-lr2-e5-b16-ep2` is a English model orginally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_lr2_e5_b16_ep2_en_5.2.0_3.0_1699995956287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_lr2_e5_b16_ep2_en_5.2.0_3.0_1699995956287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_finetuned_lr2_e5_b16_ep2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_finetuned_lr2_e5_b16_ep2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_motiondew").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_finetuned_lr2_e5_b16_ep2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-finetuned-lr2-e5-b16-ep2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_l_squadv1.1_sl256_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_l_squadv1.1_sl256_en.md new file mode 100644 index 000000000000..eeb18e3d80a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_l_squadv1.1_sl256_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from vuiseng9) +author: John Snow Labs +name: bert_qa_bert_l_squadv1.1_sl256 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-l-squadv1.1-sl256` is a English model orginally trained by `vuiseng9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl256_en_5.2.0_3.0_1699996982091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl256_en_5.2.0_3.0_1699996982091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_l_squadv1.1_sl256","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_l_squadv1.1_sl256","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.sl256.by_vuiseng9").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_l_squadv1.1_sl256| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vuiseng9/bert-l-squadv1.1-sl256 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_l_squadv1.1_sl384_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_l_squadv1.1_sl384_en.md new file mode 100644 index 000000000000..feacfdb5b152 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_l_squadv1.1_sl384_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from vuiseng9) +author: John Snow Labs +name: bert_qa_bert_l_squadv1.1_sl384 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-l-squadv1.1-sl384` is a English model orginally trained by `vuiseng9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl384_en_5.2.0_3.0_1699997522825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl384_en_5.2.0_3.0_1699997522825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_l_squadv1.1_sl384","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_l_squadv1.1_sl384","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.sl384.by_vuiseng9").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_l_squadv1.1_sl384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vuiseng9/bert-l-squadv1.1-sl384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_faquad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_faquad_en.md new file mode 100644 index 000000000000..7c5ce55a9f59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_faquad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ricardo-filho) +author: John Snow Labs +name: bert_qa_bert_large_faquad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_large_faquad` is a English model orginally trained by `ricardo-filho`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_faquad_en_5.2.0_3.0_1699998050521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_faquad_en_5.2.0_3.0_1699998050521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_faquad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_faquad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.large.by_ricardo-filho").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_faquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ricardo-filho/bert_large_faquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_finetuned_docvqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_finetuned_docvqa_en.md new file mode 100644 index 000000000000..6a367f41d3b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_finetuned_docvqa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from tiennvcs) +author: John Snow Labs +name: bert_qa_bert_large_uncased_finetuned_docvqa +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-finetuned-docvqa` is a English model orginally trained by `tiennvcs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_finetuned_docvqa_en_5.2.0_3.0_1699997434549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_finetuned_docvqa_en_5.2.0_3.0_1699997434549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_finetuned_docvqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_finetuned_docvqa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.large_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_finetuned_docvqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tiennvcs/bert-large-uncased-finetuned-docvqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en.md new file mode 100644 index 000000000000..cd87c59feb62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from armageddon) +author: John Snow Labs +name: bert_qa_bert_large_uncased_squad2_covid_qa_deepset +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squad2-covid-qa-deepset` is a English model orginally trained by `armageddon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1699995085123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1699995085123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_squad2_covid_qa_deepset","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_squad2_covid_qa_deepset","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_covid.bert.large_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_squad2_covid_qa_deepset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/armageddon/bert-large-uncased-squad2-covid-qa-deepset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md new file mode 100644 index 000000000000..8f8697f386be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Intel) +author: John Snow Labs +name: bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squadv1.1-sparse-80-1x4-block-pruneofa` is a English model orginally trained by `Intel`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1699995636644.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1699995636644.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large_uncased_sparse_80_1x4_block_pruneofa.by_Intel").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|436.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Intel/bert-large-uncased-squadv1.1-sparse-80-1x4-block-pruneofa +- https://arxiv.org/abs/2111.05754 +- https://github.com/IntelLabs/Model-Compression-Research-Package/tree/main/research/prune-once-for-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squadv2_en.md new file mode 100644 index 000000000000..53f27792a25f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_squadv2_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from madlag) +author: John Snow Labs +name: bert_qa_bert_large_uncased_squadv2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squadv2` is a English model orginally trained by `madlag`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv2_en_5.2.0_3.0_1699996683751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv2_en_5.2.0_3.0_1699996683751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_uncased_v2.by_madlag").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/madlag/bert-large-uncased-squadv2 +- https://arxiv.org/pdf/1810.04805v2.pdf%5D \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_chaii_en.md new file mode 100644 index 000000000000..385c8d449a59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_chaii +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_chaii_en_5.2.0_3.0_1699997266251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_chaii_en_5.2.0_3.0_1699997266251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.large_uncased_uncased_whole_word_masking.by_SauravMaheshkar").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-large-uncased-whole-word-masking-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en.md new file mode 100644 index 000000000000..31284903b34d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-finetuned-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en_5.2.0_3.0_1699997828295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en_5.2.0_3.0_1699997828295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.large_uncased_uncased_whole_word_masking_finetuned.by_SauravMaheshkar").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-large-uncased-whole-word-masking-finetuned-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en.md new file mode 100644 index 000000000000..d84c9dc8076f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from haddadalwi) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-finetuned-squad-finetuned-islamic-squad` is a English model orginally trained by `haddadalwi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en_5.2.0_3.0_1699997999939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en_5.2.0_3.0_1699997999939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large_uncased.by_haddadalwi").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/haddadalwi/bert-large-uncased-whole-word-masking-finetuned-squad-finetuned-islamic-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en.md new file mode 100644 index 000000000000..86d2df1efa56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from madlag) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-finetuned-squadv2` is a English model orginally trained by `madlag`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en_5.2.0_3.0_1699998620273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en_5.2.0_3.0_1699998620273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_uncased_whole_word_masking_v2.by_madlag").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_squad2_en.md new file mode 100644 index 000000000000..5162254c1c31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_large_uncased_whole_word_masking_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from deepset) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_squad2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-squad2` is a English model orginally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_squad2_en_5.2.0_3.0_1699995681298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_squad2_en_5.2.0_3.0_1699995681298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_uncased.by_deepset").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/bert-large-uncased-whole-word-masking-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_medium_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_medium_finetuned_squad_en.md new file mode 100644 index 000000000000..071a0739a79c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_medium_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_bert_medium_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-medium-finetuned-squad` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_finetuned_squad_en_5.2.0_3.0_1699998837047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_finetuned_squad_en_5.2.0_3.0_1699998837047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_medium_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_medium_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.medium.by_anas-awadalla").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_medium_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|154.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-medium-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_medium_squad2_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_medium_squad2_distilled_en.md new file mode 100644 index 000000000000..d9e169cf8b4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_medium_squad2_distilled_en.md @@ -0,0 +1,118 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from deepset) +author: John Snow Labs +name: bert_qa_bert_medium_squad2_distilled +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-medium-squad2-distilled` is a English model orginally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_squad2_distilled_en_5.2.0_3.0_1699999003843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_medium_squad2_distilled_en_5.2.0_3.0_1699999003843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_medium_squad2_distilled","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_medium_squad2_distilled","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.distilled_medium").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_medium_squad2_distilled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|154.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/bert-medium-squad2-distilled +- https://github.com/deepset-ai/haystack/discussions +- https://deepset.ai +- https://twitter.com/deepset_ai +- http://www.deepset.ai/jobs +- https://haystack.deepset.ai/community/join +- https://github.com/deepset-ai/haystack/ +- https://deepset.ai/german-bert +- https://www.linkedin.com/company/deepset-ai/ +- https://github.com/deepset-ai/FARM +- https://deepset.ai/germanquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_mini_5_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_mini_5_finetuned_squadv2_en.md new file mode 100644 index 000000000000..73375612864e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_mini_5_finetuned_squadv2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_mini_5_finetuned_squadv2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-mini-5-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_mini_5_finetuned_squadv2_en_5.2.0_3.0_1699999165366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_mini_5_finetuned_squadv2_en_5.2.0_3.0_1699999165366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_mini_5_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_mini_5_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.base_v2_5.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_mini_5_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|65.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-mini-5-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finedtuned_xquad_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finedtuned_xquad_chaii_en.md new file mode 100644 index 000000000000..cd404986d0e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finedtuned_xquad_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_multi_cased_finedtuned_xquad_chaii +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-cased-finedtuned-xquad-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finedtuned_xquad_chaii_en_5.2.0_3.0_1699999525919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finedtuned_xquad_chaii_en_5.2.0_3.0_1699999525919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_finedtuned_xquad_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_cased_finedtuned_xquad_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.xquad_chaii.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_finedtuned_xquad_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-multi-cased-finedtuned-xquad-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx.md new file mode 100644 index 000000000000..5fe774c07229 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp +date: 2023-11-14 +tags: [te, en, open_source, question_answering, bert, xx, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-cased-finedtuned-xquad-tydiqa-goldp` is a Multilingual model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx_5.2.0_3.0_1699998394777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx_5.2.0_3.0_1699998394777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp","xx") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp","xx") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.xquad_tydiqa.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-multi-cased-finedtuned-xquad-tydiqa-goldp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finetuned_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finetuned_chaii_en.md new file mode 100644 index 000000000000..ddfe6dd24aa0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finetuned_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_multi_cased_finetuned_chaii +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-cased-finetuned-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_chaii_en_5.2.0_3.0_1699998193227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_chaii_en_5.2.0_3.0_1699998193227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_finetuned_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_cased_finetuned_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_finetuned_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-multi-cased-finetuned-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en.md new file mode 100644 index 000000000000..c6cd3a8fca71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from TingChenChang) +author: John Snow Labs +name: bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-cased-finetuned-xquadv1-finetuned-squad-colab` is a English model orginally trained by `TingChenChang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en_5.2.0_3.0_1699998716394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en_5.2.0_3.0_1699998716394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.xquad_squad.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/TingChenChang/bert-multi-cased-finetuned-xquadv1-finetuned-squad-colab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_english_german_squad2_de.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_english_german_squad2_de.md new file mode 100644 index 000000000000..3de3ec27be75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_english_german_squad2_de.md @@ -0,0 +1,110 @@ +--- +layout: model +title: German BertForQuestionAnswering model (from deutsche-telekom) +author: John Snow Labs +name: bert_qa_bert_multi_english_german_squad2 +date: 2023-11-14 +tags: [de, open_source, question_answering, bert, onnx] +task: Question Answering +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-english-german-squad2` is a German model orginally trained by `deutsche-telekom`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_english_german_squad2_de_5.2.0_3.0_1699996034487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_english_german_squad2_de_5.2.0_3.0_1699996034487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_english_german_squad2","de") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_english_german_squad2","de") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.answer_question.squadv2.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_english_german_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deutsche-telekom/bert-multi-english-german-squad2 +- https://rajpurkar.github.io/SQuAD-explorer/ +- https://github.com/google-research/bert/blob/master/multilingual.md \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_uncased_finetuned_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_uncased_finetuned_chaii_en.md new file mode 100644 index 000000000000..5127b5592c46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_multi_uncased_finetuned_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_multi_uncased_finetuned_chaii +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-uncased-finetuned-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_uncased_finetuned_chaii_en_5.2.0_3.0_1699999874498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_uncased_finetuned_chaii_en_5.2.0_3.0_1699999874498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_uncased_finetuned_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_uncased_finetuned_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_uncased_finetuned_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-multi-uncased-finetuned-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_qasper_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_qasper_en.md new file mode 100644 index 000000000000..27f9234aa15a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_qasper_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from z-uo) +author: John Snow Labs +name: bert_qa_bert_qasper +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-qasper` is a English model orginally trained by `z-uo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_qasper_en_5.2.0_3.0_1699998983862.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_qasper_en_5.2.0_3.0_1699998983862.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_qasper","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_qasper","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_z-uo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_qasper| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/z-uo/bert-qasper \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4_en.md new file mode 100644 index 000000000000..e2c457e0877f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4_en_5.2.0_3.0_1699999176084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4_en_5.2.0_3.0_1699999176084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_set_date_1_lr_2e_5_bosnian_32_ep_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_1-lr-2e-5-bs-32-ep-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_small_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_small_finetuned_squadv2_en.md new file mode 100644 index 000000000000..9687a398d315 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_small_finetuned_squadv2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_small_finetuned_squadv2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_finetuned_squadv2_en_5.2.0_3.0_1699999356346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_small_finetuned_squadv2_en_5.2.0_3.0_1699999356346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_small_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_small_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.small.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_small_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-small-finetuned-squadv2 +- https://twitter.com/mrm8488 +- https://github.com/google-research +- https://arxiv.org/abs/1908.08962 +- https://rajpurkar.github.io/SQuAD-explorer/ +- https://github.com/google-research/bert/ +- https://www.linkedin.com/in/manuel-romero-cs/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_2_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_2_finetuned_squadv2_en.md new file mode 100644 index 000000000000..c9bf6053ea08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_2_finetuned_squadv2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_tiny_2_finetuned_squadv2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-2-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_2_finetuned_squadv2_en_5.2.0_3.0_1700000353666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_2_finetuned_squadv2_en_5.2.0_3.0_1700000353666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_tiny_2_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_tiny_2_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.tiny_v2.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_tiny_2_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|19.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-2-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_5_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_5_finetuned_squadv2_en.md new file mode 100644 index 000000000000..9bf06d89f7f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_5_finetuned_squadv2_en.md @@ -0,0 +1,113 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_tiny_5_finetuned_squadv2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-5-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_5_finetuned_squadv2_en_5.2.0_3.0_1700000467523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_5_finetuned_squadv2_en_5.2.0_3.0_1700000467523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_tiny_5_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_tiny_5_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.tiny_v5.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_tiny_5_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|24.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-5-finetuned-squadv2 +- https://twitter.com/mrm8488 +- https://github.com/google-research +- https://arxiv.org/abs/1908.08962 +- https://rajpurkar.github.io/SQuAD-explorer/ +- https://www.linkedin.com/in/manuel-romero-cs/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_finetuned_squadv2_en.md new file mode 100644 index 000000000000..584719dfc599 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_tiny_finetuned_squadv2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_tiny_finetuned_squadv2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_finetuned_squadv2_en_5.2.0_3.0_1699999509086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_tiny_finetuned_squadv2_en_5.2.0_3.0_1699999509086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_tiny_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_tiny_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.tiny_.by_mrm8488").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_tiny_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-tiny-finetuned-squadv2 +- https://twitter.com/mrm8488 +- https://github.com/google-research +- https://arxiv.org/abs/1908.08962 +- https://rajpurkar.github.io/SQuAD-explorer/ +- https://github.com/google-research/bert/ +- https://www.linkedin.com/in/manuel-romero-cs/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_turkish_question_answering_tr.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_turkish_question_answering_tr.md new file mode 100644 index 000000000000..2d7672c7085f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_turkish_question_answering_tr.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering model (from lserinol) +author: John Snow Labs +name: bert_qa_bert_turkish_question_answering +date: 2023-11-14 +tags: [tr, open_source, question_answering, bert, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-turkish-question-answering` is a Turkish model orginally trained by `lserinol`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_turkish_question_answering_tr_5.2.0_3.0_1700000755160.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_turkish_question_answering_tr_5.2.0_3.0_1700000755160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_turkish_question_answering","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_turkish_question_answering","tr") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_turkish_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/lserinol/bert-turkish-question-answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna_en.md new file mode 100644 index 000000000000..1688aa879c9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1699999706791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1699999706791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|177.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-10_H-512_A-8_cord19-200616_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..b1b24286279b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1699998392305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1699998392305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|177.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-10_H-512_A-8_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_squad2_en.md new file mode 100644 index 000000000000..b0225e113e9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_10_h_512_a_8_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_10_h_512_a_8_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_10_h_512_a_8_squad2 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_10_h_512_a_8_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_squad2_en_5.2.0_3.0_1699999910987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_squad2_en_5.2.0_3.0_1699999910987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_10_h_512_a_8_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|177.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-10_H-512_A-8_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..abe15ec05ab3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1700000067171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1700000067171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_2_h_512_a_8_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|83.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-2_H-512_A-8_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna_en.md new file mode 100644 index 000000000000..be0c7216fc3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna_en_5.2.0_3.0_1700000897157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna_en_5.2.0_3.0_1700000897157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_256_a_4_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|41.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-256_A-4_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna_en.md new file mode 100644 index 000000000000..49c72cb6fd91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700000249414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700000249414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-512_A-8_cord19-200616_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..5d7debe3a89e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1699998589593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1699998589593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-512_A-8_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_squad2_en.md new file mode 100644 index 000000000000..bf2e8d9c04c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_512_a_8_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_512_a_8_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_512_a_8_squad2 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_512_a_8_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_squad2_en_5.2.0_3.0_1700001069916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_squad2_en_5.2.0_3.0_1700001069916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_512_a_8_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-512_A-8_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna_en.md new file mode 100644 index 000000000000..31b2b79d867a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700001266993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700001266993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_768_a_12_cord19_200616_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|194.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-768_A-12_cord19-200616_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna_en.md new file mode 100644 index 000000000000..f2ab964e7df9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna_en_5.2.0_3.0_1700000641079.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna_en_5.2.0_3.0_1700000641079.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_768_a_12_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|195.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-768_A-12_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_squad2_en.md new file mode 100644 index 000000000000..7715a805ce03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_4_h_768_a_12_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_768_a_12_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_768_a_12_squad2 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_768_a_12_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_squad2_en_5.2.0_3.0_1700000436986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_768_a_12_squad2_en_5.2.0_3.0_1700000436986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_768_a_12_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_768_a_12_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|195.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-768_A-12_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_6_h_128_a_2_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_6_h_128_a_2_squad2_en.md new file mode 100644 index 000000000000..3fe43f82a688 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bert_uncased_l_6_h_128_a_2_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_6_h_128_a_2_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_6_h_128_a_2_squad2 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_6_h_128_a_2_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_6_h_128_a_2_squad2_en_5.2.0_3.0_1700001385455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_6_h_128_a_2_squad2_en_5.2.0_3.0_1700001385455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_6_h_128_a_2_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_6_h_128_a_2_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_6_h_128_a_2_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|19.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-6_H-128_A-2_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertfast_01_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertfast_01_en.md new file mode 100644 index 000000000000..28a6ea272588 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertfast_01_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_bertfast_01 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertFast_01` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertfast_01_en_5.2.0_3.0_1699996341011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertfast_01_en_5.2.0_3.0_1699996341011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bertfast_01","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bertfast_01","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertfast_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/bertFast_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertfast_02_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertfast_02_en.md new file mode 100644 index 000000000000..56a21f14ef05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertfast_02_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_bertfast_02 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertFast_02` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertfast_02_en_5.2.0_3.0_1700001601061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertfast_02_en_5.2.0_3.0_1700001601061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bertfast_02","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bertfast_02","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertfast_02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/bertFast_02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertimbau_squad1.1_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertimbau_squad1.1_en.md new file mode 100644 index 000000000000..4dd526732b52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertimbau_squad1.1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from hendrixcosta) +author: John Snow Labs +name: bert_qa_bertimbau_squad1.1 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertimbau-squad1.1` is a English model orginally trained by `hendrixcosta`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertimbau_squad1.1_en_5.2.0_3.0_1699996902121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertimbau_squad1.1_en_5.2.0_3.0_1699996902121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertimbau_squad1.1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bertimbau_squad1.1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_hendrixcosta").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertimbau_squad1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hendrixcosta/bertimbau-squad1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertlargeabsa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertlargeabsa_en.md new file mode 100644 index 000000000000..66a3dc2c2b61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertlargeabsa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Cased model (from LucasS) +author: John Snow Labs +name: bert_qa_bertlargeabsa +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertLargeABSA` is a English model originally trained by `LucasS`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertlargeabsa_en_5.2.0_3.0_1700001215917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertlargeabsa_en_5.2.0_3.0_1700001215917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertlargeabsa","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_bertlargeabsa","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.abs").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertlargeabsa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/LucasS/bertLargeABSA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_base_cmrc_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_base_cmrc_en.md new file mode 100644 index 000000000000..b2a481173560 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_base_cmrc_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from rsvp-ai) +author: John Snow Labs +name: bert_qa_bertserini_base_cmrc +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertserini-bert-base-cmrc` is a English model originally trained by `rsvp-ai`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_base_cmrc_en_5.2.0_3.0_1700001534997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_base_cmrc_en_5.2.0_3.0_1700001534997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertserini_base_cmrc","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_bertserini_base_cmrc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base.serini.cmrc").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertserini_base_cmrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/rsvp-ai/bertserini-bert-base-cmrc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_bert_base_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_bert_base_squad_en.md new file mode 100644 index 000000000000..e169dd1d1950 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_bert_base_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from rsvp-ai) +author: John Snow Labs +name: bert_qa_bertserini_bert_base_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertserini-bert-base-squad` is a English model orginally trained by `rsvp-ai`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_bert_base_squad_en_5.2.0_3.0_1700001867988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_bert_base_squad_en_5.2.0_3.0_1700001867988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertserini_bert_base_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bertserini_bert_base_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base.by_rsvp-ai").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertserini_bert_base_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/rsvp-ai/bertserini-bert-base-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_bert_large_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_bert_large_squad_en.md new file mode 100644 index 000000000000..49321ec74764 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bertserini_bert_large_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from rsvp-ai) +author: John Snow Labs +name: bert_qa_bertserini_bert_large_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertserini-bert-large-squad` is a English model orginally trained by `rsvp-ai`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_bert_large_squad_en_5.2.0_3.0_1699999112455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_bert_large_squad_en_5.2.0_3.0_1699999112455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertserini_bert_large_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bertserini_bert_large_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large.by_rsvp-ai").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertserini_bert_large_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/rsvp-ai/bertserini-bert-large-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_beto_base_spanish_sqac_es.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_beto_base_spanish_sqac_es.md new file mode 100644 index 000000000000..dc5bbe59d312 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_beto_base_spanish_sqac_es.md @@ -0,0 +1,112 @@ +--- +layout: model +title: Spanish BertForQuestionAnswering model (from IIC) +author: John Snow Labs +name: bert_qa_beto_base_spanish_sqac +date: 2023-11-14 +tags: [es, open_source, question_answering, bert, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beto-base-spanish-sqac` is a Spanish model orginally trained by `IIC`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_beto_base_spanish_sqac_es_5.2.0_3.0_1699997233869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_beto_base_spanish_sqac_es_5.2.0_3.0_1699997233869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_beto_base_spanish_sqac","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_beto_base_spanish_sqac","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.sqac.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_beto_base_spanish_sqac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/IIC/beto-base-spanish-sqac +- https://paperswithcode.com/sota?task=question-answering&dataset=PlanTL-GOB-ES%2FSQAC +- https://arxiv.org/abs/2107.07253 +- https://github.com/dccuchile/beto +- https://www.bsc.es/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_beto_base_spanish_squades2_es.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_beto_base_spanish_squades2_es.md new file mode 100644 index 000000000000..a444147e984a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_beto_base_spanish_squades2_es.md @@ -0,0 +1,96 @@ +--- +layout: model +title: Spanish BertForQuestionAnswering Base Cased model (from inigopm) +author: John Snow Labs +name: bert_qa_beto_base_spanish_squades2 +date: 2023-11-14 +tags: [es, open_source, bert, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beto-base-spanish-squades2` is a Spanish model originally trained by `inigopm`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_beto_base_spanish_squades2_es_5.2.0_3.0_1700002133617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_beto_base_spanish_squades2_es_5.2.0_3.0_1700002133617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_beto_base_spanish_squades2","es")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_beto_base_spanish_squades2","es") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_beto_base_spanish_squades2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/inigopm/beto-base-spanish-squades2 +- https://github.com/josecannete/spanish-corpora +- https://paperswithcode.com/sota?task=question-answering&dataset=squad_es+v2.0.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_en.md new file mode 100644 index 000000000000..004d22705671 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from dmis-lab) +author: John Snow Labs +name: bert_qa_biobert_base_cased_v1.1_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert-base-cased-v1.1-squad` is a English model orginally trained by `dmis-lab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_en_5.2.0_3.0_1700001838261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_en_5.2.0_3.0_1700001838261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_base_cased_v1.1_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_base_cased_v1.1_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.biobert.base_cased.by_dmis-lab").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_base_cased_v1.1_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/dmis-lab/biobert-base-cased-v1.1-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert_en.md new file mode 100644 index 000000000000..c844456626a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from juliusco) +author: John Snow Labs +name: bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert-base-cased-v1.1-squad-finetuned-biobert` is a English model orginally trained by `juliusco`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert_en_5.2.0_3.0_1700002123070.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert_en_5.2.0_3.0_1700002123070.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.biobert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_base_cased_v1.1_squad_finetuned_biobert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/juliusco/biobert-base-cased-v1.1-squad-finetuned-biobert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert_en.md new file mode 100644 index 000000000000..1afee3a54cc8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from juliusco) +author: John Snow Labs +name: bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert-base-cased-v1.1-squad-finetuned-covbiobert` is a English model orginally trained by `juliusco`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert_en_5.2.0_3.0_1699997515541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert_en_5.2.0_3.0_1699997515541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.covid_biobert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_base_cased_v1.1_squad_finetuned_covbiobert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/juliusco/biobert-base-cased-v1.1-squad-finetuned-covbiobert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert_en.md new file mode 100644 index 000000000000..129803d50878 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from juliusco) +author: John Snow Labs +name: bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert-base-cased-v1.1-squad-finetuned-covdrobert` is a English model orginally trained by `juliusco`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert_en_5.2.0_3.0_1700002427396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert_en_5.2.0_3.0_1700002427396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.covid_roberta.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_base_cased_v1.1_squad_finetuned_covdrobert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/juliusco/biobert-base-cased-v1.1-squad-finetuned-covdrobert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_bioasq_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_bioasq_en.md new file mode 100644 index 000000000000..08ba3db02837 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_bioasq_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from gdario) +author: John Snow Labs +name: bert_qa_biobert_bioasq +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_bioasq` is a English model orginally trained by `gdario`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_bioasq_en_5.2.0_3.0_1699999388451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_bioasq_en_5.2.0_3.0_1699999388451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_bioasq","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_bioasq","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.biobert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_bioasq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/gdario/biobert_bioasq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_large_cased_v1.1_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_large_cased_v1.1_squad_en.md new file mode 100644 index 000000000000..d1bbd37f4780 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_large_cased_v1.1_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Cased model (from dmis-lab) +author: John Snow Labs +name: bert_qa_biobert_large_cased_v1.1_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert-large-cased-v1.1-squad` is a English model originally trained by `dmis-lab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_large_cased_v1.1_squad_en_5.2.0_3.0_1700002976804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_large_cased_v1.1_squad_en_5.2.0_3.0_1700002976804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_large_cased_v1.1_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_biobert_large_cased_v1.1_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.biobert.squad.cased_large").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_large_cased_v1.1_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/dmis-lab/biobert-large-cased-v1.1-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_squad2_cased_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_squad2_cased_en.md new file mode 100644 index 000000000000..86a6c7d606a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_squad2_cased_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from clagator) +author: John Snow Labs +name: bert_qa_biobert_squad2_cased +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_squad2_cased` is a English model orginally trained by `clagator`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_squad2_cased_en_5.2.0_3.0_1699999675367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_squad2_cased_en_5.2.0_3.0_1699999675367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_squad2_cased","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_squad2_cased","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.biobert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_squad2_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/clagator/biobert_squad2_cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_squad2_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_squad2_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..b071e768823a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_squad2_cased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ptnv-s) +author: John Snow Labs +name: bert_qa_biobert_squad2_cased_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_squad2_cased-finetuned-squad` is a English model orginally trained by `ptnv-s`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_squad2_cased_finetuned_squad_en_5.2.0_3.0_1699997805522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_squad2_cased_finetuned_squad_en_5.2.0_3.0_1699997805522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_squad2_cased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_squad2_cased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.biobert.cased.by_ptnv-s").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_squad2_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ptnv-s/biobert_squad2_cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_biomedicalquestionanswering_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_biomedicalquestionanswering_en.md new file mode 100644 index 000000000000..0e685c2866ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_biomedicalquestionanswering_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Shushant) +author: John Snow Labs +name: bert_qa_biobert_v1.1_biomedicalquestionanswering +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert-v1.1-biomedicalQuestionAnswering` is a English model originally trained by `Shushant`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_v1.1_biomedicalquestionanswering_en_5.2.0_3.0_1699999913959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_v1.1_biomedicalquestionanswering_en_5.2.0_3.0_1699999913959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_v1.1_biomedicalquestionanswering","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_biobert_v1.1_biomedicalquestionanswering","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.biobert.bio_medical.").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_v1.1_biomedicalquestionanswering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Shushant/biobert-v1.1-biomedicalQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_pubmed_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_pubmed_finetuned_squad_en.md new file mode 100644 index 000000000000..0d2b64fe6f65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_pubmed_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from gerardozq) +author: John Snow Labs +name: bert_qa_biobert_v1.1_pubmed_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_v1.1_pubmed-finetuned-squad` is a English model orginally trained by `gerardozq`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_v1.1_pubmed_finetuned_squad_en_5.2.0_3.0_1700002425889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_v1.1_pubmed_finetuned_squad_en_5.2.0_3.0_1700002425889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_v1.1_pubmed_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_v1.1_pubmed_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_pubmed.biobert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_v1.1_pubmed_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/gerardozq/biobert_v1.1_pubmed-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_pubmed_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_pubmed_squad_v2_en.md new file mode 100644 index 000000000000..d8aa557873ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobert_v1.1_pubmed_squad_v2_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ktrapeznikov) +author: John Snow Labs +name: bert_qa_biobert_v1.1_pubmed_squad_v2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biobert_v1.1_pubmed_squad_v2` is a English model orginally trained by `ktrapeznikov`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_v1.1_pubmed_squad_v2_en_5.2.0_3.0_1700000208525.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobert_v1.1_pubmed_squad_v2_en_5.2.0_3.0_1700000208525.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobert_v1.1_pubmed_squad_v2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biobert_v1.1_pubmed_squad_v2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_pubmed.biobert.v2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobert_v1.1_pubmed_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ktrapeznikov/biobert_v1.1_pubmed_squad_v2 +- https://rajpurkar.github.io/SQuAD-explorer/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobertpt_squad_v1.1_portuguese_pt.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobertpt_squad_v1.1_portuguese_pt.md new file mode 100644 index 000000000000..606c82766c64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biobertpt_squad_v1.1_portuguese_pt.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Portuguese bert_qa_biobertpt_squad_v1.1_portuguese BertForQuestionAnswering from pucpr +author: John Snow Labs +name: bert_qa_biobertpt_squad_v1.1_portuguese +date: 2023-11-14 +tags: [bert, pt, open_source, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_biobertpt_squad_v1.1_portuguese` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobertpt_squad_v1.1_portuguese_pt_5.2.0_3.0_1699996422588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobertpt_squad_v1.1_portuguese_pt_5.2.0_3.0_1699996422588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobertpt_squad_v1.1_portuguese","pt") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_biobertpt_squad_v1.1_portuguese", "pt") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobertpt_squad_v1.1_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|pt| +|Size:|664.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/pucpr/bioBERTpt-squad-v1.1-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bioformer_cased_v1.0_squad1_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bioformer_cased_v1.0_squad1_en.md new file mode 100644 index 000000000000..41912fbbc29c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bioformer_cased_v1.0_squad1_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from bioformers) +author: John Snow Labs +name: bert_qa_bioformer_cased_v1.0_squad1 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bioformer-cased-v1.0-squad1` is a English model orginally trained by `bioformers`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bioformer_cased_v1.0_squad1_en_5.2.0_3.0_1700002661909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bioformer_cased_v1.0_squad1_en_5.2.0_3.0_1700002661909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bioformer_cased_v1.0_squad1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bioformer_cased_v1.0_squad1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bioformer.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bioformer_cased_v1.0_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|158.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bioformers/bioformer-cased-v1.0-squad1 +- https://rajpurkar.github.io/SQuAD-explorer +- https://arxiv.org/pdf/1910.01108.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biomedical_slot_filling_reader_base_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biomedical_slot_filling_reader_base_en.md new file mode 100644 index 000000000000..ec6f3f518624 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biomedical_slot_filling_reader_base_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from healx) +author: John Snow Labs +name: bert_qa_biomedical_slot_filling_reader_base +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biomedical-slot-filling-reader-base` is a English model orginally trained by `healx`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biomedical_slot_filling_reader_base_en_5.2.0_3.0_1699996757432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biomedical_slot_filling_reader_base_en_5.2.0_3.0_1699996757432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biomedical_slot_filling_reader_base","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biomedical_slot_filling_reader_base","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bio_medical.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biomedical_slot_filling_reader_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/healx/biomedical-slot-filling-reader-base +- https://arxiv.org/abs/2109.08564 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biomedical_slot_filling_reader_large_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biomedical_slot_filling_reader_large_en.md new file mode 100644 index 000000000000..2a3adcbaa8e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_biomedical_slot_filling_reader_large_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from healx) +author: John Snow Labs +name: bert_qa_biomedical_slot_filling_reader_large +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biomedical-slot-filling-reader-large` is a English model orginally trained by `healx`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biomedical_slot_filling_reader_large_en_5.2.0_3.0_1700003224568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biomedical_slot_filling_reader_large_en_5.2.0_3.0_1700003224568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biomedical_slot_filling_reader_large","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biomedical_slot_filling_reader_large","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bio_medical.bert.large").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biomedical_slot_filling_reader_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/healx/biomedical-slot-filling-reader-large +- https://arxiv.org/abs/2109.08564 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_braquad_bert_qna_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_braquad_bert_qna_en.md new file mode 100644 index 000000000000..77dc455317b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_braquad_bert_qna_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from piEsposito) +author: John Snow Labs +name: bert_qa_braquad_bert_qna +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `braquad-bert-qna` is a English model orginally trained by `piEsposito`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_braquad_bert_qna_en_5.2.0_3.0_1700003224896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_braquad_bert_qna_en_5.2.0_3.0_1700003224896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_braquad_bert_qna","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_braquad_bert_qna","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_piEsposito").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_braquad_bert_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/piEsposito/braquad-bert-qna +- https://github.com/piEsposito/br-quad-2.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bsnmldb_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bsnmldb_finetuned_squad_en.md new file mode 100644 index 000000000000..6c55472f5c6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_bsnmldb_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from bsnmldb) +author: John Snow Labs +name: bert_qa_bsnmldb_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `bsnmldb`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bsnmldb_finetuned_squad_en_5.2.0_3.0_1700000456714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bsnmldb_finetuned_squad_en_5.2.0_3.0_1700000456714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bsnmldb_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bsnmldb_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bsnmldb_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bsnmldb/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_case_base_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_case_base_en.md new file mode 100644 index 000000000000..049f19689290 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_case_base_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from srcocotero) +author: John Snow Labs +name: bert_qa_case_base +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-qa` is a English model originally trained by `srcocotero`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_case_base_en_5.2.0_3.0_1699998107954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_case_base_en_5.2.0_3.0_1699998107954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_case_base","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_case_base","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_case_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/srcocotero/bert-base-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_causal_qa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_causal_qa_en.md new file mode 100644 index 000000000000..1f012b02e0d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_causal_qa_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from manav) +author: John Snow Labs +name: bert_qa_causal_qa +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `causal_qa` is a English model orginally trained by `manav`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_causal_qa_en_5.2.0_3.0_1700003870784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_causal_qa_en_5.2.0_3.0_1700003870784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_causal_qa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_causal_qa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_manav").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_causal_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/manav/causal_qa +- https://github.com/kstats/CausalQG \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cgt_roberta_wwm_ext_large_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cgt_roberta_wwm_ext_large_zh.md new file mode 100644 index 000000000000..302fb1563235 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cgt_roberta_wwm_ext_large_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Large Cased model (from cgt) +author: John Snow Labs +name: bert_qa_cgt_roberta_wwm_ext_large +date: 2023-11-14 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Roberta-wwm-ext-large-qa` is a Chinese model originally trained by `cgt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_cgt_roberta_wwm_ext_large_zh_5.2.0_3.0_1699998636857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_cgt_roberta_wwm_ext_large_zh_5.2.0_3.0_1699998636857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cgt_roberta_wwm_ext_large","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cgt_roberta_wwm_ext_large","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_cgt_roberta_wwm_ext_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/cgt/Roberta-wwm-ext-large-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chemical_bert_uncased_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chemical_bert_uncased_squad2_en.md new file mode 100644 index 000000000000..5e6c07eaf348 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chemical_bert_uncased_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from recobo) +author: John Snow Labs +name: bert_qa_chemical_bert_uncased_squad2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `chemical-bert-uncased-squad2` is a English model orginally trained by `recobo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chemical_bert_uncased_squad2_en_5.2.0_3.0_1699998908659.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chemical_bert_uncased_squad2_en_5.2.0_3.0_1699998908659.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_chemical_bert_uncased_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_chemical_bert_uncased_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_chemical.bert.uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chemical_bert_uncased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/recobo/chemical-bert-uncased-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_base_mrc_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_base_mrc_zh.md new file mode 100644 index 000000000000..274ffbf272f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_base_mrc_zh.md @@ -0,0 +1,116 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from hfl) +author: John Snow Labs +name: bert_qa_chinese_pert_base_mrc +date: 2023-11-14 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `chinese-pert-base-mrc` is a Chinese model orginally trained by `hfl`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pert_base_mrc_zh_5.2.0_3.0_1699997205949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pert_base_mrc_zh_5.2.0_3.0_1699997205949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_chinese_pert_base_mrc","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_chinese_pert_base_mrc","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chinese_pert_base_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|381.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hfl/chinese-pert-base-mrc +- https://github.com/ymcui/PERT +- https://github.com/ymcui/Chinese-ELECTRA +- https://github.com/ymcui/Chinese-Minority-PLM +- https://github.com/ymcui/HFL-Anthology +- https://github.com/ymcui/Chinese-BERT-wwm +- https://github.com/ymcui/Chinese-XLNet +- https://github.com/airaria/TextBrewer +- https://github.com/ymcui/MacBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_large_mrc_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_large_mrc_zh.md new file mode 100644 index 000000000000..0264946e070a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_large_mrc_zh.md @@ -0,0 +1,116 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from hfl) +author: John Snow Labs +name: bert_qa_chinese_pert_large_mrc +date: 2023-11-14 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `chinese-pert-large-mrc` is a Chinese model orginally trained by `hfl`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pert_large_mrc_zh_5.2.0_3.0_1700003780458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pert_large_mrc_zh_5.2.0_3.0_1700003780458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_chinese_pert_large_mrc","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_chinese_pert_large_mrc","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.large.by_hfl").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chinese_pert_large_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hfl/chinese-pert-large-mrc +- https://github.com/ymcui/PERT +- https://github.com/ymcui/Chinese-ELECTRA +- https://github.com/ymcui/Chinese-Minority-PLM +- https://github.com/ymcui/HFL-Anthology +- https://github.com/ymcui/Chinese-BERT-wwm +- https://github.com/ymcui/Chinese-XLNet +- https://github.com/airaria/TextBrewer +- https://github.com/ymcui/MacBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_large_open_domain_mrc_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_large_open_domain_mrc_zh.md new file mode 100644 index 000000000000..e559db969f20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pert_large_open_domain_mrc_zh.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from qalover) +author: John Snow Labs +name: bert_qa_chinese_pert_large_open_domain_mrc +date: 2023-11-14 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `chinese-pert-large-open-domain-mrc` is a Chinese model originally trained by `qalover`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pert_large_open_domain_mrc_zh_5.2.0_3.0_1699999466545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pert_large_open_domain_mrc_zh_5.2.0_3.0_1699999466545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_chinese_pert_large_open_domain_mrc","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_chinese_pert_large_open_domain_mrc","zh") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.large").predict("""PUT YOUR QUESTION HERE|||"PUT YOUR CONTEXT HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chinese_pert_large_open_domain_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/qalover/chinese-pert-large-open-domain-mrc +- https://github.com/dbiir/UER-py/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pretrain_mrc_macbert_large_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pretrain_mrc_macbert_large_zh.md new file mode 100644 index 000000000000..cf5f8096b68a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pretrain_mrc_macbert_large_zh.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from luhua) +author: John Snow Labs +name: bert_qa_chinese_pretrain_mrc_macbert_large +date: 2023-11-14 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `chinese_pretrain_mrc_macbert_large` is a Chinese model orginally trained by `luhua`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pretrain_mrc_macbert_large_zh_5.2.0_3.0_1700004370480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pretrain_mrc_macbert_large_zh_5.2.0_3.0_1700004370480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_chinese_pretrain_mrc_macbert_large","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_chinese_pretrain_mrc_macbert_large","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.mac_bert.large").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chinese_pretrain_mrc_macbert_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/luhua/chinese_pretrain_mrc_macbert_large +- https://github.com/basketballandlearn/MRC_Competition_Dureader \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large_zh.md new file mode 100644 index 000000000000..183560d6b09e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large_zh.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from luhua) +author: John Snow Labs +name: bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large +date: 2023-11-14 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `chinese_pretrain_mrc_roberta_wwm_ext_large` is a Chinese model orginally trained by `luhua`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large_zh_5.2.0_3.0_1700000001080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large_zh_5.2.0_3.0_1700000001080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.large.by_luhua").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chinese_pretrain_mrc_roberta_wwm_ext_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/luhua/chinese_pretrain_mrc_roberta_wwm_ext_large +- https://github.com/basketballandlearn/MRC_Competition_Dureader \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_question_answering_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_question_answering_zh.md new file mode 100644 index 000000000000..c195789888de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinese_question_answering_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Cased model (from NchuNLP) +author: John Snow Labs +name: bert_qa_chinese_question_answering +date: 2023-11-14 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Chinese-Question-Answering` is a Chinese model originally trained by `NchuNLP`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_question_answering_zh_5.2.0_3.0_1700000722663.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chinese_question_answering_zh_5.2.0_3.0_1700000722663.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_chinese_question_answering","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_chinese_question_answering","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chinese_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/NchuNLP/Chinese-Question-Answering +- https://nlpnchu.org/ +- https://demo.nlpnchu.org/ +- https://github.com/NCHU-NLP-Lab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinesebert_zh.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinesebert_zh.md new file mode 100644 index 000000000000..a902430606f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_chinesebert_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Cased model (from dengwei072) +author: John Snow Labs +name: bert_qa_chinesebert +date: 2023-11-14 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ChineseBERT` is a Chinese model originally trained by `dengwei072`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_chinesebert_zh_5.2.0_3.0_1700000253518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_chinesebert_zh_5.2.0_3.0_1700000253518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_chinesebert","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_chinesebert","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_chinesebert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|381.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/dengwei072/ChineseBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_covid_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_covid_squad_en.md new file mode 100644 index 000000000000..48e421202de0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_covid_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from graviraja) +author: John Snow Labs +name: bert_qa_covid_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `covid_squad` is a English model orginally trained by `graviraja`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_covid_squad_en_5.2.0_3.0_1699997520672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_covid_squad_en_5.2.0_3.0_1699997520672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_covid_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_covid_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_covid.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_covid_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/graviraja/covid_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_covidbert_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_covidbert_squad_en.md new file mode 100644 index 000000000000..029ed8b23018 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_covidbert_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from graviraja) +author: John Snow Labs +name: bert_qa_covidbert_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `covidbert_squad` is a English model orginally trained by `graviraja`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_covidbert_squad_en_5.2.0_3.0_1700004666926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_covidbert_squad_en_5.2.0_3.0_1700004666926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_covidbert_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_covidbert_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.covid_bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_covidbert_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/graviraja/covidbert_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_csarron_bert_base_uncased_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_csarron_bert_base_uncased_squad_v1_en.md new file mode 100644 index 000000000000..971e4b8c6525 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_csarron_bert_base_uncased_squad_v1_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from csarron) +author: John Snow Labs +name: bert_qa_csarron_bert_base_uncased_squad_v1 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad-v1` is a English model orginally trained by `csarron`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_csarron_bert_base_uncased_squad_v1_en_5.2.0_3.0_1700004972342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_csarron_bert_base_uncased_squad_v1_en_5.2.0_3.0_1700004972342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_csarron_bert_base_uncased_squad_v1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_csarron_bert_base_uncased_squad_v1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_csarron").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_csarron_bert_base_uncased_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/csarron/bert-base-uncased-squad-v1 +- https://twitter.com/sysnlp +- https://awk.ai/ +- https://github.com/csarron +- https://www.aclweb.org/anthology/N19-1423/ +- https://rajpurkar.github.io/SQuAD-explorer +- https://www.aclweb.org/anthology/N19-1423.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cuad_pol_bad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cuad_pol_bad_en.md new file mode 100644 index 000000000000..c6ff6bb5182c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cuad_pol_bad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from beautifulpichai) +author: John Snow Labs +name: bert_qa_cuad_pol_bad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `cuad_pol_bad` is a English model originally trained by `beautifulpichai`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_cuad_pol_bad_en_5.2.0_3.0_1700001346182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_cuad_pol_bad_en_5.2.0_3.0_1700001346182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cuad_pol_bad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cuad_pol_bad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_cuad_pol_bad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/beautifulpichai/cuad_pol_bad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cuad_pol_good_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cuad_pol_good_en.md new file mode 100644 index 000000000000..5ea950297635 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cuad_pol_good_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from beautifulpichai) +author: John Snow Labs +name: bert_qa_cuad_pol_good +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `cuad_pol_good` is a English model originally trained by `beautifulpichai`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_cuad_pol_good_en_5.2.0_3.0_1700000901597.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_cuad_pol_good_en_5.2.0_3.0_1700000901597.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cuad_pol_good","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cuad_pol_good","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_cuad_pol_good| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/beautifulpichai/cuad_pol_good \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cyrusmv_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cyrusmv_finetuned_squad_en.md new file mode 100644 index 000000000000..a25b02a08e42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_cyrusmv_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from cyrusmv) +author: John Snow Labs +name: bert_qa_cyrusmv_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `cyrusmv`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_cyrusmv_finetuned_squad_en_5.2.0_3.0_1700004138417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_cyrusmv_finetuned_squad_en_5.2.0_3.0_1700004138417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cyrusmv_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_cyrusmv_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_cyrusmv_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/cyrusmv/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_danish_bert_botxo_qa_squad_da.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_danish_bert_botxo_qa_squad_da.md new file mode 100644 index 000000000000..fe1ba0a3b1b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_danish_bert_botxo_qa_squad_da.md @@ -0,0 +1,111 @@ +--- +layout: model +title: Danish BertForQuestionAnswering model (from jacobshein) +author: John Snow Labs +name: bert_qa_danish_bert_botxo_qa_squad +date: 2023-11-14 +tags: [da, open_source, question_answering, bert, onnx] +task: Question Answering +language: da +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `danish-bert-botxo-qa-squad` is a Danish model orginally trained by `jacobshein`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_danish_bert_botxo_qa_squad_da_5.2.0_3.0_1700001626558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_danish_bert_botxo_qa_squad_da_5.2.0_3.0_1700001626558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_danish_bert_botxo_qa_squad","da") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_danish_bert_botxo_qa_squad","da") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("da.answer_question.squad.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_danish_bert_botxo_qa_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|da| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jacobshein/danish-bert-botxo-qa-squad +- https://jacobhein.com/#contact +- https://github.com/botxo/nordic_bert +- https://github.com/ccasimiro88/TranslateAlignRetrieve/tree/multilingual/squads-tar/da \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_darshana1406_base_multilingual_cased_finetuned_squad_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_darshana1406_base_multilingual_cased_finetuned_squad_xx.md new file mode 100644 index 000000000000..ef70182fedf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_darshana1406_base_multilingual_cased_finetuned_squad_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Cased model (from darshana1406) +author: John Snow Labs +name: bert_qa_darshana1406_base_multilingual_cased_finetuned_squad +date: 2023-11-14 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-squad` is a Multilingual model originally trained by `darshana1406`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_darshana1406_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1700001977877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_darshana1406_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1700001977877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_darshana1406_base_multilingual_cased_finetuned_squad","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_darshana1406_base_multilingual_cased_finetuned_squad","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_darshana1406_base_multilingual_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/darshana1406/bert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dbg_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dbg_finetuned_squad_en.md new file mode 100644 index 000000000000..520051cb620b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dbg_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Shanny) +author: John Snow Labs +name: bert_qa_dbg_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `dbgbert-finetuned-squad` is a English model originally trained by `Shanny`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_dbg_finetuned_squad_en_5.2.0_3.0_1700002317552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_dbg_finetuned_squad_en_5.2.0_3.0_1700002317552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dbg_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dbg_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_dbg_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Shanny/dbgbert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deberta_v3_base_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deberta_v3_base_en.md new file mode 100644 index 000000000000..60e852afc94d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deberta_v3_base_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from vvincentt) +author: John Snow Labs +name: bert_qa_deberta_v3_base +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deberta-v3-base` is a English model originally trained by `vvincentt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deberta_v3_base_en_5.2.0_3.0_1700005203611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deberta_v3_base_en_5.2.0_3.0_1700005203611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deberta_v3_base","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deberta_v3_base","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deberta_v3_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vvincentt/deberta-v3-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_debug_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_debug_squad_en.md new file mode 100644 index 000000000000..687023d74f7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_debug_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ArpanZS) +author: John Snow Labs +name: bert_qa_debug_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `debug_squad` is a English model orginally trained by `ArpanZS`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_debug_squad_en_5.2.0_3.0_1700005482684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_debug_squad_en_5.2.0_3.0_1700005482684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_debug_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_debug_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_ArpanZS").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_debug_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ArpanZS/debug_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deep_pavlov_full_2_ru.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deep_pavlov_full_2_ru.md new file mode 100644 index 000000000000..1aa0189a11f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deep_pavlov_full_2_ru.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Russian BertForQuestionAnswering Cased model (from ruselkomp) +author: John Snow Labs +name: bert_qa_deep_pavlov_full_2 +date: 2023-11-14 +tags: [ru, open_source, bert, question_answering, onnx] +task: Question Answering +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deep-pavlov-full-2` is a Russian model originally trained by `ruselkomp`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deep_pavlov_full_2_ru_5.2.0_3.0_1700005836423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deep_pavlov_full_2_ru_5.2.0_3.0_1700005836423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_deep_pavlov_full_2","ru") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Как меня зовут?", "Меня зовут Клара, и я живу в Беркли."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_deep_pavlov_full_2","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Как меня зовут?", "Меня зовут Клара, и я живу в Беркли.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deep_pavlov_full_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ru| +|Size:|664.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ruselkomp/deep-pavlov-full-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deep_pavlov_full_ru.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deep_pavlov_full_ru.md new file mode 100644 index 000000000000..dc5934f45de4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deep_pavlov_full_ru.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Russian BertForQuestionAnswering Cased model (from ruselkomp) +author: John Snow Labs +name: bert_qa_deep_pavlov_full +date: 2023-11-14 +tags: [ru, open_source, bert, question_answering, onnx] +task: Question Answering +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deep-pavlov-full` is a Russian model originally trained by `ruselkomp`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deep_pavlov_full_ru_5.2.0_3.0_1700002688118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deep_pavlov_full_ru_5.2.0_3.0_1700002688118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_deep_pavlov_full","ru") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Как меня зовут?", "Меня зовут Клара, и я живу в Беркли."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_deep_pavlov_full","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Как меня зовут?", "Меня зовут Клара, и я живу в Беркли.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deep_pavlov_full| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ru| +|Size:|664.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ruselkomp/deep-pavlov-full \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_bert_base_uncased_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_bert_base_uncased_squad2_en.md new file mode 100644 index 000000000000..f50405c834a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_bert_base_uncased_squad2_en.md @@ -0,0 +1,118 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from deepset) +author: John Snow Labs +name: bert_qa_deepset_bert_base_uncased_squad2 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad2` is a English model orginally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_bert_base_uncased_squad2_en_5.2.0_3.0_1700001241191.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_bert_base_uncased_squad2_en_5.2.0_3.0_1700001241191.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_deepset_bert_base_uncased_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_deepset_bert_base_uncased_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_bert_base_uncased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/bert-base-uncased-squad2 +- https://github.com/deepset-ai/haystack/discussions +- https://deepset.ai +- https://twitter.com/deepset_ai +- http://www.deepset.ai/jobs +- https://haystack.deepset.ai/community/join +- https://github.com/deepset-ai/haystack/ +- https://deepset.ai/german-bert +- https://www.linkedin.com/company/deepset-ai/ +- https://github.com/deepset-ai/FARM +- https://deepset.ai/germanquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4_en.md new file mode 100644 index 000000000000..ce91b0591cd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from Moussab) +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deepset-minilm-uncased-squad2-orkg-how-1e-4` is a English model originally trained by `Moussab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4_en_5.2.0_3.0_1700002894184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4_en_5.2.0_3.0_1700002894184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_how_1e_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-how-1e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05_en.md new file mode 100644 index 000000000000..e96d7f9b812b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from Moussab) +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deepset-minilm-uncased-squad2-orkg-how-5e-05` is a English model originally trained by `Moussab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05_en_5.2.0_3.0_1700001415991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05_en_5.2.0_3.0_1700001415991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_how_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-how-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4_en.md new file mode 100644 index 000000000000..b0092ca467b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4 BertForQuestionAnswering from Moussab +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4_en_5.2.0_3.0_1700005977978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4_en_5.2.0_3.0_1700005977978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_1e_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| + +## References + +https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-no-label-1e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05_en.md new file mode 100644 index 000000000000..54f1ceda2095 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05 BertForQuestionAnswering from Moussab +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05_en_5.2.0_3.0_1700003005536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05_en_5.2.0_3.0_1700003005536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_norwegian_label_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| + +## References + +https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-no-label-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4_en.md new file mode 100644 index 000000000000..63c9ff642ef5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from Moussab) +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deepset-minilm-uncased-squad2-orkg-what-1e-4` is a English model originally trained by `Moussab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4_en_5.2.0_3.0_1700003146226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4_en_5.2.0_3.0_1700003146226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_what_1e_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-what-1e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05_en.md new file mode 100644 index 000000000000..f1f94ed61724 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from Moussab) +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deepset-minilm-uncased-squad2-orkg-what-5e-05` is a English model originally trained by `Moussab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05_en_5.2.0_3.0_1700004329305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05_en_5.2.0_3.0_1700004329305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_what_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-what-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4_en.md new file mode 100644 index 000000000000..9f4df7d88b40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from Moussab) +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deepset-minilm-uncased-squad2-orkg-which-1e-4` is a English model originally trained by `Moussab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4_en_5.2.0_3.0_1700006148238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4_en_5.2.0_3.0_1700006148238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_which_1e_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-which-1e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05_en.md new file mode 100644 index 000000000000..5bb4f65f4162 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from Moussab) +author: John Snow Labs +name: bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `deepset-minilm-uncased-squad2-orkg-which-5e-05` is a English model originally trained by `Moussab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05_en_5.2.0_3.0_1700004505412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05_en_5.2.0_3.0_1700004505412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_deepset_minilm_uncased_squad2_orkg_which_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Moussab/deepset-minilm-uncased-squad2-orkg-which-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_demo_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_demo_en.md new file mode 100644 index 000000000000..73ce15699db1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_demo_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from internetoftim) +author: John Snow Labs +name: bert_qa_demo +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `demo` is a English model orginally trained by `internetoftim`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_demo_en_5.2.0_3.0_1700002057115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_demo_en_5.2.0_3.0_1700002057115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_demo","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_demo","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_internetoftim").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_demo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|797.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/internetoftim/demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distilbert_base_uncased_finetuned_custom_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distilbert_base_uncased_finetuned_custom_en.md new file mode 100644 index 000000000000..84546f5cf415 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distilbert_base_uncased_finetuned_custom_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from kamilali) +author: John Snow Labs +name: bert_qa_distilbert_base_uncased_finetuned_custom +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-custom` is a English model orginally trained by `kamilali`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_distilbert_base_uncased_finetuned_custom_en_5.2.0_3.0_1700002555606.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_distilbert_base_uncased_finetuned_custom_en_5.2.0_3.0_1700002555606.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_distilbert_base_uncased_finetuned_custom","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_distilbert_base_uncased_finetuned_custom","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.distilled_base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_distilbert_base_uncased_finetuned_custom| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kamilali/distilbert-base-uncased-finetuned-custom \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distilbert_turkish_q_a_tr.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distilbert_turkish_q_a_tr.md new file mode 100644 index 000000000000..1d777e143b4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distilbert_turkish_q_a_tr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Turkish bert_qa_distilbert_turkish_q_a BertForQuestionAnswering from emre +author: John Snow Labs +name: bert_qa_distilbert_turkish_q_a +date: 2023-11-14 +tags: [bert, tr, open_source, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_distilbert_turkish_q_a` is a Turkish model originally trained by emre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_distilbert_turkish_q_a_tr_5.2.0_3.0_1699997767429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_distilbert_turkish_q_a_tr_5.2.0_3.0_1699997767429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_distilbert_turkish_q_a","tr") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_distilbert_turkish_q_a", "tr") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_distilbert_turkish_q_a| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|412.0 MB| + +## References + +https://huggingface.co/emre/distilbert-tr-q-a \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488_es.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488_es.md new file mode 100644 index 000000000000..6369c5ff8dcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488 BertForQuestionAnswering from mrm8488 +author: John Snow Labs +name: bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488 +date: 2023-11-14 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488` is a Castilian, Spanish model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488_es_5.2.0_3.0_1700004669248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488_es_5.2.0_3.0_1700004669248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_distill_bert_base_spanish_wwm_cased_finetuned_spa_squad2_spanish_mrm8488| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.5 MB| + +## References + +https://huggingface.co/mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad_en.md new file mode 100644 index 000000000000..3ef2332cde79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from DL4NLP-Group11) +author: John Snow Labs +name: bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h256-uncased-squad` is a English model originally trained by `DL4NLP-Group11`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad_en_5.2.0_3.0_1700004805396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad_en_5.2.0_3.0_1700004805396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_dl4nlp_group11_xtremedistil_l6_h256_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|47.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/DL4NLP-Group11/xtremedistil-l6-h256-uncased-squad +- https://github.com/mrqa/MRQA-Shared-Task-2019 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dry_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dry_finetuned_squad_en.md new file mode 100644 index 000000000000..f647e471fb32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dry_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from DrY) +author: John Snow Labs +name: bert_qa_dry_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `DrY`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_dry_finetuned_squad_en_5.2.0_3.0_1700006372734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_dry_finetuned_squad_en_5.2.0_3.0_1700006372734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dry_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dry_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_dry_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/DrY/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dylan1999_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dylan1999_finetuned_squad_en.md new file mode 100644 index 000000000000..e0203a4be669 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_dylan1999_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Dylan1999) +author: John Snow Labs +name: bert_qa_dylan1999_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `Dylan1999`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_dylan1999_finetuned_squad_en_5.2.0_3.0_1700005104544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_dylan1999_finetuned_squad_en_5.2.0_3.0_1700005104544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dylan1999_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dylan1999_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_dylan1999_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Dylan1999/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fabianwillner_base_uncased_finetuned_trivia_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fabianwillner_base_uncased_finetuned_trivia_en.md new file mode 100644 index 000000000000..8ce9f5fea752 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fabianwillner_base_uncased_finetuned_trivia_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from FabianWillner) +author: John Snow Labs +name: bert_qa_fabianwillner_base_uncased_finetuned_trivia +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-triviaqa` is a English model originally trained by `FabianWillner`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fabianwillner_base_uncased_finetuned_trivia_en_5.2.0_3.0_1699998057296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fabianwillner_base_uncased_finetuned_trivia_en_5.2.0_3.0_1699998057296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_fabianwillner_base_uncased_finetuned_trivia","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_fabianwillner_base_uncased_finetuned_trivia","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fabianwillner_base_uncased_finetuned_trivia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/FabianWillner/bert-base-uncased-finetuned-triviaqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_faquad_base_portuguese_cased_pt.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_faquad_base_portuguese_cased_pt.md new file mode 100644 index 000000000000..8f29534636df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_faquad_base_portuguese_cased_pt.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Portuguese BertForQuestionAnswering Base Cased model (from eraldoluis) +author: John Snow Labs +name: bert_qa_faquad_base_portuguese_cased +date: 2023-11-14 +tags: [pt, open_source, bert, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `faquad-bert-base-portuguese-cased` is a Portuguese model originally trained by `eraldoluis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_faquad_base_portuguese_cased_pt_5.2.0_3.0_1700003519252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_faquad_base_portuguese_cased_pt_5.2.0_3.0_1700003519252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_faquad_base_portuguese_cased","pt")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_faquad_base_portuguese_cased","pt") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_faquad_base_portuguese_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/eraldoluis/faquad-bert-base-portuguese-cased +- https://paperswithcode.com/sota?task=Extractive+Question-Answering&dataset=FaQuAD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fewrel_zero_shot_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fewrel_zero_shot_en.md new file mode 100644 index 000000000000..19ac5324628b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fewrel_zero_shot_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from fractalego) +author: John Snow Labs +name: bert_qa_fewrel_zero_shot +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `fewrel-zero-shot` is a English model orginally trained by `fractalego`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fewrel_zero_shot_en_5.2.0_3.0_1700004037037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fewrel_zero_shot_en_5.2.0_3.0_1700004037037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fewrel_zero_shot","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_fewrel_zero_shot","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.zero_shot").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fewrel_zero_shot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/fractalego/fewrel-zero-shot +- https://www.aclweb.org/anthology/2020.coling-main.124 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_financial_v2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_financial_v2_en.md new file mode 100644 index 000000000000..9bc9c2cf36fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_financial_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from anablasi) +author: John Snow Labs +name: bert_qa_financial_v2 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qa_financial_v2` is a English model originally trained by `anablasi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_financial_v2_en_5.2.0_3.0_1700002846482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_financial_v2_en_5.2.0_3.0_1700002846482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_financial_v2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_financial_v2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_financial_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anablasi/qa_financial_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fine_tuned_squad_aip_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fine_tuned_squad_aip_en.md new file mode 100644 index 000000000000..6ed8e11a8d74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fine_tuned_squad_aip_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Kutay) +author: John Snow Labs +name: bert_qa_fine_tuned_squad_aip +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `fine_tuned_squad_aip` is a English model orginally trained by `Kutay`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fine_tuned_squad_aip_en_5.2.0_3.0_1700004346563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fine_tuned_squad_aip_en_5.2.0_3.0_1700004346563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fine_tuned_squad_aip","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_fine_tuned_squad_aip","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_Kutay").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fine_tuned_squad_aip| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Kutay/fine_tuned_squad_aip \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fine_tuned_tweetqa_aip_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fine_tuned_tweetqa_aip_en.md new file mode 100644 index 000000000000..a593f1ec4817 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fine_tuned_tweetqa_aip_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Kutay) +author: John Snow Labs +name: bert_qa_fine_tuned_tweetqa_aip +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `fine_tuned_tweetqa_aip` is a English model orginally trained by `Kutay`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fine_tuned_tweetqa_aip_en_5.2.0_3.0_1699998354333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fine_tuned_tweetqa_aip_en_5.2.0_3.0_1699998354333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fine_tuned_tweetqa_aip","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_fine_tuned_tweetqa_aip","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.trivia.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fine_tuned_tweetqa_aip| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Kutay/fine_tuned_tweetqa_aip \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetune_bert_base_v1_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetune_bert_base_v1_en.md new file mode 100644 index 000000000000..e88d596accca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetune_bert_base_v1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from peggyhuang) +author: John Snow Labs +name: bert_qa_finetune_bert_base_v1 +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetune-bert-base-v1` is a English model orginally trained by `peggyhuang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_bert_base_v1_en_5.2.0_3.0_1700003144412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_bert_base_v1_en_5.2.0_3.0_1700003144412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_finetune_bert_base_v1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_finetune_bert_base_v1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base.by_peggyhuang").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetune_bert_base_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peggyhuang/finetune-bert-base-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetune_scibert_v2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetune_scibert_v2_en.md new file mode 100644 index 000000000000..7b048206f09a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetune_scibert_v2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from peggyhuang) +author: John Snow Labs +name: bert_qa_finetune_scibert_v2 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetune-SciBert-v2` is a English model originally trained by `peggyhuang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_scibert_v2_en_5.2.0_3.0_1700003411813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_scibert_v2_en_5.2.0_3.0_1700003411813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_finetune_scibert_v2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_finetune_scibert_v2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.scibert.scibert.v2").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetune_scibert_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peggyhuang/finetune-SciBert-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_custom_2_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_custom_2_en.md new file mode 100644 index 000000000000..2234dd94b7be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_custom_2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from VedantS01) +author: John Snow Labs +name: bert_qa_finetuned_custom_2 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-custom-2` is a English model originally trained by `VedantS01`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_custom_2_en_5.2.0_3.0_1700003752620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_custom_2_en_5.2.0_3.0_1700003752620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_custom_2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_custom_2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetuned_custom_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/VedantS01/bert-finetuned-custom-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_custom_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_custom_en.md new file mode 100644 index 000000000000..9af8271ddf15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_custom_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from VedantS01) +author: John Snow Labs +name: bert_qa_finetuned_custom +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-custom` is a English model originally trained by `VedantS01`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_custom_en_5.2.0_3.0_1700005469442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_custom_en_5.2.0_3.0_1700005469442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_custom","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_custom","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetuned_custom| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/VedantS01/bert-finetuned-custom \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_squad_transformerfrozen_testtoken_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_squad_transformerfrozen_testtoken_en.md new file mode 100644 index 000000000000..fd4b5191cca8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_squad_transformerfrozen_testtoken_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from DaisyMak) +author: John Snow Labs +name: bert_qa_finetuned_squad_transformerfrozen_testtoken +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-transformerfrozen-testtoken` is a English model originally trained by `DaisyMak`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_squad_transformerfrozen_testtoken_en_5.2.0_3.0_1700005818068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_squad_transformerfrozen_testtoken_en_5.2.0_3.0_1700005818068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_finetuned_squad_transformerfrozen_testtoken","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_finetuned_squad_transformerfrozen_testtoken","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_DaisyMak").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetuned_squad_transformerfrozen_testtoken| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/DaisyMak/bert-finetuned-squad-transformerfrozen-testtoken \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_uia_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_uia_en.md new file mode 100644 index 000000000000..20dabb53f725 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_finetuned_uia_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from eibakke) +author: John Snow Labs +name: bert_qa_finetuned_uia +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-uia` is a English model originally trained by `eibakke`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_uia_en_5.2.0_3.0_1700004690933.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_uia_en_5.2.0_3.0_1700004690933.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_uia","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_uia","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetuned_uia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/eibakke/bert-finetuned-uia \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_firmanindolanguagemodel_id.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_firmanindolanguagemodel_id.md new file mode 100644 index 000000000000..35eb8791908e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_firmanindolanguagemodel_id.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Indonesian BertForQuestionAnswering Cased model (from FirmanBr) +author: John Snow Labs +name: bert_qa_firmanindolanguagemodel +date: 2023-11-14 +tags: [id, open_source, bert, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `FirmanIndoLanguageModel` is a Indonesian model originally trained by `FirmanBr`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_firmanindolanguagemodel_id_5.2.0_3.0_1700004939155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_firmanindolanguagemodel_id_5.2.0_3.0_1700004939155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_firmanindolanguagemodel","id") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Siapa namaku?", "Nama saya Clara dan saya tinggal di Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_firmanindolanguagemodel","id") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Siapa namaku?", "Nama saya Clara dan saya tinggal di Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.answer_question.bert.lang").predict("""Siapa namaku?|||"Nama saya Clara dan saya tinggal di Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_firmanindolanguagemodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|412.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/FirmanBr/FirmanIndoLanguageModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa_en.md new file mode 100644 index 000000000000..d19ae638daf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700005142570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700005142570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fpdm_bert_ft_nepal_bhasa_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/AnonymousSub/fpdm_bert_FT_new_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa_en.md new file mode 100644 index 000000000000..4e585185eb8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700005999345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700005999345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fpdm_hier_bert_ft_nepal_bhasa_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/AnonymousSub/fpdm_hier_bert_FT_new_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_hier_bert_ft_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_hier_bert_ft_newsqa_en.md new file mode 100644 index 000000000000..ba8855c0022d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_hier_bert_ft_newsqa_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_fpdm_hier_bert_ft_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_fpdm_hier_bert_ft_newsqa +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_fpdm_hier_bert_ft_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_hier_bert_ft_newsqa_en_5.2.0_3.0_1700004121609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_hier_bert_ft_newsqa_en_5.2.0_3.0_1700004121609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fpdm_hier_bert_ft_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_fpdm_hier_bert_ft_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fpdm_hier_bert_ft_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/AnonymousSub/fpdm_hier_bert_FT_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa_en.md new file mode 100644 index 000000000000..1265b35e8e53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700006188989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700006188989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fpdm_triplet_bert_ft_nepal_bhasa_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/AnonymousSub/fpdm_triplet_bert_FT_new_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_triplet_bert_ft_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_triplet_bert_ft_newsqa_en.md new file mode 100644 index 000000000000..8a99d1b4b622 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_fpdm_triplet_bert_ft_newsqa_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_fpdm_triplet_bert_ft_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_fpdm_triplet_bert_ft_newsqa +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_fpdm_triplet_bert_ft_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_triplet_bert_ft_newsqa_en_5.2.0_3.0_1700005397148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_triplet_bert_ft_newsqa_en_5.2.0_3.0_1700005397148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fpdm_triplet_bert_ft_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_fpdm_triplet_bert_ft_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fpdm_triplet_bert_ft_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/AnonymousSub/fpdm_triplet_bert_FT_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hebert_finetuned_hebrew_squad_he.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hebert_finetuned_hebrew_squad_he.md new file mode 100644 index 000000000000..d76340e5aca6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hebert_finetuned_hebrew_squad_he.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Hebrew BertForQuestionAnswering model (from tdklab) +author: John Snow Labs +name: bert_qa_hebert_finetuned_hebrew_squad +date: 2023-11-14 +tags: [he, open_source, question_answering, bert, onnx] +task: Question Answering +language: he +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `hebert-finetuned-hebrew-squad` is a Hebrew model orginally trained by `tdklab`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_hebert_finetuned_hebrew_squad_he_5.2.0_3.0_1700004422238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_hebert_finetuned_hebrew_squad_he_5.2.0_3.0_1700004422238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_hebert_finetuned_hebrew_squad","he") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_hebert_finetuned_hebrew_squad","he") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("he.answer_question.squad.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_hebert_finetuned_hebrew_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|he| +|Size:|408.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tdklab/hebert-finetuned-hebrew-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hendrixcosta_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hendrixcosta_en.md new file mode 100644 index 000000000000..d3a51f983ed0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hendrixcosta_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from hendrixcosta) +author: John Snow Labs +name: bert_qa_hendrixcosta +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `hendrixcosta` is a English model originally trained by `hendrixcosta`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_hendrixcosta_en_5.2.0_3.0_1700005668071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_hendrixcosta_en_5.2.0_3.0_1700005668071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_hendrixcosta","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_hendrixcosta","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_hendrixcosta").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_hendrixcosta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|404.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hendrixcosta/hendrixcosta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hf_internal_testing_tiny_random_forquestionanswering_ja.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hf_internal_testing_tiny_random_forquestionanswering_ja.md new file mode 100644 index 000000000000..523a3ab58c14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hf_internal_testing_tiny_random_forquestionanswering_ja.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Japanese BertForQuestionAnswering Tiny Cased model (from hf-internal-testing) +author: John Snow Labs +name: bert_qa_hf_internal_testing_tiny_random_forquestionanswering +date: 2023-11-14 +tags: [ja, open_source, bert, question_answering, onnx] +task: Question Answering +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-random-BertForQuestionAnswering` is a Japanese model originally trained by `hf-internal-testing`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_hf_internal_testing_tiny_random_forquestionanswering_ja_5.2.0_3.0_1700006333001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_hf_internal_testing_tiny_random_forquestionanswering_ja_5.2.0_3.0_1700006333001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_hf_internal_testing_tiny_random_forquestionanswering","ja")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_hf_internal_testing_tiny_random_forquestionanswering","ja") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_hf_internal_testing_tiny_random_forquestionanswering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ja| +|Size:|346.4 KB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hf-internal-testing/tiny-random-BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hkhkhkhk_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hkhkhkhk_finetuned_squad_en.md new file mode 100644 index 000000000000..507c72cd5b5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hkhkhkhk_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from HKHKHKHK) +author: John Snow Labs +name: bert_qa_hkhkhkhk_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `HKHKHKHK`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_hkhkhkhk_finetuned_squad_en_5.2.0_3.0_1699998618156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_hkhkhkhk_finetuned_squad_en_5.2.0_3.0_1699998618156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_hkhkhkhk_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_hkhkhkhk_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_hkhkhkhk_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/HKHKHKHK/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huawei_noahtiny_general_6l_768_hotpot_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huawei_noahtiny_general_6l_768_hotpot_en.md new file mode 100644 index 000000000000..4aa9c963426a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huawei_noahtiny_general_6l_768_hotpot_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny Cased model (from DL4NLP-Group4) +author: John Snow Labs +name: bert_qa_huawei_noahtiny_general_6l_768_hotpot +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `huawei-noahTinyBERT_General_6L_768_HotpotQA` is a English model originally trained by `DL4NLP-Group4`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_huawei_noahtiny_general_6l_768_hotpot_en_5.2.0_3.0_1699999043426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_huawei_noahtiny_general_6l_768_hotpot_en_5.2.0_3.0_1699999043426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_huawei_noahtiny_general_6l_768_hotpot","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_huawei_noahtiny_general_6l_768_hotpot","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_huawei_noahtiny_general_6l_768_hotpot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|248.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/DL4NLP-Group4/huawei-noahTinyBERT_General_6L_768_HotpotQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huggingface_course_bert_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huggingface_course_bert_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..3b759547932e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huggingface_course_bert_finetuned_squad_accelerate_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from huggingface-course) +author: John Snow Labs +name: bert_qa_huggingface_course_bert_finetuned_squad_accelerate +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model orginally trained by `huggingface-course`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_huggingface_course_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1700006192068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_huggingface_course_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1700006192068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_huggingface_course_bert_finetuned_squad_accelerate","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_huggingface_course_bert_finetuned_squad_accelerate","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.accelerate.by_huggingface-course").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_huggingface_course_bert_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/huggingface-course/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huggingface_course_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huggingface_course_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..8d5d3fbef46d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_huggingface_course_bert_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from huggingface-course) +author: John Snow Labs +name: bert_qa_huggingface_course_bert_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model orginally trained by `huggingface-course`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_huggingface_course_bert_finetuned_squad_en_5.2.0_3.0_1700005924998.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_huggingface_course_bert_finetuned_squad_en_5.2.0_3.0_1700005924998.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_huggingface_course_bert_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_huggingface_course_bert_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_huggingface-course").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_huggingface_course_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/huggingface-course/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hungarian_fine_tuned_hungarian_squadv2_hu.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hungarian_fine_tuned_hungarian_squadv2_hu.md new file mode 100644 index 000000000000..c18ef79d3c60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_hungarian_fine_tuned_hungarian_squadv2_hu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hungarian bert_qa_hungarian_fine_tuned_hungarian_squadv2 BertForQuestionAnswering from mcsabai +author: John Snow Labs +name: bert_qa_hungarian_fine_tuned_hungarian_squadv2 +date: 2023-11-14 +tags: [bert, hu, open_source, question_answering, onnx] +task: Question Answering +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_hungarian_fine_tuned_hungarian_squadv2` is a Hungarian model originally trained by mcsabai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_hungarian_fine_tuned_hungarian_squadv2_hu_5.2.0_3.0_1699998824803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_hungarian_fine_tuned_hungarian_squadv2_hu_5.2.0_3.0_1699998824803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_hungarian_fine_tuned_hungarian_squadv2","hu") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_hungarian_fine_tuned_hungarian_squadv2", "hu") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_hungarian_fine_tuned_hungarian_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|hu| +|Size:|412.4 MB| + +## References + +https://huggingface.co/mcsabai/huBert-fine-tuned-hungarian-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_ixambert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_ixambert_finetuned_squad_en.md new file mode 100644 index 000000000000..22ce3873434d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_ixambert_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MarcBrun) +author: John Snow Labs +name: bert_qa_ixambert_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ixambert-finetuned-squad` is a English model orginally trained by `MarcBrun`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ixambert_finetuned_squad_en_5.2.0_3.0_1700004747005.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ixambert_finetuned_squad_en_5.2.0_3.0_1700004747005.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_ixambert_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_ixambert_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.ixam_bert.by_MarcBrun").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ixambert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|661.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MarcBrun/ixambert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_jatinshah_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_jatinshah_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..c0eb458433f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_jatinshah_bert_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from jatinshah) +author: John Snow Labs +name: bert_qa_jatinshah_bert_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model orginally trained by `jatinshah`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_jatinshah_bert_finetuned_squad_en_5.2.0_3.0_1700005039392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_jatinshah_bert_finetuned_squad_en_5.2.0_3.0_1700005039392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_jatinshah_bert_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_jatinshah_bert_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_jatinshah").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_jatinshah_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jatinshah/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kd_squad1.1_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kd_squad1.1_en.md new file mode 100644 index 000000000000..5194f4d09319 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kd_squad1.1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from maroo93) +author: John Snow Labs +name: bert_qa_kd_squad1.1 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kd_squad1.1` is a English model originally trained by `maroo93`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kd_squad1.1_en_5.2.0_3.0_1700005266222.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kd_squad1.1_en_5.2.0_3.0_1700005266222.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kd_squad1.1","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_kd_squad1.1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kd_squad1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|249.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/maroo93/kd_squad1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kflash_finetuned_squad_accelera_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kflash_finetuned_squad_accelera_en.md new file mode 100644 index 000000000000..3c577cb74e21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kflash_finetuned_squad_accelera_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from KFlash) +author: John Snow Labs +name: bert_qa_kflash_finetuned_squad_accelera +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model originally trained by `KFlash`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kflash_finetuned_squad_accelera_en_5.2.0_3.0_1700005786539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kflash_finetuned_squad_accelera_en_5.2.0_3.0_1700005786539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kflash_finetuned_squad_accelera","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_kflash_finetuned_squad_accelera","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned_squad_accelera.by_KFlash").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kflash_finetuned_squad_accelera| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/KFlash/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kflash_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kflash_finetuned_squad_en.md new file mode 100644 index 000000000000..11cb60134479 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kflash_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from KFlash) +author: John Snow Labs +name: bert_qa_kflash_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `KFlash`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kflash_finetuned_squad_en_5.2.0_3.0_1700005489077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kflash_finetuned_squad_en_5.2.0_3.0_1700005489077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kflash_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_kflash_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned_squad.by_KFlash").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kflash_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/KFlash/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kobert_finetuned_klue_v2_ko.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kobert_finetuned_klue_v2_ko.md new file mode 100644 index 000000000000..7c9ccdfaecb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_kobert_finetuned_klue_v2_ko.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Korean BertForQuestionAnswering Cased model (from obokkkk) +author: John Snow Labs +name: bert_qa_kobert_finetuned_klue_v2 +date: 2023-11-14 +tags: [ko, open_source, bert, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kobert-finetuned-klue-v2` is a Korean model originally trained by `obokkkk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kobert_finetuned_klue_v2_ko_5.2.0_3.0_1700006185949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kobert_finetuned_klue_v2_ko_5.2.0_3.0_1700006185949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kobert_finetuned_klue_v2","ko") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_kobert_finetuned_klue_v2","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kobert_finetuned_klue_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|342.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/obokkkk/kobert-finetuned-klue-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_komrc_train_ko.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_komrc_train_ko.md new file mode 100644 index 000000000000..730b9d8c5874 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_komrc_train_ko.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Korean BertForQuestionAnswering Cased model (from Taekyoon) +author: John Snow Labs +name: bert_qa_komrc_train +date: 2023-11-14 +tags: [ko, open_source, bert, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `komrc_train` is a Korean model originally trained by `Taekyoon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_komrc_train_ko_5.2.0_3.0_1699999319111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_komrc_train_ko_5.2.0_3.0_1699999319111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_komrc_train","ko") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_komrc_train","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_komrc_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|406.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Taekyoon/komrc_train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_korean_finetuned_klue_v2_ko.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_korean_finetuned_klue_v2_ko.md new file mode 100644 index 000000000000..fd1afd3cac74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_korean_finetuned_klue_v2_ko.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Korean bert_qa_korean_finetuned_klue_v2 BertForQuestionAnswering from Seongmi +author: John Snow Labs +name: bert_qa_korean_finetuned_klue_v2 +date: 2023-11-14 +tags: [bert, ko, open_source, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_korean_finetuned_klue_v2` is a Korean model originally trained by Seongmi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_korean_finetuned_klue_v2_ko_5.2.0_3.0_1700005960456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_korean_finetuned_klue_v2_ko_5.2.0_3.0_1700005960456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_korean_finetuned_klue_v2","ko") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_korean_finetuned_klue_v2", "ko") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_korean_finetuned_klue_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|342.9 MB| + +## References + +https://huggingface.co/Seongmi/kobert-finetuned-klue-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_infovqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_infovqa_en.md new file mode 100644 index 000000000000..c6cd5cd3b584 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_infovqa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Uncased model (from tiennvcs) +author: John Snow Labs +name: bert_qa_large_uncased_finetuned_infovqa +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-finetuned-infovqa` is a English model originally trained by `tiennvcs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_finetuned_infovqa_en_5.2.0_3.0_1699999822368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_finetuned_infovqa_en_5.2.0_3.0_1699999822368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_finetuned_infovqa","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_finetuned_infovqa","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.uncased_large_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_finetuned_infovqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tiennvcs/bert-large-uncased-finetuned-infovqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_squadv1_en.md new file mode 100644 index 000000000000..d5c84f2e45e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_squadv1_en.md @@ -0,0 +1,96 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Uncased model (from neuralmagic) +author: John Snow Labs +name: bert_qa_large_uncased_finetuned_squadv1 +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-finetuned-squadv1` is a English model originally trained by `neuralmagic`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_finetuned_squadv1_en_5.2.0_3.0_1700000343445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_finetuned_squadv1_en_5.2.0_3.0_1700000343445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_finetuned_squadv1","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_finetuned_squadv1","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/neuralmagic/bert-large-uncased-finetuned-squadv1 +- https://arxiv.org/abs/2203.07259 +- https://github.com/neuralmagic/sparseml/tree/main/research/optimal_BERT_surgeon_oBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_vietnamese_infovqa_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_vietnamese_infovqa_en.md new file mode 100644 index 000000000000..f2358b6349e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_large_uncased_finetuned_vietnamese_infovqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_large_uncased_finetuned_vietnamese_infovqa BertForQuestionAnswering from tiennvcs +author: John Snow Labs +name: bert_qa_large_uncased_finetuned_vietnamese_infovqa +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_finetuned_vietnamese_infovqa` is a English model originally trained by tiennvcs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_finetuned_vietnamese_infovqa_en_5.2.0_3.0_1700000677852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_finetuned_vietnamese_infovqa_en_5.2.0_3.0_1700000677852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_finetuned_vietnamese_infovqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_finetuned_vietnamese_infovqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_finetuned_vietnamese_infovqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tiennvcs/bert-large-uncased-finetuned-vi-infovqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_lewtun_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_lewtun_finetuned_squad_en.md new file mode 100644 index 000000000000..c5d2acf4e4f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_lewtun_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from lewtun) +author: John Snow Labs +name: bert_qa_lewtun_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `lewtun`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_lewtun_finetuned_squad_en_5.2.0_3.0_1700000994244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_lewtun_finetuned_squad_en_5.2.0_3.0_1700000994244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_lewtun_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_lewtun_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_lewtun").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_lewtun_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/lewtun/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_linkbert_large_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_linkbert_large_finetuned_squad_en.md new file mode 100644 index 000000000000..81bd727cfd82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_linkbert_large_finetuned_squad_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from niklaspm) +author: John Snow Labs +name: bert_qa_linkbert_large_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `linkbert-large-finetuned-squad` is a English model orginally trained by `niklaspm`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_linkbert_large_finetuned_squad_en_5.2.0_3.0_1700001527794.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_linkbert_large_finetuned_squad_en_5.2.0_3.0_1700001527794.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_linkbert_large_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_linkbert_large_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.link_bert.large").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_linkbert_large_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/niklaspm/linkbert-large-finetuned-squad +- https://arxiv.org/abs/2203.15827 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_m_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_m_xx.md new file mode 100644 index 000000000000..0e9a4c54dbc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_m_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Cased model (from sepiosky) +author: John Snow Labs +name: bert_qa_m +date: 2023-11-14 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MBERT_QA` is a Multilingual model originally trained by `sepiosky`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_m_xx_5.2.0_3.0_1700001968982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_m_xx_5.2.0_3.0_1700001968982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_m","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_m","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sepiosky/MBERT_QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_macsquad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_macsquad_en.md new file mode 100644 index 000000000000..02e277f0a5fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_macsquad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Nadav) +author: John Snow Labs +name: bert_qa_macsquad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MacSQuAD` is a English model originally trained by `Nadav`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_macsquad_en_5.2.0_3.0_1700002467032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_macsquad_en_5.2.0_3.0_1700002467032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_macsquad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_macsquad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.by_nadav").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_macsquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|406.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Nadav/MacSQuAD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_all_tahitian_sqen_sq20_1_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_all_tahitian_sqen_sq20_1_en.md new file mode 100644 index 000000000000..b67f0f2ffd89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_all_tahitian_sqen_sq20_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_mbert_all_tahitian_sqen_sq20_1 BertForQuestionAnswering from krinal214 +author: John Snow Labs +name: bert_qa_mbert_all_tahitian_sqen_sq20_1 +date: 2023-11-14 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_all_tahitian_sqen_sq20_1` is a English model originally trained by krinal214. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_all_tahitian_sqen_sq20_1_en_5.2.0_3.0_1700002195661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_all_tahitian_sqen_sq20_1_en_5.2.0_3.0_1700002195661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_all_tahitian_sqen_sq20_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_all_tahitian_sqen_sq20_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_all_tahitian_sqen_sq20_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/krinal214/mBERT_all_ty_SQen_SQ20_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_finetuned_mlqa_english_hindi_dev_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_finetuned_mlqa_english_hindi_dev_xx.md new file mode 100644 index 000000000000..1d3cfc2b5f07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_finetuned_mlqa_english_hindi_dev_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_qa_mbert_finetuned_mlqa_english_hindi_dev BertForQuestionAnswering from roshnir +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_english_hindi_dev +date: 2023-11-14 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_finetuned_mlqa_english_hindi_dev` is a Multilingual model originally trained by roshnir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_english_hindi_dev_xx_5.2.0_3.0_1700002695260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_english_hindi_dev_xx_5.2.0_3.0_1700002695260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_english_hindi_dev","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_finetuned_mlqa_english_hindi_dev", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_english_hindi_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-en-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev_xx.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev_xx.md new file mode 100644 index 000000000000..f56371308442 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev BertForQuestionAnswering from roshnir +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev +date: 2023-11-14 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev` is a Multilingual model originally trained by roshnir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev_xx_5.2.0_3.0_1700002933231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev_xx_5.2.0_3.0_1700002933231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_vietnamese_hindi_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-vi-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mkkc58_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mkkc58_finetuned_squad_en.md new file mode 100644 index 000000000000..370413c0dba0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mkkc58_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from mkkc58) +author: John Snow Labs +name: bert_qa_mkkc58_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `mkkc58`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mkkc58_finetuned_squad_en_5.2.0_3.0_1700003196849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mkkc58_finetuned_squad_en_5.2.0_3.0_1700003196849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mkkc58_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_mkkc58_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_mkkc58").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mkkc58_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mkkc58/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_modelontquad_tr.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_modelontquad_tr.md new file mode 100644 index 000000000000..56925ddeed7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_modelontquad_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Cased model (from Aybars) +author: John Snow Labs +name: bert_qa_modelontquad +date: 2023-11-14 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ModelOnTquad` is a Turkish model originally trained by `Aybars`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_modelontquad_tr_5.2.0_3.0_1700003570293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_modelontquad_tr_5.2.0_3.0_1700003570293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_modelontquad","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_modelontquad","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_modelontquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Aybars/ModelOnTquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_monakth_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_monakth_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..55dfd71086e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_monakth_base_cased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from monakth) +author: John Snow Labs +name: bert_qa_monakth_base_cased_finetuned_squad +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad` is a English model originally trained by `monakth`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_monakth_base_cased_finetuned_squad_en_5.2.0_3.0_1700003875248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_monakth_base_cased_finetuned_squad_en_5.2.0_3.0_1700003875248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_monakth_base_cased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_monakth_base_cased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_monakth_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monakth/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mqa_baseline_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mqa_baseline_en.md new file mode 100644 index 000000000000..0f205df21f23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_mqa_baseline_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from xraychen) +author: John Snow Labs +name: bert_qa_mqa_baseline +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mqa-baseline` is a English model orginally trained by `xraychen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_baseline_en_5.2.0_3.0_1700004141822.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_baseline_en_5.2.0_3.0_1700004141822.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mqa_baseline","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_mqa_baseline","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base.by_xraychen").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mqa_baseline| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/xraychen/mqa-baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_multilingual_bert_base_cased_english_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_multilingual_bert_base_cased_english_en.md new file mode 100644 index 000000000000..f688bf0eb120 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_multilingual_bert_base_cased_english_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from bhavikardeshna) +author: John Snow Labs +name: bert_qa_multilingual_bert_base_cased_english +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilingual-bert-base-cased-english` is a English model orginally trained by `bhavikardeshna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_english_en_5.2.0_3.0_1700004618287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_english_en_5.2.0_3.0_1700004618287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multilingual_bert_base_cased_english","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_multilingual_bert_base_cased_english","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.multilingual_english_tuned_base_cased.by_bhavikardeshna").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multilingual_bert_base_cased_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bhavikardeshna/multilingual-bert-base-cased-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_muril_large_cased_hita_qa_hi.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_muril_large_cased_hita_qa_hi.md new file mode 100644 index 000000000000..36e468e02be9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_muril_large_cased_hita_qa_hi.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Hindi BertForQuestionAnswering model (from Yuchen) +author: John Snow Labs +name: bert_qa_muril_large_cased_hita_qa +date: 2023-11-14 +tags: [open_source, question_answering, bert, hi, onnx] +task: Question Answering +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `muril-large-cased-hita-qa` is a Hindi model orginally trained by `Yuchen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_muril_large_cased_hita_qa_hi_5.2.0_3.0_1700005221724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_muril_large_cased_hita_qa_hi_5.2.0_3.0_1700005221724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_muril_large_cased_hita_qa","hi") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_muril_large_cased_hita_qa","hi") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("hi.answer_question.bert.large_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_muril_large_cased_hita_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|hi| +|Size:|1.9 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Yuchen/muril-large-cased-hita-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_nausheen_finetuned_squad_accelera_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_nausheen_finetuned_squad_accelera_en.md new file mode 100644 index 000000000000..5c6defa86788 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_nausheen_finetuned_squad_accelera_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Nausheen) +author: John Snow Labs +name: bert_qa_nausheen_finetuned_squad_accelera +date: 2023-11-14 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model originally trained by `Nausheen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_nausheen_finetuned_squad_accelera_en_5.2.0_3.0_1700005481407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_nausheen_finetuned_squad_accelera_en_5.2.0_3.0_1700005481407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_nausheen_finetuned_squad_accelera","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_nausheen_finetuned_squad_accelera","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_Nausheen").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_nausheen_finetuned_squad_accelera| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Nausheen/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_neuralmind_base_portuguese_squad_pt.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_neuralmind_base_portuguese_squad_pt.md new file mode 100644 index 000000000000..0a4a9bd179db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_neuralmind_base_portuguese_squad_pt.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Portuguese BertForQuestionAnswering Base Cased model (from p2o) +author: John Snow Labs +name: bert_qa_neuralmind_base_portuguese_squad +date: 2023-11-14 +tags: [pt, open_source, bert, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `neuralmind-bert-base-portuguese-squad` is a Portuguese model originally trained by `p2o`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_neuralmind_base_portuguese_squad_pt_5.2.0_3.0_1700005781942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_neuralmind_base_portuguese_squad_pt_5.2.0_3.0_1700005781942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_neuralmind_base_portuguese_squad","pt")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_neuralmind_base_portuguese_squad","pt") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_neuralmind_base_portuguese_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/p2o/neuralmind-bert-base-portuguese-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-14-bert_qa_ofirzaf_bert_large_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_ofirzaf_bert_large_uncased_squad_en.md new file mode 100644 index 000000000000..7b87797cf00e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-14-bert_qa_ofirzaf_bert_large_uncased_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ofirzaf) +author: John Snow Labs +name: bert_qa_ofirzaf_bert_large_uncased_squad +date: 2023-11-14 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squad` is a English model orginally trained by `ofirzaf`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ofirzaf_bert_large_uncased_squad_en_5.2.0_3.0_1700006266704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ofirzaf_bert_large_uncased_squad_en_5.2.0_3.0_1700006266704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_ofirzaf_bert_large_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_ofirzaf_bert_large_uncased_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large_uncased.by_ofirzaf").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ofirzaf_bert_large_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ofirzaf/bert-large-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_cased_finetuned_viquad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_cased_finetuned_viquad_en.md new file mode 100644 index 000000000000..7e94fb527211 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_cased_finetuned_viquad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from Khanh) +author: John Snow Labs +name: bert_qa_base_multilingual_cased_finetuned_viquad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-viquad` is a English model originally trained by `Khanh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_viquad_en_5.2.0_3.0_1700058664073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_viquad_en_5.2.0_3.0_1700058664073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned_viquad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned_viquad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.cased_multilingual_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_cased_finetuned_viquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Khanh/bert-base-multilingual-cased-finetuned-viquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_cased_finetuned_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_cased_finetuned_xx.md new file mode 100644 index 000000000000..2c2b8e9f89c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_cased_finetuned_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Cased model (from obokkkk) +author: John Snow Labs +name: bert_qa_base_multilingual_cased_finetuned +date: 2023-11-15 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned` is a Multilingual model originally trained by `obokkkk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_xx_5.2.0_3.0_1700058641683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_cased_finetuned_xx_5.2.0_3.0_1700058641683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_cased_finetuned","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_cased_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/obokkkk/bert-base-multilingual-cased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_uncased_finetuned_squadv2_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_uncased_finetuned_squadv2_xx.md new file mode 100644 index 000000000000..bbbb2d08769c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_multilingual_uncased_finetuned_squadv2_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Uncased model (from khoanvm) +author: John Snow Labs +name: bert_qa_base_multilingual_uncased_finetuned_squadv2 +date: 2023-11-15 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-uncased-finetuned-squadv2` is a Multilingual model originally trained by `khoanvm`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_xx_5.2.0_3.0_1700059023790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_multilingual_uncased_finetuned_squadv2_xx_5.2.0_3.0_1700059023790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_multilingual_uncased_finetuned_squadv2","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_multilingual_uncased_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/khoanvm/bert-base-multilingual-uncased-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_nnish_cased_squad1_fi.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_nnish_cased_squad1_fi.md new file mode 100644 index 000000000000..1be2a525bc4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_nnish_cased_squad1_fi.md @@ -0,0 +1,96 @@ +--- +layout: model +title: Finnish BertForQuestionAnswering Base Cased model (from ilmariky) +author: John Snow Labs +name: bert_qa_base_nnish_cased_squad1 +date: 2023-11-15 +tags: [fi, open_source, bert, question_answering, onnx] +task: Question Answering +language: fi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-finnish-cased-squad1-fi` is a Finnish model originally trained by `ilmariky`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_nnish_cased_squad1_fi_5.2.0_3.0_1700058973995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_nnish_cased_squad1_fi_5.2.0_3.0_1700058973995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_nnish_cased_squad1","fi")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_nnish_cased_squad1","fi") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_nnish_cased_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fi| +|Size:|464.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ilmariky/bert-base-finnish-cased-squad1-fi +- https://github.com/google-research-datasets/tydiqa +- https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_parsquad_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_parsquad_fa.md new file mode 100644 index 000000000000..b31cf443f85f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_parsquad_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_parsquad +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_parsquad` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_parsquad_fa_5.2.0_3.0_1700059423764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_parsquad_fa_5.2.0_3.0_1700059423764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_parsquad","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_parsquad","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_parsquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_parsquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_1epoch_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_1epoch_fa.md new file mode 100644 index 000000000000..104033122180 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_1epoch_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_pquad_1epoch +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_pquad_1epoch` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_1epoch_fa_5.2.0_3.0_1700059740761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_1epoch_fa_5.2.0_3.0_1700059740761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_1epoch","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_1epoch","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_pquad_1epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_pquad_1epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_fa.md new file mode 100644 index 000000000000..08080f3c56b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_pquad +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_pquad` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_fa_5.2.0_3.0_1700058636687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_fa_5.2.0_3.0_1700058636687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_pquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_pquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_lr1e_5_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_lr1e_5_fa.md new file mode 100644 index 000000000000..99a2e56652a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_pars_uncased_pquad_lr1e_5_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mohsenfayyaz) +author: John Snow Labs +name: bert_qa_base_pars_uncased_pquad_lr1e_5 +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased_pquad_lr1e-5` is a Persian model originally trained by `mohsenfayyaz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_lr1e_5_fa_5.2.0_3.0_1700060064612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_pars_uncased_pquad_lr1e_5_fa_5.2.0_3.0_1700060064612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_lr1e_5","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_base_pars_uncased_pquad_lr1e_5","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_pars_uncased_pquad_lr1e_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/bert-base-parsbert-uncased_pquad_lr1e-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_parsbert_uncased_finetuned_squad_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_parsbert_uncased_finetuned_squad_fa.md new file mode 100644 index 000000000000..c50eca460165 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_parsbert_uncased_finetuned_squad_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Base Uncased model (from mhmsadegh) +author: John Snow Labs +name: bert_qa_base_parsbert_uncased_finetuned_squad +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-parsbert-uncased-finetuned-squad` is a Persian model originally trained by `mhmsadegh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_parsbert_uncased_finetuned_squad_fa_5.2.0_3.0_1700058633009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_parsbert_uncased_finetuned_squad_fa_5.2.0_3.0_1700058633009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_parsbert_uncased_finetuned_squad","fa") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_parsbert_uncased_finetuned_squad","fa") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_parsbert_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mhmsadegh/bert-base-parsbert-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr.md new file mode 100644 index 000000000000..c1cd39c000ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Base Cased model (from husnu) +author: John Snow Labs +name: bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1 +date: 2023-11-15 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-1` is a Turkish model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr_5.2.0_3.0_1700060421173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1_tr_5.2.0_3.0_1700060421173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr.md new file mode 100644 index 000000000000..6782fd8356d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Base Cased model (from husnu) +author: John Snow Labs +name: bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3 +date: 2023-11-15 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-3` is a Turkish model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1700059056675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1700059056675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_turkish_128k_cased_tquad2_finetuned_lr_2e_05_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/bert-base-turkish-128k-cased-finetuned_lr-2e-05_epochs-3TQUAD2-finetuned_lr-2e-05_epochs-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..1c4e851167f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-128-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en_5.2.0_3.0_1700060723604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2_en_5.2.0_3.0_1700060723604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_128_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-128-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..8b0e30d7014b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-16-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en_5.2.0_3.0_1700058534351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2_en_5.2.0_3.0_1700058534351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_2_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_16_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-16-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..06bd72f330f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-32-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en_5.2.0_3.0_1700058852386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8_en_5.2.0_3.0_1700058852386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_8_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_32_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-32-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..212e326ccaed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-few-shot-k-64-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en_5.2.0_3.0_1700058924705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4_en_5.2.0_3.0_1700058924705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_seed_4_base_64d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_few_shot_k_64_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-base-uncased-few-shot-k-64-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en.md new file mode 100644 index 000000000000..94e76549a838 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from alistvt) +author: John Snow Labs +name: bert_qa_base_uncased_pretrain_finetuned_coqa_falttened +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-pretrain-finetuned-coqa-falttened` is a English model originally trained by `alistvt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en_5.2.0_3.0_1700059206672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_base_uncased_pretrain_finetuned_coqa_falttened_en_5.2.0_3.0_1700059206672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_pretrain_finetuned_coqa_falttened","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_base_uncased_pretrain_finetuned_coqa_falttened","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.uncased_base_finetuned.by_alistvt").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_base_uncased_pretrain_finetuned_coqa_falttened| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/alistvt/bert-base-uncased-pretrain-finetuned-coqa-falttened \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bdickson_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bdickson_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..17a76e335616 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bdickson_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from bdickson) +author: John Snow Labs +name: bert_qa_bdickson_bert_base_uncased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad` is a English model orginally trained by `bdickson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bdickson_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700061010787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bdickson_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700061010787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bdickson_bert_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bdickson_bert_base_uncased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_bdickson").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bdickson_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bdickson/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_en.md new file mode 100644 index 000000000000..9e4f59fdbb84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_en_5.2.0_3.0_1700059293524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_en_5.2.0_3.0_1700059293524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.tydiqa.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_all_translated_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_all_translated_en.md new file mode 100644 index 000000000000..1da871f80432 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_all_translated_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_squad_all_translated +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-squad_all_translated` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_all_translated_en_5.2.0_3.0_1700059641983.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_all_translated_en_5.2.0_3.0_1700059641983.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_squad_all_translated","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_squad_all_translated","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_translated.bert.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_squad_all_translated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-squad_all_translated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_ben_tel_context_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_ben_tel_context_en.md new file mode 100644 index 000000000000..c9a7bfeac9aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_ben_tel_context_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_squad_ben_tel_context +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-squad_ben_tel_context` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_ben_tel_context_en_5.2.0_3.0_1700059176175.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_ben_tel_context_en_5.2.0_3.0_1700059176175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_squad_ben_tel_context","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_squad_ben_tel_context","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_ben_tel.bert.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_squad_ben_tel_context| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-squad_ben_tel_context \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_que_translated_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_que_translated_en.md new file mode 100644 index 000000000000..379dda9a1705 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_squad_que_translated_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_squad_que_translated +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-squad_que_translated` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_que_translated_en_5.2.0_3.0_1700061367686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_squad_que_translated_en_5.2.0_3.0_1700061367686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_squad_que_translated","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_squad_que_translated","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad_translated.bert.que.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_squad_que_translated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-squad_que_translated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_translated_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_translated_en.md new file mode 100644 index 000000000000..4e140ac10533 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_all_translated_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_bert_all_translated +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-all-translated` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_translated_en_5.2.0_3.0_1700059452244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_all_translated_en_5.2.0_3.0_1700059452244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_all_translated","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_all_translated","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_all_translated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/bert-all-translated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_2048_full_trivia_copied_embeddings_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_2048_full_trivia_copied_embeddings_en.md new file mode 100644 index 000000000000..1f7fef1dae86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_2048_full_trivia_copied_embeddings_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MrAnderson) +author: John Snow Labs +name: bert_qa_bert_base_2048_full_trivia_copied_embeddings +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-2048-full-trivia-copied-embeddings` is a English model orginally trained by `MrAnderson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_2048_full_trivia_copied_embeddings_en_5.2.0_3.0_1700061631557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_2048_full_trivia_copied_embeddings_en_5.2.0_3.0_1700061631557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_2048_full_trivia_copied_embeddings","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_2048_full_trivia_copied_embeddings","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.trivia.bert.base_2048.by_MrAnderson").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_2048_full_trivia_copied_embeddings| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|411.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MrAnderson/bert-base-2048-full-trivia-copied-embeddings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_cased_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_cased_chaii_en.md new file mode 100644 index 000000000000..b785051d9818 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_cased_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_base_cased_chaii +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_chaii_en_5.2.0_3.0_1700059727653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_cased_chaii_en_5.2.0_3.0_1700059727653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_cased_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_cased_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_cased_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-base-cased-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_faquad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_faquad_en.md new file mode 100644 index 000000000000..b2309b202800 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_faquad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ricardo-filho) +author: John Snow Labs +name: bert_qa_bert_base_faquad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_base_faquad` is a English model orginally trained by `ricardo-filho`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_faquad_en_5.2.0_3.0_1700059918542.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_faquad_en_5.2.0_3.0_1700059918542.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_faquad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_faquad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_faquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|405.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ricardo-filho/bert_base_faquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetune_qa_th.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetune_qa_th.md new file mode 100644 index 000000000000..b94f2ab8cd78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetune_qa_th.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Thai BertForQuestionAnswering model (from airesearch) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_finetune_qa +date: 2023-11-15 +tags: [th, open_source, question_answering, bert, onnx] +task: Question Answering +language: th +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetune-qa` is a Thai model orginally trained by `airesearch`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetune_qa_th_5.2.0_3.0_1700059563752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetune_qa_th_5.2.0_3.0_1700059563752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_finetune_qa","th") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_finetune_qa","th") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("th.answer_question.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_finetune_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|th| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/airesearch/bert-base-multilingual-cased-finetune-qa +- https://github.com/vistec-AI/thai2transformers/blob/dev/scripts/downstream/train_question_answering_lm_finetuning.py +- https://wandb.ai/cstorm125/wangchanberta-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta.md new file mode 100644 index 000000000000..e2314f5e6a17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Tamil BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_finetuned_chaii +date: 2023-11-15 +tags: [open_source, question_answering, bert, ta, onnx] +task: Question Answering +language: ta +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-chaii` is a Tamil model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta_5.2.0_3.0_1700059545429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_chaii_ta_5.2.0_3.0_1700059545429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_chaii","ta") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_chaii","ta") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ta.answer_question.chaii.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_finetuned_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ta| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-base-multilingual-cased-finetuned-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetuned_klue_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetuned_klue_ko.md new file mode 100644 index 000000000000..fe5d082c26ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_finetuned_klue_ko.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from obokkkk) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_finetuned_klue +date: 2023-11-15 +tags: [open_source, question_answering, bert, ko, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-klue` is a Korean model orginally trained by `obokkkk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_klue_ko_5.2.0_3.0_1700060100566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_finetuned_klue_ko_5.2.0_3.0_1700060100566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_klue","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_finetuned_klue","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.klue.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_finetuned_klue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/obokkkk/bert-base-multilingual-cased-finetuned-klue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_korquad_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_korquad_ko.md new file mode 100644 index 000000000000..fae9a84c7b30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_korquad_ko.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from sangrimlee) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_korquad +date: 2023-11-15 +tags: [open_source, question_answering, bert, ko, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-korquad` is a Korean model orginally trained by `sangrimlee`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_ko_5.2.0_3.0_1700060305908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_ko_5.2.0_3.0_1700060305908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_korquad","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_korquad","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.korquad.bert.multilingual_base_cased.by_sangrimlee").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_korquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sangrimlee/bert-base-multilingual-cased-korquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_korquad_v1_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_korquad_v1_ko.md new file mode 100644 index 000000000000..93bd3a5f8018 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_cased_korquad_v1_ko.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from eliza-dukim) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_cased_korquad_v1 +date: 2023-11-15 +tags: [open_source, question_answering, bert, ko, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased_korquad-v1` is a Korean model orginally trained by `eliza-dukim`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_v1_ko_5.2.0_3.0_1700060432159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_cased_korquad_v1_ko_5.2.0_3.0_1700060432159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_cased_korquad_v1","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_cased_korquad_v1","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.korquad.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_cased_korquad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/eliza-dukim/bert-base-multilingual-cased_korquad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_xquad_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_xquad_xx.md new file mode 100644 index 000000000000..8fb375399347 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_multilingual_xquad_xx.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering model (from alon-albalak) +author: John Snow Labs +name: bert_qa_bert_base_multilingual_xquad +date: 2023-11-15 +tags: [open_source, question_answering, bert, xx, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-xquad` is a Multilingual model orginally trained by `alon-albalak`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_xquad_xx_5.2.0_3.0_1700060659109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_multilingual_xquad_xx_5.2.0_3.0_1700060659109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_multilingual_xquad","xx") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_multilingual_xquad","xx") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.xquad.bert.multilingual_base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_multilingual_xquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|625.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/alon-albalak/bert-base-multilingual-xquad +- https://github.com/deepmind/xquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es.md new file mode 100644 index 000000000000..319f983b4e15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa +date: 2023-11-15 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-cased-finetuned-qa-mlqa` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es_5.2.0_3.0_1700060933708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa_es_5.2.0_3.0_1700060933708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.mlqa.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_mlqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-cased-finetuned-qa-mlqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es.md new file mode 100644 index 000000000000..5dcea721431a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from CenIA) +author: John Snow Labs +name: bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac +date: 2023-11-15 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-spanish-wwm-cased-finetuned-qa-sqac` is a Castilian, Spanish model orginally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es_5.2.0_3.0_1700059818270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac_es_5.2.0_3.0_1700059818270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.sqac.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_spanish_wwm_cased_finetuned_qa_sqac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/bert-base-spanish-wwm-cased-finetuned-qa-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr.md new file mode 100644 index 000000000000..3622d66291aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering model (from husnu) +author: John Snow Labs +name: bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3 +date: 2023-11-15 +tags: [open_source, question_answering, bert, tr, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-turkish-cased-finetuned_lr-2e-05_epochs-3` is a Turkish model orginally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1700060105366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3_tr_5.2.0_3.0_1700060105366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3","tr") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.bert.base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_turkish_cased_finetuned_lr_2e_05_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/bert-base-turkish-cased-finetuned_lr-2e-05_epochs-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_uncased_coqa_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_uncased_coqa_en.md new file mode 100644 index 000000000000..3ef3e5f157d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_uncased_coqa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from peggyhuang) +author: John Snow Labs +name: bert_qa_bert_base_uncased_coqa +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-coqa` is a English model orginally trained by `peggyhuang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_coqa_en_5.2.0_3.0_1700061211300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_coqa_en_5.2.0_3.0_1700061211300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_coqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_coqa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_coqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peggyhuang/bert-base-uncased-coqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en.md new file mode 100644 index 000000000000..7ed52bfb2b4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from armageddon) +author: John Snow Labs +name: bert_qa_bert_base_uncased_squad2_covid_qa_deepset +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad2-covid-qa-deepset` is a English model orginally trained by `armageddon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1700059817938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_base_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1700059817938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_base_uncased_squad2_covid_qa_deepset","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_base_uncased_squad2_covid_qa_deepset","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_covid.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_base_uncased_squad2_covid_qa_deepset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/armageddon/bert-base-uncased-squad2-covid-qa-deepset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_finetuned_jackh1995_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_finetuned_jackh1995_en.md new file mode 100644 index 000000000000..2972ffc17097 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_finetuned_jackh1995_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from jackh1995) +author: John Snow Labs +name: bert_qa_bert_finetuned_jackh1995 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned` is a English model orginally trained by `jackh1995`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_jackh1995_en_5.2.0_3.0_1700060719811.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_jackh1995_en_5.2.0_3.0_1700060719811.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_finetuned_jackh1995","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_finetuned_jackh1995","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_jackh1995").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_finetuned_jackh1995| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|380.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jackh1995/bert-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_finetuned_lr2_e5_b16_ep2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_finetuned_lr2_e5_b16_ep2_en.md new file mode 100644 index 000000000000..be38585f77da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_finetuned_lr2_e5_b16_ep2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from motiondew) +author: John Snow Labs +name: bert_qa_bert_finetuned_lr2_e5_b16_ep2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-lr2-e5-b16-ep2` is a English model orginally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_lr2_e5_b16_ep2_en_5.2.0_3.0_1700061010799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_finetuned_lr2_e5_b16_ep2_en_5.2.0_3.0_1700061010799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_finetuned_lr2_e5_b16_ep2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_finetuned_lr2_e5_b16_ep2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_motiondew").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_finetuned_lr2_e5_b16_ep2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-finetuned-lr2-e5-b16-ep2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_l_squadv1.1_sl256_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_l_squadv1.1_sl256_en.md new file mode 100644 index 000000000000..af68cdbbb6a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_l_squadv1.1_sl256_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from vuiseng9) +author: John Snow Labs +name: bert_qa_bert_l_squadv1.1_sl256 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-l-squadv1.1-sl256` is a English model orginally trained by `vuiseng9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl256_en_5.2.0_3.0_1700061732088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl256_en_5.2.0_3.0_1700061732088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_l_squadv1.1_sl256","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_l_squadv1.1_sl256","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.sl256.by_vuiseng9").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_l_squadv1.1_sl256| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vuiseng9/bert-l-squadv1.1-sl256 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_l_squadv1.1_sl384_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_l_squadv1.1_sl384_en.md new file mode 100644 index 000000000000..54c679cb71dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_l_squadv1.1_sl384_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from vuiseng9) +author: John Snow Labs +name: bert_qa_bert_l_squadv1.1_sl384 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-l-squadv1.1-sl384` is a English model orginally trained by `vuiseng9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl384_en_5.2.0_3.0_1700062421907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_l_squadv1.1_sl384_en_5.2.0_3.0_1700062421907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_l_squadv1.1_sl384","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_l_squadv1.1_sl384","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.sl384.by_vuiseng9").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_l_squadv1.1_sl384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vuiseng9/bert-l-squadv1.1-sl384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_faquad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_faquad_en.md new file mode 100644 index 000000000000..16eae79e99df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_faquad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ricardo-filho) +author: John Snow Labs +name: bert_qa_bert_large_faquad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_large_faquad` is a English model orginally trained by `ricardo-filho`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_faquad_en_5.2.0_3.0_1700062980803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_faquad_en_5.2.0_3.0_1700062980803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_faquad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_faquad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.large.by_ricardo-filho").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_faquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ricardo-filho/bert_large_faquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_finetuned_docvqa_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_finetuned_docvqa_en.md new file mode 100644 index 000000000000..c7a148568b92 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_finetuned_docvqa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from tiennvcs) +author: John Snow Labs +name: bert_qa_bert_large_uncased_finetuned_docvqa +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-finetuned-docvqa` is a English model orginally trained by `tiennvcs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_finetuned_docvqa_en_5.2.0_3.0_1700062342574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_finetuned_docvqa_en_5.2.0_3.0_1700062342574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_finetuned_docvqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_finetuned_docvqa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.large_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_finetuned_docvqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tiennvcs/bert-large-uncased-finetuned-docvqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en.md new file mode 100644 index 000000000000..5d66bcd0b1d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from armageddon) +author: John Snow Labs +name: bert_qa_bert_large_uncased_squad2_covid_qa_deepset +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squad2-covid-qa-deepset` is a English model orginally trained by `armageddon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1700060388720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1700060388720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_squad2_covid_qa_deepset","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_squad2_covid_qa_deepset","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_covid.bert.large_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_squad2_covid_qa_deepset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/armageddon/bert-large-uncased-squad2-covid-qa-deepset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md new file mode 100644 index 000000000000..c73276c9e15a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Intel) +author: John Snow Labs +name: bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squadv1.1-sparse-80-1x4-block-pruneofa` is a English model orginally trained by `Intel`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1700060896462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1700060896462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large_uncased_sparse_80_1x4_block_pruneofa.by_Intel").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_squadv1.1_sparse_80_1x4_block_pruneofa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|436.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Intel/bert-large-uncased-squadv1.1-sparse-80-1x4-block-pruneofa +- https://arxiv.org/abs/2111.05754 +- https://github.com/IntelLabs/Model-Compression-Research-Package/tree/main/research/prune-once-for-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squadv2_en.md new file mode 100644 index 000000000000..1de5303da08f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_squadv2_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from madlag) +author: John Snow Labs +name: bert_qa_bert_large_uncased_squadv2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squadv2` is a English model orginally trained by `madlag`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv2_en_5.2.0_3.0_1700061553399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_squadv2_en_5.2.0_3.0_1700061553399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_uncased_v2.by_madlag").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/madlag/bert-large-uncased-squadv2 +- https://arxiv.org/pdf/1810.04805v2.pdf%5D \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_chaii_en.md new file mode 100644 index 000000000000..63dde9a6bf33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_chaii +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_chaii_en_5.2.0_3.0_1700062243886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_chaii_en_5.2.0_3.0_1700062243886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.large_uncased_uncased_whole_word_masking.by_SauravMaheshkar").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-large-uncased-whole-word-masking-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en.md new file mode 100644 index 000000000000..5b6dc827f5da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-finetuned-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en_5.2.0_3.0_1700062797772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii_en_5.2.0_3.0_1700062797772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.large_uncased_uncased_whole_word_masking_finetuned.by_SauravMaheshkar").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_finetuned_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-large-uncased-whole-word-masking-finetuned-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en.md new file mode 100644 index 000000000000..40bbc9e1f22a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from haddadalwi) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-finetuned-squad-finetuned-islamic-squad` is a English model orginally trained by `haddadalwi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en_5.2.0_3.0_1700062882246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad_en_5.2.0_3.0_1700062882246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large_uncased.by_haddadalwi").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_finetuned_squad_finetuned_islamic_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/haddadalwi/bert-large-uncased-whole-word-masking-finetuned-squad-finetuned-islamic-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en.md new file mode 100644 index 000000000000..13c4d7d0defe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from madlag) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-finetuned-squadv2` is a English model orginally trained by `madlag`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en_5.2.0_3.0_1700063498100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2_en_5.2.0_3.0_1700063498100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_uncased_whole_word_masking_v2.by_madlag").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_squad2_en.md new file mode 100644 index 000000000000..726df0b42269 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_large_uncased_whole_word_masking_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from deepset) +author: John Snow Labs +name: bert_qa_bert_large_uncased_whole_word_masking_squad2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-whole-word-masking-squad2` is a English model orginally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_squad2_en_5.2.0_3.0_1700060629483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_large_uncased_whole_word_masking_squad2_en_5.2.0_3.0_1700060629483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_large_uncased_whole_word_masking_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_large_uncased_whole_word_masking_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_uncased.by_deepset").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_large_uncased_whole_word_masking_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/bert-large-uncased-whole-word-masking-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx.md new file mode 100644 index 000000000000..1ebde69b795a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp +date: 2023-11-15 +tags: [te, en, open_source, question_answering, bert, xx, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-cased-finedtuned-xquad-tydiqa-goldp` is a Multilingual model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx_5.2.0_3.0_1700063247297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp_xx_5.2.0_3.0_1700063247297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp","xx") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp","xx") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.xquad_tydiqa.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_finedtuned_xquad_tydiqa_goldp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/bert-multi-cased-finedtuned-xquad-tydiqa-goldp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finetuned_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finetuned_chaii_en.md new file mode 100644 index 000000000000..c7c5e50fe633 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finetuned_chaii_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: bert_qa_bert_multi_cased_finetuned_chaii +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-cased-finetuned-chaii` is a English model orginally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_chaii_en_5.2.0_3.0_1700063164747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_chaii_en_5.2.0_3.0_1700063164747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_finetuned_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_cased_finetuned_chaii","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_finetuned_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/bert-multi-cased-finetuned-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en.md new file mode 100644 index 000000000000..0c281bcb536c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from TingChenChang) +author: John Snow Labs +name: bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-cased-finetuned-xquadv1-finetuned-squad-colab` is a English model orginally trained by `TingChenChang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en_5.2.0_3.0_1700063606759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab_en_5.2.0_3.0_1700063606759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.xquad_squad.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_cased_finetuned_xquadv1_finetuned_squad_colab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/TingChenChang/bert-multi-cased-finetuned-xquadv1-finetuned-squad-colab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_english_german_squad2_de.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_english_german_squad2_de.md new file mode 100644 index 000000000000..2233512e1902 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_multi_english_german_squad2_de.md @@ -0,0 +1,110 @@ +--- +layout: model +title: German BertForQuestionAnswering model (from deutsche-telekom) +author: John Snow Labs +name: bert_qa_bert_multi_english_german_squad2 +date: 2023-11-15 +tags: [de, open_source, question_answering, bert, onnx] +task: Question Answering +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-english-german-squad2` is a German model orginally trained by `deutsche-telekom`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_english_german_squad2_de_5.2.0_3.0_1700060982961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_multi_english_german_squad2_de_5.2.0_3.0_1700060982961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_multi_english_german_squad2","de") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bert_multi_english_german_squad2","de") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.answer_question.squadv2.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_multi_english_german_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deutsche-telekom/bert-multi-english-german-squad2 +- https://rajpurkar.github.io/SQuAD-explorer/ +- https://github.com/google-research/bert/blob/master/multilingual.md \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..8bb795364055 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1700063358643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1700063358643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_10_h_512_a_8_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|177.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-10_H-512_A-8_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..9dace6c0e2dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2 BertForQuestionAnswering from aodiniz +author: John Snow Labs +name: bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2` is a English model originally trained by aodiniz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1700063519961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2_en_5.2.0_3.0_1700063519961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bert_uncased_l_4_h_512_a_8_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|106.9 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/aodiniz/bert_uncased_L-4_H-512_A-8_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertfast_01_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertfast_01_en.md new file mode 100644 index 000000000000..5c84066b3a08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertfast_01_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_bertfast_01 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertFast_01` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertfast_01_en_5.2.0_3.0_1700061584041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertfast_01_en_5.2.0_3.0_1700061584041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bertfast_01","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_bertfast_01","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertfast_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/bertFast_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertimbau_squad1.1_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertimbau_squad1.1_en.md new file mode 100644 index 000000000000..bb343a2af856 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertimbau_squad1.1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from hendrixcosta) +author: John Snow Labs +name: bert_qa_bertimbau_squad1.1 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertimbau-squad1.1` is a English model orginally trained by `hendrixcosta`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertimbau_squad1.1_en_5.2.0_3.0_1700062328092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertimbau_squad1.1_en_5.2.0_3.0_1700062328092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertimbau_squad1.1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bertimbau_squad1.1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_hendrixcosta").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertimbau_squad1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hendrixcosta/bertimbau-squad1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertserini_bert_large_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertserini_bert_large_squad_en.md new file mode 100644 index 000000000000..49a6e71f334a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_bertserini_bert_large_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from rsvp-ai) +author: John Snow Labs +name: bert_qa_bertserini_bert_large_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertserini-bert-large-squad` is a English model orginally trained by `rsvp-ai`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_bert_large_squad_en_5.2.0_3.0_1700064063590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_bertserini_bert_large_squad_en_5.2.0_3.0_1700064063590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_bertserini_bert_large_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_bertserini_bert_large_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large.by_rsvp-ai").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_bertserini_bert_large_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/rsvp-ai/bertserini-bert-large-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_beto_base_spanish_sqac_es.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_beto_base_spanish_sqac_es.md new file mode 100644 index 000000000000..7298a72104e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_beto_base_spanish_sqac_es.md @@ -0,0 +1,112 @@ +--- +layout: model +title: Spanish BertForQuestionAnswering model (from IIC) +author: John Snow Labs +name: bert_qa_beto_base_spanish_sqac +date: 2023-11-15 +tags: [es, open_source, question_answering, bert, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `beto-base-spanish-sqac` is a Spanish model orginally trained by `IIC`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_beto_base_spanish_sqac_es_5.2.0_3.0_1700062655500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_beto_base_spanish_sqac_es_5.2.0_3.0_1700062655500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_beto_base_spanish_sqac","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_beto_base_spanish_sqac","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.sqac.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_beto_base_spanish_sqac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|409.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/IIC/beto-base-spanish-sqac +- https://paperswithcode.com/sota?task=question-answering&dataset=PlanTL-GOB-ES%2FSQAC +- https://arxiv.org/abs/2107.07253 +- https://github.com/dccuchile/beto +- https://www.bsc.es/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_biobertpt_squad_v1.1_portuguese_pt.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_biobertpt_squad_v1.1_portuguese_pt.md new file mode 100644 index 000000000000..b3b73c5151a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_biobertpt_squad_v1.1_portuguese_pt.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Portuguese bert_qa_biobertpt_squad_v1.1_portuguese BertForQuestionAnswering from pucpr +author: John Snow Labs +name: bert_qa_biobertpt_squad_v1.1_portuguese +date: 2023-11-15 +tags: [bert, pt, open_source, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_biobertpt_squad_v1.1_portuguese` is a Portuguese model originally trained by pucpr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biobertpt_squad_v1.1_portuguese_pt_5.2.0_3.0_1700061364212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biobertpt_squad_v1.1_portuguese_pt_5.2.0_3.0_1700061364212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biobertpt_squad_v1.1_portuguese","pt") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_biobertpt_squad_v1.1_portuguese", "pt") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biobertpt_squad_v1.1_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|pt| +|Size:|664.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/pucpr/bioBERTpt-squad-v1.1-portuguese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_biomedical_slot_filling_reader_base_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_biomedical_slot_filling_reader_base_en.md new file mode 100644 index 000000000000..4af56e67551e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_biomedical_slot_filling_reader_base_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from healx) +author: John Snow Labs +name: bert_qa_biomedical_slot_filling_reader_base +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `biomedical-slot-filling-reader-base` is a English model orginally trained by `healx`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_biomedical_slot_filling_reader_base_en_5.2.0_3.0_1700061626085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_biomedical_slot_filling_reader_base_en_5.2.0_3.0_1700061626085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_biomedical_slot_filling_reader_base","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_biomedical_slot_filling_reader_base","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bio_medical.bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_biomedical_slot_filling_reader_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/healx/biomedical-slot-filling-reader-base +- https://arxiv.org/abs/2109.08564 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_burmese_model_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_burmese_model_en.md new file mode 100644 index 000000000000..f745156e9025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_burmese_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_burmese_model BertForQuestionAnswering from Shredder +author: John Snow Labs +name: bert_qa_burmese_model +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_burmese_model` is a English model originally trained by Shredder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_burmese_model_en_5.2.0_3.0_1700008721426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_burmese_model_en_5.2.0_3.0_1700008721426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_burmese_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_burmese_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_burmese_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Shredder/My_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_dylan1999_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_dylan1999_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..1a10af8ecf77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_dylan1999_finetuned_squad_accelerate_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Dylan1999) +author: John Snow Labs +name: bert_qa_dylan1999_finetuned_squad_accelerate +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model originally trained by `Dylan1999`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_dylan1999_finetuned_squad_accelerate_en_5.2.0_3.0_1700006698837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_dylan1999_finetuned_squad_accelerate_en_5.2.0_3.0_1700006698837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dylan1999_finetuned_squad_accelerate","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_dylan1999_finetuned_squad_accelerate","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_dylan1999_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Dylan1999/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fabianwillner_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fabianwillner_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..d9393080c547 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fabianwillner_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from FabianWillner) +author: John Snow Labs +name: bert_qa_fabianwillner_base_uncased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad` is a English model originally trained by `FabianWillner`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fabianwillner_base_uncased_finetuned_squad_en_5.2.0_3.0_1700007113810.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fabianwillner_base_uncased_finetuned_squad_en_5.2.0_3.0_1700007113810.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_fabianwillner_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_fabianwillner_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fabianwillner_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/FabianWillner/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetune_bert_base_v2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetune_bert_base_v2_en.md new file mode 100644 index 000000000000..f2217b19f815 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetune_bert_base_v2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from peggyhuang) +author: John Snow Labs +name: bert_qa_finetune_bert_base_v2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetune-bert-base-v2` is a English model orginally trained by `peggyhuang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_bert_base_v2_en_5.2.0_3.0_1700007361101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_bert_base_v2_en_5.2.0_3.0_1700007361101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_finetune_bert_base_v2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_finetune_bert_base_v2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base_v2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetune_bert_base_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peggyhuang/finetune-bert-base-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetune_bert_base_v3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetune_bert_base_v3_en.md new file mode 100644 index 000000000000..a51295841b71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetune_bert_base_v3_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from peggyhuang) +author: John Snow Labs +name: bert_qa_finetune_bert_base_v3 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetune-bert-base-v3` is a English model orginally trained by `peggyhuang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_bert_base_v3_en_5.2.0_3.0_1700007635634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetune_bert_base_v3_en_5.2.0_3.0_1700007635634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_finetune_bert_base_v3","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_finetune_bert_base_v3","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.base_v3.by_peggyhuang").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetune_bert_base_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peggyhuang/finetune-bert-base-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetuned_custom_1_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetuned_custom_1_en.md new file mode 100644 index 000000000000..5396a5841b4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_finetuned_custom_1_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from VedantS01) +author: John Snow Labs +name: bert_qa_finetuned_custom_1 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-custom-1` is a English model originally trained by `VedantS01`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_custom_1_en_5.2.0_3.0_1700007920448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_finetuned_custom_1_en_5.2.0_3.0_1700007920448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_custom_1","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_finetuned_custom_1","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_finetuned_custom_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/VedantS01/bert-finetuned-custom-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fpdm_bert_ft_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fpdm_bert_ft_newsqa_en.md new file mode 100644 index 000000000000..391f61aa216e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fpdm_bert_ft_newsqa_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_fpdm_bert_ft_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_fpdm_bert_ft_newsqa +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_fpdm_bert_ft_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_bert_ft_newsqa_en_5.2.0_3.0_1700008214320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_bert_ft_newsqa_en_5.2.0_3.0_1700008214320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_fpdm_bert_ft_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_fpdm_bert_ft_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fpdm_bert_ft_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/AnonymousSub/fpdm_bert_FT_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fpdm_pert_sent_0.01_squad2.0_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fpdm_pert_sent_0.01_squad2.0_en.md new file mode 100644 index 000000000000..f315005649db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_fpdm_pert_sent_0.01_squad2.0_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_fpdm_pert_sent_0.01_squad2.0 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `fpdm_bert_pert_sent_0.01_squad2.0` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_pert_sent_0.01_squad2.0_en_5.2.0_3.0_1700008478770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_fpdm_pert_sent_0.01_squad2.0_en_5.2.0_3.0_1700008478770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_fpdm_pert_sent_0.01_squad2.0","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_fpdm_pert_sent_0.01_squad2.0","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_fpdm_pert_sent_0.01_squad2.0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/fpdm_bert_pert_sent_0.01_squad2.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_howey_bert_large_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_howey_bert_large_uncased_squad_en.md new file mode 100644 index 000000000000..3454e69377b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_howey_bert_large_uncased_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from howey) +author: John Snow Labs +name: bert_qa_howey_bert_large_uncased_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squad` is a English model orginally trained by `howey`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_howey_bert_large_uncased_squad_en_5.2.0_3.0_1700006901799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_howey_bert_large_uncased_squad_en_5.2.0_3.0_1700006901799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_howey_bert_large_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_howey_bert_large_uncased_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large_uncased.by_howey").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_howey_bert_large_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/howey/bert-large-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_hubert_fine_tuned_hungarian_squadv1_hu.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_hubert_fine_tuned_hungarian_squadv1_hu.md new file mode 100644 index 000000000000..5bc11238850a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_hubert_fine_tuned_hungarian_squadv1_hu.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Hungarian bert_qa_hubert_fine_tuned_hungarian_squadv1 BertForQuestionAnswering from mcsabai +author: John Snow Labs +name: bert_qa_hubert_fine_tuned_hungarian_squadv1 +date: 2023-11-15 +tags: [bert, hu, open_source, question_answering, onnx] +task: Question Answering +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_hubert_fine_tuned_hungarian_squadv1` is a Hungarian model originally trained by mcsabai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_hubert_fine_tuned_hungarian_squadv1_hu_5.2.0_3.0_1700007175058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_hubert_fine_tuned_hungarian_squadv1_hu_5.2.0_3.0_1700007175058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_hubert_fine_tuned_hungarian_squadv1","hu") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_hubert_fine_tuned_hungarian_squadv1", "hu") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_hubert_fine_tuned_hungarian_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|hu| +|Size:|412.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/mcsabai/huBert-fine-tuned-hungarian-squadv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_base_indonesian_finetune_idk_mrc_id.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_base_indonesian_finetune_idk_mrc_id.md new file mode 100644 index 000000000000..363130a9c69d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_base_indonesian_finetune_idk_mrc_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian bert_qa_indo_base_indonesian_finetune_idk_mrc BertForQuestionAnswering from rifkiaputri +author: John Snow Labs +name: bert_qa_indo_base_indonesian_finetune_idk_mrc +date: 2023-11-15 +tags: [bert, id, open_source, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_indo_base_indonesian_finetune_idk_mrc` is a Indonesian model originally trained by rifkiaputri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_indo_base_indonesian_finetune_idk_mrc_id_5.2.0_3.0_1700008695029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_indo_base_indonesian_finetune_idk_mrc_id_5.2.0_3.0_1700008695029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_indo_base_indonesian_finetune_idk_mrc","id") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_indo_base_indonesian_finetune_idk_mrc", "id") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_indo_base_indonesian_finetune_idk_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|464.2 MB| + +## References + +https://huggingface.co/rifkiaputri/indobert-base-id-finetune-idk-mrc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_base_uncased_finetuned_tydi_indo_in.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_base_uncased_finetuned_tydi_indo_in.md new file mode 100644 index 000000000000..a8a56c8209cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_base_uncased_finetuned_tydi_indo_in.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Indonesian BertForQuestionAnswering Base Uncased model (from jakartaresearch) +author: John Snow Labs +name: bert_qa_indo_base_uncased_finetuned_tydi_indo +date: 2023-11-15 +tags: [in, open_source, bert, question_answering, onnx] +task: Question Answering +language: in +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indobert-base-uncased-finetuned-tydiqa-indoqa` is a Indonesian model originally trained by `jakartaresearch`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_indo_base_uncased_finetuned_tydi_indo_in_5.2.0_3.0_1700007414125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_indo_base_uncased_finetuned_tydi_indo_in_5.2.0_3.0_1700007414125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_indo_base_uncased_finetuned_tydi_indo","in")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_indo_base_uncased_finetuned_tydi_indo","in") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_indo_base_uncased_finetuned_tydi_indo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|in| +|Size:|411.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jakartaresearch/indobert-base-uncased-finetuned-tydiqa-indoqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_finetune_tydi_transfer_indo_in.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_finetune_tydi_transfer_indo_in.md new file mode 100644 index 000000000000..9ba70beb6022 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_finetune_tydi_transfer_indo_in.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Indonesian BertForQuestionAnswering Cased model (from andreaschandra) +author: John Snow Labs +name: bert_qa_indo_finetune_tydi_transfer_indo +date: 2023-11-15 +tags: [in, open_source, bert, question_answering, onnx] +task: Question Answering +language: in +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indobert-finetune-tydiqa-transfer-indoqa` is a Indonesian model originally trained by `andreaschandra`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_indo_finetune_tydi_transfer_indo_in_5.2.0_3.0_1700006631538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_indo_finetune_tydi_transfer_indo_in_5.2.0_3.0_1700006631538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_indo_finetune_tydi_transfer_indo","in")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_indo_finetune_tydi_transfer_indo","in") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_indo_finetune_tydi_transfer_indo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|in| +|Size:|411.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andreaschandra/indobert-finetune-tydiqa-transfer-indoqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_finetuned_squad_id.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_finetuned_squad_id.md new file mode 100644 index 000000000000..a942b535a2aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_indo_finetuned_squad_id.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Indonesian BertForQuestionAnswering Cased model (from botika) +author: John Snow Labs +name: bert_qa_indo_finetuned_squad +date: 2023-11-15 +tags: [id, open_source, bert, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Indobert-QA-finetuned-squad` is a Indonesian model originally trained by `botika`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_indo_finetuned_squad_id_5.2.0_3.0_1700008964605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_indo_finetuned_squad_id_5.2.0_3.0_1700008964605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_indo_finetuned_squad","id")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_indo_finetuned_squad","id") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_indo_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|411.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/botika/Indobert-QA-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_internetoftim_bert_large_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_internetoftim_bert_large_uncased_squad_en.md new file mode 100644 index 000000000000..9bf97784216c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_internetoftim_bert_large_uncased_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from internetoftim) +author: John Snow Labs +name: bert_qa_internetoftim_bert_large_uncased_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-uncased-squad` is a English model orginally trained by `internetoftim`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_internetoftim_bert_large_uncased_squad_en_5.2.0_3.0_1700007331979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_internetoftim_bert_large_uncased_squad_en_5.2.0_3.0_1700007331979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_internetoftim_bert_large_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_internetoftim_bert_large_uncased_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.large_uncased.by_internetoftim").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_internetoftim_bert_large_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|797.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/internetoftim/bert-large-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_irenelizihui_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_irenelizihui_finetuned_squad_en.md new file mode 100644 index 000000000000..bee046bbcff5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_irenelizihui_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from irenelizihui) +author: John Snow Labs +name: bert_qa_irenelizihui_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `irenelizihui`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_irenelizihui_finetuned_squad_en_5.2.0_3.0_1700009279150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_irenelizihui_finetuned_squad_en_5.2.0_3.0_1700009279150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_irenelizihui_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_irenelizihui_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_irenelizihui").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_irenelizihui_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/irenelizihui/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ixambert_finetuned_squad_basque_marcbrun_eu.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ixambert_finetuned_squad_basque_marcbrun_eu.md new file mode 100644 index 000000000000..5dd42be70cbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ixambert_finetuned_squad_basque_marcbrun_eu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Basque bert_qa_ixambert_finetuned_squad_basque_marcbrun BertForQuestionAnswering from MarcBrun +author: John Snow Labs +name: bert_qa_ixambert_finetuned_squad_basque_marcbrun +date: 2023-11-15 +tags: [bert, eu, open_source, question_answering, onnx] +task: Question Answering +language: eu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_ixambert_finetuned_squad_basque_marcbrun` is a Basque model originally trained by MarcBrun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ixambert_finetuned_squad_basque_marcbrun_eu_5.2.0_3.0_1700009508595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ixambert_finetuned_squad_basque_marcbrun_eu_5.2.0_3.0_1700009508595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_ixambert_finetuned_squad_basque_marcbrun","eu") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_ixambert_finetuned_squad_basque_marcbrun", "eu") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ixambert_finetuned_squad_basque_marcbrun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|eu| +|Size:|661.1 MB| + +## References + +https://huggingface.co/MarcBrun/ixambert-finetuned-squad-eu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_jimypbr_bert_base_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_jimypbr_bert_base_uncased_squad_en.md new file mode 100644 index 000000000000..3c4e0d94b3d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_jimypbr_bert_base_uncased_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from jimypbr) +author: John Snow Labs +name: bert_qa_jimypbr_bert_base_uncased_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad` is a English model orginally trained by `jimypbr`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_jimypbr_bert_base_uncased_squad_en_5.2.0_3.0_1700007658852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_jimypbr_bert_base_uncased_squad_en_5.2.0_3.0_1700007658852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_jimypbr_bert_base_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_jimypbr_bert_base_uncased_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_jimypbr").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_jimypbr_bert_base_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|258.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jimypbr/bert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kamilali_distilbert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kamilali_distilbert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..c99949192f7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kamilali_distilbert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from kamilali) +author: John Snow Labs +name: bert_qa_kamilali_distilbert_base_uncased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model orginally trained by `kamilali`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kamilali_distilbert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700007960980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kamilali_distilbert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700007960980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kamilali_distilbert_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_kamilali_distilbert_base_uncased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.distilled_base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kamilali_distilbert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kamilali/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kaporter_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kaporter_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..71b7489a394c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kaporter_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from kaporter) +author: John Snow Labs +name: bert_qa_kaporter_bert_base_uncased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad` is a English model orginally trained by `kaporter`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kaporter_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700008245981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kaporter_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700008245981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kaporter_bert_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_kaporter_bert_base_uncased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_kaporter").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kaporter_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kaporter/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kcbert_base_finetuned_squad_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kcbert_base_finetuned_squad_ko.md new file mode 100644 index 000000000000..45bc9ce7f75f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kcbert_base_finetuned_squad_ko.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Korean BertForQuestionAnswering Base Cased model (from tucan9389) +author: John Snow Labs +name: bert_qa_kcbert_base_finetuned_squad +date: 2023-11-15 +tags: [ko, open_source, bert, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kcbert-base-finetuned-squad` is a Korean model originally trained by `tucan9389`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kcbert_base_finetuned_squad_ko_5.2.0_3.0_1700008506310.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kcbert_base_finetuned_squad_ko_5.2.0_3.0_1700008506310.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kcbert_base_finetuned_squad","ko") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_kcbert_base_finetuned_squad","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kcbert_base_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|406.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tucan9389/kcbert-base-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_keepitreal_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_keepitreal_finetuned_squad_en.md new file mode 100644 index 000000000000..5ffb340213ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_keepitreal_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from keepitreal) +author: John Snow Labs +name: bert_qa_keepitreal_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `keepitreal`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_keepitreal_finetuned_squad_en_5.2.0_3.0_1700008749370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_keepitreal_finetuned_squad_en_5.2.0_3.0_1700008749370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_keepitreal_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_keepitreal_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_keepitreal_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/keepitreal/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_khanh_base_multilingual_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_khanh_base_multilingual_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..473270e6716d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_khanh_base_multilingual_cased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from Khanh) +author: John Snow Labs +name: bert_qa_khanh_base_multilingual_cased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-squad` is a English model originally trained by `Khanh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_khanh_base_multilingual_cased_finetuned_squad_en_5.2.0_3.0_1700009076961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_khanh_base_multilingual_cased_finetuned_squad_en_5.2.0_3.0_1700009076961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_khanh_base_multilingual_cased_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_khanh_base_multilingual_cased_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.cased_multilingual_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_khanh_base_multilingual_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Khanh/bert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_klue_bert_base_aihub_mrc_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_klue_bert_base_aihub_mrc_ko.md new file mode 100644 index 000000000000..444ff065fc82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_klue_bert_base_aihub_mrc_ko.md @@ -0,0 +1,111 @@ +--- +layout: model +title: Korean BertForQuestionAnswering model (from bespin-global) +author: John Snow Labs +name: bert_qa_klue_bert_base_aihub_mrc +date: 2023-11-15 +tags: [ko, open_source, question_answering, bert, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `klue-bert-base-aihub-mrc` is a Korean model orginally trained by `bespin-global`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_klue_bert_base_aihub_mrc_ko_5.2.0_3.0_1700009374093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_klue_bert_base_aihub_mrc_ko_5.2.0_3.0_1700009374093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_klue_bert_base_aihub_mrc","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_klue_bert_base_aihub_mrc","ko") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.klue.bert.base_aihub.by_bespin-global").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_klue_bert_base_aihub_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ko| +|Size:|412.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bespin-global/klue-bert-base-aihub-mrc +- https://github.com/KLUE-benchmark/KLUE +- https://www.bespinglobal.com/ +- https://aihub.or.kr/aidata/86 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kobert_finetuned_squad_kor_v1_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kobert_finetuned_squad_kor_v1_ko.md new file mode 100644 index 000000000000..97fac200dca5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_kobert_finetuned_squad_kor_v1_ko.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Korean BertForQuestionAnswering Cased model (from arogyaGurkha) +author: John Snow Labs +name: bert_qa_kobert_finetuned_squad_kor_v1 +date: 2023-11-15 +tags: [ko, open_source, bert, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `kobert-finetuned-squad_kor_v1` is a Korean model originally trained by `arogyaGurkha`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_kobert_finetuned_squad_kor_v1_ko_5.2.0_3.0_1700009633696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_kobert_finetuned_squad_kor_v1_ko_5.2.0_3.0_1700009633696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_kobert_finetuned_squad_kor_v1","ko") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_kobert_finetuned_squad_kor_v1","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_kobert_finetuned_squad_kor_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|342.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/arogyaGurkha/kobert-finetuned-squad_kor_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_korean_lm_finetuned_klue_v2_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_korean_lm_finetuned_klue_v2_ko.md new file mode 100644 index 000000000000..095d97ae6ef4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_korean_lm_finetuned_klue_v2_ko.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Korean bert_qa_korean_lm_finetuned_klue_v2 BertForQuestionAnswering from 2tina +author: John Snow Labs +name: bert_qa_korean_lm_finetuned_klue_v2 +date: 2023-11-15 +tags: [bert, ko, open_source, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_korean_lm_finetuned_klue_v2` is a Korean model originally trained by 2tina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_korean_lm_finetuned_klue_v2_ko_5.2.0_3.0_1700009647081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_korean_lm_finetuned_klue_v2_ko_5.2.0_3.0_1700009647081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_korean_lm_finetuned_klue_v2","ko") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_korean_lm_finetuned_klue_v2", "ko") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_korean_lm_finetuned_klue_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|342.7 MB| + +## References + +https://huggingface.co/2tina/kobert-lm-finetuned-klue-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_en.md new file mode 100644 index 000000000000..92099edc8acc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Cased model (from srcocotero) +author: John Snow Labs +name: bert_qa_large +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-qa` is a English model originally trained by `srcocotero`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_en_5.2.0_3.0_1700010204357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_en_5.2.0_3.0_1700010204357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/srcocotero/bert-large-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_japanese_wikipedia_ud_head_ja.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_japanese_wikipedia_ud_head_ja.md new file mode 100644 index 000000000000..105944d750ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_japanese_wikipedia_ud_head_ja.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Japanese BertForQuestionAnswering Large model (from KoichiYasuoka) +author: John Snow Labs +name: bert_qa_large_japanese_wikipedia_ud_head +date: 2023-11-15 +tags: [ja, open_source, bert, question_answering, onnx] +task: Question Answering +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-japanese-wikipedia-ud-head` is a Japanese model originally trained by `KoichiYasuoka`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_japanese_wikipedia_ud_head_ja_5.2.0_3.0_1700006723855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_japanese_wikipedia_ud_head_ja_5.2.0_3.0_1700006723855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_japanese_wikipedia_ud_head","ja") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["私の名前は何ですか?", "私の名前はクララで、私はバークレーに住んでいます。"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_large_japanese_wikipedia_ud_head","ja") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("私の名前は何ですか?", "私の名前はクララで、私はバークレーに住んでいます。").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ja.answer_question.wikipedia.bert.large").predict("""私の名前は何ですか?|||"私の名前はクララで、私はバークレーに住んでいます。""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_japanese_wikipedia_ud_head| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ja| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/KoichiYasuoka/bert-large-japanese-wikipedia-ud-head +- https://github.com/UniversalDependencies/UD_Japanese-GSDLUW \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_squad_en.md new file mode 100644 index 000000000000..b1a80652182f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Cased model (from jaimin) +author: John Snow Labs +name: bert_qa_large_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-large-squad` is a English model originally trained by `jaimin`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_squad_en_5.2.0_3.0_1700008165848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_squad_en_5.2.0_3.0_1700008165848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_large_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jaimin/bert-large-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_uncased_spanish_sign_language_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_uncased_spanish_sign_language_en.md new file mode 100644 index 000000000000..7c7e2d92c883 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_large_uncased_spanish_sign_language_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_large_uncased_spanish_sign_language BertForQuestionAnswering from michaelrglass +author: John Snow Labs +name: bert_qa_large_uncased_spanish_sign_language +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_large_uncased_spanish_sign_language` is a English model originally trained by michaelrglass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_spanish_sign_language_en_5.2.0_3.0_1700010812943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_large_uncased_spanish_sign_language_en_5.2.0_3.0_1700010812943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_large_uncased_spanish_sign_language","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_large_uncased_spanish_sign_language", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_large_uncased_spanish_sign_language| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|795.1 MB| + +## References + +https://huggingface.co/michaelrglass/bert-large-uncased-sspt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_linkbert_base_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_linkbert_base_finetuned_squad_en.md new file mode 100644 index 000000000000..865fdd6d5234 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_linkbert_base_finetuned_squad_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from niklaspm) +author: John Snow Labs +name: bert_qa_linkbert_base_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `linkbert-base-finetuned-squad` is a English model originally trained by `niklaspm`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_linkbert_base_finetuned_squad_en_5.2.0_3.0_1700008477891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_linkbert_base_finetuned_squad_en_5.2.0_3.0_1700008477891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_linkbert_base_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_linkbert_base_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.link_bert.squad.base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_linkbert_base_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/niklaspm/linkbert-base-finetuned-squad +- https://arxiv.org/abs/2203.15827 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_logo_qna_model_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_logo_qna_model_tr.md new file mode 100644 index 000000000000..632d80dc16f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_logo_qna_model_tr.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering model (from yunusemreemik) +author: John Snow Labs +name: bert_qa_logo_qna_model +date: 2023-11-15 +tags: [tr, open_source, question_answering, bert, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `logo-qna-model` is a Turkish model orginally trained by `yunusemreemik`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_logo_qna_model_tr_5.2.0_3.0_1700011067624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_logo_qna_model_tr_5.2.0_3.0_1700011067624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_logo_qna_model","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_logo_qna_model","tr") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.bert.by_yunusemreemik").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_logo_qna_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/yunusemreemik/logo-qna-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_loodos_bert_base_uncased_qa_fine_tuned_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_loodos_bert_base_uncased_qa_fine_tuned_tr.md new file mode 100644 index 000000000000..83a98c69997e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_loodos_bert_base_uncased_qa_fine_tuned_tr.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Turkish bert_qa_loodos_bert_base_uncased_qa_fine_tuned BertForQuestionAnswering from oguzhanolm +author: John Snow Labs +name: bert_qa_loodos_bert_base_uncased_qa_fine_tuned +date: 2023-11-15 +tags: [bert, tr, open_source, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_loodos_bert_base_uncased_qa_fine_tuned` is a Turkish model originally trained by oguzhanolm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_loodos_bert_base_uncased_qa_fine_tuned_tr_5.2.0_3.0_1700008767611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_loodos_bert_base_uncased_qa_fine_tuned_tr_5.2.0_3.0_1700008767611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_loodos_bert_base_uncased_qa_fine_tuned","tr") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_loodos_bert_base_uncased_qa_fine_tuned", "tr") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_loodos_bert_base_uncased_qa_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|tr| +|Size:|412.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/oguzhanolm/loodos-bert-base-uncased-QA-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_bengali_tydiqa_qa_bn.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_bengali_tydiqa_qa_bn.md new file mode 100644 index 000000000000..783a28a8ade8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_bengali_tydiqa_qa_bn.md @@ -0,0 +1,113 @@ +--- +layout: model +title: Bangla BertForQuestionAnswering model (from sagorsarker) +author: John Snow Labs +name: bert_qa_mbert_bengali_tydiqa_qa +date: 2023-11-15 +tags: [bn, open_source, question_answering, bert, onnx] +task: Question Answering +language: bn +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mbert-bengali-tydiqa-qa` is a Bangla model orginally trained by `sagorsarker`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_bengali_tydiqa_qa_bn_5.2.0_3.0_1700011657711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_bengali_tydiqa_qa_bn_5.2.0_3.0_1700011657711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_bengali_tydiqa_qa","bn") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_mbert_bengali_tydiqa_qa","bn") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("bn.answer_question.tydiqa.multi_lingual_bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_bengali_tydiqa_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|bn| +|Size:|625.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sagorsarker/mbert-bengali-tydiqa-qa +- https://github.com/sagorbrur +- https://github.com/sagorbrur/bntransformer +- https://github.com/google-research-datasets/tydiqa +- https://www.linkedin.com/in/sagor-sarker/ +- https://www.kaggle.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev_xx.md new file mode 100644 index 000000000000..264ff2b0e642 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev BertForQuestionAnswering from roshnir +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev +date: 2023-11-15 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev` is a Multilingual model originally trained by roshnir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev_xx_5.2.0_3.0_1700006955082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev_xx_5.2.0_3.0_1700006955082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_arabic_hindi_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-ar-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev_xx.md new file mode 100644 index 000000000000..384c899d4cf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev BertForQuestionAnswering from roshnir +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev +date: 2023-11-15 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev` is a Multilingual model originally trained by roshnir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev_xx_5.2.0_3.0_1700007162317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev_xx_5.2.0_3.0_1700007162317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_chinese_hindi_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-zh-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_dev_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_dev_en.md new file mode 100644 index 000000000000..2f463d194d8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_dev_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from roshnir) +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_dev +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mBert-finetuned-mlqa-dev-en` is a English model originally trained by `roshnir`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_dev_en_5.2.0_3.0_1700009322395.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_dev_en_5.2.0_3.0_1700009322395.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_dev","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_dev","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.mlqa.finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|625.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev_xx.md new file mode 100644 index 000000000000..21d722fa2ecd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev BertForQuestionAnswering from roshnir +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev +date: 2023-11-15 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev` is a Multilingual model originally trained by roshnir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev_xx_5.2.0_3.0_1700009561613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev_xx_5.2.0_3.0_1700009561613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_english_chinese_hindi_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-en-zh-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_german_hindi_dev_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_german_hindi_dev_xx.md new file mode 100644 index 000000000000..1518486c758a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_german_hindi_dev_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_qa_mbert_finetuned_mlqa_german_hindi_dev BertForQuestionAnswering from roshnir +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_german_hindi_dev +date: 2023-11-15 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_finetuned_mlqa_german_hindi_dev` is a Multilingual model originally trained by roshnir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_german_hindi_dev_xx_5.2.0_3.0_1700011875967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_german_hindi_dev_xx_5.2.0_3.0_1700011875967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_german_hindi_dev","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_finetuned_mlqa_german_hindi_dev", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_german_hindi_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-de-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev_xx.md new file mode 100644 index 000000000000..a94a2ec1bd0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev BertForQuestionAnswering from roshnir +author: John Snow Labs +name: bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev +date: 2023-11-15 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev` is a Multilingual model originally trained by roshnir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev_xx_5.2.0_3.0_1700009787692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev_xx_5.2.0_3.0_1700009787692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mbert_finetuned_mlqa_spanish_hindi_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/roshnir/mBert-finetuned-mlqa-dev-es-hi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mini_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mini_en.md new file mode 100644 index 000000000000..b7f20c49a8bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mini_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Cased model (from srcocotero) +author: John Snow Labs +name: bert_qa_mini +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mini-bert-qa` is a English model originally trained by `srcocotero`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mini_en_5.2.0_3.0_1700012043479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mini_en_5.2.0_3.0_1700012043479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_mini","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_mini","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mini| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|41.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/srcocotero/mini-bert-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mini_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mini_finetuned_squad_en.md new file mode 100644 index 000000000000..713265f9883f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mini_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_mini_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-mini-finetuned-squad` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mini_finetuned_squad_en_5.2.0_3.0_1700007275952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mini_finetuned_squad_en_5.2.0_3.0_1700007275952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mini_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_mini_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.mini_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mini_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|41.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/bert-mini-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_minilm_l12_h384_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_minilm_l12_h384_uncased_squad_en.md new file mode 100644 index 000000000000..fdf35476072e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_minilm_l12_h384_uncased_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Mini Uncased model (from haritzpuerto) +author: John Snow Labs +name: bert_qa_minilm_l12_h384_uncased_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L12-H384-uncased-squad` is a English model originally trained by `haritzpuerto`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_minilm_l12_h384_uncased_squad_en_5.2.0_3.0_1700012168987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_minilm_l12_h384_uncased_squad_en_5.2.0_3.0_1700012168987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_minilm_l12_h384_uncased_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_minilm_l12_h384_uncased_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.uncased_mini_lm_mini").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_minilm_l12_h384_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/haritzpuerto/MiniLM-L12-H384-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_minilm_uncased_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_minilm_uncased_squad2_en.md new file mode 100644 index 000000000000..17a50e215e43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_minilm_uncased_squad2_en.md @@ -0,0 +1,120 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from deepset) +author: John Snow Labs +name: bert_qa_minilm_uncased_squad2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `minilm-uncased-squad2` is a English model orginally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_minilm_uncased_squad2_en_5.2.0_3.0_1700009958896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_minilm_uncased_squad2_en_5.2.0_3.0_1700009958896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_minilm_uncased_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_minilm_uncased_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.mini_lm_base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_minilm_uncased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|123.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/minilm-uncased-squad2 +- https://github.com/deepset-ai/haystack/discussions +- https://deepset.ai +- https://github.com/deepset-ai/FARM/blob/master/examples/question_answering.py +- https://twitter.com/deepset_ai +- http://www.deepset.ai/jobs +- https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/ +- https://haystack.deepset.ai/community/join +- https://github.com/deepset-ai/haystack/ +- https://deepset.ai/german-bert +- https://www.linkedin.com/company/deepset-ai/ +- https://github.com/deepset-ai/FARM +- https://deepset.ai/germanquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mod_7_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mod_7_squad_en.md new file mode 100644 index 000000000000..825aa3a9ab16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mod_7_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Go2Heart) +author: John Snow Labs +name: bert_qa_mod_7_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `BERT_Mod_7_Squad` is a English model originally trained by `Go2Heart`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mod_7_squad_en_5.2.0_3.0_1700012402588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mod_7_squad_en_5.2.0_3.0_1700012402588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_mod_7_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_mod_7_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mod_7_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|406.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Go2Heart/BERT_Mod_7_Squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_model_output_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_model_output_en.md new file mode 100644 index 000000000000..261e7cce9f4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_model_output_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from SanayCo) +author: John Snow Labs +name: bert_qa_model_output +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `model_output` is a English model orginally trained by `SanayCo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_model_output_en_5.2.0_3.0_1700012692609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_model_output_en_5.2.0_3.0_1700012692609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_model_output","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_model_output","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_SanayCo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_model_output| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SanayCo/model_output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelbin_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelbin_en.md new file mode 100644 index 000000000000..0b6aeef4a55a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelbin_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_modelbin +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `modelbin` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_modelbin_en_5.2.0_3.0_1700010098432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_modelbin_en_5.2.0_3.0_1700010098432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_modelbin","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_modelbin","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_modelbin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/modelbin \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelf_01_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelf_01_en.md new file mode 100644 index 000000000000..28e536bda369 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelf_01_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_modelf_01 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `modelF_01` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_modelf_01_en_5.2.0_3.0_1700012948741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_modelf_01_en_5.2.0_3.0_1700012948741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_modelf_01","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_modelf_01","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_modelf_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/modelF_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelonwhol_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelonwhol_tr.md new file mode 100644 index 000000000000..82730aaad887 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelonwhol_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Cased model (from Aybars) +author: John Snow Labs +name: bert_qa_modelonwhol +date: 2023-11-15 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ModelOnWhole` is a Turkish model originally trained by `Aybars`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_modelonwhol_tr_5.2.0_3.0_1700013291003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_modelonwhol_tr_5.2.0_3.0_1700013291003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_modelonwhol","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_modelonwhol","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_modelonwhol| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Aybars/ModelOnWhole \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelv2_en.md new file mode 100644 index 000000000000..4843431a8df0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_modelv2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from JAlexis) +author: John Snow Labs +name: bert_qa_modelv2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `modelv2` is a English model originally trained by `JAlexis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_modelv2_en_5.2.0_3.0_1700013581942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_modelv2_en_5.2.0_3.0_1700013581942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_modelv2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_modelv2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_modelv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/JAlexis/modelv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_monakth_base_multilingual_cased_finetuned_squad_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_monakth_base_multilingual_cased_finetuned_squad_xx.md new file mode 100644 index 000000000000..b1287133a770 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_monakth_base_multilingual_cased_finetuned_squad_xx.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering Base Cased model (from monakth) +author: John Snow Labs +name: bert_qa_monakth_base_multilingual_cased_finetuned_squad +date: 2023-11-15 +tags: [xx, open_source, bert, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-squad` is a Multilingual model originally trained by `monakth`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_monakth_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1700015431790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_monakth_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1700015431790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_monakth_base_multilingual_cased_finetuned_squad","xx")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_monakth_base_multilingual_cased_finetuned_squad","xx") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_monakth_base_multilingual_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monakth/bert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_monakth_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_monakth_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..f1d9be7b909a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_monakth_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from monakth) +author: John Snow Labs +name: bert_qa_monakth_base_uncased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad` is a English model originally trained by `monakth`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_monakth_base_uncased_finetuned_squad_en_5.2.0_3.0_1700007526590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_monakth_base_uncased_finetuned_squad_en_5.2.0_3.0_1700007526590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_monakth_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_monakth_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_monakth_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monakth/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_cls_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_cls_en.md new file mode 100644 index 000000000000..3237dc65acbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_cls_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from xraychen) +author: John Snow Labs +name: bert_qa_mqa_cls +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mqa-cls` is a English model orginally trained by `xraychen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_cls_en_5.2.0_3.0_1700010393249.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_cls_en_5.2.0_3.0_1700010393249.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mqa_cls","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_mqa_cls","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_qu estion.mqa_cls.bert.by_xraychen").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mqa_cls| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/xraychen/mqa-cls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_sim_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_sim_en.md new file mode 100644 index 000000000000..5c46c0a669c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_sim_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from xraychen) +author: John Snow Labs +name: bert_qa_mqa_sim +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mqa-sim` is a English model orginally trained by `xraychen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_sim_en_5.2.0_3.0_1700063910561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_sim_en_5.2.0_3.0_1700063910561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mqa_sim","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_mqa_sim","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.sim.by_xraychen").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mqa_sim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/xraychen/mqa-sim \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_unsupsim_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_unsupsim_en.md new file mode 100644 index 000000000000..3249cd1ff6c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mqa_unsupsim_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from xraychen) +author: John Snow Labs +name: bert_qa_mqa_unsupsim +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `mqa-unsupsim` is a English model orginally trained by `xraychen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_unsupsim_en_5.2.0_3.0_1700064165660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mqa_unsupsim_en_5.2.0_3.0_1700064165660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mqa_unsupsim","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_mqa_unsupsim","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.unsupsim.by_xraychen").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mqa_unsupsim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/xraychen/mqa-unsupsim \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mrp_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mrp_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..18e46ce5c927 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mrp_bert_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrp) +author: John Snow Labs +name: bert_qa_mrp_bert_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model orginally trained by `mrp`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mrp_bert_finetuned_squad_en_5.2.0_3.0_1700007840846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mrp_bert_finetuned_squad_en_5.2.0_3.0_1700007840846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mrp_bert_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_mrp_bert_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_mrp").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mrp_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrp/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multi_uncased_trained_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multi_uncased_trained_squadv2_en.md new file mode 100644 index 000000000000..c8280a75e919 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multi_uncased_trained_squadv2_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from roshnir) +author: John Snow Labs +name: bert_qa_multi_uncased_trained_squadv2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-multi-uncased-trained-squadv2` is a English model originally trained by `roshnir`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multi_uncased_trained_squadv2_en_5.2.0_3.0_1700010782611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multi_uncased_trained_squadv2_en_5.2.0_3.0_1700010782611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multi_uncased_trained_squadv2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_multi_uncased_trained_squadv2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.uncased_v2").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multi_uncased_trained_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|625.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/roshnir/bert-multi-uncased-trained-squadv2 +- https://aclanthology.org/2020.acl-main.421.pdf%5D \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_base_cased_chines_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_base_cased_chines_zh.md new file mode 100644 index 000000000000..7caac2232692 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_base_cased_chines_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Base Cased model (from bhavikardeshna) +author: John Snow Labs +name: bert_qa_multilingual_base_cased_chines +date: 2023-11-15 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilingual-bert-base-cased-chinese` is a Chinese model originally trained by `bhavikardeshna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_base_cased_chines_zh_5.2.0_3.0_1700010354850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_base_cased_chines_zh_5.2.0_3.0_1700010354850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multilingual_base_cased_chines","zh") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_multilingual_base_cased_chines","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.cased_multilingual_base").predict("""PUT YOUR QUESTION HERE|||"PUT YOUR CONTEXT HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multilingual_base_cased_chines| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bhavikardeshna/multilingual-bert-base-cased-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_arabic_ar.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_arabic_ar.md new file mode 100644 index 000000000000..e05c78e29bf6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_arabic_ar.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Arabic BertForQuestionAnswering model (from bhavikardeshna) +author: John Snow Labs +name: bert_qa_multilingual_bert_base_cased_arabic +date: 2023-11-15 +tags: [open_source, question_answering, bert, ar, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilingual-bert-base-cased-arabic` is a Arabic model orginally trained by `bhavikardeshna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_arabic_ar_5.2.0_3.0_1700011124489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_arabic_ar_5.2.0_3.0_1700011124489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multilingual_bert_base_cased_arabic","ar") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_multilingual_bert_base_cased_arabic","ar") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.answer_question.bert.multilingual_arabic_tuned_base_cased.by_bhavikardeshna").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multilingual_bert_base_cased_arabic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|ar| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bhavikardeshna/multilingual-bert-base-cased-arabic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_german_de.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_german_de.md new file mode 100644 index 000000000000..fb31a29fa1e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_german_de.md @@ -0,0 +1,108 @@ +--- +layout: model +title: German BertForQuestionAnswering model (from bhavikardeshna) +author: John Snow Labs +name: bert_qa_multilingual_bert_base_cased_german +date: 2023-11-15 +tags: [open_source, question_answering, bert, de, onnx] +task: Question Answering +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilingual-bert-base-cased-german` is a German model orginally trained by `bhavikardeshna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_german_de_5.2.0_3.0_1700010746503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_german_de_5.2.0_3.0_1700010746503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multilingual_bert_base_cased_german","de") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_multilingual_bert_base_cased_german","de") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.answer_question.bert.multilingual_german_tuned_base_cased.by_bhavikardeshna").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multilingual_bert_base_cased_german| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|de| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bhavikardeshna/multilingual-bert-base-cased-german \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_hindi_hi.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_hindi_hi.md new file mode 100644 index 000000000000..3686f6c5c138 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_hindi_hi.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Hindi BertForQuestionAnswering model (from bhavikardeshna) +author: John Snow Labs +name: bert_qa_multilingual_bert_base_cased_hindi +date: 2023-11-15 +tags: [open_source, question_answering, bert, hi, onnx] +task: Question Answering +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilingual-bert-base-cased-hindi` is a Hindi model orginally trained by `bhavikardeshna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_hindi_hi_5.2.0_3.0_1700064514465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_hindi_hi_5.2.0_3.0_1700064514465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multilingual_bert_base_cased_hindi","hi") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_multilingual_bert_base_cased_hindi","hi") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("hi.answer_question.bert.multilingual_hindi_tuned_base_cased.by_bhavikardeshna").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multilingual_bert_base_cased_hindi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|hi| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bhavikardeshna/multilingual-bert-base-cased-hindi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_spanish_es.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_spanish_es.md new file mode 100644 index 000000000000..5050f6bbcd55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_spanish_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Castilian, Spanish BertForQuestionAnswering model (from bhavikardeshna) +author: John Snow Labs +name: bert_qa_multilingual_bert_base_cased_spanish +date: 2023-11-15 +tags: [open_source, question_answering, bert, es, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilingual-bert-base-cased-spanish` is a Castilian, Spanish model orginally trained by `bhavikardeshna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_spanish_es_5.2.0_3.0_1700008189037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_spanish_es_5.2.0_3.0_1700008189037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multilingual_bert_base_cased_spanish","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_multilingual_bert_base_cased_spanish","es") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.bert.multilingual_spanish_tuned_base_cased.by_bhavikardeshna").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multilingual_bert_base_cased_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|es| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bhavikardeshna/multilingual-bert-base-cased-spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_vietnamese_vi.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_vietnamese_vi.md new file mode 100644 index 000000000000..410b9d34f94d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_multilingual_bert_base_cased_vietnamese_vi.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Vietnamese BertForQuestionAnswering model (from bhavikardeshna) +author: John Snow Labs +name: bert_qa_multilingual_bert_base_cased_vietnamese +date: 2023-11-15 +tags: [open_source, question_answering, bert, vi, onnx] +task: Question Answering +language: vi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `multilingual-bert-base-cased-vietnamese` is a Vietnamese model orginally trained by `bhavikardeshna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_vietnamese_vi_5.2.0_3.0_1700008525364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_multilingual_bert_base_cased_vietnamese_vi_5.2.0_3.0_1700008525364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_multilingual_bert_base_cased_vietnamese","vi") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_multilingual_bert_base_cased_vietnamese","vi") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("vi.answer_question.bert.multilingual_vietnamese_tuned_base_cased.by_bhavikardeshna").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_multilingual_bert_base_cased_vietnamese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|vi| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bhavikardeshna/multilingual-bert-base-cased-vietnamese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_muril_large_squad2_hi.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_muril_large_squad2_hi.md new file mode 100644 index 000000000000..7c6ffd687dd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_muril_large_squad2_hi.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Hindi BertForQuestionAnswering model (from Sindhu) +author: John Snow Labs +name: bert_qa_muril_large_squad2 +date: 2023-11-15 +tags: [open_source, question_answering, bert, hi, onnx] +task: Question Answering +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `muril-large-squad2` is a Hindi model orginally trained by `Sindhu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_muril_large_squad2_hi_5.2.0_3.0_1700011602780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_muril_large_squad2_hi_5.2.0_3.0_1700011602780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_muril_large_squad2","hi") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_muril_large_squad2","hi") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("hi.answer_question.squadv2.bert.large").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_muril_large_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|hi| +|Size:|1.9 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Sindhu/muril-large-squad2 +- https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/ +- https://twitter.com/batw0man \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mymild_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mymild_finetuned_squad_en.md new file mode 100644 index 000000000000..d7de9c5613a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_mymild_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from MyMild) +author: John Snow Labs +name: bert_qa_mymild_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `MyMild`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_mymild_finetuned_squad_en_5.2.0_3.0_1700011622327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_mymild_finetuned_squad_en_5.2.0_3.0_1700011622327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_mymild_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_mymild_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_MyMild").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_mymild_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MyMild/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_neg_komrc_train_ko.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_neg_komrc_train_ko.md new file mode 100644 index 000000000000..8aa111b9b97c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_neg_komrc_train_ko.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Korean BertForQuestionAnswering Cased model (from Taekyoon) +author: John Snow Labs +name: bert_qa_neg_komrc_train +date: 2023-11-15 +tags: [ko, open_source, bert, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `neg_komrc_train` is a Korean model originally trained by `Taekyoon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_neg_komrc_train_ko_5.2.0_3.0_1700011887300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_neg_komrc_train_ko_5.2.0_3.0_1700011887300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_neg_komrc_train","ko") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_neg_komrc_train","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_neg_komrc_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|406.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Taekyoon/neg_komrc_train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_negfir_distilbert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_negfir_distilbert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..a5731ad9a7f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_negfir_distilbert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from negfir) +author: John Snow Labs +name: bert_qa_negfir_distilbert_base_uncased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `negfir`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_negfir_distilbert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700064684620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_negfir_distilbert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700064684620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_negfir_distilbert_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_negfir_distilbert_base_uncased_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.distilled_uncased_base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_negfir_distilbert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|200.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/negfir/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ner_conll_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ner_conll_base_uncased_en.md new file mode 100644 index 000000000000..c67cdb22dfe5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ner_conll_base_uncased_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from dayyass) +author: John Snow Labs +name: bert_qa_ner_conll_base_uncased +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qaner-conll-bert-base-uncased` is a English model originally trained by `dayyass`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ner_conll_base_uncased_en_5.2.0_3.0_1700008964066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ner_conll_base_uncased_en_5.2.0_3.0_1700008964066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ner_conll_base_uncased","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ner_conll_base_uncased","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ner_conll_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/dayyass/qaner-conll-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_neuralmagic_bert_squad_12layer_0sparse_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_neuralmagic_bert_squad_12layer_0sparse_en.md new file mode 100644 index 000000000000..b332ca26efca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_neuralmagic_bert_squad_12layer_0sparse_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from spacemanidol) +author: John Snow Labs +name: bert_qa_neuralmagic_bert_squad_12layer_0sparse +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `neuralmagic-bert-squad-12layer-0sparse` is a English model orginally trained by `spacemanidol`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_neuralmagic_bert_squad_12layer_0sparse_en_5.2.0_3.0_1700009310735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_neuralmagic_bert_squad_12layer_0sparse_en_5.2.0_3.0_1700009310735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_neuralmagic_bert_squad_12layer_0sparse","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_neuralmagic_bert_squad_12layer_0sparse","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_neuralmagic_bert_squad_12layer_0sparse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/spacemanidol/neuralmagic-bert-squad-12layer-0sparse \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa_en.md new file mode 100644 index 000000000000..b85e014e392a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700011832898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa_en_5.2.0_3.0_1700011832898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_news_pretrain_bert_ft_nepal_bhasa_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| + +## References + +https://huggingface.co/AnonymousSub/news_pretrain_bert_FT_new_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_news_pretrain_bert_ft_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_news_pretrain_bert_ft_newsqa_en.md new file mode 100644 index 000000000000..9db633063e5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_news_pretrain_bert_ft_newsqa_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_news_pretrain_bert_ft_newsqa BertForQuestionAnswering from AnonymousSub +author: John Snow Labs +name: bert_qa_news_pretrain_bert_ft_newsqa +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_news_pretrain_bert_ft_newsqa` is a English model originally trained by AnonymousSub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_news_pretrain_bert_ft_newsqa_en_5.2.0_3.0_1700012087045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_news_pretrain_bert_ft_newsqa_en_5.2.0_3.0_1700012087045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_news_pretrain_bert_ft_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_news_pretrain_bert_ft_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_news_pretrain_bert_ft_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/AnonymousSub/news_pretrain_bert_FT_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_nolog_scibert_v2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_nolog_scibert_v2_en.md new file mode 100644 index 000000000000..4d0e902ec0d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_nolog_scibert_v2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_nolog_scibert_v2 BertForQuestionAnswering from peggyhuang +author: John Snow Labs +name: bert_qa_nolog_scibert_v2 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_nolog_scibert_v2` is a English model originally trained by peggyhuang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_nolog_scibert_v2_en_5.2.0_3.0_1700009571562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_nolog_scibert_v2_en_5.2.0_3.0_1700009571562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_nolog_scibert_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_nolog_scibert_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_nolog_scibert_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/peggyhuang/nolog-SciBert-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_norwegian_need_tonga_tonga_islands_name_this_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_norwegian_need_tonga_tonga_islands_name_this_en.md new file mode 100644 index 000000000000..de0fb1074f7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_norwegian_need_tonga_tonga_islands_name_this_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_norwegian_need_tonga_tonga_islands_name_this BertForQuestionAnswering from LenaSchmidt +author: John Snow Labs +name: bert_qa_norwegian_need_tonga_tonga_islands_name_this +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_norwegian_need_tonga_tonga_islands_name_this` is a English model originally trained by LenaSchmidt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_norwegian_need_tonga_tonga_islands_name_this_en_5.2.0_3.0_1700012067858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_norwegian_need_tonga_tonga_islands_name_this_en_5.2.0_3.0_1700012067858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_norwegian_need_tonga_tonga_islands_name_this","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_norwegian_need_tonga_tonga_islands_name_this", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_norwegian_need_tonga_tonga_islands_name_this| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/LenaSchmidt/no_need_to_name_this \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_output_files_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_output_files_en.md new file mode 100644 index 000000000000..352808a41605 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_output_files_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from sunitha) +author: John Snow Labs +name: bert_qa_output_files +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `output_files` is a English model orginally trained by `sunitha`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_output_files_en_5.2.0_3.0_1700012511653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_output_files_en_5.2.0_3.0_1700012511653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_output_files","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_output_files","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.output_files.bert.by_sunitha").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_output_files| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sunitha/output_files \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_paranoidandroid_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_paranoidandroid_finetuned_squad_en.md new file mode 100644 index 000000000000..953e35642bd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_paranoidandroid_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from ParanoidAndroid) +author: John Snow Labs +name: bert_qa_paranoidandroid_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `ParanoidAndroid`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_paranoidandroid_finetuned_squad_en_5.2.0_3.0_1700012780240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_paranoidandroid_finetuned_squad_en_5.2.0_3.0_1700012780240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_paranoidandroid_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_paranoidandroid_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_ParanoidAndroid").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_paranoidandroid_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ParanoidAndroid/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pars_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pars_fa.md new file mode 100644 index 000000000000..a096421a2621 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pars_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Cased model (from sepiosky) +author: John Snow Labs +name: bert_qa_pars +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ParsBERT_QA` is a Persian model originally trained by `sepiosky`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_pars_fa_5.2.0_3.0_1700012639595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_pars_fa_5.2.0_3.0_1700012639595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_pars","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_pars","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_pars| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sepiosky/ParsBERT_QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pars_question_answering_pquad_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pars_question_answering_pquad_fa.md new file mode 100644 index 000000000000..22aa6c6450b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pars_question_answering_pquad_fa.md @@ -0,0 +1,98 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Cased model (from pedramyazdipoor) +author: John Snow Labs +name: bert_qa_pars_question_answering_pquad +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `parsbert_question_answering_PQuAD` is a Persian model originally trained by `pedramyazdipoor`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_pars_question_answering_pquad_fa_5.2.0_3.0_1700010005378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_pars_question_answering_pquad_fa_5.2.0_3.0_1700010005378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_pars_question_answering_pquad","fa")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_pars_question_answering_pquad","fa") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_pars_question_answering_pquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/pedramyazdipoor/parsbert_question_answering_PQuAD +- https://github.com/pedramyazdipoor/ParsBert_QA_PQuAD +- https://arxiv.org/abs/2005.12515 +- https://arxiv.org/abs/2202.06219 +- https://www.linkedin.com/in/pedram-yazdipour/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_parsbert_finetuned_persianqa_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_parsbert_finetuned_persianqa_fa.md new file mode 100644 index 000000000000..5aa1265d171b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_parsbert_finetuned_persianqa_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Cased model (from marzinouri101) +author: John Snow Labs +name: bert_qa_parsbert_finetuned_persianqa +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `parsbert-finetuned-persianQA` is a Persian model originally trained by `marzinouri101`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_parsbert_finetuned_persianqa_fa_5.2.0_3.0_1700013045523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_parsbert_finetuned_persianqa_fa_5.2.0_3.0_1700013045523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_parsbert_finetuned_persianqa","fa") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_parsbert_finetuned_persianqa","fa") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_parsbert_finetuned_persianqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|441.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/marzinouri101/parsbert-finetuned-persianQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_part_2_mbert_model_e1_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_part_2_mbert_model_e1_en.md new file mode 100644 index 000000000000..df176bbadb16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_part_2_mbert_model_e1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from horsbug98) +author: John Snow Labs +name: bert_qa_part_2_mbert_model_e1 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Part_2_mBERT_Model_E1` is a English model originally trained by `horsbug98`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_part_2_mbert_model_e1_en_5.2.0_3.0_1700066351101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_part_2_mbert_model_e1_en_5.2.0_3.0_1700066351101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_part_2_mbert_model_e1","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_part_2_mbert_model_e1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.tydiqa.").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_part_2_mbert_model_e1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/horsbug98/Part_2_mBERT_Model_E1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pert_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pert_zh.md new file mode 100644 index 000000000000..d5ea1007c76d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pert_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Cased model (from cgt) +author: John Snow Labs +name: bert_qa_pert +date: 2023-11-15 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `pert-qa` is a Chinese model originally trained by `cgt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_pert_zh_5.2.0_3.0_1700006794991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_pert_zh_5.2.0_3.0_1700006794991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_pert","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_pert","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_pert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/cgt/pert-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_peterhsu_bert_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_peterhsu_bert_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..0f5b3df4da42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_peterhsu_bert_finetuned_squad_accelerate_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from peterhsu) +author: John Snow Labs +name: bert_qa_peterhsu_bert_finetuned_squad_accelerate +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model orginally trained by `peterhsu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_peterhsu_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1700007359588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_peterhsu_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1700007359588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_peterhsu_bert_finetuned_squad_accelerate","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_peterhsu_bert_finetuned_squad_accelerate","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.accelerate.by_peterhsu").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_peterhsu_bert_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peterhsu/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_peterhsu_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_peterhsu_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..9005693e5705 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_peterhsu_bert_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from peterhsu) +author: John Snow Labs +name: bert_qa_peterhsu_bert_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model orginally trained by `peterhsu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_peterhsu_bert_finetuned_squad_en_5.2.0_3.0_1700007104039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_peterhsu_bert_finetuned_squad_en_5.2.0_3.0_1700007104039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_peterhsu_bert_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_peterhsu_bert_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.v2.by_peterhsu").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_peterhsu_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peterhsu/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_petros89_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_petros89_finetuned_squad_en.md new file mode 100644 index 000000000000..820e90fca5e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_petros89_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Petros89) +author: John Snow Labs +name: bert_qa_petros89_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `Petros89`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_petros89_finetuned_squad_en_5.2.0_3.0_1700012871512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_petros89_finetuned_squad_en_5.2.0_3.0_1700012871512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_petros89_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_petros89_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_petros89_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Petros89/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pquad_fa.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pquad_fa.md new file mode 100644 index 000000000000..2ebe24afedbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pquad_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Cased model (from newsha) +author: John Snow Labs +name: bert_qa_pquad +date: 2023-11-15 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `PQuAD` is a Persian model originally trained by `newsha`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_pquad_fa_5.2.0_3.0_1700013178126.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_pquad_fa_5.2.0_3.0_1700013178126.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_pquad","fa") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_pquad","fa") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_pquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/newsha/PQuAD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pubmed_bert_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pubmed_bert_squadv2_en.md new file mode 100644 index 000000000000..d399819f545c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_pubmed_bert_squadv2_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from franklu) +author: John Snow Labs +name: bert_qa_pubmed_bert_squadv2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `pubmed_bert_squadv2` is a English model orginally trained by `franklu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_pubmed_bert_squadv2_en_5.2.0_3.0_1700007643318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_pubmed_bert_squadv2_en_5.2.0_3.0_1700007643318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_pubmed_bert_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_pubmed_bert_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_pubmed.bert.v2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_pubmed_bert_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|408.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/franklu/pubmed_bert_squadv2 +- https://rajpurkar.github.io/SQuAD-explorer/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qa_roberta_base_chinese_extractive_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qa_roberta_base_chinese_extractive_zh.md new file mode 100644 index 000000000000..f915d8bf408d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qa_roberta_base_chinese_extractive_zh.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from liam168) +author: John Snow Labs +name: bert_qa_qa_roberta_base_chinese_extractive +date: 2023-11-15 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qa-roberta-base-chinese-extractive` is a Chinese model orginally trained by `liam168`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_qa_roberta_base_chinese_extractive_zh_5.2.0_3.0_1700013334469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_qa_roberta_base_chinese_extractive_zh_5.2.0_3.0_1700013334469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_qa_roberta_base_chinese_extractive","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_qa_roberta_base_chinese_extractive","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.base.by_liam168").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_qa_roberta_base_chinese_extractive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|380.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/liam168/qa-roberta-base-chinese-extractive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2_en.md new file mode 100644 index 000000000000..e2609a82f9cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from Salesforce) +author: John Snow Labs +name: bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qaconv-bert-large-uncased-whole-word-masking-squad2` is a English model orginally trained by `Salesforce`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2_en_5.2.0_3.0_1700014185929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2_en_5.2.0_3.0_1700014185929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_uncased.by_Salesforce").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_qaconv_bert_large_uncased_whole_word_masking_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Salesforce/qaconv-bert-large-uncased-whole-word-masking-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qgrantq_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qgrantq_finetuned_squad_en.md new file mode 100644 index 000000000000..7a00e175708a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_qgrantq_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from qgrantq) +author: John Snow Labs +name: bert_qa_qgrantq_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `qgrantq`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_qgrantq_finetuned_squad_en_5.2.0_3.0_1700007983639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_qgrantq_finetuned_squad_en_5.2.0_3.0_1700007983639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_qgrantq_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_qgrantq_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_qgrantq").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_qgrantq_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/qgrantq/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_cased_squadv2_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_cased_squadv2_tr.md new file mode 100644 index 000000000000..53f232977a6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_cased_squadv2_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Cased model (from enelpi) +author: John Snow Labs +name: bert_qa_question_answering_cased_squadv2 +date: 2023-11-15 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-question-answering-cased-squadv2_tr` is a Turkish model originally trained by `enelpi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_cased_squadv2_tr_5.2.0_3.0_1700008230626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_cased_squadv2_tr_5.2.0_3.0_1700008230626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_question_answering_cased_squadv2","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_question_answering_cased_squadv2","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_question_answering_cased_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|412.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/enelpi/bert-question-answering-cased-squadv2_tr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_chinese_voidful_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_chinese_voidful_zh.md new file mode 100644 index 000000000000..bc03919db0c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_chinese_voidful_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese bert_qa_question_answering_chinese_voidful BertForQuestionAnswering from voidful +author: John Snow Labs +name: bert_qa_question_answering_chinese_voidful +date: 2023-11-15 +tags: [bert, zh, open_source, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_question_answering_chinese_voidful` is a Chinese model originally trained by voidful. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_chinese_voidful_zh_5.2.0_3.0_1700068229745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_chinese_voidful_zh_5.2.0_3.0_1700068229745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_question_answering_chinese_voidful","zh") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_question_answering_chinese_voidful", "zh") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_question_answering_chinese_voidful| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|381.0 MB| + +## References + +https://huggingface.co/voidful/question-answering-zh \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_chinese_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_chinese_zh.md new file mode 100644 index 000000000000..5a9aacee8235 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_chinese_zh.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from yechen) +author: John Snow Labs +name: bert_qa_question_answering_chinese +date: 2023-11-15 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `question-answering-chinese` is a Chinese model orginally trained by `yechen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_chinese_zh_5.2.0_3.0_1700014729623.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_chinese_zh_5.2.0_3.0_1700014729623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_question_answering_chinese","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_question_answering_chinese","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_question_answering_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/yechen/question-answering-chinese \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_uncased_squadv2_tr.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_uncased_squadv2_tr.md new file mode 100644 index 000000000000..92b087353c70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_question_answering_uncased_squadv2_tr.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Uncased model (from enelpi) +author: John Snow Labs +name: bert_qa_question_answering_uncased_squadv2 +date: 2023-11-15 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-question-answering-uncased-squadv2_tr` is a Turkish model originally trained by `enelpi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_uncased_squadv2_tr_5.2.0_3.0_1700008510603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_question_answering_uncased_squadv2_tr_5.2.0_3.0_1700008510603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_question_answering_uncased_squadv2","tr") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_question_answering_uncased_squadv2","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_question_answering_uncased_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|412.5 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/enelpi/bert-question-answering-uncased-squadv2_tr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_questionansweing_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_questionansweing_en.md new file mode 100644 index 000000000000..2f5dd17a5201 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_questionansweing_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from ponmari) +author: John Snow Labs +name: bert_qa_questionansweing +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `QuestionAnsweingBert` is a English model originally trained by `ponmari`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_questionansweing_en_5.2.0_3.0_1700069952307.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_questionansweing_en_5.2.0_3.0_1700069952307.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_questionansweing","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_questionansweing","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_ponmari").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_questionansweing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ponmari/QuestionAnsweingBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_quote_attribution_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_quote_attribution_en.md new file mode 100644 index 000000000000..fd4c21e37c61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_quote_attribution_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from helliun) +author: John Snow Labs +name: bert_qa_quote_attribution +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `quote-attribution` is a English model originally trained by `helliun`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_quote_attribution_en_5.2.0_3.0_1700008758458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_quote_attribution_en_5.2.0_3.0_1700008758458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_quote_attribution","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_quote_attribution","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_quote_attribution| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/helliun/quote-attribution \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ramrajput_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ramrajput_finetuned_squad_en.md new file mode 100644 index 000000000000..1ccddc27182c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ramrajput_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from ramrajput) +author: John Snow Labs +name: bert_qa_ramrajput_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `ramrajput`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ramrajput_finetuned_squad_en_5.2.0_3.0_1700071709869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ramrajput_finetuned_squad_en_5.2.0_3.0_1700071709869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ramrajput_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ramrajput_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ramrajput_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ramrajput/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..26f934efee8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_bert-base-uncased_EASY_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3_en_5.2.0_3.0_1700073251310.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3_en_5.2.0_3.0_1700073251310.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_base_uncased_easy_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_bert-base-uncased_EASY_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..0fb1355d4554 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_bert-base-uncased_EASY_TIMESTEP_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700075008869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700075008869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_base_uncased_easy_timestep_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_bert-base-uncased_EASY_TIMESTEP_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..2971b2c6c646 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_bert-base-uncased_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3_en_5.2.0_3.0_1700015968478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3_en_5.2.0_3.0_1700015968478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_base_uncased_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_bert-base-uncased_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..153c2b9aa3c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_bert-base-uncased_TIMESTEP_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700064358012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700064358012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_base_uncased_timestep_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_bert-base-uncased_TIMESTEP_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..02a9ebeda288 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_recipe-bert-base-uncased_EASY_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3_en_5.2.0_3.0_1700064649328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3_en_5.2.0_3.0_1700064649328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_recipe_base_uncased_easy_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_recipe-bert-base-uncased_EASY_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..fa2ad3c91390 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_recipe-bert-base-uncased_EASY_TIMESTEP_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700076702862.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700076702862.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_recipe_base_uncased_easy_timestep_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_recipe-bert-base-uncased_EASY_TIMESTEP_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..878a4f7a9f8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_recipe-bert-base-uncased_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3_en_5.2.0_3.0_1700078543892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3_en_5.2.0_3.0_1700078543892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_recipe_base_uncased_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_recipe-bert-base-uncased_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3_en.md new file mode 100644 index 000000000000..bb3085a0f074 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Uncased model (from AnonymousSub) +author: John Snow Labs +name: bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `recipe_triplet_recipe-bert-base-uncased_TIMESTEP_squadv2_epochs_3` is a English model originally trained by `AnonymousSub`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700080450858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3_en_5.2.0_3.0_1700080450858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_recipe_triplet_recipe_base_uncased_timestep_squadv2_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AnonymousSub/recipe_triplet_recipe-bert-base-uncased_TIMESTEP_squadv2_epochs_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_results_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_results_en.md new file mode 100644 index 000000000000..95fba0909849 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_results_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ericRosello) +author: John Snow Labs +name: bert_qa_results +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `results` is a English model orginally trained by `ericRosello`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_results_en_5.2.0_3.0_1700065867532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_results_en_5.2.0_3.0_1700065867532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_results","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_results","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_ericRosello").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_results| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ericRosello/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_reza_aditya_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_reza_aditya_finetuned_squad_en.md new file mode 100644 index 000000000000..37e4c80cce9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_reza_aditya_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from reza-aditya) +author: John Snow Labs +name: bert_qa_reza_aditya_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `reza-aditya`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_reza_aditya_finetuned_squad_en_5.2.0_3.0_1700081885655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_reza_aditya_finetuned_squad_en_5.2.0_3.0_1700081885655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_reza_aditya_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_reza_aditya_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_reza_aditya_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/reza-aditya/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_chinese_extractive_qa_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_chinese_extractive_qa_zh.md new file mode 100644 index 000000000000..749ca884f615 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_chinese_extractive_qa_zh.md @@ -0,0 +1,113 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from uer) +author: John Snow Labs +name: bert_qa_roberta_base_chinese_extractive_qa +date: 2023-11-15 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-chinese-extractive-qa` is a Chinese model orginally trained by `uer`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_chinese_extractive_qa_zh_5.2.0_3.0_1700085567946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_chinese_extractive_qa_zh_5.2.0_3.0_1700085567946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_roberta_base_chinese_extractive_qa","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_roberta_base_chinese_extractive_qa","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.base.by_uer").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_roberta_base_chinese_extractive_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|380.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/uer/roberta-base-chinese-extractive-qa +- https://spaces.ac.cn/archives/4338 +- https://www.kesci.com/home/competition/5d142d8cbb14e6002c04e14a/content/0 +- https://github.com/dbiir/UER-py/ +- https://cloud.tencent.com/product/tione/ +- https://github.com/ymcui/cmrc2018 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_chinese_extractive_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_chinese_extractive_zh.md new file mode 100644 index 000000000000..5ced75aa0828 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_chinese_extractive_zh.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Base Cased model (from jackh1995) +author: John Snow Labs +name: bert_qa_roberta_base_chinese_extractive +date: 2023-11-15 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-chinese-extractive-qa` is a Chinese model originally trained by `jackh1995`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_chinese_extractive_zh_5.2.0_3.0_1700083617267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_chinese_extractive_zh_5.2.0_3.0_1700083617267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_roberta_base_chinese_extractive","zh") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_roberta_base_chinese_extractive","zh") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("PUT YOUR QUESTION HERE", "PUT YOUR CONTEXT HERE").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.base_extractive").predict("""PUT YOUR QUESTION HERE|||"PUT YOUR CONTEXT HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_roberta_base_chinese_extractive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|380.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jackh1995/roberta-base-chinese-extractive-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_squad2_en.md new file mode 100644 index 000000000000..8c58e97cca2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_base_squad2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from vvincentt) +author: John Snow Labs +name: bert_qa_roberta_base_squad2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-squad2` is a English model originally trained by `vvincentt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_squad2_en_5.2.0_3.0_1700067423627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_squad2_en_5.2.0_3.0_1700067423627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_roberta_base_squad2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_roberta_base_squad2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_roberta_base_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vvincentt/roberta-base-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_test_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_test_en.md new file mode 100644 index 000000000000..e34ecb105d00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_test_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from vvincentt) +author: John Snow Labs +name: bert_qa_roberta_test +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta_test` is a English model originally trained by `vvincentt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_test_en_5.2.0_3.0_1700089519451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_test_en_5.2.0_3.0_1700089519451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_roberta_test","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_roberta_test","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_roberta_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vvincentt/roberta_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_wwm_ext_larg_zh.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_wwm_ext_larg_zh.md new file mode 100644 index 000000000000..f052cd79a985 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_roberta_wwm_ext_larg_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Cased model (from wskhanh) +author: John Snow Labs +name: bert_qa_roberta_wwm_ext_larg +date: 2023-11-15 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Roberta-wwm-ext-larg` is a Chinese model originally trained by `wskhanh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_wwm_ext_larg_zh_5.2.0_3.0_1700064001338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_wwm_ext_larg_zh_5.2.0_3.0_1700064001338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_roberta_wwm_ext_larg","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_roberta_wwm_ext_larg","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_roberta_wwm_ext_larg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/wskhanh/Roberta-wwm-ext-larg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full_ru.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full_ru.md new file mode 100644 index 000000000000..e2e2e1f4cb68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full BertForQuestionAnswering from ruselkomp +author: John Snow Labs +name: bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full +date: 2023-11-15 +tags: [bert, ru, open_source, question_answering, onnx] +task: Question Answering +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full` is a Russian model originally trained by ruselkomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full_ru_5.2.0_3.0_1700010708407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full_ru_5.2.0_3.0_1700010708407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full","ru") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full", "ru") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ruselkomp_sbert_large_nlu_russian_finetuned_squad_full| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ru| +|Size:|1.6 GB| + +## References + +https://huggingface.co/ruselkomp/sbert_large_nlu_ru-finetuned-squad-full \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_salti_bert_base_multilingual_cased_finetuned_squad_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_salti_bert_base_multilingual_cased_finetuned_squad_xx.md new file mode 100644 index 000000000000..b4c4f0b76ffd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_salti_bert_base_multilingual_cased_finetuned_squad_xx.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering model (from salti) +author: John Snow Labs +name: bert_qa_salti_bert_base_multilingual_cased_finetuned_squad +date: 2023-11-15 +tags: [open_source, question_answering, bert, xx, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-multilingual-cased-finetuned-squad` is a Multilingual model orginally trained by `salti`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_salti_bert_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1700009569501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_salti_bert_base_multilingual_cased_finetuned_squad_xx_5.2.0_3.0_1700009569501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_salti_bert_base_multilingual_cased_finetuned_squad","xx") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_salti_bert_base_multilingual_cased_finetuned_squad","xx") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.squad.bert.multilingual_base_cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_salti_bert_base_multilingual_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/salti/bert-base-multilingual-cased-finetuned-squad +- https://wandb.ai/salti/mBERT_QA/runs/wkqzhrp2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sangyongan30_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sangyongan30_finetuned_squad_en.md new file mode 100644 index 000000000000..0c0b076dc12c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sangyongan30_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from sangyongan30) +author: John Snow Labs +name: bert_qa_sangyongan30_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `sangyongan30`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sangyongan30_finetuned_squad_en_5.2.0_3.0_1700011197408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sangyongan30_finetuned_squad_en_5.2.0_3.0_1700011197408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sangyongan30_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sangyongan30_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sangyongan30_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sangyongan30/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sber_full_tes_ru.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sber_full_tes_ru.md new file mode 100644 index 000000000000..a2af090fecd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sber_full_tes_ru.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Russian BertForQuestionAnswering Cased model (from ruselkomp) +author: John Snow Labs +name: bert_qa_sber_full_tes +date: 2023-11-15 +tags: [ru, open_source, bert, question_answering, onnx] +task: Question Answering +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sber-full-test` is a Russian model originally trained by `ruselkomp`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sber_full_tes_ru_5.2.0_3.0_1700010222750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sber_full_tes_ru_5.2.0_3.0_1700010222750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sber_full_tes","ru") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Как меня зовут?", "Меня зовут Клара, и я живу в Беркли."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sber_full_tes","ru") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Как меня зовут?", "Меня зовут Клара, и я живу в Беркли.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sber_full_tes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ru| +|Size:|1.6 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ruselkomp/sber-full-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sbert_large_nlu_russian_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sbert_large_nlu_russian_finetuned_squad_en.md new file mode 100644 index 000000000000..a603e14f6d70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sbert_large_nlu_russian_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_sbert_large_nlu_russian_finetuned_squad BertForQuestionAnswering from ruselkomp +author: John Snow Labs +name: bert_qa_sbert_large_nlu_russian_finetuned_squad +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_sbert_large_nlu_russian_finetuned_squad` is a English model originally trained by ruselkomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sbert_large_nlu_russian_finetuned_squad_en_5.2.0_3.0_1700010689496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sbert_large_nlu_russian_finetuned_squad_en_5.2.0_3.0_1700010689496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sbert_large_nlu_russian_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_sbert_large_nlu_russian_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sbert_large_nlu_russian_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.6 GB| + +## References + +https://huggingface.co/ruselkomp/sbert_large_nlu_ru-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sci_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sci_squadv2_en.md new file mode 100644 index 000000000000..9ecd673564a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sci_squadv2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from jbrat) +author: John Snow Labs +name: bert_qa_sci_squadv2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scibert-squadv2` is a English model originally trained by `jbrat`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sci_squadv2_en_5.2.0_3.0_1700011820213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sci_squadv2_en_5.2.0_3.0_1700011820213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sci_squadv2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sci_squadv2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sci_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|410.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jbrat/scibert-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_coqa_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_coqa_en.md new file mode 100644 index 000000000000..9548415ee3bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_coqa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from peggyhuang) +author: John Snow Labs +name: bert_qa_scibert_coqa +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SciBERT-CoQA` is a English model originally trained by `peggyhuang`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_coqa_en_5.2.0_3.0_1700068925491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_coqa_en_5.2.0_3.0_1700068925491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_scibert_coqa","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_scibert_coqa","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.scibert.scibert.").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_scibert_coqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/peggyhuang/SciBERT-CoQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_nli_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_nli_squad_en.md new file mode 100644 index 000000000000..9160bb9c688b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_nli_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from amoux) +author: John Snow Labs +name: bert_qa_scibert_nli_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scibert_nli_squad` is a English model orginally trained by `amoux`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_nli_squad_en_5.2.0_3.0_1700011023145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_nli_squad_en_5.2.0_3.0_1700011023145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_scibert_nli_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_scibert_nli_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.scibert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_scibert_nli_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|409.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/amoux/scibert_nli_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_scivocab_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_scivocab_uncased_squad_en.md new file mode 100644 index 000000000000..08dc69192b8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_scivocab_uncased_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from LoudlySoft) +author: John Snow Labs +name: bert_qa_scibert_scivocab_uncased_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scibert_scivocab_uncased_squad` is a English model orginally trained by `LoudlySoft`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_scivocab_uncased_squad_en_5.2.0_3.0_1700012083245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_scivocab_uncased_squad_en_5.2.0_3.0_1700012083245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_scibert_scivocab_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_scibert_scivocab_uncased_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.scibert.uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_scibert_scivocab_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/LoudlySoft/scibert_scivocab_uncased_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_scivocab_uncased_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_scivocab_uncased_squad_v2_en.md new file mode 100644 index 000000000000..2a2cf6b8338a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_scibert_scivocab_uncased_squad_v2_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ktrapeznikov) +author: John Snow Labs +name: bert_qa_scibert_scivocab_uncased_squad_v2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scibert_scivocab_uncased_squad_v2` is a English model orginally trained by `ktrapeznikov`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_scivocab_uncased_squad_v2_en_5.2.0_3.0_1700012351870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_scibert_scivocab_uncased_squad_v2_en_5.2.0_3.0_1700012351870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_scibert_scivocab_uncased_squad_v2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_scibert_scivocab_uncased_squad_v2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.scibert.uncased_v2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_scibert_scivocab_uncased_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|410.0 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ktrapeznikov/scibert_scivocab_uncased_squad_v2 +- https://rajpurkar.github.io/SQuAD-explorer/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd1_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd1_en.md new file mode 100644 index 000000000000..d068feb6363e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from motiondew) +author: John Snow Labs +name: bert_qa_sd1 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-sd1` is a English model originally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sd1_en_5.2.0_3.0_1700070839724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sd1_en_5.2.0_3.0_1700070839724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sd1","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sd1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.sd1.by_motiondew").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sd1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-sd1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_en.md new file mode 100644 index 000000000000..29591c995c8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from motiondew) +author: John Snow Labs +name: bert_qa_sd2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-sd2` is a English model originally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sd2_en_5.2.0_3.0_1700011295304.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sd2_en_5.2.0_3.0_1700011295304.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sd2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sd2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.sd2.by_motiondew").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sd2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-sd2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_lr_5e_5_bosnian_32_e_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_lr_5e_5_bosnian_32_e_3_en.md new file mode 100644 index 000000000000..18dd38c31c7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_lr_5e_5_bosnian_32_e_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_sd2_lr_5e_5_bosnian_32_e_3 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_sd2_lr_5e_5_bosnian_32_e_3 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_sd2_lr_5e_5_bosnian_32_e_3` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sd2_lr_5e_5_bosnian_32_e_3_en_5.2.0_3.0_1700072001043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sd2_lr_5e_5_bosnian_32_e_3_en_5.2.0_3.0_1700072001043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sd2_lr_5e_5_bosnian_32_e_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_sd2_lr_5e_5_bosnian_32_e_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sd2_lr_5e_5_bosnian_32_e_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-sd2-lr-5e-5-bs-32-e-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_small_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_small_en.md new file mode 100644 index 000000000000..75e3765e806d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd2_small_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from motiondew) +author: John Snow Labs +name: bert_qa_sd2_small +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-sd2-small` is a English model originally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sd2_small_en_5.2.0_3.0_1700011602299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sd2_small_en_5.2.0_3.0_1700011602299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sd2_small","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sd2_small","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.small.sd2_small.by_motiondew").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sd2_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-sd2-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd3_en.md new file mode 100644 index 000000000000..16e6d47144c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd3_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from motiondew) +author: John Snow Labs +name: bert_qa_sd3 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-sd3` is a English model originally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sd3_en_5.2.0_3.0_1700064473704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sd3_en_5.2.0_3.0_1700064473704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sd3","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sd3","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.sd3.by_motiondew").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sd3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-sd3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd3_small_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd3_small_en.md new file mode 100644 index 000000000000..8401bbddd095 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sd3_small_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from motiondew) +author: John Snow Labs +name: bert_qa_sd3_small +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-sd3-small` is a English model originally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sd3_small_en_5.2.0_3.0_1700073781848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sd3_small_en_5.2.0_3.0_1700073781848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sd3_small","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sd3_small","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.small.sd3_small.by_motiondew").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sd3_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-sd3-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sebastians_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sebastians_finetuned_squad_en.md new file mode 100644 index 000000000000..40c22175bccb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sebastians_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from SebastianS) +author: John Snow Labs +name: bert_qa_sebastians_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `SebastianS`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sebastians_finetuned_squad_en_5.2.0_3.0_1700075206741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sebastians_finetuned_squad_en_5.2.0_3.0_1700075206741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sebastians_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sebastians_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned_squad.by_SebastianS").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sebastians_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SebastianS/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad_en.md new file mode 100644 index 000000000000..b603917c2957 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from SebOchs) +author: John Snow Labs +name: bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h256-uncased-squad` is a English model originally trained by `SebOchs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad_en_5.2.0_3.0_1700064675507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad_en_5.2.0_3.0_1700064675507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sebochs_xtremedistil_l6_h256_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|47.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SebOchs/xtremedistil-l6-h256-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_impartit_4_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_impartit_4_en.md new file mode 100644 index 000000000000..e9a182306be7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_impartit_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from motiondew) +author: John Snow Labs +name: bert_qa_set_date_1_impartit_4 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `set_date_1-impartit_4-bert` is a English model originally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_1_impartit_4_en_5.2.0_3.0_1700076947046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_1_impartit_4_en_5.2.0_3.0_1700076947046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_1_impartit_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_set_date_1_impartit_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.set_date_1_impartit_4.by_motiondew").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_1_impartit_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/set_date_1-impartit_4-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3_en.md new file mode 100644 index 000000000000..14ffbe79b96e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700065047343.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700065047343.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_1_lr_2e_5_bosnian_32_ep_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_1-lr-2e-5-bs-32-ep-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3_en.md new file mode 100644 index 000000000000..07240bc7b9a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700011802544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700011802544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_1_lr_3e_5_bosnian_32_ep_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_1-lr-3e-5-bs-32-ep-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3_en.md new file mode 100644 index 000000000000..1c238243e939 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700078543680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700078543680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_2-lr-2e-5-bs-32-ep-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4_en.md new file mode 100644 index 000000000000..b7132d298203 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4_en_5.2.0_3.0_1700012028417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4_en_5.2.0_3.0_1700012028417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_2_lr_2e_5_bosnian_32_ep_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_2-lr-2e-5-bs-32-ep-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3_en.md new file mode 100644 index 000000000000..97fd39ed127e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700080056206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700080056206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_2_lr_3e_5_bosnian_32_ep_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_2-lr-3e-5-bs-32-ep-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3_en.md new file mode 100644 index 000000000000..70ea3212b78a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700012208110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700012208110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_3-lr-2e-5-bs-32-ep-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4_en.md new file mode 100644 index 000000000000..869826b85f1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4 +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4_en_5.2.0_3.0_1700081626662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4_en_5.2.0_3.0_1700081626662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_3_lr_2e_5_bosnian_32_ep_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_3-lr-2e-5-bs-32-ep-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shadowtwin41_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shadowtwin41_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..90d99fbba418 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shadowtwin41_finetuned_squad_accelerate_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from ShadowTwin41) +author: John Snow Labs +name: bert_qa_shadowtwin41_finetuned_squad_accelerate +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model originally trained by `ShadowTwin41`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_shadowtwin41_finetuned_squad_accelerate_en_5.2.0_3.0_1700083476522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_shadowtwin41_finetuned_squad_accelerate_en_5.2.0_3.0_1700083476522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shadowtwin41_finetuned_squad_accelerate","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shadowtwin41_finetuned_squad_accelerate","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_shadowtwin41_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ShadowTwin41/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shanny_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shanny_finetuned_squad_en.md new file mode 100644 index 000000000000..2ae57f392f25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shanny_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Shanny) +author: John Snow Labs +name: bert_qa_shanny_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `Shanny`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_shanny_finetuned_squad_en_5.2.0_3.0_1700085140357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_shanny_finetuned_squad_en_5.2.0_3.0_1700085140357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shanny_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shanny_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_shanny_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Shanny/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shash2409_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shash2409_finetuned_squad_en.md new file mode 100644 index 000000000000..88a8f51ad957 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shash2409_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from shash2409) +author: John Snow Labs +name: bert_qa_shash2409_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `shash2409`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_shash2409_finetuned_squad_en_5.2.0_3.0_1700067204546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_shash2409_finetuned_squad_en_5.2.0_3.0_1700067204546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shash2409_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shash2409_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_shash2409_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/shash2409/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shawon100_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shawon100_finetuned_squad_en.md new file mode 100644 index 000000000000..4a833b226d03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shawon100_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from shawon100) +author: John Snow Labs +name: bert_qa_shawon100_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `shawon100`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_shawon100_finetuned_squad_en_5.2.0_3.0_1700068925335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_shawon100_finetuned_squad_en_5.2.0_3.0_1700068925335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shawon100_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shawon100_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_shawon100_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/shawon100/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shed_e_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shed_e_finetuned_squad_en.md new file mode 100644 index 000000000000..f45c8a3127bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_shed_e_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from shed-e) +author: John Snow Labs +name: bert_qa_shed_e_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `shed-e`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_shed_e_finetuned_squad_en_5.2.0_3.0_1700012461934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_shed_e_finetuned_squad_en_5.2.0_3.0_1700012461934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shed_e_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shed_e_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_shed_e_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/shed-e/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sirah_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sirah_finetuned_squad_en.md new file mode 100644 index 000000000000..7b3194c54e07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_sirah_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from SiraH) +author: John Snow Labs +name: bert_qa_sirah_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `SiraH`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sirah_finetuned_squad_en_5.2.0_3.0_1700012684291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sirah_finetuned_squad_en_5.2.0_3.0_1700012684291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sirah_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sirah_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sirah_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SiraH/bert-finetuned-squad +- https://paperswithcode.com/sota?task=Question+Answering&dataset=squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_en.md new file mode 100644 index 000000000000..3af4cc39fd35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from muhtasham) +author: John Snow Labs +name: bert_qa_small_finetuned_cuad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-finetuned-cuad` is a English model originally trained by `muhtasham`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_en_5.2.0_3.0_1700086918408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_en_5.2.0_3.0_1700086918408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_small_finetuned_cuad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/muhtasham/bert-small-finetuned-cuad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_full_longer_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_full_longer_en.md new file mode 100644 index 000000000000..0dab11594609 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_full_longer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from muhtasham) +author: John Snow Labs +name: bert_qa_small_finetuned_cuad_full_longer +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-finetuned-cuad-full-longer` is a English model originally trained by `muhtasham`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_full_longer_en_5.2.0_3.0_1700088314621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_full_longer_en_5.2.0_3.0_1700088314621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad_full_longer","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad_full_longer","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_small_finetuned_cuad_full_longer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/muhtasham/bert-small-finetuned-cuad-full-longer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_longer_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_longer_en.md new file mode 100644 index 000000000000..d7a545834d60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_small_finetuned_cuad_longer_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from muhtasham) +author: John Snow Labs +name: bert_qa_small_finetuned_cuad_longer +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-finetuned-cuad-longer` is a English model originally trained by `muhtasham`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_longer_en_5.2.0_3.0_1700012855006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_longer_en_5.2.0_3.0_1700012855006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad_longer","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad_longer","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_small_finetuned_cuad_longer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/muhtasham/bert-small-finetuned-cuad-longer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_span_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_span_finetuned_squadv2_en.md new file mode 100644 index 000000000000..3e62227528d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_span_finetuned_squadv2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from vvincentt) +author: John Snow Labs +name: bert_qa_span_finetuned_squadv2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-finetuned-squadv2` is a English model originally trained by `vvincentt`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_span_finetuned_squadv2_en_5.2.0_3.0_1700013138777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_span_finetuned_squadv2_en_5.2.0_3.0_1700013138777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_span_finetuned_squadv2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_span_finetuned_squadv2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_span_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|402.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vvincentt/spanbert-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..e89e15f3cc03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0_en_5.2.0_3.0_1700071314719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0_en_5.2.0_3.0_1700071314719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_1024d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|389.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..b8e6af9e370c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-10` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10_en_5.2.0_3.0_1700013423047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10_en_5.2.0_3.0_1700013423047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_1024d_seed_10").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|389.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..44a9f69fbc31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-2` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2_en_5.2.0_3.0_1700013676058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2_en_5.2.0_3.0_1700013676058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_1024d_seed_2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|389.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42_en.md new file mode 100644 index 000000000000..5690a0ff9a4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-42` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42_en_5.2.0_3.0_1700090111484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42_en_5.2.0_3.0_1700090111484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_1024d_seed_42").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|394.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..3ce5503e9fd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-4` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4_en_5.2.0_3.0_1700012629885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4_en_5.2.0_3.0_1700012629885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_1024d_seed_4").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|390.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..b4d07993c2a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-8` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8_en_5.2.0_3.0_1700072907682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8_en_5.2.0_3.0_1700072907682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_1024d_seed_8").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|390.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..c245c387c433 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-0` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0_en_5.2.0_3.0_1700012881424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0_en_5.2.0_3.0_1700012881424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_0_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|380.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..d0c47898e738 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-10` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10_en_5.2.0_3.0_1700061998255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10_en_5.2.0_3.0_1700061998255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_128d_seed_10").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|380.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..e17daa069a09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-0` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0_en_5.2.0_3.0_1700013150885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0_en_5.2.0_3.0_1700013150885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_0_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|375.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..69c4a1eddc98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2_en_5.2.0_3.0_1700013427254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2_en_5.2.0_3.0_1700013427254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_2_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|375.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..7d24bb9bc8a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6_en_5.2.0_3.0_1700062364295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6_en_5.2.0_3.0_1700062364295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_6_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|375.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..9447050c59f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2_en_5.2.0_3.0_1700076683284.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2_en_5.2.0_3.0_1700076683284.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_2_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|383.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..72eaa932f0c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-6` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6_en_5.2.0_3.0_1700078543838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6_en_5.2.0_3.0_1700078543838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_6_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|383.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..91c2e223cd54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8_en_5.2.0_3.0_1700062697002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8_en_5.2.0_3.0_1700062697002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_8_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|383.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..101322fb7ea9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-10` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10_en_5.2.0_3.0_1700062985400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10_en_5.2.0_3.0_1700062985400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_32d_seed_10").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|376.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..e43dd10e0eee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4_en_5.2.0_3.0_1700063261261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4_en_5.2.0_3.0_1700063261261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_4_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|376.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..20b41039741d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-6` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6_en_5.2.0_3.0_1700063566487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6_en_5.2.0_3.0_1700063566487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_32d_seed_6").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|376.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..f81ed18b47fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8_en_5.2.0_3.0_1700063835701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8_en_5.2.0_3.0_1700063835701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_8_base_32d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|376.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..2f13ec4e1a1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0_en_5.2.0_3.0_1700013682338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0_en_5.2.0_3.0_1700013682338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_512d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|386.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..969859c4ad93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-2` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2_en_5.2.0_3.0_1700082007492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2_en_5.2.0_3.0_1700082007492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_2_base_512d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|386.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..d905f075d078 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4_en_5.2.0_3.0_1700062977896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4_en_5.2.0_3.0_1700062977896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_4_base_512d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|386.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..7faf2d485c37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0_en_5.2.0_3.0_1700063265809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0_en_5.2.0_3.0_1700063265809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_64d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|378.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..3ff3cb4f931e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-10` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10_en_5.2.0_3.0_1700083820750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10_en_5.2.0_3.0_1700083820750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_64d_seed_10").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|378.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..b313cb6e4974 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-2` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2_en_5.2.0_3.0_1700085853659.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2_en_5.2.0_3.0_1700085853659.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_64d_seed_2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|378.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_finetuned_squadv2_en.md new file mode 100644 index 000000000000..190c4f689a2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spanbert_finetuned_squadv2_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_spanbert_finetuned_squadv2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-finetuned-squadv2` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_finetuned_squadv2_en_5.2.0_3.0_1700089654562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_finetuned_squadv2_en_5.2.0_3.0_1700089654562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_finetuned_squadv2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.span_bert.v2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/spanbert-finetuned-squadv2 +- https://arxiv.org/abs/1907.10529 +- https://twitter.com/mrm8488 +- https://github.com/facebookresearch +- https://github.com/facebookresearch/SpanBERT +- https://github.com/facebookresearch/SpanBERT#pre-trained-models +- https://rajpurkar.github.io/SQuAD-explorer/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spasis_finetuned_squad_accelera_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spasis_finetuned_squad_accelera_en.md new file mode 100644 index 000000000000..c1aa7bdf3478 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_spasis_finetuned_squad_accelera_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from spasis) +author: John Snow Labs +name: bert_qa_spasis_finetuned_squad_accelera +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model originally trained by `spasis`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spasis_finetuned_squad_accelera_en_5.2.0_3.0_1700064116229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spasis_finetuned_squad_accelera_en_5.2.0_3.0_1700064116229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spasis_finetuned_squad_accelera","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spasis_finetuned_squad_accelera","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned_accelera.by_spasis").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spasis_finetuned_squad_accelera| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/spasis/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad1.1_1_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad1.1_1_en.md new file mode 100644 index 000000000000..c0b4e0e815b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad1.1_1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from maroo93) +author: John Snow Labs +name: bert_qa_squad1.1_1 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `squad1.1_1` is a English model orginally trained by `maroo93`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad1.1_1_en_5.2.0_3.0_1700063541554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad1.1_1_en_5.2.0_3.0_1700063541554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad1.1_1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_squad1.1_1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.v1.1.by_maroo93").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad1.1_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|406.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/maroo93/squad1.1_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad2.0_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad2.0_en.md new file mode 100644 index 000000000000..77707368717c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad2.0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from maroo93) +author: John Snow Labs +name: bert_qa_squad2.0 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `squad2.0` is a English model orginally trained by `maroo93`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad2.0_en_5.2.0_3.0_1700063832186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad2.0_en_5.2.0_3.0_1700063832186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad2.0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_squad2.0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad2.0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/maroo93/squad2.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_malay_bert_base_ms.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_malay_bert_base_ms.md new file mode 100644 index 000000000000..28df58e31692 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_malay_bert_base_ms.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Malay (macrolanguage) bert_qa_squad_malay_bert_base BertForQuestionAnswering from zhufy +author: John Snow Labs +name: bert_qa_squad_malay_bert_base +date: 2023-11-15 +tags: [bert, ms, open_source, question_answering, onnx] +task: Question Answering +language: ms +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_squad_malay_bert_base` is a Malay (macrolanguage) model originally trained by zhufy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad_malay_bert_base_ms_5.2.0_3.0_1700064386332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad_malay_bert_base_ms_5.2.0_3.0_1700064386332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad_malay_bert_base","ms") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_squad_malay_bert_base", "ms") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad_malay_bert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ms| +|Size:|412.1 MB| + +## References + +https://huggingface.co/zhufy/squad-ms-bert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_mbert_model_2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_mbert_model_2_en.md new file mode 100644 index 000000000000..0f65f28da5bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_mbert_model_2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ZYW) +author: John Snow Labs +name: bert_qa_squad_mbert_model_2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `squad-mbert-model_2` is a English model orginally trained by `ZYW`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad_mbert_model_2_en_5.2.0_3.0_1700064192848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad_mbert_model_2_en_5.2.0_3.0_1700064192848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad_mbert_model_2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_squad_mbert_model_2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.multi_lingual_bert.v2.by_ZYW").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad_mbert_model_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ZYW/squad-mbert-model_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_with_greetings_v2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_with_greetings_v2_en.md new file mode 100644 index 000000000000..81bba63735d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_with_greetings_v2_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from moquanyi) +author: John Snow Labs +name: bert_qa_squad_with_greetings_v2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `squad_with_greetings-v2` is a English model originally trained by `moquanyi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad_with_greetings_v2_en_5.2.0_3.0_1700064472186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad_with_greetings_v2_en_5.2.0_3.0_1700064472186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_squad_with_greetings_v2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_squad_with_greetings_v2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad_with_greetings_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/moquanyi/squad_with_greetings-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_xxl_cased_hub1_it.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_xxl_cased_hub1_it.md new file mode 100644 index 000000000000..15698bbf859b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_squad_xxl_cased_hub1_it.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Italian BertForQuestionAnswering model (from luigisaetta) +author: John Snow Labs +name: bert_qa_squad_xxl_cased_hub1 +date: 2023-11-15 +tags: [it, open_source, bert, question_answering, onnx] +task: Question Answering +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `squad_it_xxl_cased_hub1` is a Italian model originally trained by `luigisaetta`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad_xxl_cased_hub1_it_5.2.0_3.0_1700064714351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad_xxl_cased_hub1_it_5.2.0_3.0_1700064714351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad_xxl_cased_hub1","it") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Qual è il mio nome?", "Mi chiamo Clara e vivo a Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_squad_xxl_cased_hub1","it") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Qual è il mio nome?", "Mi chiamo Clara e vivo a Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("it.answer_question.squad.bert.xxl_cased").predict("""Qual è il mio nome?|||"Mi chiamo Clara e vivo a Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad_xxl_cased_hub1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|it| +|Size:|412.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/luigisaetta/squad_it_xxl_cased_hub1 +- https://github.com/luigisaetta/nlp-qa-italian/blob/main/train_squad_it_final1.ipynb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_srcoc_es.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_srcoc_es.md new file mode 100644 index 000000000000..ceddc0e1054f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_srcoc_es.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Spanish BertForQuestionAnswering Cased model (from srcocotero) +author: John Snow Labs +name: bert_qa_srcoc +date: 2023-11-15 +tags: [es, open_source, bert, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-qa-es` is a Spanish model originally trained by `srcocotero`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_srcoc_es_5.2.0_3.0_1700066156853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_srcoc_es_5.2.0_3.0_1700066156853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_srcoc","es")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_srcoc","es") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_srcoc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|409.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/srcocotero/bert-qa-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_srmukundb_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_srmukundb_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..d229291c7202 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_srmukundb_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from srmukundb) +author: John Snow Labs +name: bert_qa_srmukundb_bert_base_uncased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad` is a English model orginally trained by `srmukundb`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_srmukundb_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700064675052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_srmukundb_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700064675052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_srmukundb_bert_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_srmukundb_bert_base_uncased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_srmukundb").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_srmukundb_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/srmukundb/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ss756_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ss756_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..151d524175df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_ss756_base_cased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from ss756) +author: John Snow Labs +name: bert_qa_ss756_base_cased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad` is a English model originally trained by `ss756`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ss756_base_cased_finetuned_squad_en_5.2.0_3.0_1700065720292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ss756_base_cased_finetuned_squad_en_5.2.0_3.0_1700065720292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ss756_base_cased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ss756_base_cased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ss756_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ss756/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_susghosh_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_susghosh_finetuned_squad_en.md new file mode 100644 index 000000000000..aabf15320c04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_susghosh_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from susghosh) +author: John Snow Labs +name: bert_qa_susghosh_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `susghosh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_susghosh_finetuned_squad_en_5.2.0_3.0_1700068229977.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_susghosh_finetuned_squad_en_5.2.0_3.0_1700068229977.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_susghosh_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_susghosh_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_susghosh").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_susghosh_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/susghosh/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_telugu_bertu_tydiqa_xx.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_telugu_bertu_tydiqa_xx.md new file mode 100644 index 000000000000..3cda2c558217 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_telugu_bertu_tydiqa_xx.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Multilingual BertForQuestionAnswering model (from kuppuluri) +author: John Snow Labs +name: bert_qa_telugu_bertu_tydiqa +date: 2023-11-15 +tags: [te, en, open_source, question_answering, bert, xx, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `telugu_bertu_tydiqa` is a Multilingual model orginally trained by `kuppuluri`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_telugu_bertu_tydiqa_xx_5.2.0_3.0_1700067667402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_telugu_bertu_tydiqa_xx_5.2.0_3.0_1700067667402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_telugu_bertu_tydiqa","xx") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_telugu_bertu_tydiqa","xx") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.answer_question.tydiqa.bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_telugu_bertu_tydiqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|xx| +|Size:|412.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kuppuluri/telugu_bertu_tydiqa +- https://github.com/google-research-datasets/tydiqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_test01_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_test01_en.md new file mode 100644 index 000000000000..4e9aed238da1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_test01_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Akert) +author: John Snow Labs +name: bert_qa_test01 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `test01` is a English model originally trained by `Akert`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_test01_en_5.2.0_3.0_1700071870324.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_test01_en_5.2.0_3.0_1700071870324.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_test01","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_test01","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_test01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Akert/test01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tiny_wrslb_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tiny_wrslb_finetuned_squadv1_en.md new file mode 100644 index 000000000000..eb2d623050fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tiny_wrslb_finetuned_squadv1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny model (from mrm8488) +author: John Snow Labs +name: bert_qa_tiny_wrslb_finetuned_squadv1 +date: 2023-11-15 +tags: [open_source, bert, question_answering, tiny, en, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BERT Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-wrslb-finetuned-squadv1` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_wrslb_finetuned_squadv1_en_5.2.0_3.0_1700072728247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_wrslb_finetuned_squadv1_en_5.2.0_3.0_1700072728247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tiny_wrslb_finetuned_squadv1","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["PUT YOUR 'QUESTION' STRING HERE?", "PUT YOUR 'CONTEXT' STRING HERE"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_tiny_wrslb_finetuned_squadv1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("PUT YOUR 'QUESTION' STRING HERE?", "PUT YOUR 'CONTEXT' STRING HERE").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.tiny_finetuned").predict("""PUT YOUR 'QUESTION' STRING HERE?|||"PUT YOUR 'CONTEXT' STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tiny_wrslb_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +https://huggingface.co/mrm8488/bert-tiny-wrslb-finetuned-squadv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_6l_768d_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_6l_768d_squad2_en.md new file mode 100644 index 000000000000..b6b5e5df90a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_6l_768d_squad2_en.md @@ -0,0 +1,119 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from deepset) +author: John Snow Labs +name: bert_qa_tinybert_6l_768d_squad2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tinybert-6l-768d-squad2` is a English model orginally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_en_5.2.0_3.0_1700069667496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_en_5.2.0_3.0_1700069667496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tinybert_6l_768d_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_tinybert_6l_768d_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.tiny_768d").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tinybert_6l_768d_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|248.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/tinybert-6l-768d-squad2 +- https://github.com/deepset-ai/haystack/discussions +- https://deepset.ai +- https://twitter.com/deepset_ai +- http://www.deepset.ai/jobs +- https://haystack.deepset.ai/community/join +- https://github.com/deepset-ai/haystack/ +- https://deepset.ai/german-bert +- https://www.linkedin.com/company/deepset-ai/ +- https://arxiv.org/pdf/1909.10351.pdf +- https://github.com/deepset-ai/FARM +- https://deepset.ai/germanquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_6l_768d_squad2_large_teach_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_6l_768d_squad2_large_teach_en.md new file mode 100644 index 000000000000..41db7b2d0726 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_6l_768d_squad2_large_teach_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny Cased model (from MichelBartels) +author: John Snow Labs +name: bert_qa_tinybert_6l_768d_squad2_large_teach +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tinybert-6l-768d-squad2-large-teacher` is a English model originally trained by `MichelBartels`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teach_en_5.2.0_3.0_1700071106122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teach_en_5.2.0_3.0_1700071106122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teach","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teach","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.large_tiny_768d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tinybert_6l_768d_squad2_large_teach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|249.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MichelBartels/tinybert-6l-768d-squad2-large-teacher \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_general_4l_312d_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_general_4l_312d_squad_en.md new file mode 100644 index 000000000000..67b083d08a32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_tinybert_general_4l_312d_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny Cased model (from haritzpuerto) +author: John Snow Labs +name: bert_qa_tinybert_general_4l_312d_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `TinyBERT_General_4L_312D-squad` is a English model originally trained by `haritzpuerto`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_general_4l_312d_squad_en_5.2.0_3.0_1700076181489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_general_4l_312d_squad_en_5.2.0_3.0_1700076181489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tinybert_general_4l_312d_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_tinybert_general_4l_312d_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.tiny").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tinybert_general_4l_312d_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|53.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/haritzpuerto/TinyBERT_General_4L_312D-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..630075e8dc7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from aodiniz) +author: John Snow Labs +name: bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2 +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_uncased_L-6_H-128_A-2_cord19-200616_squad2` is a English model originally trained by `aodiniz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_en_5.2.0_3.0_1700072728267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_en_5.2.0_3.0_1700072728267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2_cord19.uncased_6l_128d_a2a_128d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|19.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aodiniz/bert_uncased_L-6_H-128_A-2_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_vedants01_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_vedants01_finetuned_squad_en.md new file mode 100644 index 000000000000..261a98ac738f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_vedants01_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from VedantS01) +author: John Snow Labs +name: bert_qa_vedants01_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `VedantS01`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_vedants01_finetuned_squad_en_5.2.0_3.0_1700076702685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_vedants01_finetuned_squad_en_5.2.0_3.0_1700076702685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_vedants01_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_vedants01_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_vedants01_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/VedantS01/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_victorlee071200_base_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_victorlee071200_base_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..58f63c171710 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_victorlee071200_base_cased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from victorlee071200) +author: John Snow Labs +name: bert_qa_victorlee071200_base_cased_finetuned_squad +date: 2023-11-15 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-cased-finetuned-squad` is a English model originally trained by `victorlee071200`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_victorlee071200_base_cased_finetuned_squad_en_5.2.0_3.0_1700078403143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_victorlee071200_base_cased_finetuned_squad_en_5.2.0_3.0_1700078403143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_victorlee071200_base_cased_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_victorlee071200_base_cased_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.cased_base_finetuned.by_victorlee071200").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_victorlee071200_base_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/victorlee071200/bert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_xdistil_l12_h384_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_xdistil_l12_h384_squad2_en.md new file mode 100644 index 000000000000..dfb92631d346 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_xdistil_l12_h384_squad2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from nbroad) +author: John Snow Labs +name: bert_qa_xdistil_l12_h384_squad2 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xdistil-l12-h384-squad2` is a English model orginally trained by `nbroad`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xdistil_l12_h384_squad2_en_5.2.0_3.0_1700082106520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xdistil_l12_h384_squad2_en_5.2.0_3.0_1700082106520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_xdistil_l12_h384_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_xdistil_l12_h384_squad2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.distilled").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xdistil_l12_h384_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|123.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/nbroad/xdistil-l12-h384-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3_en.md new file mode 100644 index 000000000000..1101a45a3fd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from husnu) +author: John Snow Labs +name: bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3 +date: 2023-11-15 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h256-uncased-finetuned_lr-2e-05_epochs-3` is a English model orginally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3_en_5.2.0_3.0_1700083536001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3_en_5.2.0_3.0_1700083536001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.xtremedistiled_uncased_lr_2e_05_epochs_3.by_husnu").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|47.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/xtremedistil-l6-h256-uncased-finetuned_lr-2e-05_epochs-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-electra_qa_araelectra_discriminator_soqal_ar.md b/docs/_posts/ahmedlone127/2023-11-15-electra_qa_araelectra_discriminator_soqal_ar.md new file mode 100644 index 000000000000..fd6664d63b5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-electra_qa_araelectra_discriminator_soqal_ar.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Arabic electra_qa_araelectra_discriminator_soqal BertForQuestionAnswering from Damith +author: John Snow Labs +name: electra_qa_araelectra_discriminator_soqal +date: 2023-11-15 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_araelectra_discriminator_soqal` is a Arabic model originally trained by Damith. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_araelectra_discriminator_soqal_ar_5.2.0_3.0_1700089728517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_araelectra_discriminator_soqal_ar_5.2.0_3.0_1700089728517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_araelectra_discriminator_soqal","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_araelectra_discriminator_soqal", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_araelectra_discriminator_soqal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|504.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/Damith/AraELECTRA-discriminator-SOQAL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-electra_qa_base_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-15-electra_qa_base_chaii_en.md new file mode 100644 index 000000000000..976afd85711b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-electra_qa_base_chaii_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from SauravMaheshkar) +author: John Snow Labs +name: electra_qa_base_chaii +date: 2023-11-15 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-base-chaii` is a English model originally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_chaii_en_5.2.0_3.0_1700090287058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_chaii_en_5.2.0_3.0_1700090287058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_chaii","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.electra.base").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/electra-base-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-15-electra_qa_biom_base_squad2_bioasq8b_en.md b/docs/_posts/ahmedlone127/2023-11-15-electra_qa_biom_base_squad2_bioasq8b_en.md new file mode 100644 index 000000000000..c828219745a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-15-electra_qa_biom_base_squad2_bioasq8b_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English electra_qa_biom_base_squad2_bioasq8b BertForQuestionAnswering from sultan +author: John Snow Labs +name: electra_qa_biom_base_squad2_bioasq8b +date: 2023-11-15 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_biom_base_squad2_bioasq8b` is a English model originally trained by sultan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_biom_base_squad2_bioasq8b_en_5.2.0_3.0_1700082106995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_biom_base_squad2_bioasq8b_en_5.2.0_3.0_1700082106995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_biom_base_squad2_bioasq8b","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_biom_base_squad2_bioasq8b", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_biom_base_squad2_bioasq8b| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/sultan/BioM-ELECTRA-Base-SQuAD2-BioASQ8B \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-admisi_indobert_qna_v2_id.md b/docs/_posts/ahmedlone127/2023-11-16-admisi_indobert_qna_v2_id.md new file mode 100644 index 000000000000..8b289b33b124 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-admisi_indobert_qna_v2_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian admisi_indobert_qna_v2 BertForQuestionAnswering from emny +author: John Snow Labs +name: admisi_indobert_qna_v2 +date: 2023-11-16 +tags: [bert, id, open_source, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`admisi_indobert_qna_v2` is a Indonesian model originally trained by emny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/admisi_indobert_qna_v2_id_5.2.0_3.0_1700163084339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/admisi_indobert_qna_v2_id_5.2.0_3.0_1700163084339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("admisi_indobert_qna_v2","id") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("admisi_indobert_qna_v2", "id") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|admisi_indobert_qna_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|411.7 MB| + +## References + +https://huggingface.co/emny/admisi-indobert-qna-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-batterybert_cased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-batterybert_cased_finetuned_squad_en.md new file mode 100644 index 000000000000..9c1765ca393a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-batterybert_cased_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English batterybert_cased_finetuned_squad BertForQuestionAnswering from HongyangLi +author: John Snow Labs +name: batterybert_cased_finetuned_squad +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`batterybert_cased_finetuned_squad` is a English model originally trained by HongyangLi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/batterybert_cased_finetuned_squad_en_5.2.0_3.0_1700114589035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/batterybert_cased_finetuned_squad_en_5.2.0_3.0_1700114589035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("batterybert_cased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("batterybert_cased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|batterybert_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/HongyangLi/batterybert-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_arabertv2_finetuned_arcd_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_arabertv2_finetuned_arcd_squad_en.md new file mode 100644 index 000000000000..1f3e2b79aba4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_arabertv2_finetuned_arcd_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_arabertv2_finetuned_arcd_squad BertForQuestionAnswering from amnahhebrahim +author: John Snow Labs +name: bert_base_arabertv2_finetuned_arcd_squad +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_arabertv2_finetuned_arcd_squad` is a English model originally trained by amnahhebrahim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_arabertv2_finetuned_arcd_squad_en_5.2.0_3.0_1700106077088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_arabertv2_finetuned_arcd_squad_en_5.2.0_3.0_1700106077088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_arabertv2_finetuned_arcd_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_arabertv2_finetuned_arcd_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_arabertv2_finetuned_arcd_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|504.8 MB| + +## References + +https://huggingface.co/amnahhebrahim/bert-base-arabertv2-finetuned-arcd-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_chinese_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_chinese_finetuned_squad_en.md new file mode 100644 index 000000000000..b73ea0b151e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_chinese_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_squad BertForQuestionAnswering from sharkMeow +author: John Snow Labs +name: bert_base_chinese_finetuned_squad +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_squad` is a English model originally trained by sharkMeow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_squad_en_5.2.0_3.0_1700107426041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_squad_en_5.2.0_3.0_1700107426041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_chinese_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_chinese_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/sharkMeow/bert-base-chinese-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_finetuned_klue_mrc_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_finetuned_klue_mrc_en.md new file mode 100644 index 000000000000..6947877cec34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_finetuned_klue_mrc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_finetuned_klue_mrc BertForQuestionAnswering from Forturne +author: John Snow Labs +name: bert_base_finetuned_klue_mrc +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finetuned_klue_mrc` is a English model originally trained by Forturne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_klue_mrc_en_5.2.0_3.0_1700116517830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finetuned_klue_mrc_en_5.2.0_3.0_1700116517830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_finetuned_klue_mrc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_finetuned_klue_mrc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finetuned_klue_mrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|412.4 MB| + +## References + +https://huggingface.co/Forturne/bert-base-finetuned-klue-mrc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_french_europeana_cased_squad_french_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_french_europeana_cased_squad_french_en.md new file mode 100644 index 000000000000..262c1b379ab8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_french_europeana_cased_squad_french_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_french_europeana_cased_squad_french BertForQuestionAnswering from Nadav +author: John Snow Labs +name: bert_base_french_europeana_cased_squad_french +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_french_europeana_cased_squad_french` is a English model originally trained by Nadav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_french_europeana_cased_squad_french_en_5.2.0_3.0_1700119713022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_french_europeana_cased_squad_french_en_5.2.0_3.0_1700119713022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_french_europeana_cased_squad_french","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_french_europeana_cased_squad_french", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_french_europeana_cased_squad_french| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|412.3 MB| + +## References + +https://huggingface.co/Nadav/bert-base-french-europeana-cased-squad-fr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_multilingual_cased_finetuned_squad_jensh_xx.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_multilingual_cased_finetuned_squad_jensh_xx.md new file mode 100644 index 000000000000..b4f01ccba84b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_multilingual_cased_finetuned_squad_jensh_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_squad_jensh BertForQuestionAnswering from JensH +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_squad_jensh +date: 2023-11-16 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_squad_jensh` is a Multilingual model originally trained by JensH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_squad_jensh_xx_5.2.0_3.0_1700154777207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_squad_jensh_xx_5.2.0_3.0_1700154777207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_multilingual_cased_finetuned_squad_jensh","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_multilingual_cased_finetuned_squad_jensh", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_squad_jensh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/JensH/bert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_multilingual_uncased_finetuned_squadv2_xx.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_multilingual_uncased_finetuned_squadv2_xx.md new file mode 100644 index 000000000000..330c2786f812 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_multilingual_uncased_finetuned_squadv2_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_finetuned_squadv2 BertForQuestionAnswering from monakth +author: John Snow Labs +name: bert_base_multilingual_uncased_finetuned_squadv2 +date: 2023-11-16 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_finetuned_squadv2` is a Multilingual model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_finetuned_squadv2_xx_5.2.0_3.0_1700153197666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_finetuned_squadv2_xx_5.2.0_3.0_1700153197666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_multilingual_uncased_finetuned_squadv2","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_multilingual_uncased_finetuned_squadv2", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/monakth/bert-base-multilingual-uncased-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_spanish_wwm_cased_finetuned_qa_mlqa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_spanish_wwm_cased_finetuned_qa_mlqa_en.md new file mode 100644 index 000000000000..08d38904f60b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_spanish_wwm_cased_finetuned_qa_mlqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_qa_mlqa BertForQuestionAnswering from dccuchile +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_qa_mlqa +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_qa_mlqa` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_qa_mlqa_en_5.2.0_3.0_1700171245676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_qa_mlqa_en_5.2.0_3.0_1700171245676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_spanish_wwm_cased_finetuned_qa_mlqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_spanish_wwm_cased_finetuned_qa_mlqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_qa_mlqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased-finetuned-qa-mlqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_nq_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_nq_finetuned_squad_en.md new file mode 100644 index 000000000000..068730d71fe7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_nq_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_nq_finetuned_squad BertForQuestionAnswering from leonardoschluter +author: John Snow Labs +name: bert_base_uncased_finetuned_nq_finetuned_squad +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_nq_finetuned_squad` is a English model originally trained by leonardoschluter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_nq_finetuned_squad_en_5.2.0_3.0_1700121045315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_nq_finetuned_squad_en_5.2.0_3.0_1700121045315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_nq_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_nq_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_nq_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/leonardoschluter/bert-base-uncased-finetuned-nq-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_squad2_iproject_10_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_squad2_iproject_10_en.md new file mode 100644 index 000000000000..03c4eebedd62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_squad2_iproject_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad2_iproject_10 BertForQuestionAnswering from IProject-10 +author: John Snow Labs +name: bert_base_uncased_finetuned_squad2_iproject_10 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad2_iproject_10` is a English model originally trained by IProject-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad2_iproject_10_en_5.2.0_3.0_1700178697409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad2_iproject_10_en_5.2.0_3.0_1700178697409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad2_iproject_10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_squad2_iproject_10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad2_iproject_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/IProject-10/bert-base-uncased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_squad_aditya4521_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_squad_aditya4521_en.md new file mode 100644 index 000000000000..cb89c8875cd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_base_uncased_finetuned_squad_aditya4521_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_aditya4521 BertForQuestionAnswering from Aditya4521 +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_aditya4521 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_aditya4521` is a English model originally trained by Aditya4521. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_aditya4521_en_5.2.0_3.0_1700163084328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_aditya4521_en_5.2.0_3.0_1700163084328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_aditya4521","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_squad_aditya4521", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_aditya4521| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Aditya4521/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_covid_10_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_covid_10_en.md new file mode 100644 index 000000000000..c0a77af1e025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_covid_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_covid_10 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_covid_10 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_covid_10` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_covid_10_en_5.2.0_3.0_1700179027577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_covid_10_en_5.2.0_3.0_1700179027577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_covid_10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_covid_10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_covid_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/hung200504/bert-covid-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_accelerate_quangb1910128_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_accelerate_quangb1910128_en.md new file mode 100644 index 000000000000..9e215166d37f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_accelerate_quangb1910128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_accelerate_quangb1910128 BertForQuestionAnswering from quangb1910128 +author: John Snow Labs +name: bert_finetuned_squad_accelerate_quangb1910128 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_accelerate_quangb1910128` is a English model originally trained by quangb1910128. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_accelerate_quangb1910128_en_5.2.0_3.0_1700170476755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_accelerate_quangb1910128_en_5.2.0_3.0_1700170476755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_accelerate_quangb1910128","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_accelerate_quangb1910128", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_accelerate_quangb1910128| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/quangb1910128/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_alaa1234_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_alaa1234_en.md new file mode 100644 index 000000000000..4af38d566273 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_alaa1234_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_alaa1234 BertForQuestionAnswering from alaa1234 +author: John Snow Labs +name: bert_finetuned_squad_alaa1234 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_alaa1234` is a English model originally trained by alaa1234. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_alaa1234_en_5.2.0_3.0_1700175900118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_alaa1234_en_5.2.0_3.0_1700175900118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_alaa1234","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_alaa1234", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_alaa1234| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/alaa1234/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_avecoder_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_avecoder_en.md new file mode 100644 index 000000000000..0ba03bb271f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_avecoder_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_avecoder BertForQuestionAnswering from avecoder +author: John Snow Labs +name: bert_finetuned_squad_avecoder +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_avecoder` is a English model originally trained by avecoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_avecoder_en_5.2.0_3.0_1700123346771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_avecoder_en_5.2.0_3.0_1700123346771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_avecoder","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_avecoder", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_avecoder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/avecoder/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_quangb1910128_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_quangb1910128_en.md new file mode 100644 index 000000000000..f938a98677ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_quangb1910128_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_quangb1910128 BertForQuestionAnswering from quangb1910128 +author: John Snow Labs +name: bert_finetuned_squad_quangb1910128 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_quangb1910128` is a English model originally trained by quangb1910128. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_quangb1910128_en_5.2.0_3.0_1700109477106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_quangb1910128_en_5.2.0_3.0_1700109477106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_quangb1910128","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_quangb1910128", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_quangb1910128| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/quangb1910128/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_salmonai123_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_salmonai123_en.md new file mode 100644 index 000000000000..2fc81554205d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_salmonai123_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_salmonai123 BertForQuestionAnswering from SalmonAI123 +author: John Snow Labs +name: bert_finetuned_squad_salmonai123 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_salmonai123` is a English model originally trained by SalmonAI123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_salmonai123_en_5.2.0_3.0_1700161797677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_salmonai123_en_5.2.0_3.0_1700161797677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_salmonai123","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_salmonai123", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_salmonai123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/SalmonAI123/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_shafa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_shafa_en.md new file mode 100644 index 000000000000..21201b1e57a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_shafa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_shafa BertForQuestionAnswering from shafa +author: John Snow Labs +name: bert_finetuned_squad_shafa +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_shafa` is a English model originally trained by shafa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_shafa_en_5.2.0_3.0_1700176862417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_shafa_en_5.2.0_3.0_1700176862417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_shafa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_shafa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_shafa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/shafa/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_shynbui_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_shynbui_en.md new file mode 100644 index 000000000000..d16b04b16919 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_shynbui_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_shynbui BertForQuestionAnswering from ShynBui +author: John Snow Labs +name: bert_finetuned_squad_shynbui +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_shynbui` is a English model originally trained by ShynBui. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_shynbui_en_5.2.0_3.0_1700111481730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_shynbui_en_5.2.0_3.0_1700111481730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_shynbui","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_shynbui", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_shynbui| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ShynBui/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_technicalmorujiii_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_technicalmorujiii_en.md new file mode 100644 index 000000000000..b628a0e32536 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_technicalmorujiii_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_technicalmorujiii BertForQuestionAnswering from TechnicalMoruJiii +author: John Snow Labs +name: bert_finetuned_squad_technicalmorujiii +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_technicalmorujiii` is a English model originally trained by TechnicalMoruJiii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_technicalmorujiii_en_5.2.0_3.0_1700179171257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_technicalmorujiii_en_5.2.0_3.0_1700179171257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_technicalmorujiii","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_technicalmorujiii", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_technicalmorujiii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/TechnicalMoruJiii/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_v1_francesco_a_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_v1_francesco_a_en.md new file mode 100644 index 000000000000..7de4975b8b66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_finetuned_squad_v1_francesco_a_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_v1_francesco_a BertForQuestionAnswering from Francesco-A +author: John Snow Labs +name: bert_finetuned_squad_v1_francesco_a +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_v1_francesco_a` is a English model originally trained by Francesco-A. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_v1_francesco_a_en_5.2.0_3.0_1700173829042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_v1_francesco_a_en_5.2.0_3.0_1700173829042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_v1_francesco_a","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_v1_francesco_a", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_v1_francesco_a| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Francesco-A/bert-finetuned-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_italian_cased_question_answering_it.md b/docs/_posts/ahmedlone127/2023-11-16-bert_italian_cased_question_answering_it.md new file mode 100644 index 000000000000..41214d4bf8fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_italian_cased_question_answering_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian bert_italian_cased_question_answering BertForQuestionAnswering from osiria +author: John Snow Labs +name: bert_italian_cased_question_answering +date: 2023-11-16 +tags: [bert, it, open_source, question_answering, onnx] +task: Question Answering +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_italian_cased_question_answering` is a Italian model originally trained by osiria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_italian_cased_question_answering_it_5.2.0_3.0_1700160138104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_italian_cased_question_answering_it_5.2.0_3.0_1700160138104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_italian_cased_question_answering","it") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_italian_cased_question_answering", "it") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_italian_cased_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|it| +|Size:|409.0 MB| + +## References + +https://huggingface.co/osiria/bert-italian-cased-question-answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_large_mpdocvqa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_large_mpdocvqa_en.md new file mode 100644 index 000000000000..6e89062dad12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_large_mpdocvqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_mpdocvqa BertForQuestionAnswering from rubentito +author: John Snow Labs +name: bert_large_mpdocvqa +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_mpdocvqa` is a English model originally trained by rubentito. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_mpdocvqa_en_5.2.0_3.0_1700154120764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_mpdocvqa_en_5.2.0_3.0_1700154120764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_large_mpdocvqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_large_mpdocvqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_mpdocvqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/rubentito/bert-large-mpdocvqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_large_mrqa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_large_mrqa_en.md new file mode 100644 index 000000000000..eae0ea3a0546 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_large_mrqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_mrqa BertForQuestionAnswering from VMware +author: John Snow Labs +name: bert_large_mrqa +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_mrqa` is a English model originally trained by VMware. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_mrqa_en_5.2.0_3.0_1700161818217.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_mrqa_en_5.2.0_3.0_1700161818217.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_large_mrqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_large_mrqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_mrqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/VMware/bert-large-mrqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_roberta_base_chinese_extractive_qa_scratch_zh.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_roberta_base_chinese_extractive_qa_scratch_zh.md new file mode 100644 index 000000000000..315d3b5a9584 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_roberta_base_chinese_extractive_qa_scratch_zh.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering model (from jackh1995) +author: John Snow Labs +name: bert_qa_roberta_base_chinese_extractive_qa_scratch +date: 2023-11-16 +tags: [zh, open_source, question_answering, bert, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-chinese-extractive-qa-scratch` is a Chinese model orginally trained by `jackh1995`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_chinese_extractive_qa_scratch_zh_5.2.0_3.0_1700140746123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_roberta_base_chinese_extractive_qa_scratch_zh_5.2.0_3.0_1700140746123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_roberta_base_chinese_extractive_qa_scratch","zh") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_roberta_base_chinese_extractive_qa_scratch","zh") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("zh.answer_question.bert.base.by_jackh1995").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_roberta_base_chinese_extractive_qa_scratch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|zh| +|Size:|407.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jackh1995/roberta-base-chinese-extractive-qa-scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_rule_softmatching_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_rule_softmatching_en.md new file mode 100644 index 000000000000..9f4ea5cb14a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_rule_softmatching_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from enoriega) +author: John Snow Labs +name: bert_qa_rule_softmatching +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rule_softmatching` is a English model originally trained by `enoriega`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_rule_softmatching_en_5.2.0_3.0_1700141019762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_rule_softmatching_en_5.2.0_3.0_1700141019762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_rule_softmatching","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_rule_softmatching","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.by_enoriega").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_rule_softmatching| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/enoriega/rule_softmatching \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sagemaker_bioclinicalbert_adr_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sagemaker_bioclinicalbert_adr_en.md new file mode 100644 index 000000000000..73f3fef1cd66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sagemaker_bioclinicalbert_adr_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_sagemaker_bioclinicalbert_adr BertForQuestionAnswering from anindabitm +author: John Snow Labs +name: bert_qa_sagemaker_bioclinicalbert_adr +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_sagemaker_bioclinicalbert_adr` is a English model originally trained by anindabitm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sagemaker_bioclinicalbert_adr_en_5.2.0_3.0_1700141304291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sagemaker_bioclinicalbert_adr_en_5.2.0_3.0_1700141304291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sagemaker_bioclinicalbert_adr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_sagemaker_bioclinicalbert_adr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sagemaker_bioclinicalbert_adr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/anindabitm/sagemaker-BioclinicalBERT-ADR \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sasuke_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sasuke_finetuned_squad_en.md new file mode 100644 index 000000000000..75e0fc873f47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sasuke_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from sasuke) +author: John Snow Labs +name: bert_qa_sasuke_finetuned_squad +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `sasuke`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sasuke_finetuned_squad_en_5.2.0_3.0_1700096158969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sasuke_finetuned_squad_en_5.2.0_3.0_1700096158969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sasuke_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_sasuke_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sasuke_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sasuke/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sd1_small_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sd1_small_en.md new file mode 100644 index 000000000000..36e6b1104a35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_sd1_small_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from motiondew) +author: John Snow Labs +name: bert_qa_sd1_small +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-sd1-small` is a English model originally trained by `motiondew`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_sd1_small_en_5.2.0_3.0_1700097709617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_sd1_small_en_5.2.0_3.0_1700097709617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_sd1_small","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_sd1_small","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.small").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_sd1_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/motiondew/bert-sd1-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3_en.md new file mode 100644 index 000000000000..20e6c7f548dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3 BertForQuestionAnswering from motiondew +author: John Snow Labs +name: bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3` is a English model originally trained by motiondew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700099070600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3_en_5.2.0_3.0_1700099070600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_set_date_3_lr_3e_5_bosnian_32_ep_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/motiondew/bert-set_date_3-lr-3e-5-bs-32-ep-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_shashank1303_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_shashank1303_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..f97da78402b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_shashank1303_finetuned_squad_accelerate_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from shashank1303) +author: John Snow Labs +name: bert_qa_shashank1303_finetuned_squad_accelerate +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model originally trained by `shashank1303`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_shashank1303_finetuned_squad_accelerate_en_5.2.0_3.0_1700100817571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_shashank1303_finetuned_squad_accelerate_en_5.2.0_3.0_1700100817571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shashank1303_finetuned_squad_accelerate","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_shashank1303_finetuned_squad_accelerate","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_shashank1303_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/shashank1303/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_small_finetuned_cuad_full_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_small_finetuned_cuad_full_en.md new file mode 100644 index 000000000000..6138dc190360 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_small_finetuned_cuad_full_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Small Cased model (from muhtasham) +author: John Snow Labs +name: bert_qa_small_finetuned_cuad_full +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-small-finetuned-cuad-full` is a English model originally trained by `muhtasham`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_full_en_5.2.0_3.0_1700102019793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_small_finetuned_cuad_full_en_5.2.0_3.0_1700102019793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad_full","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_small_finetuned_cuad_full","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_small_finetuned_cuad_full| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|107.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/muhtasham/bert-small-finetuned-cuad-full \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..b5ddcf111599 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-6` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6_en_5.2.0_3.0_1700141625010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6_en_5.2.0_3.0_1700141625010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_1024d_seed_6").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_1024_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|390.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-1024-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42_en.md new file mode 100644 index 000000000000..c092b915b24e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-42` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42_en_5.2.0_3.0_1700108220647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42_en_5.2.0_3.0_1700108220647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_42_base_128d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|384.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..d2bb5ec1d69b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-4` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4_en_5.2.0_3.0_1700106348085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4_en_5.2.0_3.0_1700106348085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_128d_seed_4").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|380.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..f66381b6a99c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-6` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6_en_5.2.0_3.0_1700140753164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6_en_5.2.0_3.0_1700140753164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_128d_seed_6").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|380.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..158339a3cf1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-8` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8_en_5.2.0_3.0_1700141048381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8_en_5.2.0_3.0_1700141048381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_128d_seed_8").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_128_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|380.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-128-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..bb7669d6e431 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-10` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10_en_5.2.0_3.0_1700096426899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10_en_5.2.0_3.0_1700096426899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_10_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|375.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42_en.md new file mode 100644 index 000000000000..9f9936343e79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-42` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42_en_5.2.0_3.0_1700141920271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42_en_5.2.0_3.0_1700141920271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_seed_42").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|380.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..a35709e19c89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4_en_5.2.0_3.0_1700110051006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4_en_5.2.0_3.0_1700110051006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_4_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|375.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..9934969c0232 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8_en_5.2.0_3.0_1700140756897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8_en_5.2.0_3.0_1700140756897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_8_base_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_16_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|375.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-16-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..87d326765fa0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-0` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0_en_5.2.0_3.0_1700113931035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0_en_5.2.0_3.0_1700113931035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_0_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|383.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..06a10de7b0a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-10` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10_en_5.2.0_3.0_1700098256022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10_en_5.2.0_3.0_1700098256022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_256d_seed_10").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|383.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..dcf80385735c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-4` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4_en_5.2.0_3.0_1700099917692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4_en_5.2.0_3.0_1700099917692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_4_base_256d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_256_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|383.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-256-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0_en.md new file mode 100644 index 000000000000..c17c0dc9d351 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-0` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0_en_5.2.0_3.0_1700141052019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0_en_5.2.0_3.0_1700141052019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_32d_seed_0").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|376.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2_en.md new file mode 100644 index 000000000000..5e203a207a3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-2` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2_en_5.2.0_3.0_1700142255782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2_en_5.2.0_3.0_1700142255782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_32d_seed_2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_32_finetuned_squad_seed_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|376.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-32-finetuned-squad-seed-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10_en.md new file mode 100644 index 000000000000..93d3071181b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-10` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10_en_5.2.0_3.0_1700142645727.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10_en_5.2.0_3.0_1700142645727.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_512d_seed_10").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|386.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..11a605caf300 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-8` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8_en_5.2.0_3.0_1700122406393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8_en_5.2.0_3.0_1700122406393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_512d_seed_8").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_512_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|386.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-512-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4_en.md new file mode 100644 index 000000000000..4a851c7ad5fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-4` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4_en_5.2.0_3.0_1700141314824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4_en_5.2.0_3.0_1700141314824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_64d_seed_4").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|378.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6_en.md new file mode 100644 index 000000000000..db467a9ff38a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-6` is a English model orginally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6_en_5.2.0_3.0_1700101758893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6_en_5.2.0_3.0_1700101758893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert.base_cased_64d_seed_6").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|378.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8_en.md new file mode 100644 index 000000000000..b339645f3b4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-8` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8_en_5.2.0_3.0_1700124031626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8_en_5.2.0_3.0_1700124031626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.cased_seed_8_base_64d_finetuned_few_shot").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_cased_few_shot_k_64_finetuned_squad_seed_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|378.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-cased-few-shot-k-64-finetuned-squad-seed-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_finetuned_squad_r3f_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_finetuned_squad_r3f_en.md new file mode 100644 index 000000000000..fa35314be944 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_base_finetuned_squad_r3f_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Base Cased model (from anas-awadalla) +author: John Snow Labs +name: bert_qa_spanbert_base_finetuned_squad_r3f +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-base-finetuned-squad-r3f` is a English model originally trained by `anas-awadalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_finetuned_squad_r3f_en_5.2.0_3.0_1700125587369.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_base_finetuned_squad_r3f_en_5.2.0_3.0_1700125587369.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_finetuned_squad_r3f","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_spanbert_base_finetuned_squad_r3f","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.squad.base_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_base_finetuned_squad_r3f| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|399.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anas-awadalla/spanbert-base-finetuned-squad-r3f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_finetuned_squadv1_en.md new file mode 100644 index 000000000000..cf2374846ae5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_finetuned_squadv1_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: bert_qa_spanbert_finetuned_squadv1 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-finetuned-squadv1` is a English model orginally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_finetuned_squadv1_en_5.2.0_3.0_1700103584619.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_finetuned_squadv1_en_5.2.0_3.0_1700103584619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_finetuned_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_finetuned_squadv1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.span_bert").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|402.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/spanbert-finetuned-squadv1 +- https://arxiv.org/abs/1907.10529 +- https://twitter.com/mrm8488 +- https://github.com/facebookresearch +- https://github.com/facebookresearch/SpanBERT +- https://github.com/facebookresearch/SpanBERT#pre-trained-models +- https://rajpurkar.github.io/SQuAD-explorer/ +- https://www.linkedin.com/in/manuel-romero-cs/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_large_recruit_qa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_large_recruit_qa_en.md new file mode 100644 index 000000000000..be0a31b5e872 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_large_recruit_qa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from manishiitg) +author: John Snow Labs +name: bert_qa_spanbert_large_recruit_qa +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-large-recruit-qa` is a English model orginally trained by `manishiitg`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_large_recruit_qa_en_5.2.0_3.0_1700141860561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_large_recruit_qa_en_5.2.0_3.0_1700141860561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_large_recruit_qa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_large_recruit_qa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.large").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_large_recruit_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/manishiitg/spanbert-large-recruit-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_recruit_qa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_recruit_qa_en.md new file mode 100644 index 000000000000..38e9e252556f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_spanbert_recruit_qa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from manishiitg) +author: John Snow Labs +name: bert_qa_spanbert_recruit_qa +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `spanbert-recruit-qa` is a English model orginally trained by `manishiitg`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_recruit_qa_en_5.2.0_3.0_1700141653812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_spanbert_recruit_qa_en_5.2.0_3.0_1700141653812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_spanbert_recruit_qa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_spanbert_recruit_qa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.span_bert.by_manishiitg").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_spanbert_recruit_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/manishiitg/spanbert-recruit-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad1.1_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad1.1_en.md new file mode 100644 index 000000000000..bac6fee993bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad1.1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from maroo93) +author: John Snow Labs +name: bert_qa_squad1.1 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `squad1.1` is a English model orginally trained by `maroo93`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad1.1_en_5.2.0_3.0_1700107972423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad1.1_en_5.2.0_3.0_1700107972423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad1.1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_squad1.1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_maroo93").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad1.1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/maroo93/squad1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad_baseline_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad_baseline_en.md new file mode 100644 index 000000000000..bd139592541b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad_baseline_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from xraychen) +author: John Snow Labs +name: bert_qa_squad_baseline +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `squad-baseline` is a English model orginally trained by `xraychen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad_baseline_en_5.2.0_3.0_1700109612502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad_baseline_en_5.2.0_3.0_1700109612502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad_baseline","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_squad_baseline","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base.by_xraychen").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad_baseline| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/xraychen/squad-baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad_english_bert_base_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad_english_bert_base_en.md new file mode 100644 index 000000000000..ad4ab9213f75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_squad_english_bert_base_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_squad_english_bert_base BertForQuestionAnswering from zhufy +author: John Snow Labs +name: bert_qa_squad_english_bert_base +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_squad_english_bert_base` is a English model originally trained by zhufy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_squad_english_bert_base_en_5.2.0_3.0_1700110927573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_squad_english_bert_base_en_5.2.0_3.0_1700110927573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_squad_english_bert_base","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_squad_english_bert_base", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_squad_english_bert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/zhufy/squad-en-bert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_swahili_question_answer_latest_cased_sw.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_swahili_question_answer_latest_cased_sw.md new file mode 100644 index 000000000000..6e8ac8653497 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_swahili_question_answer_latest_cased_sw.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Swahili BertForQuestionAnswering Cased model (from innocent-charles) +author: John Snow Labs +name: bert_qa_swahili_question_answer_latest_cased +date: 2023-11-16 +tags: [sw, open_source, bert, question_answering, onnx] +task: Question Answering +language: sw +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Swahili-question-answer-latest-cased` is a Swahili model originally trained by `innocent-charles`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_swahili_question_answer_latest_cased_sw_5.2.0_3.0_1700140782767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_swahili_question_answer_latest_cased_sw_5.2.0_3.0_1700140782767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_swahili_question_answer_latest_cased","sw")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_swahili_question_answer_latest_cased","sw") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_swahili_question_answer_latest_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|sw| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/innocent-charles/Swahili-question-answer-latest-cased +- https://github.com/Neurotech-HQ/Swahili-QA-dataset +- https://blog.neurotech.africa/building-swahili-question-and-answering-with-haystack/ +- https://github.com/deepset-ai/haystack/ +- https://haystack.deepset.ai +- https://www.linkedin.com/in/innocent-charles/ +- https://github.com/innocent-charles +- https://paperswithcode.com/sota?task=Question+Answering&dataset=kenyacorpus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_test02_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_test02_en.md new file mode 100644 index 000000000000..c3726256f1f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_test02_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from Akert) +author: John Snow Labs +name: bert_qa_test02 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `test02` is a English model originally trained by `Akert`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_test02_en_5.2.0_3.0_1700153810019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_test02_en_5.2.0_3.0_1700153810019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_test02","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_test02","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_test02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Akert/test02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_testpersianqa_fa.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_testpersianqa_fa.md new file mode 100644 index 000000000000..f5ee7ac05528 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_testpersianqa_fa.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Persian BertForQuestionAnswering Cased model (from AlirezaBaneshi) +author: John Snow Labs +name: bert_qa_testpersianqa +date: 2023-11-16 +tags: [fa, open_source, bert, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `testPersianQA` is a Persian model originally trained by `AlirezaBaneshi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_testpersianqa_fa_5.2.0_3.0_1700141974295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_testpersianqa_fa_5.2.0_3.0_1700141974295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_testpersianqa","fa") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_testpersianqa","fa") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("اسم من چیست؟", "نام من کلارا است و من در برکلی زندگی می کنم.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_testpersianqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.5 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/AlirezaBaneshi/testPersianQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tests_finetuned_squad_test_bert_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tests_finetuned_squad_test_bert_en.md new file mode 100644 index 000000000000..025bc4be174c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tests_finetuned_squad_test_bert_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from ruselkomp) +author: John Snow Labs +name: bert_qa_tests_finetuned_squad_test_bert +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tests-finetuned-squad-test-bert` is a English model orginally trained by `ruselkomp`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tests_finetuned_squad_test_bert_en_5.2.0_3.0_1700142647414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tests_finetuned_squad_test_bert_en_5.2.0_3.0_1700142647414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tests_finetuned_squad_test_bert","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_tests_finetuned_squad_test_bert","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.by_ruselkomp").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tests_finetuned_squad_test_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|1.6 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ruselkomp/tests-finetuned-squad-test-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad_th.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad_th.md new file mode 100644 index 000000000000..35c224af24a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad_th.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Thai BertForQuestionAnswering model (from wicharnkeisei) +author: John Snow Labs +name: bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad +date: 2023-11-16 +tags: [th, open_source, question_answering, bert, onnx] +task: Question Answering +language: th +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `thai-bert-multi-cased-finetuned-xquadv1-finetuned-squad` is a Thai model orginally trained by `wicharnkeisei`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad_th_5.2.0_3.0_1700142277018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad_th_5.2.0_3.0_1700142277018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad","th") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad","th") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("th.answer_question.xquad_squad.bert.cased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_thai_bert_multi_cased_finetuned_xquadv1_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|th| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/wicharnkeisei/thai-bert-multi-cased-finetuned-xquadv1-finetuned-squad +- https://github.com/iapp-technology/iapp-wiki-qa-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_en.md new file mode 100644 index 000000000000..74f0a6eb0a2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny Cased model (from srcocotero) +author: John Snow Labs +name: bert_qa_tiny +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-bert-qa` is a English model originally trained by `srcocotero`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_en_5.2.0_3.0_1700094861529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_en_5.2.0_3.0_1700094861529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tiny","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tiny","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tiny| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/srcocotero/tiny-bert-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_finetuned_cuad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_finetuned_cuad_en.md new file mode 100644 index 000000000000..63a7fb54c765 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_finetuned_cuad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny Cased model (from muhtasham) +author: John Snow Labs +name: bert_qa_tiny_finetuned_cuad +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-cuad` is a English model originally trained by `muhtasham`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_finetuned_cuad_en_5.2.0_3.0_1700120481223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_finetuned_cuad_en_5.2.0_3.0_1700120481223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tiny_finetuned_cuad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tiny_finetuned_cuad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tiny_finetuned_cuad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/muhtasham/bert-tiny-finetuned-cuad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_finetuned_squadv2_en.md new file mode 100644 index 000000000000..ccd6a3beb073 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tiny_finetuned_squadv2_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny Cased model (from M-FAC) +author: John Snow Labs +name: bert_qa_tiny_finetuned_squadv2 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-tiny-finetuned-squadv2` is a English model originally trained by `M-FAC`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_finetuned_squadv2_en_5.2.0_3.0_1700154874575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tiny_finetuned_squadv2_en_5.2.0_3.0_1700154874575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tiny_finetuned_squadv2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_tiny_finetuned_squadv2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.v2_tiny_finetuned").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tiny_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/M-FAC/bert-tiny-finetuned-squadv2 +- https://arxiv.org/pdf/2107.03356.pdf +- https://github.com/IST-DASLab/M-FAC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy_en.md new file mode 100644 index 000000000000..4635ddf462cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Tiny Cased model (from MichelBartels) +author: John Snow Labs +name: bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tinybert-6l-768d-squad2-large-teacher-dummy` is a English model originally trained by `MichelBartels`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy_en_5.2.0_3.0_1700156590757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy_en_5.2.0_3.0_1700156590757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.large_tiny_768d.by_MichelBartels").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tinybert_6l_768d_squad2_large_teacher_dummy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|248.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MichelBartels/tinybert-6l-768d-squad2-large-teacher-dummy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_en.md new file mode 100644 index 000000000000..fb08ab3e0ae2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MichelBartels) +author: John Snow Labs +name: bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tinybert-6l-768d-squad2-large-teacher-finetuned` is a English model orginally trained by `MichelBartels`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_en_5.2.0_3.0_1700141008040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_en_5.2.0_3.0_1700141008040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_tiny_768d.by_MichelBartels").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|249.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MichelBartels/tinybert-6l-768d-squad2-large-teacher-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1_en.md new file mode 100644 index 000000000000..e22b265acb17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from MichelBartels) +author: John Snow Labs +name: bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tinybert-6l-768d-squad2-large-teacher-finetuned-step1` is a English model orginally trained by `MichelBartels`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1_en_5.2.0_3.0_1700122406299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1_en_5.2.0_3.0_1700122406299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.bert.large_tiny_768d_v2.by_MichelBartels").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tinybert_6l_768d_squad2_large_teacher_finetuned_step1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|249.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MichelBartels/tinybert-6l-768d-squad2-large-teacher-finetuned-step1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tmgondal_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tmgondal_finetuned_squad_en.md new file mode 100644 index 000000000000..7096a557a965 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tmgondal_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from tmgondal) +author: John Snow Labs +name: bert_qa_tmgondal_finetuned_squad +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `tmgondal`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tmgondal_finetuned_squad_en_5.2.0_3.0_1700158435973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tmgondal_finetuned_squad_en_5.2.0_3.0_1700158435973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tmgondal_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tmgondal_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tmgondal_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tmgondal/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tquad_base_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tquad_base_turkish_tr.md new file mode 100644 index 000000000000..b010c953e79b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_tquad_base_turkish_tr.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Base Cased model (from Izzet) +author: John Snow Labs +name: bert_qa_tquad_base_turkish +date: 2023-11-16 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qa_tquad_bert-base-turkish` is a Turkish model originally trained by `Izzet`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_tquad_base_turkish_tr_5.2.0_3.0_1700141353029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_tquad_base_turkish_tr_5.2.0_3.0_1700141353029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tquad_base_turkish","tr")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_tquad_base_turkish","tr") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_tquad_base_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Izzet/qa_tquad_bert-base-turkish +- https://github.com/izzetkalic/botcuk-dataset-analyze/tree/main/datasets/qa-tquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna_en.md new file mode 100644 index 000000000000..ad1269ffa7c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from aodiniz) +author: John Snow Labs +name: bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_uncased_L-2_H-128_A-2_cord19-200616_squad2_covid-qna` is a English model originally trained by `aodiniz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700095892418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700095892418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2_covid_cord19.uncased_2l_128d_a2a_128d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aodiniz/bert_uncased_L-2_H-128_A-2_cord19-200616_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_en.md new file mode 100644 index 000000000000..febda140d054 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from aodiniz) +author: John Snow Labs +name: bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_uncased_L-2_H-128_A-2_cord19-200616_squad2` is a English model originally trained by `aodiniz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_en_5.2.0_3.0_1700161528646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2_en_5.2.0_3.0_1700161528646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2_cord19.uncased_2l_128d_a2a_128d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_uncased_l_2_h_128_a_2_cord19_200616_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aodiniz/bert_uncased_L-2_H-128_A-2_cord19-200616_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna_en.md new file mode 100644 index 000000000000..e26b351b192a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from aodiniz) +author: John Snow Labs +name: bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_uncased_L-2_H-128_A-2_squad2_covid-qna` is a English model originally trained by `aodiniz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna_en_5.2.0_3.0_1700163669992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna_en_5.2.0_3.0_1700163669992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2_covid.uncased_2l_128d_a2a_128d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_uncased_l_2_h_128_a_2_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aodiniz/bert_uncased_L-2_H-128_A-2_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_squad2_en.md new file mode 100644 index 000000000000..4931b6a9da54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_128_a_2_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from aodiniz) +author: John Snow Labs +name: bert_qa_uncased_l_2_h_128_a_2_squad2 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_uncased_L-2_H-128_A-2_squad2` is a English model originally trained by `aodiniz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_squad2_en_5.2.0_3.0_1700162863747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_128_a_2_squad2_en_5.2.0_3.0_1700162863747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_squad2","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_128_a_2_squad2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2.uncased_2l_128d_a2a_128d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_uncased_l_2_h_128_a_2_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|16.7 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aodiniz/bert_uncased_L-2_H-128_A-2_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna_en.md new file mode 100644 index 000000000000..322a4ca305de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from aodiniz) +author: John Snow Labs +name: bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_uncased_L-2_H-512_A-8_cord19-200616_squad2_covid-qna` is a English model originally trained by `aodiniz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700097405458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700097405458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2_covid_cord19.uncased_2l_512d_a8a_512d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_uncased_l_2_h_512_a_8_cord19_200616_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|83.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aodiniz/bert_uncased_L-2_H-512_A-8_cord19-200616_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna_en.md new file mode 100644 index 000000000000..d7700bc451d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from aodiniz) +author: John Snow Labs +name: bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert_uncased_L-6_H-128_A-2_cord19-200616_squad2_covid-qna` is a English model originally trained by `aodiniz`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700123409113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna_en_5.2.0_3.0_1700123409113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squadv2_covid_cord19.uncased_6l_128d_a2a_128d").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_uncased_l_6_h_128_a_2_cord19_200616_squad2_covid_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|19.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aodiniz/bert_uncased_L-6_H-128_A-2_cord19-200616_squad2_covid-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_unqover_bert_base_uncased_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_unqover_bert_base_uncased_newsqa_en.md new file mode 100644 index 000000000000..3d64943754d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_unqover_bert_base_uncased_newsqa_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from tli8hf) +author: John Snow Labs +name: bert_qa_unqover_bert_base_uncased_newsqa +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `unqover-bert-base-uncased-newsqa` is a English model orginally trained by `tli8hf`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_unqover_bert_base_uncased_newsqa_en_5.2.0_3.0_1700125145133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_unqover_bert_base_uncased_newsqa_en_5.2.0_3.0_1700125145133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_unqover_bert_base_uncased_newsqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_unqover_bert_base_uncased_newsqa","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.news.bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_unqover_bert_base_uncased_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tli8hf/unqover-bert-base-uncased-newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_unqover_large_uncased_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_unqover_large_uncased_newsqa_en.md new file mode 100644 index 000000000000..5bed1f37730d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_unqover_large_uncased_newsqa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Large Uncased model (from tli8hf) +author: John Snow Labs +name: bert_qa_unqover_large_uncased_newsqa +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `unqover-bert-large-uncased-newsqa` is a English model originally trained by `tli8hf`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_unqover_large_uncased_newsqa_en_5.2.0_3.0_1700141035813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_unqover_large_uncased_newsqa_en_5.2.0_3.0_1700141035813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_unqover_large_uncased_newsqa","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_unqover_large_uncased_newsqa","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.news_sqa.uncased_large").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_unqover_large_uncased_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tli8hf/unqover-bert-large-uncased-newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_victoraavila_bert_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_victoraavila_bert_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..fe5f03260b8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_victoraavila_bert_base_uncased_finetuned_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from victoraavila) +author: John Snow Labs +name: bert_qa_victoraavila_bert_base_uncased_finetuned_squad +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-finetuned-squad` is a English model orginally trained by `victoraavila`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_victoraavila_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700141658128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_victoraavila_bert_base_uncased_finetuned_squad_en_5.2.0_3.0_1700141658128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_victoraavila_bert_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_victoraavila_bert_base_uncased_finetuned_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_victoraavila").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_victoraavila_bert_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/victoraavila/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_vuiseng9_bert_base_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_vuiseng9_bert_base_uncased_squad_en.md new file mode 100644 index 000000000000..3c559cf0aadb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_vuiseng9_bert_base_uncased_squad_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from vuiseng9) +author: John Snow Labs +name: bert_qa_vuiseng9_bert_base_uncased_squad +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-squad` is a English model orginally trained by `vuiseng9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_vuiseng9_bert_base_uncased_squad_en_5.2.0_3.0_1700099761022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_vuiseng9_bert_base_uncased_squad_en_5.2.0_3.0_1700099761022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_vuiseng9_bert_base_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_vuiseng9_bert_base_uncased_squad","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.base_uncased.by_vuiseng9").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_vuiseng9_bert_base_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|407.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vuiseng9/bert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_wiselinjayajos_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_wiselinjayajos_finetuned_squad_en.md new file mode 100644 index 000000000000..cb23f61c28da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_wiselinjayajos_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from wiselinjayajos) +author: John Snow Labs +name: bert_qa_wiselinjayajos_finetuned_squad +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `wiselinjayajos`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_wiselinjayajos_finetuned_squad_en_5.2.0_3.0_1700101442486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_wiselinjayajos_finetuned_squad_en_5.2.0_3.0_1700101442486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_wiselinjayajos_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_wiselinjayajos_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_wiselinjayajos").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_wiselinjayajos_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/wiselinjayajos/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_wskhanh_roberta_wwm_ext_large_zh.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_wskhanh_roberta_wwm_ext_large_zh.md new file mode 100644 index 000000000000..8ec3d0465397 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_wskhanh_roberta_wwm_ext_large_zh.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Chinese BertForQuestionAnswering Large Cased model (from wskhanh) +author: John Snow Labs +name: bert_qa_wskhanh_roberta_wwm_ext_large +date: 2023-11-16 +tags: [zh, open_source, bert, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Roberta-wwm-ext-large-qa` is a Chinese model originally trained by `wskhanh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_wskhanh_roberta_wwm_ext_large_zh_5.2.0_3.0_1700141587605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_wskhanh_roberta_wwm_ext_large_zh_5.2.0_3.0_1700141587605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_wskhanh_roberta_wwm_ext_large","zh")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_wskhanh_roberta_wwm_ext_large","zh") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_wskhanh_roberta_wwm_ext_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/wskhanh/Roberta-wwm-ext-large-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xquad_thai_mbert_base_th.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xquad_thai_mbert_base_th.md new file mode 100644 index 000000000000..cb967029b1e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xquad_thai_mbert_base_th.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Thai bert_qa_xquad_thai_mbert_base BertForQuestionAnswering from zhufy +author: John Snow Labs +name: bert_qa_xquad_thai_mbert_base +date: 2023-11-16 +tags: [bert, th, open_source, question_answering, onnx] +task: Question Answering +language: th +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_xquad_thai_mbert_base` is a Thai model originally trained by zhufy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xquad_thai_mbert_base_th_5.2.0_3.0_1700103412739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xquad_thai_mbert_base_th_5.2.0_3.0_1700103412739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_xquad_thai_mbert_base","th") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_xquad_thai_mbert_base", "th") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xquad_thai_mbert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|th| +|Size:|665.0 MB| + +## References + +https://huggingface.co/zhufy/xquad-th-mbert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l12_h384_uncased_natural_questions_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l12_h384_uncased_natural_questions_en.md new file mode 100644 index 000000000000..70d477ca39bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l12_h384_uncased_natural_questions_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from nyorain) +author: John Snow Labs +name: bert_qa_xtremedistil_l12_h384_uncased_natural_questions +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l12-h384-uncased-natural-questions` is a English model originally trained by `nyorain`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l12_h384_uncased_natural_questions_en_5.2.0_3.0_1700152150227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l12_h384_uncased_natural_questions_en_5.2.0_3.0_1700152150227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l12_h384_uncased_natural_questions","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l12_h384_uncased_natural_questions","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xtremedistil_l12_h384_uncased_natural_questions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|124.1 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/nyorain/xtremedistil-l12-h384-uncased-natural-questions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6_en.md new file mode 100644 index 000000000000..995999d0a174 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from husnu) +author: John Snow Labs +name: bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6 +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h256-uncased-finetuned_lr-2e-05_epochs-6` is a English model orginally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6_en_5.2.0_3.0_1700153291634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6_en_5.2.0_3.0_1700153291634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.xtremedistiled_uncased_lr_2e_05_epochs_6.by_husnu").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xtremedistil_l6_h256_uncased_finetuned_lr_2e_05_epochs_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|47.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/xtremedistil-l6-h256-uncased-finetuned_lr-2e-05_epochs-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short_en.md new file mode 100644 index 000000000000..61b969172571 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from Be-Lo) +author: John Snow Labs +name: bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h256-uncased-natural-questions-short` is a English model originally trained by `Be-Lo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short_en_5.2.0_3.0_1700172621685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short_en_5.2.0_3.0_1700172621685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xtremedistil_l6_h256_uncased_natural_questions_short| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|47.4 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Be-Lo/xtremedistil-l6-h256-uncased-natural-questions-short +- https://research.google/pubs/pub47761/ +- https://github.com/mrqa/MRQA-Shared-Task-2019 +- https://research.google/pubs/pub47761/ +- https://github.com/mrqa/MRQA-Shared-Task-2019 +- https://square.ukp-lab.de/qa +- https://www.informatik.tu-darmstadt.de/ukp/ukp_home/index.en.jsp +- https://github.com/dl4nlp-tuda/deep-learning-for-nlp-lectures +- https://www.trusthlt.org/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3_en.md new file mode 100644 index 000000000000..09792763ffb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from husnu) +author: John Snow Labs +name: bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h256-uncased-TQUAD-finetuned_lr-2e-05_epochs-3` is a English model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3_en_5.2.0_3.0_1700154318451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3_en_5.2.0_3.0_1700154318451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.tquad.xtremedistiled_uncased_finetuned_epochs_3").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|47.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/xtremedistil-l6-h256-uncased-TQUAD-finetuned_lr-2e-05_epochs-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6_en.md new file mode 100644 index 000000000000..57069ccbca2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Uncased model (from husnu) +author: John Snow Labs +name: bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6 +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `xtremedistil-l6-h256-uncased-TQUAD-finetuned_lr-2e-05_epochs-6` is a English model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6_en_5.2.0_3.0_1700104833487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6_en_5.2.0_3.0_1700104833487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.tquad.xtremedistiled_uncased_finetuned_epochs_6").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|47.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/xtremedistil-l6-h256-uncased-TQUAD-finetuned_lr-2e-05_epochs-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9_en.md new file mode 100644 index 000000000000..95deab0edacb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9 BertForQuestionAnswering from husnu +author: John Snow Labs +name: bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9` is a English model originally trained by husnu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9_en_5.2.0_3.0_1700171301368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9_en_5.2.0_3.0_1700171301368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_xtremedistil_l6_h256_uncased_tquad_finetuned_lr_2e_05_epochs_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|47.3 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/husnu/xtremedistil-l6-h256-uncased-TQUAD-finetuned_lr-2e-05_epochs-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_ydshieh_tiny_random_forquestionanswering_ja.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_ydshieh_tiny_random_forquestionanswering_ja.md new file mode 100644 index 000000000000..b072aee0c0cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_ydshieh_tiny_random_forquestionanswering_ja.md @@ -0,0 +1,94 @@ +--- +layout: model +title: Japanese BertForQuestionAnswering Tiny Cased model (from ydshieh) +author: John Snow Labs +name: bert_qa_ydshieh_tiny_random_forquestionanswering +date: 2023-11-16 +tags: [ja, open_source, bert, question_answering, onnx] +task: Question Answering +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-random-BertForQuestionAnswering` is a Japanese model originally trained by `ydshieh`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ydshieh_tiny_random_forquestionanswering_ja_5.2.0_3.0_1700154938859.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ydshieh_tiny_random_forquestionanswering_ja_5.2.0_3.0_1700154938859.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ydshieh_tiny_random_forquestionanswering","ja")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ydshieh_tiny_random_forquestionanswering","ja") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ydshieh_tiny_random_forquestionanswering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ja| +|Size:|346.5 KB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ydshieh/tiny-random-BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_yossra_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_yossra_finetuned_squad_en.md new file mode 100644 index 000000000000..cc04a3a6a65d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_yossra_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English BertForQuestionAnswering Cased model (from yossra) +author: John Snow Labs +name: bert_qa_yossra_finetuned_squad +date: 2023-11-16 +tags: [en, open_source, bert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad` is a English model originally trained by `yossra`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_yossra_finetuned_squad_en_5.2.0_3.0_1700156597353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_yossra_finetuned_squad_en_5.2.0_3.0_1700156597353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_yossra_finetuned_squad","en") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("bert_qa_yossra_finetuned_squad","en") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.squad.finetuned.by_yossra").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_yossra_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/yossra/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_youngjae_bert_finetuned_squad_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_youngjae_bert_finetuned_squad_accelerate_en.md new file mode 100644 index 000000000000..8b797adb88cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_youngjae_bert_finetuned_squad_accelerate_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from youngjae) +author: John Snow Labs +name: bert_qa_youngjae_bert_finetuned_squad_accelerate +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-finetuned-squad-accelerate` is a English model orginally trained by `youngjae`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_youngjae_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1700141918411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_youngjae_bert_finetuned_squad_accelerate_en_5.2.0_3.0_1700141918411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_youngjae_bert_finetuned_squad_accelerate","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_youngjae_bert_finetuned_squad_accelerate","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.bert.accelerate.by_youngjae").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_youngjae_bert_finetuned_squad_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|403.6 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/youngjae/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_ytu_base_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_ytu_base_turkish_tr.md new file mode 100644 index 000000000000..44e901f9ee70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_ytu_base_turkish_tr.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Turkish BertForQuestionAnswering Base Cased model (from Izzet) +author: John Snow Labs +name: bert_qa_ytu_base_turkish +date: 2023-11-16 +tags: [tr, open_source, bert, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `qa_ytu_bert-base-turkish` is a Turkish model originally trained by `Izzet`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_ytu_base_turkish_tr_5.2.0_3.0_1700142270831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_ytu_base_turkish_tr_5.2.0_3.0_1700142270831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ytu_base_turkish","tr")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = BertForQuestionAnswering.pretrained("bert_qa_ytu_base_turkish","tr") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_ytu_base_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|688.9 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Izzet/qa_ytu_bert-base-turkish +- https://github.com/izzetkalic/botcuk-dataset-analyze/tree/main/datasets/qa-ytu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_qa_zero_shot_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_zero_shot_en.md new file mode 100644 index 000000000000..cf1ff755b936 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_qa_zero_shot_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English BertForQuestionAnswering model (from krinal214) +author: John Snow Labs +name: bert_qa_zero_shot +date: 2023-11-16 +tags: [en, open_source, question_answering, bert, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `zero_shot` is a English model orginally trained by `krinal214`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_zero_shot_en_5.2.0_3.0_1700142728982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_zero_shot_en_5.2.0_3.0_1700142728982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("bert_qa_zero_shot","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer") \ +.setCaseSensitive(True) + +pipeline = Pipeline().setStages([ +document_assembler, +spanClassifier +]) + +example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(example).transform(example) +``` +```scala +val document = new MultiDocumentAssembler() +.setInputCols("question", "context") +.setOutputCols("document_question", "document_context") + +val spanClassifier = BertForQuestionAnswering +.pretrained("bert_qa_zero_shot","en") +.setInputCols(Array("document_question", "document_context")) +.setOutputCol("answer") +.setCaseSensitive(true) +.setMaxSentenceLength(512) + +val pipeline = new Pipeline().setStages(Array(document, spanClassifier)) + +val example = Seq( +("Where was John Lenon born?", "John Lenon was born in London and lived in Paris. My name is Sarah and I live in London."), +("What's my name?", "My name is Clara and I live in Berkeley.")) +.toDF("question", "context") + +val result = pipeline.fit(example).transform(example) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.bert.zero_shot.by_krinal214").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_zero_shot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[embeddings]| +|Language:|en| +|Size:|665.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/krinal214/zero_shot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_squad_covidqa_2_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_squad_covidqa_2_en.md new file mode 100644 index 000000000000..c36a6337fd01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_squad_covidqa_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_squad_covidqa_2 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_squad_covidqa_2 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_squad_covidqa_2` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_squad_covidqa_2_en_5.2.0_3.0_1700157604047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_squad_covidqa_2_en_5.2.0_3.0_1700157604047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_squad_covidqa_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_squad_covidqa_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_squad_covidqa_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/hung200504/bert-squad-covidqa-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_squad_v2_en.md new file mode 100644 index 000000000000..4f7e0cf4b868 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_squad_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_squad_v2 BertForQuestionAnswering from fahmiaziz +author: John Snow Labs +name: bert_squad_v2 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_squad_v2` is a English model originally trained by fahmiaziz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_squad_v2_en_5.2.0_3.0_1700163659003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_squad_v2_en_5.2.0_3.0_1700163659003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_squad_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_squad_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/fahmiaziz/bert-squad-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-bert_test_model_en.md b/docs/_posts/ahmedlone127/2023-11-16-bert_test_model_en.md new file mode 100644 index 000000000000..00700f747bac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-bert_test_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_test_model BertForQuestionAnswering from Jellevdl +author: John Snow Labs +name: bert_test_model +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_test_model` is a English model originally trained by Jellevdl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_test_model_en_5.2.0_3.0_1700156136455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_test_model_en_5.2.0_3.0_1700156136455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_test_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_test_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_test_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Jellevdl/Bert-test-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-biomedical_question_answering_en.md b/docs/_posts/ahmedlone127/2023-11-16-biomedical_question_answering_en.md new file mode 100644 index 000000000000..69ac65b6d01e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-biomedical_question_answering_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_question_answering BertForQuestionAnswering from Shushant +author: John Snow Labs +name: biomedical_question_answering +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_question_answering` is a English model originally trained by Shushant. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_question_answering_en_5.2.0_3.0_1700172001885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_question_answering_en_5.2.0_3.0_1700172001885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("biomedical_question_answering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("biomedical_question_answering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/Shushant/biomedical_question_answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-braslab_bert_drcd_384_en.md b/docs/_posts/ahmedlone127/2023-11-16-braslab_bert_drcd_384_en.md new file mode 100644 index 000000000000..2d4b23d4826d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-braslab_bert_drcd_384_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English braslab_bert_drcd_384 BertForQuestionAnswering from nyust-eb210 +author: John Snow Labs +name: braslab_bert_drcd_384 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`braslab_bert_drcd_384` is a English model originally trained by nyust-eb210. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/braslab_bert_drcd_384_en_5.2.0_3.0_1700110050448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/braslab_bert_drcd_384_en_5.2.0_3.0_1700110050448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("braslab_bert_drcd_384","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("braslab_bert_drcd_384", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|braslab_bert_drcd_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/nyust-eb210/braslab-bert-drcd-384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-burmese_awesome_qa_model_sglasher_en.md b/docs/_posts/ahmedlone127/2023-11-16-burmese_awesome_qa_model_sglasher_en.md new file mode 100644 index 000000000000..947188ed13c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-burmese_awesome_qa_model_sglasher_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sglasher BertForQuestionAnswering from sglasher +author: John Snow Labs +name: burmese_awesome_qa_model_sglasher +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sglasher` is a English model originally trained by sglasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sglasher_en_5.2.0_3.0_1700118329547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sglasher_en_5.2.0_3.0_1700118329547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sglasher","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sglasher", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sglasher| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/sglasher/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-chatanswering_ptbr_pt.md b/docs/_posts/ahmedlone127/2023-11-16-chatanswering_ptbr_pt.md new file mode 100644 index 000000000000..e8f9686b60a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-chatanswering_ptbr_pt.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Portuguese chatanswering_ptbr BertForQuestionAnswering from JeanL-0 +author: John Snow Labs +name: chatanswering_ptbr +date: 2023-11-16 +tags: [bert, pt, open_source, question_answering, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatanswering_ptbr` is a Portuguese model originally trained by JeanL-0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatanswering_ptbr_pt_5.2.0_3.0_1700158279659.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatanswering_ptbr_pt_5.2.0_3.0_1700158279659.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("chatanswering_ptbr","pt") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("chatanswering_ptbr", "pt") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatanswering_ptbr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|665.0 MB| + +## References + +https://huggingface.co/JeanL-0/ChatAnswering-PTBR \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-chinese_bert_wwm_ext_finetuned_qa_b8_10_en.md b/docs/_posts/ahmedlone127/2023-11-16-chinese_bert_wwm_ext_finetuned_qa_b8_10_en.md new file mode 100644 index 000000000000..4889a74dde46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-chinese_bert_wwm_ext_finetuned_qa_b8_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chinese_bert_wwm_ext_finetuned_qa_b8_10 BertForQuestionAnswering from sharkMeow +author: John Snow Labs +name: chinese_bert_wwm_ext_finetuned_qa_b8_10 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_bert_wwm_ext_finetuned_qa_b8_10` is a English model originally trained by sharkMeow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_bert_wwm_ext_finetuned_qa_b8_10_en_5.2.0_3.0_1700171888784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_bert_wwm_ext_finetuned_qa_b8_10_en_5.2.0_3.0_1700171888784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("chinese_bert_wwm_ext_finetuned_qa_b8_10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("chinese_bert_wwm_ext_finetuned_qa_b8_10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_bert_wwm_ext_finetuned_qa_b8_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.2 MB| + +## References + +https://huggingface.co/sharkMeow/chinese-bert-wwm-ext-finetuned-QA-b8-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-chinese_question_answering_jeremyfeng_en.md b/docs/_posts/ahmedlone127/2023-11-16-chinese_question_answering_jeremyfeng_en.md new file mode 100644 index 000000000000..0d6b023a8461 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-chinese_question_answering_jeremyfeng_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chinese_question_answering_jeremyfeng BertForQuestionAnswering from JeremyFeng +author: John Snow Labs +name: chinese_question_answering_jeremyfeng +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chinese_question_answering_jeremyfeng` is a English model originally trained by JeremyFeng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chinese_question_answering_jeremyfeng_en_5.2.0_3.0_1700123320158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chinese_question_answering_jeremyfeng_en_5.2.0_3.0_1700123320158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("chinese_question_answering_jeremyfeng","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("chinese_question_answering_jeremyfeng", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chinese_question_answering_jeremyfeng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/JeremyFeng/chinese-question-answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-costa_rica_cased_qa_en.md b/docs/_posts/ahmedlone127/2023-11-16-costa_rica_cased_qa_en.md new file mode 100644 index 000000000000..f59c1c8f6614 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-costa_rica_cased_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English costa_rica_cased_qa BertForQuestionAnswering from nymiz +author: John Snow Labs +name: costa_rica_cased_qa +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`costa_rica_cased_qa` is a English model originally trained by nymiz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/costa_rica_cased_qa_en_5.2.0_3.0_1700156398909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/costa_rica_cased_qa_en_5.2.0_3.0_1700156398909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("costa_rica_cased_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("costa_rica_cased_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|costa_rica_cased_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|625.5 MB| + +## References + +https://huggingface.co/nymiz/costa-rica_cased-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-distilbert_preguntas_respuestas_posgrados_en.md b/docs/_posts/ahmedlone127/2023-11-16-distilbert_preguntas_respuestas_posgrados_en.md new file mode 100644 index 000000000000..082a00ee2836 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-distilbert_preguntas_respuestas_posgrados_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_preguntas_respuestas_posgrados BertForQuestionAnswering from leo123 +author: John Snow Labs +name: distilbert_preguntas_respuestas_posgrados +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_preguntas_respuestas_posgrados` is a English model originally trained by leo123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_preguntas_respuestas_posgrados_en_5.2.0_3.0_1700115077403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_preguntas_respuestas_posgrados_en_5.2.0_3.0_1700115077403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("distilbert_preguntas_respuestas_posgrados","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("distilbert_preguntas_respuestas_posgrados", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_preguntas_respuestas_posgrados| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/leo123/DistilBERT-Preguntas-Respuestas-Posgrados \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_ara_base_artydiqa_ar.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_ara_base_artydiqa_ar.md new file mode 100644 index 000000000000..fe40495bdeb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_ara_base_artydiqa_ar.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Arabic ElectraForQuestionAnswering model (from wissamantoun) +author: John Snow Labs +name: electra_qa_ara_base_artydiqa +date: 2023-11-16 +tags: [ar, open_source, electra, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `araelectra-base-artydiqa` is a Arabic model originally trained by `wissamantoun`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_ara_base_artydiqa_ar_5.2.0_3.0_1700095615535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_ara_base_artydiqa_ar_5.2.0_3.0_1700095615535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_ara_base_artydiqa","ar") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["ما هو اسمي؟", "اسمي كلارا وأنا أعيش في بيركلي."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_ara_base_artydiqa","ar") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("ما هو اسمي؟", "اسمي كلارا وأنا أعيش في بيركلي.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ar.answer_question.tydiqa.electra.base").predict("""ما هو اسمي؟|||"اسمي كلارا وأنا أعيش في بيركلي.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_ara_base_artydiqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|504.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/wissamantoun/araelectra-base-artydiqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_araelectra_base_finetuned_arcd_ar.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_araelectra_base_finetuned_arcd_ar.md new file mode 100644 index 000000000000..60047e95b1ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_araelectra_base_finetuned_arcd_ar.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Arabic electra_qa_araelectra_base_finetuned_arcd BertForQuestionAnswering from salti +author: John Snow Labs +name: electra_qa_araelectra_base_finetuned_arcd +date: 2023-11-16 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_araelectra_base_finetuned_arcd` is a Arabic model originally trained by salti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_araelectra_base_finetuned_arcd_ar_5.2.0_3.0_1700108532707.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_araelectra_base_finetuned_arcd_ar_5.2.0_3.0_1700108532707.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_araelectra_base_finetuned_arcd","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_araelectra_base_finetuned_arcd", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_araelectra_base_finetuned_arcd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|504.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/salti/AraElectra-base-finetuned-ARCD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_araelectra_squad_arcd_ar.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_araelectra_squad_arcd_ar.md new file mode 100644 index 000000000000..86a6b222ec5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_araelectra_squad_arcd_ar.md @@ -0,0 +1,95 @@ +--- +layout: model +title: Arabic electra_qa_araelectra_squad_arcd BertForQuestionAnswering from aymanm419 +author: John Snow Labs +name: electra_qa_araelectra_squad_arcd +date: 2023-11-16 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_araelectra_squad_arcd` is a Arabic model originally trained by aymanm419. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_araelectra_squad_arcd_ar_5.2.0_3.0_1700114108717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_araelectra_squad_arcd_ar_5.2.0_3.0_1700114108717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_araelectra_squad_arcd","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_araelectra_squad_arcd", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_araelectra_squad_arcd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|504.3 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/aymanm419/araElectra-SQUAD-ARCD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_discriminator_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_discriminator_finetuned_squadv1_en.md new file mode 100644 index 000000000000..443fa5d69b97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_discriminator_finetuned_squadv1_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from valhalla) +author: John Snow Labs +name: electra_qa_base_discriminator_finetuned_squadv1 +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-base-discriminator-finetuned_squadv1` is a English model originally trained by `valhalla`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_discriminator_finetuned_squadv1_en_5.2.0_3.0_1700118260213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_discriminator_finetuned_squadv1_en_5.2.0_3.0_1700118260213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_discriminator_finetuned_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_discriminator_finetuned_squadv1","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.base.by_valhalla").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_discriminator_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/valhalla/electra-base-discriminator-finetuned_squadv1 +- https://github.com/patil-suraj/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_discriminator_finetuned_squadv2_tr.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_discriminator_finetuned_squadv2_tr.md new file mode 100644 index 000000000000..214d1d5b7b8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_discriminator_finetuned_squadv2_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish ElectraForQuestionAnswering model (from enelpi) Discriminator Version-2 +author: John Snow Labs +name: electra_qa_base_discriminator_finetuned_squadv2 +date: 2023-11-16 +tags: [tr, open_source, electra, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-base-discriminator-finetuned_squadv2_tr` is a Turkish model originally trained by `enelpi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_discriminator_finetuned_squadv2_tr_5.2.0_3.0_1700175680958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_discriminator_finetuned_squadv2_tr_5.2.0_3.0_1700175680958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_discriminator_finetuned_squadv2","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_discriminator_finetuned_squadv2","tr") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.squadv2.electra.base_v2").predict("""Benim adım ne?|||"Benim adım Clara ve Berkeley'de yaşıyorum.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_discriminator_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|412.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/enelpi/electra-base-discriminator-finetuned_squadv2_tr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_finetuned_squadv1_en.md new file mode 100644 index 000000000000..fcb9a1bd61f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_finetuned_squadv1_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: electra_qa_base_finetuned_squadv1 +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-base-finetuned-squadv1` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_finetuned_squadv1_en_5.2.0_3.0_1700097241442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_finetuned_squadv1_en_5.2.0_3.0_1700097241442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_finetuned_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_finetuned_squadv1","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.base.by_mrm8488").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/electra-base-finetuned-squadv1 +- https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_squad2_covid_deepset_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_squad2_covid_deepset_en.md new file mode 100644 index 000000000000..ae2ce8116015 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_squad2_covid_deepset_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from armageddon) +author: John Snow Labs +name: electra_qa_base_squad2_covid_deepset +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-base-squad2-covid-qa-deepset` is a English model originally trained by `armageddon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_squad2_covid_deepset_en_5.2.0_3.0_1700177599963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_squad2_covid_deepset_en_5.2.0_3.0_1700177599963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_squad2_covid_deepset","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_squad2_covid_deepset","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_covid.electra.base").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_squad2_covid_deepset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/armageddon/electra-base-squad2-covid-qa-deepset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_squad2_en.md new file mode 100644 index 000000000000..63fa816f7aeb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_squad2_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from navteca) +author: John Snow Labs +name: electra_qa_base_squad2 +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-base-squad2` is a English model originally trained by `navteca`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_squad2_en_5.2.0_3.0_1700098850923.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_squad2_en_5.2.0_3.0_1700098850923.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_squad2","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.electra.base.by_navteca").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/navteca/electra-base-squad2 +- https://rajpurkar.github.io/SQuAD-explorer/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v2_finetuned_korquad_ko.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v2_finetuned_korquad_ko.md new file mode 100644 index 000000000000..943f1db69c0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v2_finetuned_korquad_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean ElectraForQuestionAnswering model (from monologg) +author: John Snow Labs +name: electra_qa_base_v2_finetuned_korquad +date: 2023-11-16 +tags: [ko, open_source, electra, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `koelectra-base-v2-finetuned-korquad` is a Korean model originally trained by `monologg`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_v2_finetuned_korquad_ko_5.2.0_3.0_1700094573912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_v2_finetuned_korquad_ko_5.2.0_3.0_1700094573912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_v2_finetuned_korquad","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_v2_finetuned_korquad","ko") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.korquad.electra.base_v2.by_monologg").predict("""내 이름은 무엇입니까?|||"제 이름은 클라라이고 저는 버클리에 살고 있습니다.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_v2_finetuned_korquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|411.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monologg/koelectra-base-v2-finetuned-korquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v3_discriminator_finetuned_klue_v4_ko.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v3_discriminator_finetuned_klue_v4_ko.md new file mode 100644 index 000000000000..c001a9eda4ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v3_discriminator_finetuned_klue_v4_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean ElectraForQuestionAnswering model (from obokkkk) +author: John Snow Labs +name: electra_qa_base_v3_discriminator_finetuned_klue_v4 +date: 2023-11-16 +tags: [ko, open_source, electra, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `koelectra-base-v3-discriminator-finetuned-klue-v4` is a Korean model originally trained by `obokkkk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_v3_discriminator_finetuned_klue_v4_ko_5.2.0_3.0_1700096009029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_v3_discriminator_finetuned_klue_v4_ko_5.2.0_3.0_1700096009029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_v3_discriminator_finetuned_klue_v4","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_v3_discriminator_finetuned_klue_v4","ko") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.klue.electra.base.by_obokkkk").predict("""내 이름은 무엇입니까?|||"제 이름은 클라라이고 저는 버클리에 살고 있습니다.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_v3_discriminator_finetuned_klue_v4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|419.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/obokkkk/koelectra-base-v3-discriminator-finetuned-klue-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v3_finetuned_korquad_ko.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v3_finetuned_korquad_ko.md new file mode 100644 index 000000000000..765dbeb11e12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_base_v3_finetuned_korquad_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean ElectraForQuestionAnswering model (from monologg) Version-3 +author: John Snow Labs +name: electra_qa_base_v3_finetuned_korquad +date: 2023-11-16 +tags: [ko, open_source, electra, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `koelectra-base-v3-finetuned-korquad` is a Korean model originally trained by `monologg`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_base_v3_finetuned_korquad_ko_5.2.0_3.0_1700097709677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_base_v3_finetuned_korquad_ko_5.2.0_3.0_1700097709677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_base_v3_finetuned_korquad","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_base_v3_finetuned_korquad","ko") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.korquad.electra.base").predict("""내 이름은 무엇입니까?|||"제 이름은 클라라이고 저는 버클리에 살고 있습니다.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_base_v3_finetuned_korquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|419.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monologg/koelectra-base-v3-finetuned-korquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_base_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_base_squad2_en.md new file mode 100644 index 000000000000..14300f5c4c64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_base_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English electra_qa_biom_base_squad2 BertForQuestionAnswering from sultan +author: John Snow Labs +name: electra_qa_biom_base_squad2 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_biom_base_squad2` is a English model originally trained by sultan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_biom_base_squad2_en_5.2.0_3.0_1700110050778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_biom_base_squad2_en_5.2.0_3.0_1700110050778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_biom_base_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_biom_base_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_biom_base_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/sultan/BioM-ELECTRA-Base-SQuAD2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_large_squad2_bioasq8b_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_large_squad2_bioasq8b_en.md new file mode 100644 index 000000000000..016378d6813c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_large_squad2_bioasq8b_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English electra_qa_biom_large_squad2_bioasq8b BertForQuestionAnswering from sultan +author: John Snow Labs +name: electra_qa_biom_large_squad2_bioasq8b +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_biom_large_squad2_bioasq8b` is a English model originally trained by sultan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_biom_large_squad2_bioasq8b_en_5.2.0_3.0_1700143012944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_biom_large_squad2_bioasq8b_en_5.2.0_3.0_1700143012944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_biom_large_squad2_bioasq8b","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_biom_large_squad2_bioasq8b", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_biom_large_squad2_bioasq8b| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/sultan/BioM-ELECTRA-Large-SQuAD2-BioASQ8B \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_large_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_large_squad2_en.md new file mode 100644 index 000000000000..b89a68215a81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biom_large_squad2_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English electra_qa_biom_large_squad2 BertForQuestionAnswering from sultan +author: John Snow Labs +name: electra_qa_biom_large_squad2 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_biom_large_squad2` is a English model originally trained by sultan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_biom_large_squad2_en_5.2.0_3.0_1700142216874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_biom_large_squad2_en_5.2.0_3.0_1700142216874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_biom_large_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_biom_large_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_biom_large_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/sultan/BioM-ELECTRA-Large-SQuAD2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biomedtra_small_spanish_squad2_es.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biomedtra_small_spanish_squad2_es.md new file mode 100644 index 000000000000..2addac899e43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_biomedtra_small_spanish_squad2_es.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Castilian, Spanish electra_qa_biomedtra_small_spanish_squad2 BertForQuestionAnswering from hackathon-pln-es +author: John Snow Labs +name: electra_qa_biomedtra_small_spanish_squad2 +date: 2023-11-16 +tags: [bert, es, open_source, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_biomedtra_small_spanish_squad2` is a Castilian, Spanish model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_biomedtra_small_spanish_squad2_es_5.2.0_3.0_1700170121774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_biomedtra_small_spanish_squad2_es_5.2.0_3.0_1700170121774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_biomedtra_small_spanish_squad2","es") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_biomedtra_small_spanish_squad2", "es") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_biomedtra_small_spanish_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|51.2 MB| + +## References + +https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_dspfirst_finetuning_4_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_dspfirst_finetuning_4_en.md new file mode 100644 index 000000000000..9e1a9a2facdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_dspfirst_finetuning_4_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English electra_qa_dspfirst_finetuning_4 BertForQuestionAnswering from ptran74 +author: John Snow Labs +name: electra_qa_dspfirst_finetuning_4 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`electra_qa_dspfirst_finetuning_4` is a English model originally trained by ptran74. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_dspfirst_finetuning_4_en_5.2.0_3.0_1700143360634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_dspfirst_finetuning_4_en_5.2.0_3.0_1700143360634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_dspfirst_finetuning_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("electra_qa_dspfirst_finetuning_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_dspfirst_finetuning_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/ptran74/DSPFirst-Finetuning-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_elctrafp_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_elctrafp_en.md new file mode 100644 index 000000000000..6766d581315d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_elctrafp_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from carlosserquen) +author: John Snow Labs +name: electra_qa_elctrafp +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electrafp` is a English model originally trained by `carlosserquen`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_elctrafp_en_5.2.0_3.0_1700171578206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_elctrafp_en_5.2.0_3.0_1700171578206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_elctrafp","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_elctrafp","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.electra.by_carlosserquen").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_elctrafp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|50.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/carlosserquen/electrafp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_enelpi_squad_tr.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_enelpi_squad_tr.md new file mode 100644 index 000000000000..65e3fda917d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_enelpi_squad_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish ElectraForQuestionAnswering model (from enelpi) +author: John Snow Labs +name: electra_qa_enelpi_squad +date: 2023-11-16 +tags: [tr, open_source, electra, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-tr-enelpi-squad-qa` is a Turkish model originally trained by `enelpi`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_enelpi_squad_tr_5.2.0_3.0_1700120180540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_enelpi_squad_tr_5.2.0_3.0_1700120180540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_enelpi_squad","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_enelpi_squad","tr") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.squad.electra").predict("""Benim adım ne?|||"Benim adım Clara ve Berkeley'de yaşıyorum.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_enelpi_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|412.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/enelpi/electra-tr-enelpi-squad-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_g_base_germanquad_de.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_g_base_germanquad_de.md new file mode 100644 index 000000000000..7a5115cccea6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_g_base_germanquad_de.md @@ -0,0 +1,105 @@ +--- +layout: model +title: German ElectraForQuestionAnswering model (from deepset) +author: John Snow Labs +name: electra_qa_g_base_germanquad +date: 2023-11-16 +tags: [de, open_source, electra, question_answering, onnx] +task: Question Answering +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `gelectra-base-germanquad` is a German model originally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_g_base_germanquad_de_5.2.0_3.0_1700099310383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_g_base_germanquad_de_5.2.0_3.0_1700099310383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_g_base_germanquad","de") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Was ist mein Name?", "Mein Name ist Clara und ich lebe in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_g_base_germanquad","de") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Was ist mein Name?", "Mein Name ist Clara und ich lebe in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.answer_question.electra.base").predict("""Was ist mein Name?|||"Mein Name ist Clara und ich lebe in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_g_base_germanquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|de| +|Size:|410.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/gelectra-base-germanquad +- https://deepset.ai/germanquad +- https://deepset.ai/german-bert +- https://github.com/deepset-ai/FARM +- https://github.com/deepset-ai/haystack/ +- https://haystack.deepset.ai/community/join \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_g_base_germanquad_distilled_de.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_g_base_germanquad_distilled_de.md new file mode 100644 index 000000000000..62ae0aa98151 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_g_base_germanquad_distilled_de.md @@ -0,0 +1,105 @@ +--- +layout: model +title: German ElectraForQuestionAnswering Distilled model (from deepset) +author: John Snow Labs +name: electra_qa_g_base_germanquad_distilled +date: 2023-11-16 +tags: [de, open_source, electra, question_answering, onnx] +task: Question Answering +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `gelectra-base-germanquad-distilled` is a German model originally trained by `deepset`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_g_base_germanquad_distilled_de_5.2.0_3.0_1700100818667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_g_base_germanquad_distilled_de_5.2.0_3.0_1700100818667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_g_base_germanquad_distilled","de") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Was ist mein Name?", "Mein Name ist Clara und ich lebe in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_g_base_germanquad_distilled","de") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Was ist mein Name?", "Mein Name ist Clara und ich lebe in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.answer_question.electra.distilled_base").predict("""Was ist mein Name?|||"Mein Name ist Clara und ich lebe in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_g_base_germanquad_distilled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|de| +|Size:|410.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepset/gelectra-base-germanquad-distilled +- https://deepset.ai/germanquad +- https://deepset.ai/german-bert +- https://github.com/deepset-ai/FARM +- https://github.com/deepset-ai/haystack/ +- https://haystack.deepset.ai/community/join \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_google_base_discriminator_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_google_base_discriminator_squad_en.md new file mode 100644 index 000000000000..317c0f541e96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_google_base_discriminator_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from Palak) +author: John Snow Labs +name: electra_qa_google_base_discriminator_squad +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `google_electra-base-discriminator_squad` is a English model originally trained by `Palak`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_google_base_discriminator_squad_en_5.2.0_3.0_1700123716727.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_google_base_discriminator_squad_en_5.2.0_3.0_1700123716727.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_google_base_discriminator_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_google_base_discriminator_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.base.by_Palak").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_google_base_discriminator_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Palak/google_electra-base-discriminator_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_google_small_discriminator_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_google_small_discriminator_squad_en.md new file mode 100644 index 000000000000..aac16aeb267f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_google_small_discriminator_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering Small model (from Palak) +author: John Snow Labs +name: electra_qa_google_small_discriminator_squad +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `google_electra-small-discriminator_squad` is a English model originally trained by `Palak`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_google_small_discriminator_squad_en_5.2.0_3.0_1700124965933.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_google_small_discriminator_squad_en_5.2.0_3.0_1700124965933.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_google_small_discriminator_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_google_small_discriminator_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.small.by_Palak").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_google_small_discriminator_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|50.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Palak/google_electra-small-discriminator_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_hankzhong_small_discriminator_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_hankzhong_small_discriminator_finetuned_squad_en.md new file mode 100644 index 000000000000..897c5fab4d1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_hankzhong_small_discriminator_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from hankzhong) +author: John Snow Labs +name: electra_qa_hankzhong_small_discriminator_finetuned_squad +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-small-discriminator-finetuned-squad` is a English model originally trained by `hankzhong`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_hankzhong_small_discriminator_finetuned_squad_en_5.2.0_3.0_1700152242588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_hankzhong_small_discriminator_finetuned_squad_en_5.2.0_3.0_1700152242588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_hankzhong_small_discriminator_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_hankzhong_small_discriminator_finetuned_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.small.by_hankzhong").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_hankzhong_small_discriminator_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|50.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hankzhong/electra-small-discriminator-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_klue_mrc_base_ko.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_klue_mrc_base_ko.md new file mode 100644 index 000000000000..19991a104b67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_klue_mrc_base_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean ElectraForQuestionAnswering model (from seongju) +author: John Snow Labs +name: electra_qa_klue_mrc_base +date: 2023-11-16 +tags: [ko, open_source, electra, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `klue-mrc-koelectra-base` is a Korean model originally trained by `seongju`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_klue_mrc_base_ko_5.2.0_3.0_1700100936675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_klue_mrc_base_ko_5.2.0_3.0_1700100936675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_klue_mrc_base","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_klue_mrc_base","ko") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.klue.electra.base").predict("""내 이름은 무엇입니까?|||"제 이름은 클라라이고 저는 버클리에 살고 있습니다.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_klue_mrc_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|419.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/seongju/klue-mrc-koelectra-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_discriminator_finetuned_squad_2_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_discriminator_finetuned_squad_2_en.md new file mode 100644 index 000000000000..7d044fab1fbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_discriminator_finetuned_squad_2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from bdickson) Version-2 +author: John Snow Labs +name: electra_qa_small_discriminator_finetuned_squad_2 +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-small-discriminator-finetuned-squad-finetuned-squad` is a English model originally trained by `bdickson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_small_discriminator_finetuned_squad_2_en_5.2.0_3.0_1700175351592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_small_discriminator_finetuned_squad_2_en_5.2.0_3.0_1700175351592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_small_discriminator_finetuned_squad_2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_small_discriminator_finetuned_squad_2","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.small_v2.by_bdickson").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_small_discriminator_finetuned_squad_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|50.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bdickson/electra-small-discriminator-finetuned-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_finetuned_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_finetuned_squadv1_en.md new file mode 100644 index 000000000000..361b4356a237 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_finetuned_squadv1_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English ElectraForQuestionAnswering Small model (from mrm8488) +author: John Snow Labs +name: electra_qa_small_finetuned_squadv1 +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-small-finetuned-squadv1` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_small_finetuned_squadv1_en_5.2.0_3.0_1700102000508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_small_finetuned_squadv1_en_5.2.0_3.0_1700102000508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_small_finetuned_squadv1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_small_finetuned_squadv1","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.small").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_small_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|50.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/electra-small-finetuned-squadv1 +- https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_finetuned_squadv2_en.md new file mode 100644 index 000000000000..404f385ec0eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_finetuned_squadv2_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English ElectraForQuestionAnswering Small model (from mrm8488) Version-2 +author: John Snow Labs +name: electra_qa_small_finetuned_squadv2 +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-small-finetuned-squadv2` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_small_finetuned_squadv2_en_5.2.0_3.0_1700176800464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_small_finetuned_squadv2_en_5.2.0_3.0_1700176800464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_small_finetuned_squadv2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_small_finetuned_squadv2","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.electra.small_v2").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_small_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|50.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/electra-small-finetuned-squadv2 +- https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_turkish_uncased_discriminator_finetuned_tr.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_turkish_uncased_discriminator_finetuned_tr.md new file mode 100644 index 000000000000..4bed1343e6a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_small_turkish_uncased_discriminator_finetuned_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish ElectraForQuestionAnswering model (from husnu) +author: John Snow Labs +name: electra_qa_small_turkish_uncased_discriminator_finetuned +date: 2023-11-16 +tags: [tr, open_source, electra, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-small-turkish-uncased-discriminator-finetuned_lr-2e-05_epochs-6` is a Turkish model originally trained by `husnu`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_small_turkish_uncased_discriminator_finetuned_tr_5.2.0_3.0_1700106346111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_small_turkish_uncased_discriminator_finetuned_tr_5.2.0_3.0_1700106346111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_small_turkish_uncased_discriminator_finetuned","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_small_turkish_uncased_discriminator_finetuned","tr") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.electra.small_uncased").predict("""Benim adım ne?|||"Benim adım Clara ve Berkeley'de yaşıyorum.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_small_turkish_uncased_discriminator_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|51.6 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/husnu/electra-small-turkish-uncased-discriminator-finetuned_lr-2e-05_epochs-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_squad_slp_en.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_squad_slp_en.md new file mode 100644 index 000000000000..929d7b39f188 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_squad_slp_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from rowan1224) Squad +author: John Snow Labs +name: electra_qa_squad_slp +date: 2023-11-16 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-squad-slp` is a English model originally trained by `rowan1224`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_squad_slp_en_5.2.0_3.0_1700108220754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_squad_slp_en_5.2.0_3.0_1700108220754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_squad_slp","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_squad_slp","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_squad_slp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/rowan1224/electra-squad-slp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-electra_qa_turkish_tr.md b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_turkish_tr.md new file mode 100644 index 000000000000..fccd5caaabf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-electra_qa_turkish_tr.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Turkish ElectraForQuestionAnswering model (from kuzgunlar) +author: John Snow Labs +name: electra_qa_turkish +date: 2023-11-16 +tags: [tr, open_source, electra, question_answering, onnx] +task: Question Answering +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-turkish-qa` is a Turkish model originally trained by `kuzgunlar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_turkish_tr_5.2.0_3.0_1700104330871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_turkish_tr_5.2.0_3.0_1700104330871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_turkish","tr") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_turkish","tr") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("Benim adım ne?", "Benim adım Clara ve Berkeley'de yaşıyorum.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.answer_question.electra").predict("""Benim adım ne?|||"Benim adım Clara ve Berkeley'de yaşıyorum.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|tr| +|Size:|412.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kuzgunlar/electra-turkish-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-extractive_reader_nq_en.md b/docs/_posts/ahmedlone127/2023-11-16-extractive_reader_nq_en.md new file mode 100644 index 000000000000..f923acc622fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-extractive_reader_nq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English extractive_reader_nq BertForQuestionAnswering from ToluClassics +author: John Snow Labs +name: extractive_reader_nq +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`extractive_reader_nq` is a English model originally trained by ToluClassics. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/extractive_reader_nq_en_5.2.0_3.0_1700175915496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/extractive_reader_nq_en_5.2.0_3.0_1700175915496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("extractive_reader_nq","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("extractive_reader_nq", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|extractive_reader_nq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|625.5 MB| + +## References + +https://huggingface.co/ToluClassics/extractive_reader_nq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-hubert_qa_milqa_hu.md b/docs/_posts/ahmedlone127/2023-11-16-hubert_qa_milqa_hu.md new file mode 100644 index 000000000000..32ac335ed78e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-hubert_qa_milqa_hu.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hungarian hubert_qa_milqa BertForQuestionAnswering from ZTamas +author: John Snow Labs +name: hubert_qa_milqa +date: 2023-11-16 +tags: [bert, hu, open_source, question_answering, onnx] +task: Question Answering +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hubert_qa_milqa` is a Hungarian model originally trained by ZTamas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hubert_qa_milqa_hu_5.2.0_3.0_1700113413446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hubert_qa_milqa_hu_5.2.0_3.0_1700113413446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("hubert_qa_milqa","hu") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("hubert_qa_milqa", "hu") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hubert_qa_milqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|hu| +|Size:|412.4 MB| + +## References + +https://huggingface.co/ZTamas/hubert-qa-milqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-hw1_span_selection_wwm_ext_en.md b/docs/_posts/ahmedlone127/2023-11-16-hw1_span_selection_wwm_ext_en.md new file mode 100644 index 000000000000..05394b3c3b09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-hw1_span_selection_wwm_ext_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hw1_span_selection_wwm_ext BertForQuestionAnswering from kyle0518 +author: John Snow Labs +name: hw1_span_selection_wwm_ext +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hw1_span_selection_wwm_ext` is a English model originally trained by kyle0518. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hw1_span_selection_wwm_ext_en_5.2.0_3.0_1700122263353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hw1_span_selection_wwm_ext_en_5.2.0_3.0_1700122263353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("hw1_span_selection_wwm_ext","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("hw1_span_selection_wwm_ext", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hw1_span_selection_wwm_ext| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/kyle0518/HW1_span_selection_wwm_ext \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-ia_llama_en.md b/docs/_posts/ahmedlone127/2023-11-16-ia_llama_en.md new file mode 100644 index 000000000000..9ac7ace29b37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-ia_llama_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ia_llama BertForQuestionAnswering from nordGARA +author: John Snow Labs +name: ia_llama +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ia_llama` is a English model originally trained by nordGARA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ia_llama_en_5.2.0_3.0_1700177131200.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ia_llama_en_5.2.0_3.0_1700177131200.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ia_llama","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ia_llama", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ia_llama| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/nordGARA/IA-LLAMA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-indobert_examqa_id.md b/docs/_posts/ahmedlone127/2023-11-16-indobert_examqa_id.md new file mode 100644 index 000000000000..aa2c71bc5bac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-indobert_examqa_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian indobert_examqa BertForQuestionAnswering from sinu +author: John Snow Labs +name: indobert_examqa +date: 2023-11-16 +tags: [bert, id, open_source, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_examqa` is a Indonesian model originally trained by sinu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_examqa_id_5.2.0_3.0_1700124569567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_examqa_id_5.2.0_3.0_1700124569567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("indobert_examqa","id") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("indobert_examqa", "id") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_examqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|411.7 MB| + +## References + +https://huggingface.co/sinu/IndoBERT-ExamQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-indobert_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-indobert_squad_en.md new file mode 100644 index 000000000000..51b85e3c6a59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-indobert_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English indobert_squad BertForQuestionAnswering from esakrissa +author: John Snow Labs +name: indobert_squad +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_squad` is a English model originally trained by esakrissa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_squad_en_5.2.0_3.0_1700153650731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_squad_en_5.2.0_3.0_1700153650731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("indobert_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("indobert_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|411.7 MB| + +## References + +https://huggingface.co/esakrissa/IndoBERT-SQuAD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-ixambert_finetuned_squad_basque_english_en.md b/docs/_posts/ahmedlone127/2023-11-16-ixambert_finetuned_squad_basque_english_en.md new file mode 100644 index 000000000000..8d49d33b5f49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-ixambert_finetuned_squad_basque_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ixambert_finetuned_squad_basque_english BertForQuestionAnswering from MarcBrun +author: John Snow Labs +name: ixambert_finetuned_squad_basque_english +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ixambert_finetuned_squad_basque_english` is a English model originally trained by MarcBrun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ixambert_finetuned_squad_basque_english_en_5.2.0_3.0_1700169065920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ixambert_finetuned_squad_basque_english_en_5.2.0_3.0_1700169065920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ixambert_finetuned_squad_basque_english","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ixambert_finetuned_squad_basque_english", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ixambert_finetuned_squad_basque_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|661.1 MB| + +## References + +https://huggingface.co/MarcBrun/ixambert-finetuned-squad-eu-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-kazakhbertmulti_squad_kaz_en.md b/docs/_posts/ahmedlone127/2023-11-16-kazakhbertmulti_squad_kaz_en.md new file mode 100644 index 000000000000..7cdf5f415d3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-kazakhbertmulti_squad_kaz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kazakhbertmulti_squad_kaz BertForQuestionAnswering from Kyrmasch +author: John Snow Labs +name: kazakhbertmulti_squad_kaz +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kazakhbertmulti_squad_kaz` is a English model originally trained by Kyrmasch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kazakhbertmulti_squad_kaz_en_5.2.0_3.0_1700124839732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kazakhbertmulti_squad_kaz_en_5.2.0_3.0_1700124839732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("kazakhbertmulti_squad_kaz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("kazakhbertmulti_squad_kaz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kazakhbertmulti_squad_kaz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|609.9 MB| + +## References + +https://huggingface.co/Kyrmasch/KazakhBERTmulti-SQUAD-kaz \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-klue_finetuned_squad_kor_v1_ko.md b/docs/_posts/ahmedlone127/2023-11-16-klue_finetuned_squad_kor_v1_ko.md new file mode 100644 index 000000000000..8cf59680fd81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-klue_finetuned_squad_kor_v1_ko.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Korean klue_finetuned_squad_kor_v1 BertForQuestionAnswering from Kdogs +author: John Snow Labs +name: klue_finetuned_squad_kor_v1 +date: 2023-11-16 +tags: [bert, ko, open_source, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_finetuned_squad_kor_v1` is a Korean model originally trained by Kdogs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_finetuned_squad_kor_v1_ko_5.2.0_3.0_1700111722409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_finetuned_squad_kor_v1_ko_5.2.0_3.0_1700111722409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("klue_finetuned_squad_kor_v1","ko") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("klue_finetuned_squad_kor_v1", "ko") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_finetuned_squad_kor_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|412.4 MB| + +## References + +https://huggingface.co/Kdogs/klue-finetuned-squad_kor_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-legal_bert_base_cuad_en.md b/docs/_posts/ahmedlone127/2023-11-16-legal_bert_base_cuad_en.md new file mode 100644 index 000000000000..5a394cccbd67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-legal_bert_base_cuad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English legal_bert_base_cuad BertForQuestionAnswering from alex-apostolo +author: John Snow Labs +name: legal_bert_base_cuad +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_bert_base_cuad` is a English model originally trained by alex-apostolo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_bert_base_cuad_en_5.2.0_3.0_1700118130405.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_bert_base_cuad_en_5.2.0_3.0_1700118130405.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("legal_bert_base_cuad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("legal_bert_base_cuad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_bert_base_cuad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/alex-apostolo/legal-bert-base-cuad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-matbert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-matbert_finetuned_squad_en.md new file mode 100644 index 000000000000..1382b38ea23e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-matbert_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English matbert_finetuned_squad BertForQuestionAnswering from HongyangLi +author: John Snow Labs +name: matbert_finetuned_squad +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`matbert_finetuned_squad` is a English model originally trained by HongyangLi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/matbert_finetuned_squad_en_5.2.0_3.0_1700151674563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/matbert_finetuned_squad_en_5.2.0_3.0_1700151674563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("matbert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("matbert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|matbert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/HongyangLi/Matbert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-mbert_squad2_webis_id.md b/docs/_posts/ahmedlone127/2023-11-16-mbert_squad2_webis_id.md new file mode 100644 index 000000000000..a96752fb36ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-mbert_squad2_webis_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian mbert_squad2_webis BertForQuestionAnswering from intanm +author: John Snow Labs +name: mbert_squad2_webis +date: 2023-11-16 +tags: [bert, id, open_source, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_squad2_webis` is a Indonesian model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_squad2_webis_id_5.2.0_3.0_1700175351684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_squad2_webis_id_5.2.0_3.0_1700175351684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("mbert_squad2_webis","id") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("mbert_squad2_webis", "id") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_squad2_webis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|665.0 MB| + +## References + +https://huggingface.co/intanm/mbert-squad2-webis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-mbert_squad_en.md b/docs/_posts/ahmedlone127/2023-11-16-mbert_squad_en.md new file mode 100644 index 000000000000..9f188f786772 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-mbert_squad_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English mbert_squad BertEmbeddings from oceanpty +author: John Snow Labs +name: mbert_squad +date: 2023-11-16 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_squad` is a English model originally trained by oceanpty. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_squad_en_5.2.0_3.0_1700158280562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_squad_en_5.2.0_3.0_1700158280562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("mbert_squad","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("mbert_squad", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +References + +https://huggingface.co/oceanpty/mbert-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_bert_en.md b/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_bert_en.md new file mode 100644 index 000000000000..f6ab6624614d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ntu_adl_span_selection_bert BertForQuestionAnswering from xjlulu +author: John Snow Labs +name: ntu_adl_span_selection_bert +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ntu_adl_span_selection_bert` is a English model originally trained by xjlulu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_bert_en_5.2.0_3.0_1700161797686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_bert_en_5.2.0_3.0_1700161797686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ntu_adl_span_selection_bert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ntu_adl_span_selection_bert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ntu_adl_span_selection_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/xjlulu/ntu_adl_span_selection_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_macbert_en.md b/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_macbert_en.md new file mode 100644 index 000000000000..2e9f16ddd931 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_macbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ntu_adl_span_selection_macbert BertForQuestionAnswering from xjlulu +author: John Snow Labs +name: ntu_adl_span_selection_macbert +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ntu_adl_span_selection_macbert` is a English model originally trained by xjlulu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_macbert_en_5.2.0_3.0_1700170463920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_macbert_en_5.2.0_3.0_1700170463920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ntu_adl_span_selection_macbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ntu_adl_span_selection_macbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ntu_adl_span_selection_macbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/xjlulu/ntu_adl_span_selection_macbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_roberta_en.md b/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_roberta_en.md new file mode 100644 index 000000000000..1ff142ed2316 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-ntu_adl_span_selection_roberta_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ntu_adl_span_selection_roberta BertForQuestionAnswering from xjlulu +author: John Snow Labs +name: ntu_adl_span_selection_roberta +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ntu_adl_span_selection_roberta` is a English model originally trained by xjlulu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_roberta_en_5.2.0_3.0_1700112855562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_roberta_en_5.2.0_3.0_1700112855562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ntu_adl_span_selection_roberta","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ntu_adl_span_selection_roberta", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ntu_adl_span_selection_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/xjlulu/ntu_adl_span_selection_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-persian_qa_bert_v1_fa.md b/docs/_posts/ahmedlone127/2023-11-16-persian_qa_bert_v1_fa.md new file mode 100644 index 000000000000..2aa4d6bf14aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-persian_qa_bert_v1_fa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Persian persian_qa_bert_v1 BertForQuestionAnswering from SeyedAli +author: John Snow Labs +name: persian_qa_bert_v1 +date: 2023-11-16 +tags: [bert, fa, open_source, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`persian_qa_bert_v1` is a Persian model originally trained by SeyedAli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/persian_qa_bert_v1_fa_5.2.0_3.0_1700119804478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/persian_qa_bert_v1_fa_5.2.0_3.0_1700119804478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("persian_qa_bert_v1","fa") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("persian_qa_bert_v1", "fa") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|persian_qa_bert_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|606.5 MB| + +## References + +https://huggingface.co/SeyedAli/Persian-QA-Bert-V1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-qthang_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-16-qthang_finetuned_en.md new file mode 100644 index 000000000000..5d02f29121b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-qthang_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qthang_finetuned BertForQuestionAnswering from ThangDinh +author: John Snow Labs +name: qthang_finetuned +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qthang_finetuned` is a English model originally trained by ThangDinh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qthang_finetuned_en_5.2.0_3.0_1700154874434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qthang_finetuned_en_5.2.0_3.0_1700154874434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("qthang_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("qthang_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qthang_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/ThangDinh/qthang-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-question_answering_arabert_xtreme_arabic_ar.md b/docs/_posts/ahmedlone127/2023-11-16-question_answering_arabert_xtreme_arabic_ar.md new file mode 100644 index 000000000000..aa9ad3eec4e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-question_answering_arabert_xtreme_arabic_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic question_answering_arabert_xtreme_arabic BertForQuestionAnswering from MMars +author: John Snow Labs +name: question_answering_arabert_xtreme_arabic +date: 2023-11-16 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_arabert_xtreme_arabic` is a Arabic model originally trained by MMars. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_arabert_xtreme_arabic_ar_5.2.0_3.0_1700177599845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_arabert_xtreme_arabic_ar_5.2.0_3.0_1700177599845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("question_answering_arabert_xtreme_arabic","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("question_answering_arabert_xtreme_arabic", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_arabert_xtreme_arabic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|504.8 MB| + +## References + +https://huggingface.co/MMars/Question_Answering_AraBERT_xtreme_ar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-rinna_arabert_qa_ar2_en.md b/docs/_posts/ahmedlone127/2023-11-16-rinna_arabert_qa_ar2_en.md new file mode 100644 index 000000000000..8b3589adaf57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-rinna_arabert_qa_ar2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rinna_arabert_qa_ar2 BertForQuestionAnswering from Echiguerkh +author: John Snow Labs +name: rinna_arabert_qa_ar2 +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rinna_arabert_qa_ar2` is a English model originally trained by Echiguerkh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rinna_arabert_qa_ar2_en_5.2.0_3.0_1700169044586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rinna_arabert_qa_ar2_en_5.2.0_3.0_1700169044586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("rinna_arabert_qa_ar2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("rinna_arabert_qa_ar2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rinna_arabert_qa_ar2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|504.9 MB| + +## References + +https://huggingface.co/Echiguerkh/rinna-arabert-qa-ar2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-shivamqa_en.md b/docs/_posts/ahmedlone127/2023-11-16-shivamqa_en.md new file mode 100644 index 000000000000..2cc4897429d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-shivamqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English shivamqa BertForQuestionAnswering from Shivam22182 +author: John Snow Labs +name: shivamqa +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`shivamqa` is a English model originally trained by Shivam22182. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/shivamqa_en_5.2.0_3.0_1700165128883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/shivamqa_en_5.2.0_3.0_1700165128883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("shivamqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("shivamqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|shivamqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Shivam22182/ShivamQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-sqoin_qa_model_first_en.md b/docs/_posts/ahmedlone127/2023-11-16-sqoin_qa_model_first_en.md new file mode 100644 index 000000000000..bdf1075290b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-sqoin_qa_model_first_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sqoin_qa_model_first BertForQuestionAnswering from Ryan20 +author: John Snow Labs +name: sqoin_qa_model_first +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sqoin_qa_model_first` is a English model originally trained by Ryan20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sqoin_qa_model_first_en_5.2.0_3.0_1700108770704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sqoin_qa_model_first_en_5.2.0_3.0_1700108770704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("sqoin_qa_model_first","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("sqoin_qa_model_first", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sqoin_qa_model_first| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|405.0 MB| + +## References + +https://huggingface.co/Ryan20/sqoin_qa_model_first \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-stackoverflow_qa_en.md b/docs/_posts/ahmedlone127/2023-11-16-stackoverflow_qa_en.md new file mode 100644 index 000000000000..612034c338be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-stackoverflow_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English stackoverflow_qa BertForQuestionAnswering from pacovaldez +author: John Snow Labs +name: stackoverflow_qa +date: 2023-11-16 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stackoverflow_qa` is a English model originally trained by pacovaldez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stackoverflow_qa_en_5.2.0_3.0_1700172479295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stackoverflow_qa_en_5.2.0_3.0_1700172479295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("stackoverflow_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("stackoverflow_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stackoverflow_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/pacovaldez/stackoverflow-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-wspalign_mbert_base_xx.md b/docs/_posts/ahmedlone127/2023-11-16-wspalign_mbert_base_xx.md new file mode 100644 index 000000000000..0ad8337dac86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-wspalign_mbert_base_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual wspalign_mbert_base BertForQuestionAnswering from qiyuw +author: John Snow Labs +name: wspalign_mbert_base +date: 2023-11-16 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wspalign_mbert_base` is a Multilingual model originally trained by qiyuw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wspalign_mbert_base_xx_5.2.0_3.0_1700155632305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wspalign_mbert_base_xx_5.2.0_3.0_1700155632305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("wspalign_mbert_base","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("wspalign_mbert_base", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wspalign_mbert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/qiyuw/WSPAlign-mbert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-16-xtremeqa_arabic_ar.md b/docs/_posts/ahmedlone127/2023-11-16-xtremeqa_arabic_ar.md new file mode 100644 index 000000000000..71dc9912ba9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-16-xtremeqa_arabic_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic xtremeqa_arabic BertForQuestionAnswering from abdalrahmanshahrour +author: John Snow Labs +name: xtremeqa_arabic +date: 2023-11-16 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`xtremeqa_arabic` is a Arabic model originally trained by abdalrahmanshahrour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xtremeqa_arabic_ar_5.2.0_3.0_1700165065844.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xtremeqa_arabic_ar_5.2.0_3.0_1700165065844.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("xtremeqa_arabic","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("xtremeqa_arabic", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|xtremeqa_arabic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|84.2 MB| + +## References + +https://huggingface.co/abdalrahmanshahrour/xtremeQA-ar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-adl2023_hw1_span_selection_en.md b/docs/_posts/ahmedlone127/2023-11-17-adl2023_hw1_span_selection_en.md new file mode 100644 index 000000000000..3df7e16584d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-adl2023_hw1_span_selection_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English adl2023_hw1_span_selection BertForQuestionAnswering from dean22029 +author: John Snow Labs +name: adl2023_hw1_span_selection +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adl2023_hw1_span_selection` is a English model originally trained by dean22029. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adl2023_hw1_span_selection_en_5.2.0_3.0_1700201995540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adl2023_hw1_span_selection_en_5.2.0_3.0_1700201995540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("adl2023_hw1_span_selection","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("adl2023_hw1_span_selection", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adl2023_hw1_span_selection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.2 GB| + +## References + +https://huggingface.co/dean22029/adl2023_hw1_span_selection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-alephbertgimmel_finetuned_parashootandheq_he.md b/docs/_posts/ahmedlone127/2023-11-17-alephbertgimmel_finetuned_parashootandheq_he.md new file mode 100644 index 000000000000..7ea745d3a52a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-alephbertgimmel_finetuned_parashootandheq_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew alephbertgimmel_finetuned_parashootandheq BertForQuestionAnswering from juniam +author: John Snow Labs +name: alephbertgimmel_finetuned_parashootandheq +date: 2023-11-17 +tags: [bert, he, open_source, question_answering, onnx] +task: Question Answering +language: he +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`alephbertgimmel_finetuned_parashootandheq` is a Hebrew model originally trained by juniam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/alephbertgimmel_finetuned_parashootandheq_he_5.2.0_3.0_1700216172752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/alephbertgimmel_finetuned_parashootandheq_he_5.2.0_3.0_1700216172752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("alephbertgimmel_finetuned_parashootandheq","he") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("alephbertgimmel_finetuned_parashootandheq", "he") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|alephbertgimmel_finetuned_parashootandheq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|he| +|Size:|690.4 MB| + +## References + +https://huggingface.co/juniam/alephbertgimmel-finetuned-parashootandHeQ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-all_minilm_l12_v2_qa_all_en.md b/docs/_posts/ahmedlone127/2023-11-17-all_minilm_l12_v2_qa_all_en.md new file mode 100644 index 000000000000..f117aba8ed57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-all_minilm_l12_v2_qa_all_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English all_minilm_l12_v2_qa_all BertForQuestionAnswering from LLukas22 +author: John Snow Labs +name: all_minilm_l12_v2_qa_all +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_minilm_l12_v2_qa_all` is a English model originally trained by LLukas22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_minilm_l12_v2_qa_all_en_5.2.0_3.0_1700212952543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_minilm_l12_v2_qa_all_en_5.2.0_3.0_1700212952543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("all_minilm_l12_v2_qa_all","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("all_minilm_l12_v2_qa_all", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_minilm_l12_v2_qa_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|124.1 MB| + +## References + +https://huggingface.co/LLukas22/all-MiniLM-L12-v2-qa-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-all_minilm_l12_v2_qa_english_en.md b/docs/_posts/ahmedlone127/2023-11-17-all_minilm_l12_v2_qa_english_en.md new file mode 100644 index 000000000000..82fe1c5af082 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-all_minilm_l12_v2_qa_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English all_minilm_l12_v2_qa_english BertForQuestionAnswering from LLukas22 +author: John Snow Labs +name: all_minilm_l12_v2_qa_english +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_minilm_l12_v2_qa_english` is a English model originally trained by LLukas22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_minilm_l12_v2_qa_english_en_5.2.0_3.0_1700191610001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_minilm_l12_v2_qa_english_en_5.2.0_3.0_1700191610001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("all_minilm_l12_v2_qa_english","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("all_minilm_l12_v2_qa_english", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_minilm_l12_v2_qa_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|124.1 MB| + +## References + +https://huggingface.co/LLukas22/all-MiniLM-L12-v2-qa-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-autotrain_postgres_relational_database_40784105522_en.md b/docs/_posts/ahmedlone127/2023-11-17-autotrain_postgres_relational_database_40784105522_en.md new file mode 100644 index 000000000000..cbb01475f23c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-autotrain_postgres_relational_database_40784105522_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_postgres_relational_database_40784105522 BertForQuestionAnswering from Saumils +author: John Snow Labs +name: autotrain_postgres_relational_database_40784105522 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_postgres_relational_database_40784105522` is a English model originally trained by Saumils. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_postgres_relational_database_40784105522_en_5.2.0_3.0_1700220777544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_postgres_relational_database_40784105522_en_5.2.0_3.0_1700220777544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("autotrain_postgres_relational_database_40784105522","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("autotrain_postgres_relational_database_40784105522", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_postgres_relational_database_40784105522| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Saumils/autotrain-postgres-relational-database-40784105522 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-autotrain_robertaqanda_99403147318_en.md b/docs/_posts/ahmedlone127/2023-11-17-autotrain_robertaqanda_99403147318_en.md new file mode 100644 index 000000000000..e785947bb65d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-autotrain_robertaqanda_99403147318_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_robertaqanda_99403147318 BertForQuestionAnswering from Samis922 +author: John Snow Labs +name: autotrain_robertaqanda_99403147318 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_robertaqanda_99403147318` is a English model originally trained by Samis922. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_robertaqanda_99403147318_en_5.2.0_3.0_1700203027511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_robertaqanda_99403147318_en_5.2.0_3.0_1700203027511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("autotrain_robertaqanda_99403147318","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("autotrain_robertaqanda_99403147318", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_robertaqanda_99403147318| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Samis922/autotrain-robertaqanda-99403147318 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_10_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_10_en.md new file mode 100644 index 000000000000..55ac32c1731f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_10 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_10 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_10` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_10_en_5.2.0_3.0_1700186903229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_10_en_5.2.0_3.0_1700186903229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/hung200504/bert-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_13_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_13_en.md new file mode 100644 index 000000000000..aaa6d529fba9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_13 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_13 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_13` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_13_en_5.2.0_3.0_1700217203515.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_13_en_5.2.0_3.0_1700217203515.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_13","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_13", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/hung200504/bert-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_21_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_21_en.md new file mode 100644 index 000000000000..73c474eaf841 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_21_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_21 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_21 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_21` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_21_en_5.2.0_3.0_1700196602657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_21_en_5.2.0_3.0_1700196602657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_21","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_21", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/hung200504/bert-21 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_30_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_30_en.md new file mode 100644 index 000000000000..b6b42e493c61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_30_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_30 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_30 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_30` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_30_en_5.2.0_3.0_1700208805027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_30_en_5.2.0_3.0_1700208805027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_30","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_30", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_30| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/hung200504/bert-30 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_31_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_31_en.md new file mode 100644 index 000000000000..a6df8bc322ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_31_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_31 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_31 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_31` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_31_en_5.2.0_3.0_1700180456553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_31_en_5.2.0_3.0_1700180456553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_31","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_31", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_31| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/hung200504/bert-31 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_cased_healthdemomodel_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_cased_healthdemomodel_en.md new file mode 100644 index 000000000000..a62ba3ee4d31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_cased_healthdemomodel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_cased_healthdemomodel BertForQuestionAnswering from pythonist +author: John Snow Labs +name: bert_base_cased_healthdemomodel +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_cased_healthdemomodel` is a English model originally trained by pythonist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_cased_healthdemomodel_en_5.2.0_3.0_1700180576373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_cased_healthdemomodel_en_5.2.0_3.0_1700180576373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_cased_healthdemomodel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_cased_healthdemomodel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_cased_healthdemomodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/pythonist/bert-base-cased-healthdemomodel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_chinese_finetuned_qa_b8_10_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_chinese_finetuned_qa_b8_10_en.md new file mode 100644 index 000000000000..a825a5e290b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_chinese_finetuned_qa_b8_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_qa_b8_10 BertForQuestionAnswering from sharkMeow +author: John Snow Labs +name: bert_base_chinese_finetuned_qa_b8_10 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_qa_b8_10` is a English model originally trained by sharkMeow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_qa_b8_10_en_5.2.0_3.0_1700204233166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_qa_b8_10_en_5.2.0_3.0_1700204233166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_chinese_finetuned_qa_b8_10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_chinese_finetuned_qa_b8_10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_qa_b8_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/sharkMeow/bert-base-chinese-finetuned-QA-b8-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_chinese_finetuned_qa_b8_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_chinese_finetuned_qa_b8_en.md new file mode 100644 index 000000000000..0f6cf3e8f236 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_chinese_finetuned_qa_b8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_chinese_finetuned_qa_b8 BertForQuestionAnswering from sharkMeow +author: John Snow Labs +name: bert_base_chinese_finetuned_qa_b8 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_chinese_finetuned_qa_b8` is a English model originally trained by sharkMeow. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_qa_b8_en_5.2.0_3.0_1700185572697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_chinese_finetuned_qa_b8_en_5.2.0_3.0_1700185572697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_chinese_finetuned_qa_b8","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_chinese_finetuned_qa_b8", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_chinese_finetuned_qa_b8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/sharkMeow/bert-base-chinese-finetuned-QA-b8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_finnish_cased_squad2_fi.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_finnish_cased_squad2_fi.md new file mode 100644 index 000000000000..e283cc510477 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_finnish_cased_squad2_fi.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Finnish bert_base_finnish_cased_squad2 BertForQuestionAnswering from TurkuNLP +author: John Snow Labs +name: bert_base_finnish_cased_squad2 +date: 2023-11-17 +tags: [bert, fi, open_source, question_answering, onnx] +task: Question Answering +language: fi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_finnish_cased_squad2` is a Finnish model originally trained by TurkuNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_finnish_cased_squad2_fi_5.2.0_3.0_1700182362083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_finnish_cased_squad2_fi_5.2.0_3.0_1700182362083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_finnish_cased_squad2","fi") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_finnish_cased_squad2", "fi") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_finnish_cased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fi| +|Size:|464.7 MB| + +## References + +https://huggingface.co/TurkuNLP/bert-base-finnish-cased-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_mrqa_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_mrqa_en.md new file mode 100644 index 000000000000..ea74797a3abe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_mrqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_mrqa BertForQuestionAnswering from VMware +author: John Snow Labs +name: bert_base_mrqa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_mrqa` is a English model originally trained by VMware. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_mrqa_en_5.2.0_3.0_1700188428332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_mrqa_en_5.2.0_3.0_1700188428332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_mrqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_mrqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_mrqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/VMware/bert-base-mrqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_multilingual_cased_finetuned_squad_finetuned_squad_xx.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_multilingual_cased_finetuned_squad_finetuned_squad_xx.md new file mode 100644 index 000000000000..0215958a4974 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_multilingual_cased_finetuned_squad_finetuned_squad_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_cased_finetuned_squad_finetuned_squad BertForQuestionAnswering from JensH +author: John Snow Labs +name: bert_base_multilingual_cased_finetuned_squad_finetuned_squad +date: 2023-11-17 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_cased_finetuned_squad_finetuned_squad` is a Multilingual model originally trained by JensH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_squad_finetuned_squad_xx_5.2.0_3.0_1700217207998.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_cased_finetuned_squad_finetuned_squad_xx_5.2.0_3.0_1700217207998.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_multilingual_cased_finetuned_squad_finetuned_squad","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_multilingual_cased_finetuned_squad_finetuned_squad", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_cased_finetuned_squad_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.0 MB| + +## References + +https://huggingface.co/JensH/bert-base-multilingual-cased-finetuned-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_multilingual_uncased_finetuned_squad_alex31y_xx.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_multilingual_uncased_finetuned_squad_alex31y_xx.md new file mode 100644 index 000000000000..1ae8211387c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_multilingual_uncased_finetuned_squad_alex31y_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_base_multilingual_uncased_finetuned_squad_alex31y BertForQuestionAnswering from Alex31y +author: John Snow Labs +name: bert_base_multilingual_uncased_finetuned_squad_alex31y +date: 2023-11-17 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_multilingual_uncased_finetuned_squad_alex31y` is a Multilingual model originally trained by Alex31y. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_finetuned_squad_alex31y_xx_5.2.0_3.0_1700211897912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_multilingual_uncased_finetuned_squad_alex31y_xx_5.2.0_3.0_1700211897912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_multilingual_uncased_finetuned_squad_alex31y","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_multilingual_uncased_finetuned_squad_alex31y", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_multilingual_uncased_finetuned_squad_alex31y| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/Alex31y/bert-base-multilingual-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_squad_spanish_tfm_4_question_answering_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_squad_spanish_tfm_4_question_answering_en.md new file mode 100644 index 000000000000..a3ea9c2f8480 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_squad_spanish_tfm_4_question_answering_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_squad_spanish_tfm_4_question_answering BertForQuestionAnswering from JoelVIU +author: John Snow Labs +name: bert_base_spanish_squad_spanish_tfm_4_question_answering +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_squad_spanish_tfm_4_question_answering` is a English model originally trained by JoelVIU. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_squad_spanish_tfm_4_question_answering_en_5.2.0_3.0_1700182364908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_squad_spanish_tfm_4_question_answering_en_5.2.0_3.0_1700182364908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_spanish_squad_spanish_tfm_4_question_answering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_spanish_squad_spanish_tfm_4_question_answering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_squad_spanish_tfm_4_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/JoelVIU/bert-base-spanish_squad_es-TFM_4-Question-Answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_wwm_cased_finetuned_quales_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_wwm_cased_finetuned_quales_en.md new file mode 100644 index 000000000000..2aab60e6740c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_wwm_cased_finetuned_quales_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_cased_finetuned_quales BertForQuestionAnswering from luischir +author: John Snow Labs +name: bert_base_spanish_wwm_cased_finetuned_quales +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_cased_finetuned_quales` is a English model originally trained by luischir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_quales_en_5.2.0_3.0_1700188118106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_cased_finetuned_quales_en_5.2.0_3.0_1700188118106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_spanish_wwm_cased_finetuned_quales","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_spanish_wwm_cased_finetuned_quales", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_cased_finetuned_quales| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.5 MB| + +## References + +https://huggingface.co/luischir/bert-base-spanish-wwm-cased-finetuned-quales \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_en.md new file mode 100644 index 000000000000..beb8ff8aa97c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_spanish_wwm_uncased_finetuned_qa_mlqa BertForQuestionAnswering from dccuchile +author: John Snow Labs +name: bert_base_spanish_wwm_uncased_finetuned_qa_mlqa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_spanish_wwm_uncased_finetuned_qa_mlqa` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_en_5.2.0_3.0_1700186827331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_spanish_wwm_uncased_finetuned_qa_mlqa_en_5.2.0_3.0_1700186827331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_spanish_wwm_uncased_finetuned_qa_mlqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_spanish_wwm_uncased_finetuned_qa_mlqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_spanish_wwm_uncased_finetuned_qa_mlqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.6 MB| + +## References + +https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased-finetuned-qa-mlqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_coqa_willheld_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_coqa_willheld_en.md new file mode 100644 index 000000000000..91a2f309ca68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_coqa_willheld_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_coqa_willheld BertForQuestionAnswering from WillHeld +author: John Snow Labs +name: bert_base_uncased_coqa_willheld +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_coqa_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_coqa_willheld_en_5.2.0_3.0_1700208569250.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_coqa_willheld_en_5.2.0_3.0_1700208569250.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_coqa_willheld","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_coqa_willheld", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_coqa_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/WillHeld/bert-base-uncased-coqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_en.md new file mode 100644 index 000000000000..da90423292bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_en_5.2.0_3.0_1700190329192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0_en_5.2.0_3.0_1700190329192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_ep_10_0_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_true_fh_false_hs_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-ep-10.0-b-32-lr-8e-07-dp-0.5-ss-0-st-True-fh-False-hs-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666_en.md new file mode 100644 index 000000000000..db961ce012d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666_en_5.2.0_3.0_1700220997876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666_en_5.2.0_3.0_1700220997876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_1_0_lr_1e_05_wd_0_001_dp_0_2_swati_100_southern_sotho_false_fh_true_hs_666| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-1.0-lr-1e-05-wd-0.001-dp-0.2-ss-100-st-False-fh-True-hs-666 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en.md new file mode 100644 index 000000000000..5f7c0544bd8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666 BertForQuestionAnswering from danielkty22 +author: John Snow Labs +name: bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666` is a English model originally trained by danielkty22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en_5.2.0_3.0_1700203135082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666_en_5.2.0_3.0_1700203135082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetune_squad_ep_4_0_lr_1e_06_wd_0_001_dp_0_2_swati_9119_southern_sotho_false_fh_true_hs_666| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/danielkty22/bert-base-uncased-finetune-squad-ep-4.0-lr-1e-06-wd-0.001-dp-0.2-ss-9119-st-False-fh-True-hs-666 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_badokorach_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_badokorach_en.md new file mode 100644 index 000000000000..4f83dba218b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_badokorach_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_badokorach BertForQuestionAnswering from badokorach +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_badokorach +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_badokorach` is a English model originally trained by badokorach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_badokorach_en_5.2.0_3.0_1700199417338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_badokorach_en_5.2.0_3.0_1700199417338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_badokorach","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_squad_badokorach", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_badokorach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/badokorach/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_finetuned_nq_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_finetuned_nq_en.md new file mode 100644 index 000000000000..89ae9c50efca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_finetuned_nq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_finetuned_nq BertForQuestionAnswering from leonardoschluter +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_finetuned_nq +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_finetuned_nq` is a English model originally trained by leonardoschluter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_finetuned_nq_en_5.2.0_3.0_1700182220379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_finetuned_nq_en_5.2.0_3.0_1700182220379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_finetuned_nq","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_squad_finetuned_nq", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_finetuned_nq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/leonardoschluter/bert-base-uncased-finetuned-squad-finetuned-nq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_harrynewcomb_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_harrynewcomb_en.md new file mode 100644 index 000000000000..9f88ae6bf936 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_harrynewcomb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_harrynewcomb BertForQuestionAnswering from HarryNewcomb +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_harrynewcomb +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_harrynewcomb` is a English model originally trained by HarryNewcomb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_harrynewcomb_en_5.2.0_3.0_1700199098678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_harrynewcomb_en_5.2.0_3.0_1700199098678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_harrynewcomb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_squad_harrynewcomb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_harrynewcomb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/HarryNewcomb/bert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_v2_4_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_v2_4_en.md new file mode 100644 index 000000000000..1a302297de73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_v2_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_v2_4 BertForQuestionAnswering from seviladiguzel +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_v2_4 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_v2_4` is a English model originally trained by seviladiguzel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_v2_4_en_5.2.0_3.0_1700184118361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_v2_4_en_5.2.0_3.0_1700184118361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_v2_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_squad_v2_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_v2_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/seviladiguzel/bert-base-uncased-finetuned-squad_v2_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_v2_seviladiguzel_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_v2_seviladiguzel_en.md new file mode 100644 index 000000000000..218e2daa73aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_base_uncased_finetuned_squad_v2_seviladiguzel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_uncased_finetuned_squad_v2_seviladiguzel BertForQuestionAnswering from seviladiguzel +author: John Snow Labs +name: bert_base_uncased_finetuned_squad_v2_seviladiguzel +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_uncased_finetuned_squad_v2_seviladiguzel` is a English model originally trained by seviladiguzel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_v2_seviladiguzel_en_5.2.0_3.0_1700180576351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_uncased_finetuned_squad_v2_seviladiguzel_en_5.2.0_3.0_1700180576351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_base_uncased_finetuned_squad_v2_seviladiguzel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_base_uncased_finetuned_squad_v2_seviladiguzel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_uncased_finetuned_squad_v2_seviladiguzel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/seviladiguzel/bert-base-uncased-finetuned-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_covid_21_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_covid_21_en.md new file mode 100644 index 000000000000..b7031b7ec64f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_covid_21_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_covid_21 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_covid_21 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_covid_21` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_covid_21_en_5.2.0_3.0_1700208806304.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_covid_21_en_5.2.0_3.0_1700208806304.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_covid_21","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_covid_21", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_covid_21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/hung200504/bert-covid-21 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_covidqa_3_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_covidqa_3_en.md new file mode 100644 index 000000000000..66929a2d0d76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_covidqa_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_covidqa_3 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_covidqa_3 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_covidqa_3` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_covidqa_3_en_5.2.0_3.0_1700216169783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_covidqa_3_en_5.2.0_3.0_1700216169783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_covidqa_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_covidqa_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_covidqa_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/hung200504/bert-covidqa-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_cpgqa_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_cpgqa_en.md new file mode 100644 index 000000000000..df721f1a1331 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_cpgqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_cpgqa BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_finetuned_cpgqa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_cpgqa` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_cpgqa_en_5.2.0_3.0_1700211897653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_cpgqa_en_5.2.0_3.0_1700211897653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_cpgqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_cpgqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_cpgqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/hung200504/bert-finetuned-cpgqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_on_nq_short_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_on_nq_short_en.md new file mode 100644 index 000000000000..4c2b7aef997c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_on_nq_short_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_on_nq_short BertForQuestionAnswering from eibakke +author: John Snow Labs +name: bert_finetuned_on_nq_short +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_on_nq_short` is a English model originally trained by eibakke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_on_nq_short_en_5.2.0_3.0_1700195868532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_on_nq_short_en_5.2.0_3.0_1700195868532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_on_nq_short","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_on_nq_short", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_on_nq_short| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/eibakke/bert-finetuned-on-nq-short \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_aaroosh_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_aaroosh_en.md new file mode 100644 index 000000000000..a7ff18a61815 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_aaroosh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_aaroosh BertForQuestionAnswering from Aaroosh +author: John Snow Labs +name: bert_finetuned_squad_aaroosh +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_aaroosh` is a English model originally trained by Aaroosh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_aaroosh_en_5.2.0_3.0_1700193166034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_aaroosh_en_5.2.0_3.0_1700193166034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_aaroosh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_aaroosh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_aaroosh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Aaroosh/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_accelerate_zouhair1_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_accelerate_zouhair1_en.md new file mode 100644 index 000000000000..133312f8df42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_accelerate_zouhair1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_accelerate_zouhair1 BertForQuestionAnswering from Zouhair1 +author: John Snow Labs +name: bert_finetuned_squad_accelerate_zouhair1 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_accelerate_zouhair1` is a English model originally trained by Zouhair1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_accelerate_zouhair1_en_5.2.0_3.0_1700184122737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_accelerate_zouhair1_en_5.2.0_3.0_1700184122737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_accelerate_zouhair1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_accelerate_zouhair1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_accelerate_zouhair1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Zouhair1/bert-finetuned-squad-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_alexperkin_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_alexperkin_en.md new file mode 100644 index 000000000000..c991dbfeb02a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_alexperkin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_alexperkin BertForQuestionAnswering from AlexPerkin +author: John Snow Labs +name: bert_finetuned_squad_alexperkin +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_alexperkin` is a English model originally trained by AlexPerkin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_alexperkin_en_5.2.0_3.0_1700209084264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_alexperkin_en_5.2.0_3.0_1700209084264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_alexperkin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_alexperkin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_alexperkin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/AlexPerkin/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_alvintu_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_alvintu_en.md new file mode 100644 index 000000000000..09cd062d351e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_alvintu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_alvintu BertForQuestionAnswering from alvintu +author: John Snow Labs +name: bert_finetuned_squad_alvintu +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_alvintu` is a English model originally trained by alvintu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_alvintu_en_5.2.0_3.0_1700213134563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_alvintu_en_5.2.0_3.0_1700213134563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_alvintu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_alvintu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_alvintu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/alvintu/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ashutosh2109_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ashutosh2109_en.md new file mode 100644 index 000000000000..b103df6f8e8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ashutosh2109_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_ashutosh2109 BertForQuestionAnswering from ashutosh2109 +author: John Snow Labs +name: bert_finetuned_squad_ashutosh2109 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_ashutosh2109` is a English model originally trained by ashutosh2109. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ashutosh2109_en_5.2.0_3.0_1700203021936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ashutosh2109_en_5.2.0_3.0_1700203021936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_ashutosh2109","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_ashutosh2109", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_ashutosh2109| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/ashutosh2109/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_asmit_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_asmit_en.md new file mode 100644 index 000000000000..486a2ca5b683 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_asmit_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_asmit BertForQuestionAnswering from Asmit +author: John Snow Labs +name: bert_finetuned_squad_asmit +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_asmit` is a English model originally trained by Asmit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_asmit_en_5.2.0_3.0_1700212501027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_asmit_en_5.2.0_3.0_1700212501027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_asmit","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_asmit", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_asmit| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Asmit/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_bbbbearczx_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_bbbbearczx_en.md new file mode 100644 index 000000000000..22b87a934ab8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_bbbbearczx_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_bbbbearczx BertForQuestionAnswering from bbbbearczx +author: John Snow Labs +name: bert_finetuned_squad_bbbbearczx +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_bbbbearczx` is a English model originally trained by bbbbearczx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_bbbbearczx_en_5.2.0_3.0_1700215558206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_bbbbearczx_en_5.2.0_3.0_1700215558206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_bbbbearczx","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_bbbbearczx", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_bbbbearczx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/bbbbearczx/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_catlord_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_catlord_en.md new file mode 100644 index 000000000000..33ce553946cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_catlord_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_catlord BertForQuestionAnswering from catlord +author: John Snow Labs +name: bert_finetuned_squad_catlord +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_catlord` is a English model originally trained by catlord. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_catlord_en_5.2.0_3.0_1700193166013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_catlord_en_5.2.0_3.0_1700193166013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_catlord","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_catlord", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_catlord| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/catlord/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_courtneypham_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_courtneypham_en.md new file mode 100644 index 000000000000..54bb222acc62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_courtneypham_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_courtneypham BertForQuestionAnswering from courtneypham +author: John Snow Labs +name: bert_finetuned_squad_courtneypham +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_courtneypham` is a English model originally trained by courtneypham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_courtneypham_en_5.2.0_3.0_1700211897624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_courtneypham_en_5.2.0_3.0_1700211897624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_courtneypham","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_courtneypham", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_courtneypham| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/courtneypham/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_covidqa_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_covidqa_en.md new file mode 100644 index 000000000000..dafc306596b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_covidqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_covidqa BertForQuestionAnswering from pkduongsu +author: John Snow Labs +name: bert_finetuned_squad_covidqa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_covidqa` is a English model originally trained by pkduongsu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_covidqa_en_5.2.0_3.0_1700214899571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_covidqa_en_5.2.0_3.0_1700214899571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_covidqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_covidqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_covidqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/pkduongsu/bert-finetuned-squad-covidqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ethanwtl_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ethanwtl_en.md new file mode 100644 index 000000000000..97e637374de3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ethanwtl_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_ethanwtl BertForQuestionAnswering from EthanWTL +author: John Snow Labs +name: bert_finetuned_squad_ethanwtl +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_ethanwtl` is a English model originally trained by EthanWTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ethanwtl_en_5.2.0_3.0_1700224311663.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ethanwtl_en_5.2.0_3.0_1700224311663.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_ethanwtl","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_ethanwtl", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_ethanwtl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/EthanWTL/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_gallyamovi_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_gallyamovi_en.md new file mode 100644 index 000000000000..409723ffe380 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_gallyamovi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_gallyamovi BertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: bert_finetuned_squad_gallyamovi +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_gallyamovi` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_gallyamovi_en_5.2.0_3.0_1700201822787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_gallyamovi_en_5.2.0_3.0_1700201822787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_gallyamovi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_gallyamovi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_gallyamovi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/gallyamovi/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_golightly_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_golightly_en.md new file mode 100644 index 000000000000..6f168c8a5d55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_golightly_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_golightly BertForQuestionAnswering from golightly +author: John Snow Labs +name: bert_finetuned_squad_golightly +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_golightly` is a English model originally trained by golightly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_golightly_en_5.2.0_3.0_1700205623429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_golightly_en_5.2.0_3.0_1700205623429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_golightly","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_golightly", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_golightly| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/golightly/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_heheshu_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_heheshu_en.md new file mode 100644 index 000000000000..599f39e871e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_heheshu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_heheshu BertForQuestionAnswering from heheshu +author: John Snow Labs +name: bert_finetuned_squad_heheshu +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_heheshu` is a English model originally trained by heheshu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_heheshu_en_5.2.0_3.0_1700195864033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_heheshu_en_5.2.0_3.0_1700195864033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_heheshu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_heheshu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_heheshu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/heheshu/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_iamannika_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_iamannika_en.md new file mode 100644 index 000000000000..8f1ba7594afa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_iamannika_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_iamannika BertForQuestionAnswering from iamannika +author: John Snow Labs +name: bert_finetuned_squad_iamannika +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_iamannika` is a English model originally trained by iamannika. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_iamannika_en_5.2.0_3.0_1700200195583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_iamannika_en_5.2.0_3.0_1700200195583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_iamannika","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_iamannika", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_iamannika| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/iamannika/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ihfaudsip_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ihfaudsip_en.md new file mode 100644 index 000000000000..256d8f2e2c8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ihfaudsip_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_ihfaudsip BertForQuestionAnswering from ihfaudsip +author: John Snow Labs +name: bert_finetuned_squad_ihfaudsip +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_ihfaudsip` is a English model originally trained by ihfaudsip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ihfaudsip_en_5.2.0_3.0_1700195630435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ihfaudsip_en_5.2.0_3.0_1700195630435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_ihfaudsip","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_ihfaudsip", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_ihfaudsip| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/ihfaudsip/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jchhabra_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jchhabra_en.md new file mode 100644 index 000000000000..31e96e7c42d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jchhabra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_jchhabra BertForQuestionAnswering from jchhabra +author: John Snow Labs +name: bert_finetuned_squad_jchhabra +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_jchhabra` is a English model originally trained by jchhabra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_jchhabra_en_5.2.0_3.0_1700204370272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_jchhabra_en_5.2.0_3.0_1700204370272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_jchhabra","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_jchhabra", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_jchhabra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/jchhabra/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jfarmerphd_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jfarmerphd_en.md new file mode 100644 index 000000000000..99a3445ed5a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jfarmerphd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_jfarmerphd BertForQuestionAnswering from jfarmerphd +author: John Snow Labs +name: bert_finetuned_squad_jfarmerphd +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_jfarmerphd` is a English model originally trained by jfarmerphd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_jfarmerphd_en_5.2.0_3.0_1700194348728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_jfarmerphd_en_5.2.0_3.0_1700194348728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_jfarmerphd","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_jfarmerphd", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_jfarmerphd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/jfarmerphd/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jmoraes_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jmoraes_en.md new file mode 100644 index 000000000000..bdff33164ccb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_jmoraes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_jmoraes BertForQuestionAnswering from jmoraes +author: John Snow Labs +name: bert_finetuned_squad_jmoraes +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_jmoraes` is a English model originally trained by jmoraes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_jmoraes_en_5.2.0_3.0_1700202304817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_jmoraes_en_5.2.0_3.0_1700202304817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_jmoraes","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_jmoraes", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_jmoraes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/jmoraes/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_johnxiaxj_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_johnxiaxj_en.md new file mode 100644 index 000000000000..13de5a58a0bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_johnxiaxj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_johnxiaxj BertForQuestionAnswering from JohnXiaXJ +author: John Snow Labs +name: bert_finetuned_squad_johnxiaxj +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_johnxiaxj` is a English model originally trained by JohnXiaXJ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_johnxiaxj_en_5.2.0_3.0_1700190329198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_johnxiaxj_en_5.2.0_3.0_1700190329198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_johnxiaxj","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_johnxiaxj", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_johnxiaxj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/JohnXiaXJ/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_kellyxuanlin_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_kellyxuanlin_en.md new file mode 100644 index 000000000000..18bad0216d36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_kellyxuanlin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_kellyxuanlin BertForQuestionAnswering from kellyxuanlin +author: John Snow Labs +name: bert_finetuned_squad_kellyxuanlin +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_kellyxuanlin` is a English model originally trained by kellyxuanlin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_kellyxuanlin_en_5.2.0_3.0_1700204374413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_kellyxuanlin_en_5.2.0_3.0_1700204374413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_kellyxuanlin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_kellyxuanlin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_kellyxuanlin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/kellyxuanlin/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_krolis_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_krolis_en.md new file mode 100644 index 000000000000..6d347ecffab5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_krolis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_krolis BertForQuestionAnswering from krolis +author: John Snow Labs +name: bert_finetuned_squad_krolis +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_krolis` is a English model originally trained by krolis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_krolis_en_5.2.0_3.0_1700197403017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_krolis_en_5.2.0_3.0_1700197403017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_krolis","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_krolis", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_krolis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/krolis/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_legalbert_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_legalbert_en.md new file mode 100644 index 000000000000..4383799f2ed5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_legalbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_legalbert BertForQuestionAnswering from Jasu +author: John Snow Labs +name: bert_finetuned_squad_legalbert +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_legalbert` is a English model originally trained by Jasu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_legalbert_en_5.2.0_3.0_1700206944958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_legalbert_en_5.2.0_3.0_1700206944958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_legalbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_legalbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_legalbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.3 MB| + +## References + +https://huggingface.co/Jasu/bert-finetuned-squad-legalbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_leinadh_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_leinadh_en.md new file mode 100644 index 000000000000..38a0894c1a08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_leinadh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_leinadh BertForQuestionAnswering from Leinadh +author: John Snow Labs +name: bert_finetuned_squad_leinadh +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_leinadh` is a English model originally trained by Leinadh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_leinadh_en_5.2.0_3.0_1700201639839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_leinadh_en_5.2.0_3.0_1700201639839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_leinadh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_leinadh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_leinadh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Leinadh/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_lexie79_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_lexie79_en.md new file mode 100644 index 000000000000..75c4f16a241a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_lexie79_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_lexie79 BertForQuestionAnswering from Lexie79 +author: John Snow Labs +name: bert_finetuned_squad_lexie79 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_lexie79` is a English model originally trained by Lexie79. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_lexie79_en_5.2.0_3.0_1700193166028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_lexie79_en_5.2.0_3.0_1700193166028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_lexie79","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_lexie79", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_lexie79| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Lexie79/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_lukezekes_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_lukezekes_en.md new file mode 100644 index 000000000000..4d4457a7defc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_lukezekes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_lukezekes BertForQuestionAnswering from LukeZekes +author: John Snow Labs +name: bert_finetuned_squad_lukezekes +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_lukezekes` is a English model originally trained by LukeZekes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_lukezekes_en_5.2.0_3.0_1700215440668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_lukezekes_en_5.2.0_3.0_1700215440668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_lukezekes","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_lukezekes", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_lukezekes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/LukeZekes/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_marcowong02_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_marcowong02_en.md new file mode 100644 index 000000000000..fa488014fb5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_marcowong02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_marcowong02 BertForQuestionAnswering from marcowong02 +author: John Snow Labs +name: bert_finetuned_squad_marcowong02 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_marcowong02` is a English model originally trained by marcowong02. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_marcowong02_en_5.2.0_3.0_1700197403032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_marcowong02_en_5.2.0_3.0_1700197403032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_marcowong02","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_marcowong02", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_marcowong02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/marcowong02/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_marcuslee_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_marcuslee_en.md new file mode 100644 index 000000000000..f50f533ebc36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_marcuslee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_marcuslee BertForQuestionAnswering from MarcusLee +author: John Snow Labs +name: bert_finetuned_squad_marcuslee +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_marcuslee` is a English model originally trained by MarcusLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_marcuslee_en_5.2.0_3.0_1700194348722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_marcuslee_en_5.2.0_3.0_1700194348722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_marcuslee","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_marcuslee", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_marcuslee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/MarcusLee/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_mongdiutindei_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_mongdiutindei_en.md new file mode 100644 index 000000000000..21283627b6aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_mongdiutindei_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_mongdiutindei BertForQuestionAnswering from mongdiutindei +author: John Snow Labs +name: bert_finetuned_squad_mongdiutindei +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_mongdiutindei` is a English model originally trained by mongdiutindei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_mongdiutindei_en_5.2.0_3.0_1700206946653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_mongdiutindei_en_5.2.0_3.0_1700206946653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_mongdiutindei","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_mongdiutindei", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_mongdiutindei| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|400.6 MB| + +## References + +https://huggingface.co/mongdiutindei/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_mxalmeida_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_mxalmeida_en.md new file mode 100644 index 000000000000..16166a1720b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_mxalmeida_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_mxalmeida BertForQuestionAnswering from mxalmeida +author: John Snow Labs +name: bert_finetuned_squad_mxalmeida +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_mxalmeida` is a English model originally trained by mxalmeida. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_mxalmeida_en_5.2.0_3.0_1700214456888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_mxalmeida_en_5.2.0_3.0_1700214456888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_mxalmeida","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_mxalmeida", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_mxalmeida| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/mxalmeida/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_nightlighttw_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_nightlighttw_en.md new file mode 100644 index 000000000000..b86815906328 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_nightlighttw_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_nightlighttw BertForQuestionAnswering from nightlighttw +author: John Snow Labs +name: bert_finetuned_squad_nightlighttw +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_nightlighttw` is a English model originally trained by nightlighttw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_nightlighttw_en_5.2.0_3.0_1700198902920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_nightlighttw_en_5.2.0_3.0_1700198902920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_nightlighttw","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_nightlighttw", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_nightlighttw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/nightlighttw/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_question_generation_25_percent_cased_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_question_generation_25_percent_cased_en.md new file mode 100644 index 000000000000..e3166a32a524 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_question_generation_25_percent_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_question_generation_25_percent_cased BertForQuestionAnswering from mohilp1998 +author: John Snow Labs +name: bert_finetuned_squad_question_generation_25_percent_cased +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_question_generation_25_percent_cased` is a English model originally trained by mohilp1998. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_question_generation_25_percent_cased_en_5.2.0_3.0_1700214475000.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_question_generation_25_percent_cased_en_5.2.0_3.0_1700214475000.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_question_generation_25_percent_cased","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_question_generation_25_percent_cased", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_question_generation_25_percent_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/mohilp1998/bert-finetuned-squad-question-generation-25_percent-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_raghan_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_raghan_en.md new file mode 100644 index 000000000000..da87221fbb89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_raghan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_raghan BertForQuestionAnswering from Raghan +author: John Snow Labs +name: bert_finetuned_squad_raghan +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_raghan` is a English model originally trained by Raghan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_raghan_en_5.2.0_3.0_1700214187252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_raghan_en_5.2.0_3.0_1700214187252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_raghan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_raghan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_raghan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Raghan/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_rainanabul_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_rainanabul_en.md new file mode 100644 index 000000000000..0ec82e828874 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_rainanabul_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_rainanabul BertForQuestionAnswering from rainanabul +author: John Snow Labs +name: bert_finetuned_squad_rainanabul +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_rainanabul` is a English model originally trained by rainanabul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_rainanabul_en_5.2.0_3.0_1700213134191.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_rainanabul_en_5.2.0_3.0_1700213134191.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_rainanabul","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_rainanabul", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_rainanabul| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.7 MB| + +## References + +https://huggingface.co/rainanabul/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_raychang7_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_raychang7_en.md new file mode 100644 index 000000000000..d521cd8723e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_raychang7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_raychang7 BertForQuestionAnswering from raychang7 +author: John Snow Labs +name: bert_finetuned_squad_raychang7 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_raychang7` is a English model originally trained by raychang7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_raychang7_en_5.2.0_3.0_1700209877967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_raychang7_en_5.2.0_3.0_1700209877967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_raychang7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_raychang7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_raychang7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/raychang7/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_rghosh8_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_rghosh8_en.md new file mode 100644 index 000000000000..78fa37f65d60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_rghosh8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_rghosh8 BertForQuestionAnswering from rghosh8 +author: John Snow Labs +name: bert_finetuned_squad_rghosh8 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_rghosh8` is a English model originally trained by rghosh8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_rghosh8_en_5.2.0_3.0_1700220992874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_rghosh8_en_5.2.0_3.0_1700220992874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_rghosh8","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_rghosh8", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_rghosh8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/rghosh8/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_sandeepmbm_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_sandeepmbm_en.md new file mode 100644 index 000000000000..a7af70637ef6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_sandeepmbm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_sandeepmbm BertForQuestionAnswering from sandeepmbm +author: John Snow Labs +name: bert_finetuned_squad_sandeepmbm +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_sandeepmbm` is a English model originally trained by sandeepmbm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_sandeepmbm_en_5.2.0_3.0_1700185572765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_sandeepmbm_en_5.2.0_3.0_1700185572765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_sandeepmbm","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_sandeepmbm", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_sandeepmbm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/sandeepmbm/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_shabdansh01_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_shabdansh01_en.md new file mode 100644 index 000000000000..134ecd11622f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_shabdansh01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_shabdansh01 BertForQuestionAnswering from Shabdansh01 +author: John Snow Labs +name: bert_finetuned_squad_shabdansh01 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_shabdansh01` is a English model originally trained by Shabdansh01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_shabdansh01_en_5.2.0_3.0_1700191612352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_shabdansh01_en_5.2.0_3.0_1700191612352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_shabdansh01","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_shabdansh01", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_shabdansh01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Shabdansh01/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_sneh1th_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_sneh1th_en.md new file mode 100644 index 000000000000..f754283824a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_sneh1th_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_sneh1th BertForQuestionAnswering from sneh1th +author: John Snow Labs +name: bert_finetuned_squad_sneh1th +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_sneh1th` is a English model originally trained by sneh1th. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_sneh1th_en_5.2.0_3.0_1700182220425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_sneh1th_en_5.2.0_3.0_1700182220425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_sneh1th","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_sneh1th", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_sneh1th| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/sneh1th/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ssv93venkat_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ssv93venkat_en.md new file mode 100644 index 000000000000..0e3947b06e97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_ssv93venkat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_ssv93venkat BertForQuestionAnswering from ssv93venkat +author: John Snow Labs +name: bert_finetuned_squad_ssv93venkat +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_ssv93venkat` is a English model originally trained by ssv93venkat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ssv93venkat_en_5.2.0_3.0_1700213667838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_ssv93venkat_en_5.2.0_3.0_1700213667838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_ssv93venkat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_ssv93venkat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_ssv93venkat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/ssv93venkat/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_strongwar_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_strongwar_en.md new file mode 100644 index 000000000000..a794694264f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_strongwar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_strongwar BertForQuestionAnswering from strongwar +author: John Snow Labs +name: bert_finetuned_squad_strongwar +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_strongwar` is a English model originally trained by strongwar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_strongwar_en_5.2.0_3.0_1700205623457.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_strongwar_en_5.2.0_3.0_1700205623457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_strongwar","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_strongwar", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_strongwar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/strongwar/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_tmatup_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_tmatup_en.md new file mode 100644 index 000000000000..bdd3895bd147 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_tmatup_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_tmatup BertForQuestionAnswering from tmatup +author: John Snow Labs +name: bert_finetuned_squad_tmatup +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_tmatup` is a English model originally trained by tmatup. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tmatup_en_5.2.0_3.0_1700186812640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tmatup_en_5.2.0_3.0_1700186812640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_tmatup","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_tmatup", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_tmatup| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/tmatup/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_v1_sooolee_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_v1_sooolee_en.md new file mode 100644 index 000000000000..388ddff70224 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_v1_sooolee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_v1_sooolee BertForQuestionAnswering from sooolee +author: John Snow Labs +name: bert_finetuned_squad_v1_sooolee +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_v1_sooolee` is a English model originally trained by sooolee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_v1_sooolee_en_5.2.0_3.0_1700210428957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_v1_sooolee_en_5.2.0_3.0_1700210428957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_v1_sooolee","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_v1_sooolee", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_v1_sooolee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/sooolee/bert-finetuned-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_wrobinw_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_wrobinw_en.md new file mode 100644 index 000000000000..4f46060988c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_wrobinw_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_wrobinw BertForQuestionAnswering from WRobinW +author: John Snow Labs +name: bert_finetuned_squad_wrobinw +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_wrobinw` is a English model originally trained by WRobinW. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_wrobinw_en_5.2.0_3.0_1700200195204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_wrobinw_en_5.2.0_3.0_1700200195204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_wrobinw","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_wrobinw", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_wrobinw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/WRobinW/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_yifanpan_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_yifanpan_en.md new file mode 100644 index 000000000000..842fae1c5bf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_yifanpan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_yifanpan BertForQuestionAnswering from YifanPan +author: John Snow Labs +name: bert_finetuned_squad_yifanpan +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_yifanpan` is a English model originally trained by YifanPan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_yifanpan_en_5.2.0_3.0_1700190500124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_yifanpan_en_5.2.0_3.0_1700190500124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_yifanpan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_yifanpan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_yifanpan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/YifanPan/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_zhangh795_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_zhangh795_en.md new file mode 100644 index 000000000000..aaaec8da8f09 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_zhangh795_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_zhangh795 BertForQuestionAnswering from ZhangH795 +author: John Snow Labs +name: bert_finetuned_squad_zhangh795 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_zhangh795` is a English model originally trained by ZhangH795. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_zhangh795_en_5.2.0_3.0_1700189186725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_zhangh795_en_5.2.0_3.0_1700189186725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_zhangh795","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_zhangh795", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_zhangh795| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/ZhangH795/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_zohaib99k_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_zohaib99k_en.md new file mode 100644 index 000000000000..8a26f154e071 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_squad_zohaib99k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_zohaib99k BertForQuestionAnswering from zohaib99k +author: John Snow Labs +name: bert_finetuned_squad_zohaib99k +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_zohaib99k` is a English model originally trained by zohaib99k. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_zohaib99k_en_5.2.0_3.0_1700224393955.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_zohaib99k_en_5.2.0_3.0_1700224393955.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_squad_zohaib99k","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_squad_zohaib99k", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_zohaib99k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/zohaib99k/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_uncased_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_uncased_squad_v2_en.md new file mode 100644 index 000000000000..50871fca17c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_finetuned_uncased_squad_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_uncased_squad_v2 BertForQuestionAnswering from aai520-group6 +author: John Snow Labs +name: bert_finetuned_uncased_squad_v2 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_uncased_squad_v2` is a English model originally trained by aai520-group6. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_uncased_squad_v2_en_5.2.0_3.0_1700203141990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_uncased_squad_v2_en_5.2.0_3.0_1700203141990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_finetuned_uncased_squad_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_finetuned_uncased_squad_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_uncased_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/aai520-group6/bert-finetuned-uncased-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_large_question_answering_finetuned_legal_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_large_question_answering_finetuned_legal_en.md new file mode 100644 index 000000000000..402d4dd6d331 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_large_question_answering_finetuned_legal_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_large_question_answering_finetuned_legal BertForQuestionAnswering from atharvamundada99 +author: John Snow Labs +name: bert_large_question_answering_finetuned_legal +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_large_question_answering_finetuned_legal` is a English model originally trained by atharvamundada99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_large_question_answering_finetuned_legal_en_5.2.0_3.0_1700180647015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_large_question_answering_finetuned_legal_en_5.2.0_3.0_1700180647015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_large_question_answering_finetuned_legal","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_large_question_answering_finetuned_legal", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_large_question_answering_finetuned_legal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/atharvamundada99/bert-large-question-answering-finetuned-legal \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_squad_chatbot_aai_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_squad_chatbot_aai_en.md new file mode 100644 index 000000000000..5a2ddb307328 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_squad_chatbot_aai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_squad_chatbot_aai BertForQuestionAnswering from tmcgirr +author: John Snow Labs +name: bert_squad_chatbot_aai +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_squad_chatbot_aai` is a English model originally trained by tmcgirr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_squad_chatbot_aai_en_5.2.0_3.0_1700194433336.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_squad_chatbot_aai_en_5.2.0_3.0_1700194433336.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_squad_chatbot_aai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_squad_chatbot_aai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_squad_chatbot_aai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/tmcgirr/BERT-squad-chatbot-AAI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_turkce_soru_cevaplama_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_turkce_soru_cevaplama_en.md new file mode 100644 index 000000000000..e135b67793f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_turkce_soru_cevaplama_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_turkce_soru_cevaplama BertForQuestionAnswering from daddycik +author: John Snow Labs +name: bert_turkce_soru_cevaplama +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_turkce_soru_cevaplama` is a English model originally trained by daddycik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_turkce_soru_cevaplama_en_5.2.0_3.0_1700212499584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_turkce_soru_cevaplama_en_5.2.0_3.0_1700212499584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_turkce_soru_cevaplama","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_turkce_soru_cevaplama", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_turkce_soru_cevaplama| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|412.3 MB| + +## References + +https://huggingface.co/daddycik/bert-turkce-soru-cevaplama \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bert_uncased_finetuned_cpgqa_en.md b/docs/_posts/ahmedlone127/2023-11-17-bert_uncased_finetuned_cpgqa_en.md new file mode 100644 index 000000000000..411803faf58b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bert_uncased_finetuned_cpgqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_uncased_finetuned_cpgqa BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: bert_uncased_finetuned_cpgqa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_uncased_finetuned_cpgqa` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_uncased_finetuned_cpgqa_en_5.2.0_3.0_1700222865358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_uncased_finetuned_cpgqa_en_5.2.0_3.0_1700222865358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bert_uncased_finetuned_cpgqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bert_uncased_finetuned_cpgqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_uncased_finetuned_cpgqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/hung200504/bert-uncased-finetuned-cpgqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-bioformer_8l_squad1_en.md b/docs/_posts/ahmedlone127/2023-11-17-bioformer_8l_squad1_en.md new file mode 100644 index 000000000000..861174300d0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-bioformer_8l_squad1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bioformer_8l_squad1 BertForQuestionAnswering from bioformers +author: John Snow Labs +name: bioformer_8l_squad1 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bioformer_8l_squad1` is a English model originally trained by bioformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bioformer_8l_squad1_en_5.2.0_3.0_1700209877808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bioformer_8l_squad1_en_5.2.0_3.0_1700209877808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("bioformer_8l_squad1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("bioformer_8l_squad1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bioformer_8l_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|158.5 MB| + +## References + +https://huggingface.co/bioformers/bioformer-8L-squad1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-clibert_20_en.md b/docs/_posts/ahmedlone127/2023-11-17-clibert_20_en.md new file mode 100644 index 000000000000..8dae42f47aa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-clibert_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clibert_20 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: clibert_20 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clibert_20` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clibert_20_en_5.2.0_3.0_1700189186705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clibert_20_en_5.2.0_3.0_1700189186705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("clibert_20","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("clibert_20", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clibert_20| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.3 MB| + +## References + +https://huggingface.co/hung200504/CliBert-20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-close_book_2_en.md b/docs/_posts/ahmedlone127/2023-11-17-close_book_2_en.md new file mode 100644 index 000000000000..86985549e407 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-close_book_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English close_book_2 BertForQuestionAnswering from Ahmed007 +author: John Snow Labs +name: close_book_2 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`close_book_2` is a English model originally trained by Ahmed007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/close_book_2_en_5.2.0_3.0_1700203204243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/close_book_2_en_5.2.0_3.0_1700203204243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("close_book_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("close_book_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|close_book_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/Ahmed007/Close_book_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-ct_cos_mbert_idkmrc_en.md b/docs/_posts/ahmedlone127/2023-11-17-ct_cos_mbert_idkmrc_en.md new file mode 100644 index 000000000000..ca328967a7b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-ct_cos_mbert_idkmrc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ct_cos_mbert_idkmrc BertForQuestionAnswering from intanm +author: John Snow Labs +name: ct_cos_mbert_idkmrc +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ct_cos_mbert_idkmrc` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ct_cos_mbert_idkmrc_en_5.2.0_3.0_1700217311367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ct_cos_mbert_idkmrc_en_5.2.0_3.0_1700217311367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ct_cos_mbert_idkmrc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ct_cos_mbert_idkmrc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ct_cos_mbert_idkmrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/intanm/ct-cos-mbert-idkmrc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-ct_kld_mbert_idkmrc_en.md b/docs/_posts/ahmedlone127/2023-11-17-ct_kld_mbert_idkmrc_en.md new file mode 100644 index 000000000000..b88316cbf23a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-ct_kld_mbert_idkmrc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ct_kld_mbert_idkmrc BertForQuestionAnswering from intanm +author: John Snow Labs +name: ct_kld_mbert_idkmrc +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ct_kld_mbert_idkmrc` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ct_kld_mbert_idkmrc_en_5.2.0_3.0_1700219451910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ct_kld_mbert_idkmrc_en_5.2.0_3.0_1700219451910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ct_kld_mbert_idkmrc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ct_kld_mbert_idkmrc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ct_kld_mbert_idkmrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/intanm/ct-kld-mbert-idkmrc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-ct_mse_mbert_idkmrc_en.md b/docs/_posts/ahmedlone127/2023-11-17-ct_mse_mbert_idkmrc_en.md new file mode 100644 index 000000000000..5c6a52d6d7a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-ct_mse_mbert_idkmrc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ct_mse_mbert_idkmrc BertForQuestionAnswering from intanm +author: John Snow Labs +name: ct_mse_mbert_idkmrc +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ct_mse_mbert_idkmrc` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ct_mse_mbert_idkmrc_en_5.2.0_3.0_1700224592139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ct_mse_mbert_idkmrc_en_5.2.0_3.0_1700224592139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ct_mse_mbert_idkmrc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ct_mse_mbert_idkmrc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ct_mse_mbert_idkmrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/intanm/ct-mse-mbert-idkmrc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-darijabert_finetuned_arabic_squad_ar.md b/docs/_posts/ahmedlone127/2023-11-17-darijabert_finetuned_arabic_squad_ar.md new file mode 100644 index 000000000000..e669094bacae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-darijabert_finetuned_arabic_squad_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic darijabert_finetuned_arabic_squad BertForQuestionAnswering from JasperV13 +author: John Snow Labs +name: darijabert_finetuned_arabic_squad +date: 2023-11-17 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`darijabert_finetuned_arabic_squad` is a Arabic model originally trained by JasperV13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/darijabert_finetuned_arabic_squad_ar_5.2.0_3.0_1700191737908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/darijabert_finetuned_arabic_squad_ar_5.2.0_3.0_1700191737908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("darijabert_finetuned_arabic_squad","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("darijabert_finetuned_arabic_squad", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|darijabert_finetuned_arabic_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|551.5 MB| + +## References + +https://huggingface.co/JasperV13/DarijaBERT-finetuned-Arabic-SQuAD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-deepset_bert_base_uncased_squad2_trained_en.md b/docs/_posts/ahmedlone127/2023-11-17-deepset_bert_base_uncased_squad2_trained_en.md new file mode 100644 index 000000000000..2a1c1eedcccc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-deepset_bert_base_uncased_squad2_trained_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English deepset_bert_base_uncased_squad2_trained BertForQuestionAnswering from moska +author: John Snow Labs +name: deepset_bert_base_uncased_squad2_trained +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deepset_bert_base_uncased_squad2_trained` is a English model originally trained by moska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deepset_bert_base_uncased_squad2_trained_en_5.2.0_3.0_1700184122622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deepset_bert_base_uncased_squad2_trained_en_5.2.0_3.0_1700184122622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("deepset_bert_base_uncased_squad2_trained","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("deepset_bert_base_uncased_squad2_trained", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deepset_bert_base_uncased_squad2_trained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/moska/deepset_bert-base-uncased-squad2_trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-dictabert_heq_he.md b/docs/_posts/ahmedlone127/2023-11-17-dictabert_heq_he.md new file mode 100644 index 000000000000..20e7fe11625f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-dictabert_heq_he.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Hebrew dictabert_heq BertForQuestionAnswering from dicta-il +author: John Snow Labs +name: dictabert_heq +date: 2023-11-17 +tags: [bert, he, open_source, question_answering, onnx] +task: Question Answering +language: he +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dictabert_heq` is a Hebrew model originally trained by dicta-il. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dictabert_heq_he_5.2.0_3.0_1700193363237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dictabert_heq_he_5.2.0_3.0_1700193363237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("dictabert_heq","he") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("dictabert_heq", "he") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dictabert_heq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|he| +|Size:|645.0 MB| + +## References + +https://huggingface.co/dicta-il/dictabert-heq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-ekattorbert_multilingual_finetuned_squad_v2_xx.md b/docs/_posts/ahmedlone127/2023-11-17-ekattorbert_multilingual_finetuned_squad_v2_xx.md new file mode 100644 index 000000000000..3351b705271a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-ekattorbert_multilingual_finetuned_squad_v2_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual ekattorbert_multilingual_finetuned_squad_v2 BertForQuestionAnswering from shawmoon +author: John Snow Labs +name: ekattorbert_multilingual_finetuned_squad_v2 +date: 2023-11-17 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ekattorbert_multilingual_finetuned_squad_v2` is a Multilingual model originally trained by shawmoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ekattorbert_multilingual_finetuned_squad_v2_xx_5.2.0_3.0_1700188428485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ekattorbert_multilingual_finetuned_squad_v2_xx_5.2.0_3.0_1700188428485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ekattorbert_multilingual_finetuned_squad_v2","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ekattorbert_multilingual_finetuned_squad_v2", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ekattorbert_multilingual_finetuned_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|625.5 MB| + +## References + +https://huggingface.co/shawmoon/EkattorBert-multilingual-finetuned-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-electra_qa_electricidad_small_finetuned_squadv1_es.md b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_electricidad_small_finetuned_squadv1_es.md new file mode 100644 index 000000000000..67f289bded91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_electricidad_small_finetuned_squadv1_es.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Spanish ElectraForQuestionAnswering Small model (from mrm8488) +author: John Snow Labs +name: electra_qa_electricidad_small_finetuned_squadv1 +date: 2023-11-17 +tags: [es, open_source, electra, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electricidad-small-finetuned-squadv1-es` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_electricidad_small_finetuned_squadv1_es_5.2.0_3.0_1700180656189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_electricidad_small_finetuned_squadv1_es_5.2.0_3.0_1700180656189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_electricidad_small_finetuned_squadv1","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_electricidad_small_finetuned_squadv1","es") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.squad.electra.small").predict("""¿Cuál es mi nombre?|||"Mi nombre es Clara y vivo en Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_electricidad_small_finetuned_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|50.7 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/electricidad-small-finetuned-squadv1-es +- https://github.com/ccasimiro88/TranslateAlignRetrieve/tree/master/SQuAD-es-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-electra_qa_long_ko.md b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_long_ko.md new file mode 100644 index 000000000000..a4801e86c816 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_long_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean ElectraForQuestionAnswering model (from sehandev) +author: John Snow Labs +name: electra_qa_long +date: 2023-11-17 +tags: [ko, open_source, electra, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `koelectra-long-qa` is a Korean model originally trained by `sehandev`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_long_ko_5.2.0_3.0_1700186722924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_long_ko_5.2.0_3.0_1700186722924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_long","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_long","ko") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.electra").predict("""내 이름은 무엇입니까?|||"제 이름은 클라라이고 저는 버클리에 살고 있습니다.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_long| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|419.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sehandev/koelectra-long-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-electra_qa_slp_en.md b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_slp_en.md new file mode 100644 index 000000000000..7fb1257bdac3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_slp_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from rowan1224) +author: John Snow Labs +name: electra_qa_slp +date: 2023-11-17 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-slp` is a English model originally trained by `rowan1224`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_slp_en_5.2.0_3.0_1700188524329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_slp_en_5.2.0_3.0_1700188524329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_slp","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_slp","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.electra.by_rowan1224").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_slp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.0 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/rowan1224/electra-slp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-electra_qa_small_discriminator_finetuned_squad_1_en.md b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_small_discriminator_finetuned_squad_1_en.md new file mode 100644 index 000000000000..d1478e3f608f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_small_discriminator_finetuned_squad_1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English ElectraForQuestionAnswering model (from bdickson) Version-1 +author: John Snow Labs +name: electra_qa_small_discriminator_finetuned_squad_1 +date: 2023-11-17 +tags: [en, open_source, electra, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `electra-small-discriminator-finetuned-squad` is a English model originally trained by `bdickson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_small_discriminator_finetuned_squad_1_en_5.2.0_3.0_1700189577951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_small_discriminator_finetuned_squad_1_en_5.2.0_3.0_1700189577951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_small_discriminator_finetuned_squad_1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_small_discriminator_finetuned_squad_1","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.electra.small.by_bdickson").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_small_discriminator_finetuned_squad_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|50.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bdickson/electra-small-discriminator-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-electra_qa_small_v3_finetuned_korquad_ko.md b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_small_v3_finetuned_korquad_ko.md new file mode 100644 index 000000000000..70e7b2e5e9b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-electra_qa_small_v3_finetuned_korquad_ko.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Korean ElectraForQuestionAnswering Small model (from monologg) Version-3 +author: John Snow Labs +name: electra_qa_small_v3_finetuned_korquad +date: 2023-11-17 +tags: [ko, open_source, electra, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `koelectra-small-v3-finetuned-korquad` is a Korean model originally trained by `monologg`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/electra_qa_small_v3_finetuned_korquad_ko_5.2.0_3.0_1700190429493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/electra_qa_small_v3_finetuned_korquad_ko_5.2.0_3.0_1700190429493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = BertForQuestionAnswering.pretrained("electra_qa_small_v3_finetuned_korquad","ko") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = BertForQuestionAnswering.pretrained("electra_qa_small_v3_finetuned_korquad","ko") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("내 이름은 무엇입니까?", "제 이름은 클라라이고 저는 버클리에 살고 있습니다.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.answer_question.korquad.electra.small").predict("""내 이름은 무엇입니까?|||"제 이름은 클라라이고 저는 버클리에 살고 있습니다.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|electra_qa_small_v3_finetuned_korquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|53.1 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/monologg/koelectra-small-v3-finetuned-korquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-energybert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-17-energybert_finetuned_squad_en.md new file mode 100644 index 000000000000..125dae564626 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-energybert_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English energybert_finetuned_squad BertForQuestionAnswering from HongyangLi +author: John Snow Labs +name: energybert_finetuned_squad +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`energybert_finetuned_squad` is a English model originally trained by HongyangLi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/energybert_finetuned_squad_en_5.2.0_3.0_1700189569320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/energybert_finetuned_squad_en_5.2.0_3.0_1700189569320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("energybert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("energybert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|energybert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/HongyangLi/energybert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_arabertv2_en.md b/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_arabertv2_en.md new file mode 100644 index 000000000000..4b360350a16b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_arabertv2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_bert_base_arabertv2 BertForQuestionAnswering from kaarelkaarelson +author: John Snow Labs +name: finetuned_bert_base_arabertv2 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_arabertv2` is a English model originally trained by kaarelkaarelson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_arabertv2_en_5.2.0_3.0_1700207170853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_arabertv2_en_5.2.0_3.0_1700207170853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("finetuned_bert_base_arabertv2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("finetuned_bert_base_arabertv2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_arabertv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|504.8 MB| + +## References + +https://huggingface.co/kaarelkaarelson/finetuned-bert-base-arabertv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_multilingual_cased_doerig_xx.md b/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_multilingual_cased_doerig_xx.md new file mode 100644 index 000000000000..e986814ff4f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_multilingual_cased_doerig_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual finetuned_bert_base_multilingual_cased_doerig BertForQuestionAnswering from doerig +author: John Snow Labs +name: finetuned_bert_base_multilingual_cased_doerig +date: 2023-11-17 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_multilingual_cased_doerig` is a Multilingual model originally trained by doerig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_multilingual_cased_doerig_xx_5.2.0_3.0_1700214036867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_multilingual_cased_doerig_xx_5.2.0_3.0_1700214036867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("finetuned_bert_base_multilingual_cased_doerig","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("finetuned_bert_base_multilingual_cased_doerig", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_multilingual_cased_doerig| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/doerig/finetuned_bert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_multilingual_cased_kaarelkaarelson_xx.md b/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_multilingual_cased_kaarelkaarelson_xx.md new file mode 100644 index 000000000000..21d2427adf42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-finetuned_bert_base_multilingual_cased_kaarelkaarelson_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual finetuned_bert_base_multilingual_cased_kaarelkaarelson BertForQuestionAnswering from kaarelkaarelson +author: John Snow Labs +name: finetuned_bert_base_multilingual_cased_kaarelkaarelson +date: 2023-11-17 +tags: [bert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_bert_base_multilingual_cased_kaarelkaarelson` is a Multilingual model originally trained by kaarelkaarelson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_multilingual_cased_kaarelkaarelson_xx_5.2.0_3.0_1700199417568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_bert_base_multilingual_cased_kaarelkaarelson_xx_5.2.0_3.0_1700199417568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("finetuned_bert_base_multilingual_cased_kaarelkaarelson","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("finetuned_bert_base_multilingual_cased_kaarelkaarelson", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_bert_base_multilingual_cased_kaarelkaarelson| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|665.1 MB| + +## References + +https://huggingface.co/kaarelkaarelson/finetuned-bert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-heq_en.md b/docs/_posts/ahmedlone127/2023-11-17-heq_en.md new file mode 100644 index 000000000000..c9e2586ec4f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-heq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English heq BertForQuestionAnswering from amirdnc +author: John Snow Labs +name: heq +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`heq` is a English model originally trained by amirdnc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/heq_en_5.2.0_3.0_1700185821460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/heq_en_5.2.0_3.0_1700185821460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("heq","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("heq", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|heq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/amirdnc/HeQ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-hotel_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-17-hotel_qa_model_en.md new file mode 100644 index 000000000000..8ce8cb243f0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-hotel_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hotel_qa_model BertForQuestionAnswering from nova-sqoin +author: John Snow Labs +name: hotel_qa_model +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hotel_qa_model` is a English model originally trained by nova-sqoin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hotel_qa_model_en_5.2.0_3.0_1700210132942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hotel_qa_model_en_5.2.0_3.0_1700210132942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("hotel_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("hotel_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hotel_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|405.0 MB| + +## References + +https://huggingface.co/nova-sqoin/hotel_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3_en.md b/docs/_posts/ahmedlone127/2023-11-17-how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3_en.md new file mode 100644 index 000000000000..ef6eef9c9456 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3 BertForQuestionAnswering from Tural +author: John Snow Labs +name: how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3` is a English model originally trained by Tural. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3_en_5.2.0_3.0_1700198227946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3_en_5.2.0_3.0_1700198227946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/Tural/How_to_fine-tune_a_model_for_common_downstream_tasks_V3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver12_en.md b/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver12_en.md new file mode 100644 index 000000000000..b2e463dc4a89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hw1_part2_ver12 BertForQuestionAnswering from weiiiii0622 +author: John Snow Labs +name: hw1_part2_ver12 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hw1_part2_ver12` is a English model originally trained by weiiiii0622. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hw1_part2_ver12_en_5.2.0_3.0_1700210427798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hw1_part2_ver12_en_5.2.0_3.0_1700210427798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("hw1_part2_ver12","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("hw1_part2_ver12", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hw1_part2_ver12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/weiiiii0622/HW1_Part2_ver12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver26_en.md b/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver26_en.md new file mode 100644 index 000000000000..33f443df6df2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver26_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hw1_part2_ver26 BertForQuestionAnswering from weiiiii0622 +author: John Snow Labs +name: hw1_part2_ver26 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hw1_part2_ver26` is a English model originally trained by weiiiii0622. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hw1_part2_ver26_en_5.2.0_3.0_1700204457155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hw1_part2_ver26_en_5.2.0_3.0_1700204457155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("hw1_part2_ver26","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("hw1_part2_ver26", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hw1_part2_ver26| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/weiiiii0622/HW1_Part2_ver26 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver27_en.md b/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver27_en.md new file mode 100644 index 000000000000..35e6dbeb7df1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-hw1_part2_ver27_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hw1_part2_ver27 BertForQuestionAnswering from weiiiii0622 +author: John Snow Labs +name: hw1_part2_ver27 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hw1_part2_ver27` is a English model originally trained by weiiiii0622. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hw1_part2_ver27_en_5.2.0_3.0_1700200967071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hw1_part2_ver27_en_5.2.0_3.0_1700200967071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("hw1_part2_ver27","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("hw1_part2_ver27", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hw1_part2_ver27| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/weiiiii0622/HW1_Part2_ver27 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-hw1_span_selection_en.md b/docs/_posts/ahmedlone127/2023-11-17-hw1_span_selection_en.md new file mode 100644 index 000000000000..9ff843e3c2fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-hw1_span_selection_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hw1_span_selection BertForQuestionAnswering from kyle0518 +author: John Snow Labs +name: hw1_span_selection +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hw1_span_selection` is a English model originally trained by kyle0518. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hw1_span_selection_en_5.2.0_3.0_1700200784771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hw1_span_selection_en_5.2.0_3.0_1700200784771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("hw1_span_selection","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("hw1_span_selection", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hw1_span_selection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.0 MB| + +## References + +https://huggingface.co/kyle0518/HW1_span_selection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-inlegalbert_cbp_lkg_qa_triples_w_context_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-17-inlegalbert_cbp_lkg_qa_triples_w_context_finetuned_en.md new file mode 100644 index 000000000000..2536f036400d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-inlegalbert_cbp_lkg_qa_triples_w_context_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English inlegalbert_cbp_lkg_qa_triples_w_context_finetuned BertForQuestionAnswering from kinshuk-h +author: John Snow Labs +name: inlegalbert_cbp_lkg_qa_triples_w_context_finetuned +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`inlegalbert_cbp_lkg_qa_triples_w_context_finetuned` is a English model originally trained by kinshuk-h. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/inlegalbert_cbp_lkg_qa_triples_w_context_finetuned_en_5.2.0_3.0_1700219140911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/inlegalbert_cbp_lkg_qa_triples_w_context_finetuned_en_5.2.0_3.0_1700219140911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("inlegalbert_cbp_lkg_qa_triples_w_context_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("inlegalbert_cbp_lkg_qa_triples_w_context_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|inlegalbert_cbp_lkg_qa_triples_w_context_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/kinshuk-h/InLegalBERT-cbp-lkg-qa-triples-w-context-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-kanuri_bert_qa_en.md b/docs/_posts/ahmedlone127/2023-11-17-kanuri_bert_qa_en.md new file mode 100644 index 000000000000..bbd68b9f74aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-kanuri_bert_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kanuri_bert_qa BertForQuestionAnswering from J1won7 +author: John Snow Labs +name: kanuri_bert_qa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kanuri_bert_qa` is a English model originally trained by J1won7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kanuri_bert_qa_en_5.2.0_3.0_1700191610338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kanuri_bert_qa_en_5.2.0_3.0_1700191610338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("kanuri_bert_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("kanuri_bert_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kanuri_bert_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|367.3 MB| + +## References + +https://huggingface.co/J1won7/kr-bert-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-klue_bert_base_finetuned_squad_kor_v1_ko.md b/docs/_posts/ahmedlone127/2023-11-17-klue_bert_base_finetuned_squad_kor_v1_ko.md new file mode 100644 index 000000000000..0d038c76d010 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-klue_bert_base_finetuned_squad_kor_v1_ko.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Korean klue_bert_base_finetuned_squad_kor_v1 BertForQuestionAnswering from yjgwak +author: John Snow Labs +name: klue_bert_base_finetuned_squad_kor_v1 +date: 2023-11-17 +tags: [bert, ko, open_source, question_answering, onnx] +task: Question Answering +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`klue_bert_base_finetuned_squad_kor_v1` is a Korean model originally trained by yjgwak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/klue_bert_base_finetuned_squad_kor_v1_ko_5.2.0_3.0_1700205694463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/klue_bert_base_finetuned_squad_kor_v1_ko_5.2.0_3.0_1700205694463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("klue_bert_base_finetuned_squad_kor_v1","ko") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("klue_bert_base_finetuned_squad_kor_v1", "ko") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|klue_bert_base_finetuned_squad_kor_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ko| +|Size:|412.4 MB| + +## References + +https://huggingface.co/yjgwak/klue-bert-base-finetuned-squad-kor-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-legal_document_question_answering_zh.md b/docs/_posts/ahmedlone127/2023-11-17-legal_document_question_answering_zh.md new file mode 100644 index 000000000000..34608015d4a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-legal_document_question_answering_zh.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Chinese legal_document_question_answering BertForQuestionAnswering from NchuNLP +author: John Snow Labs +name: legal_document_question_answering +date: 2023-11-17 +tags: [bert, zh, open_source, question_answering, onnx] +task: Question Answering +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_document_question_answering` is a Chinese model originally trained by NchuNLP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_document_question_answering_zh_5.2.0_3.0_1700185572714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_document_question_answering_zh_5.2.0_3.0_1700185572714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("legal_document_question_answering","zh") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("legal_document_question_answering", "zh") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_document_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|zh| +|Size:|381.0 MB| + +## References + +https://huggingface.co/NchuNLP/Legal-Document-Question-Answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-matscibert_qa_en.md b/docs/_posts/ahmedlone127/2023-11-17-matscibert_qa_en.md new file mode 100644 index 000000000000..2bca4bde55aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-matscibert_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English matscibert_qa BertForQuestionAnswering from vrx2 +author: John Snow Labs +name: matscibert_qa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`matscibert_qa` is a English model originally trained by vrx2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/matscibert_qa_en_5.2.0_3.0_1700194352540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/matscibert_qa_en_5.2.0_3.0_1700194352540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("matscibert_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("matscibert_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|matscibert_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|409.9 MB| + +## References + +https://huggingface.co/vrx2/matscibert-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-mbert_quoref_en.md b/docs/_posts/ahmedlone127/2023-11-17-mbert_quoref_en.md new file mode 100644 index 000000000000..9ca80005383f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-mbert_quoref_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mbert_quoref BertForQuestionAnswering from intanm +author: John Snow Labs +name: mbert_quoref +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_quoref` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_quoref_en_5.2.0_3.0_1700211289902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_quoref_en_5.2.0_3.0_1700211289902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("mbert_quoref","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("mbert_quoref", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_quoref| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/intanm/mbert-quoref \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-mbert_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-17-mbert_squadv2_en.md new file mode 100644 index 000000000000..7276576161a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-mbert_squadv2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mbert_squadv2 BertForQuestionAnswering from intanm +author: John Snow Labs +name: mbert_squadv2 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbert_squadv2` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbert_squadv2_en_5.2.0_3.0_1700193166276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbert_squadv2_en_5.2.0_3.0_1700193166276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("mbert_squadv2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("mbert_squadv2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbert_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.1 MB| + +## References + +https://huggingface.co/intanm/mbert-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-nlp_spring23_hw4_question_answering_g13_en.md b/docs/_posts/ahmedlone127/2023-11-17-nlp_spring23_hw4_question_answering_g13_en.md new file mode 100644 index 000000000000..24bb25f5abc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-nlp_spring23_hw4_question_answering_g13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlp_spring23_hw4_question_answering_g13 BertForQuestionAnswering from parsi-ai-nlpclass +author: John Snow Labs +name: nlp_spring23_hw4_question_answering_g13 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_spring23_hw4_question_answering_g13` is a English model originally trained by parsi-ai-nlpclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_spring23_hw4_question_answering_g13_en_5.2.0_3.0_1700219141074.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_spring23_hw4_question_answering_g13_en_5.2.0_3.0_1700219141074.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("nlp_spring23_hw4_question_answering_g13","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("nlp_spring23_hw4_question_answering_g13", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_spring23_hw4_question_answering_g13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|606.5 MB| + +## References + +https://huggingface.co/parsi-ai-nlpclass/NLP_Spring23_HW4_Question_Answering_G13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-ntu_adl_span_selection_cluecorpussmall_en.md b/docs/_posts/ahmedlone127/2023-11-17-ntu_adl_span_selection_cluecorpussmall_en.md new file mode 100644 index 000000000000..46c37305aa0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-ntu_adl_span_selection_cluecorpussmall_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ntu_adl_span_selection_cluecorpussmall BertForQuestionAnswering from xjlulu +author: John Snow Labs +name: ntu_adl_span_selection_cluecorpussmall +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ntu_adl_span_selection_cluecorpussmall` is a English model originally trained by xjlulu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_cluecorpussmall_en_5.2.0_3.0_1700205472025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ntu_adl_span_selection_cluecorpussmall_en_5.2.0_3.0_1700205472025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("ntu_adl_span_selection_cluecorpussmall","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("ntu_adl_span_selection_cluecorpussmall", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ntu_adl_span_selection_cluecorpussmall| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|381.1 MB| + +## References + +https://huggingface.co/xjlulu/ntu_adl_span_selection_cluecorpussmall \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-parsbert_finetuned_persianqa_en.md b/docs/_posts/ahmedlone127/2023-11-17-parsbert_finetuned_persianqa_en.md new file mode 100644 index 000000000000..ee527052b3c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-parsbert_finetuned_persianqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English parsbert_finetuned_persianqa BertForQuestionAnswering from marzinouri +author: John Snow Labs +name: parsbert_finetuned_persianqa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parsbert_finetuned_persianqa` is a English model originally trained by marzinouri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parsbert_finetuned_persianqa_en_5.2.0_3.0_1700188118419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parsbert_finetuned_persianqa_en_5.2.0_3.0_1700188118419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("parsbert_finetuned_persianqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("parsbert_finetuned_persianqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parsbert_finetuned_persianqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|441.6 MB| + +## References + +https://huggingface.co/marzinouri/parsbert-finetuned-persianQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-polaris_bert_0000_en.md b/docs/_posts/ahmedlone127/2023-11-17-polaris_bert_0000_en.md new file mode 100644 index 000000000000..dd4f53c4e365 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-polaris_bert_0000_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English polaris_bert_0000 BertForQuestionAnswering from logoyazilim +author: John Snow Labs +name: polaris_bert_0000 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`polaris_bert_0000` is a English model originally trained by logoyazilim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/polaris_bert_0000_en_5.2.0_3.0_1700196764592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/polaris_bert_0000_en_5.2.0_3.0_1700196764592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("polaris_bert_0000","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("polaris_bert_0000", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|polaris_bert_0000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|412.2 MB| + +## References + +https://huggingface.co/logoyazilim/polaris_bert_0000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-pubmed_bert_squad_covidqa_en.md b/docs/_posts/ahmedlone127/2023-11-17-pubmed_bert_squad_covidqa_en.md new file mode 100644 index 000000000000..1c9304db414f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-pubmed_bert_squad_covidqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pubmed_bert_squad_covidqa BertForQuestionAnswering from Sarmila +author: John Snow Labs +name: pubmed_bert_squad_covidqa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pubmed_bert_squad_covidqa` is a English model originally trained by Sarmila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pubmed_bert_squad_covidqa_en_5.2.0_3.0_1700204374667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pubmed_bert_squad_covidqa_en_5.2.0_3.0_1700204374667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("pubmed_bert_squad_covidqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("pubmed_bert_squad_covidqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pubmed_bert_squad_covidqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.1 MB| + +## References + +https://huggingface.co/Sarmila/pubmed-bert-squad-covidqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-qa_history_saudi_ar.md b/docs/_posts/ahmedlone127/2023-11-17-qa_history_saudi_ar.md new file mode 100644 index 000000000000..0beacb9a323d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-qa_history_saudi_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic qa_history_saudi BertForQuestionAnswering from IAMNawaf +author: John Snow Labs +name: qa_history_saudi +date: 2023-11-17 +tags: [bert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_history_saudi` is a Arabic model originally trained by IAMNawaf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_history_saudi_ar_5.2.0_3.0_1700215568855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_history_saudi_ar_5.2.0_3.0_1700215568855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("qa_history_saudi","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("qa_history_saudi", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_history_saudi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|505.1 MB| + +## References + +https://huggingface.co/IAMNawaf/QA-History-Saudi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-qa_persian_complete_en.md b/docs/_posts/ahmedlone127/2023-11-17-qa_persian_complete_en.md new file mode 100644 index 000000000000..9a6373c8174c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-qa_persian_complete_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_persian_complete BertForQuestionAnswering from AliBagherz +author: John Snow Labs +name: qa_persian_complete +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_persian_complete` is a English model originally trained by AliBagherz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_persian_complete_en_5.2.0_3.0_1700187055678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_persian_complete_en_5.2.0_3.0_1700187055678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("qa_persian_complete","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("qa_persian_complete", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_persian_complete| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|606.5 MB| + +## References + +https://huggingface.co/AliBagherz/qa-persian-complete \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-qamodelforpathopubmed_en.md b/docs/_posts/ahmedlone127/2023-11-17-qamodelforpathopubmed_en.md new file mode 100644 index 000000000000..3614a3d17cf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-qamodelforpathopubmed_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qamodelforpathopubmed BertForQuestionAnswering from Galahad3x +author: John Snow Labs +name: qamodelforpathopubmed +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qamodelforpathopubmed` is a English model originally trained by Galahad3x. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qamodelforpathopubmed_en_5.2.0_3.0_1700224592020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qamodelforpathopubmed_en_5.2.0_3.0_1700224592020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("qamodelforpathopubmed","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("qamodelforpathopubmed", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qamodelforpathopubmed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|408.2 MB| + +## References + +https://huggingface.co/Galahad3x/QAModelForPathoPubMed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-question_answering_based_on_bert_en.md b/docs/_posts/ahmedlone127/2023-11-17-question_answering_based_on_bert_en.md new file mode 100644 index 000000000000..aa4d9ddac485 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-question_answering_based_on_bert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_based_on_bert BertForQuestionAnswering from notoookay +author: John Snow Labs +name: question_answering_based_on_bert +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_based_on_bert` is a English model originally trained by notoookay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_based_on_bert_en_5.2.0_3.0_1700200011423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_based_on_bert_en_5.2.0_3.0_1700200011423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("question_answering_based_on_bert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("question_answering_based_on_bert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_based_on_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/notoookay/question-answering-based-on-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-question_answering_bert_base_cased_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-17-question_answering_bert_base_cased_squad2_en.md new file mode 100644 index 000000000000..729eb058bdc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-question_answering_bert_base_cased_squad2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_bert_base_cased_squad2 BertForQuestionAnswering from TunahanGokcimen +author: John Snow Labs +name: question_answering_bert_base_cased_squad2 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_bert_base_cased_squad2` is a English model originally trained by TunahanGokcimen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_bert_base_cased_squad2_en_5.2.0_3.0_1700194348732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_bert_base_cased_squad2_en_5.2.0_3.0_1700194348732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("question_answering_bert_base_cased_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("question_answering_bert_base_cased_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_bert_base_cased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|403.6 MB| + +## References + +https://huggingface.co/TunahanGokcimen/Question-Answering-Bert-base-cased-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-question_answering_ican_en.md b/docs/_posts/ahmedlone127/2023-11-17-question_answering_ican_en.md new file mode 100644 index 000000000000..ce4c05ddf0c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-question_answering_ican_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_ican BertForQuestionAnswering from LDY +author: John Snow Labs +name: question_answering_ican +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_ican` is a English model originally trained by LDY. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_ican_en_5.2.0_3.0_1700222980427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_ican_en_5.2.0_3.0_1700222980427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("question_answering_ican","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("question_answering_ican", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_ican| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|380.8 MB| + +## References + +https://huggingface.co/LDY/Question-Answering-Ican \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-rubert_base_sberquad_ai_f_en.md b/docs/_posts/ahmedlone127/2023-11-17-rubert_base_sberquad_ai_f_en.md new file mode 100644 index 000000000000..5b7cfc571133 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-rubert_base_sberquad_ai_f_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_base_sberquad_ai_f BertForQuestionAnswering from Mathnub +author: John Snow Labs +name: rubert_base_sberquad_ai_f +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_base_sberquad_ai_f` is a English model originally trained by Mathnub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_base_sberquad_ai_f_en_5.2.0_3.0_1700224592247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_base_sberquad_ai_f_en_5.2.0_3.0_1700224592247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("rubert_base_sberquad_ai_f","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("rubert_base_sberquad_ai_f", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_base_sberquad_ai_f| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|667.0 MB| + +## References + +https://huggingface.co/Mathnub/rubert-base-sberquad_ai-f \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-rubert_dpbase_sberquad_en.md b/docs/_posts/ahmedlone127/2023-11-17-rubert_dpbase_sberquad_en.md new file mode 100644 index 000000000000..c8e81ad8bbfd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-rubert_dpbase_sberquad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_dpbase_sberquad BertForQuestionAnswering from Mathnub +author: John Snow Labs +name: rubert_dpbase_sberquad +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_dpbase_sberquad` is a English model originally trained by Mathnub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_dpbase_sberquad_en_5.2.0_3.0_1700221331320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_dpbase_sberquad_en_5.2.0_3.0_1700221331320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("rubert_dpbase_sberquad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("rubert_dpbase_sberquad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_dpbase_sberquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|664.3 MB| + +## References + +https://huggingface.co/Mathnub/rubert-DPbase-sberquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-rubert_tiny2_qa_en.md b/docs/_posts/ahmedlone127/2023-11-17-rubert_tiny2_qa_en.md new file mode 100644 index 000000000000..1234ace77089 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-rubert_tiny2_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rubert_tiny2_qa BertForQuestionAnswering from Stacy123 +author: John Snow Labs +name: rubert_tiny2_qa +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rubert_tiny2_qa` is a English model originally trained by Stacy123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rubert_tiny2_qa_en_5.2.0_3.0_1700217532163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rubert_tiny2_qa_en_5.2.0_3.0_1700217532163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("rubert_tiny2_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("rubert_tiny2_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rubert_tiny2_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|664.3 MB| + +## References + +https://huggingface.co/Stacy123/rubert_tiny2_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-spanbert_squad_finetuned_qa_tf32_en.md b/docs/_posts/ahmedlone127/2023-11-17-spanbert_squad_finetuned_qa_tf32_en.md new file mode 100644 index 000000000000..88d004891ca5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-spanbert_squad_finetuned_qa_tf32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English spanbert_squad_finetuned_qa_tf32 BertForQuestionAnswering from botcon +author: John Snow Labs +name: spanbert_squad_finetuned_qa_tf32 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spanbert_squad_finetuned_qa_tf32` is a English model originally trained by botcon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spanbert_squad_finetuned_qa_tf32_en_5.2.0_3.0_1700222835835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spanbert_squad_finetuned_qa_tf32_en_5.2.0_3.0_1700222835835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("spanbert_squad_finetuned_qa_tf32","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("spanbert_squad_finetuned_qa_tf32", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spanbert_squad_finetuned_qa_tf32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|402.5 MB| + +## References + +https://huggingface.co/botcon/SpanBERT_squad_finetuned_qa_tf32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-spec_seq_lab_bengali_en.md b/docs/_posts/ahmedlone127/2023-11-17-spec_seq_lab_bengali_en.md new file mode 100644 index 000000000000..4dcda7277368 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-spec_seq_lab_bengali_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English spec_seq_lab_bengali BertForQuestionAnswering from mathildeparlo +author: John Snow Labs +name: spec_seq_lab_bengali +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spec_seq_lab_bengali` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spec_seq_lab_bengali_en_5.2.0_3.0_1700207170871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spec_seq_lab_bengali_en_5.2.0_3.0_1700207170871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("spec_seq_lab_bengali","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("spec_seq_lab_bengali", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spec_seq_lab_bengali| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|625.5 MB| + +## References + +https://huggingface.co/mathildeparlo/spec_seq_lab_bengali \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-sports_klue_finetuned_korquad_en.md b/docs/_posts/ahmedlone127/2023-11-17-sports_klue_finetuned_korquad_en.md new file mode 100644 index 000000000000..8ffc1cedfeea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-sports_klue_finetuned_korquad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sports_klue_finetuned_korquad BertForQuestionAnswering from Kdogs +author: John Snow Labs +name: sports_klue_finetuned_korquad +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sports_klue_finetuned_korquad` is a English model originally trained by Kdogs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sports_klue_finetuned_korquad_en_5.2.0_3.0_1700205629288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sports_klue_finetuned_korquad_en_5.2.0_3.0_1700205629288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("sports_klue_finetuned_korquad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("sports_klue_finetuned_korquad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sports_klue_finetuned_korquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|412.5 MB| + +## References + +https://huggingface.co/Kdogs/sports_klue_finetuned_korquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-squad_bert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-17-squad_bert_base_uncased_en.md new file mode 100644 index 000000000000..99aaa6146002 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-squad_bert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English squad_bert_base_uncased BertForQuestionAnswering from diffuserconfuser +author: John Snow Labs +name: squad_bert_base_uncased +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`squad_bert_base_uncased` is a English model originally trained by diffuserconfuser. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/squad_bert_base_uncased_en_5.2.0_3.0_1700217494264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/squad_bert_base_uncased_en_5.2.0_3.0_1700217494264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("squad_bert_base_uncased","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("squad_bert_base_uncased", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|squad_bert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/diffuserconfuser/squad-bert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-test_bert_3_en.md b/docs/_posts/ahmedlone127/2023-11-17-test_bert_3_en.md new file mode 100644 index 000000000000..e7d99a11b1ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-test_bert_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_bert_3 BertForQuestionAnswering from hung200504 +author: John Snow Labs +name: test_bert_3 +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_bert_3` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_bert_3_en_5.2.0_3.0_1700195866022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_bert_3_en_5.2.0_3.0_1700195866022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("test_bert_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("test_bert_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_bert_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|407.2 MB| + +## References + +https://huggingface.co/hung200504/test-bert-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-tiny_random_bertforquestionanswering_en.md b/docs/_posts/ahmedlone127/2023-11-17-tiny_random_bertforquestionanswering_en.md new file mode 100644 index 000000000000..02cecd80efd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-tiny_random_bertforquestionanswering_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tiny_random_bertforquestionanswering BertForQuestionAnswering from hf-tiny-model-private +author: John Snow Labs +name: tiny_random_bertforquestionanswering +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_bertforquestionanswering` is a English model originally trained by hf-tiny-model-private. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_bertforquestionanswering_en_5.2.0_3.0_1700191644425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_bertforquestionanswering_en_5.2.0_3.0_1700191644425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("tiny_random_bertforquestionanswering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("tiny_random_bertforquestionanswering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_bertforquestionanswering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|346.4 KB| + +## References + +https://huggingface.co/hf-tiny-model-private/tiny-random-BertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-17-wspalign_ft_kftt_en.md b/docs/_posts/ahmedlone127/2023-11-17-wspalign_ft_kftt_en.md new file mode 100644 index 000000000000..bc795cde5705 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-17-wspalign_ft_kftt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wspalign_ft_kftt BertForQuestionAnswering from qiyuw +author: John Snow Labs +name: wspalign_ft_kftt +date: 2023-11-17 +tags: [bert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: BertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wspalign_ft_kftt` is a English model originally trained by qiyuw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wspalign_ft_kftt_en_5.2.0_3.0_1700190715563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wspalign_ft_kftt_en_5.2.0_3.0_1700190715563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = BertForQuestionAnswering.pretrained("wspalign_ft_kftt","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = BertForQuestionAnswering + .pretrained("wspalign_ft_kftt", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wspalign_ft_kftt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|665.0 MB| + +## References + +https://huggingface.co/qiyuw/WSPAlign-ft-kftt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-03_model_sales_en.md b/docs/_posts/ahmedlone127/2023-11-18-03_model_sales_en.md new file mode 100644 index 000000000000..788be9abec35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-03_model_sales_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 03_model_sales DistilBertForSequenceClassification from hannoh +author: John Snow Labs +name: 03_model_sales +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`03_model_sales` is a English model originally trained by hannoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/03_model_sales_en_5.2.0_3.0_1700339455179.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/03_model_sales_en_5.2.0_3.0_1700339455179.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("03_model_sales","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("03_model_sales","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|03_model_sales| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/hannoh/03_model_sales \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-1023_en.md b/docs/_posts/ahmedlone127/2023-11-18-1023_en.md new file mode 100644 index 000000000000..4fc8363ee79a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-1023_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 1023 DistilBertForSequenceClassification from tingchih +author: John Snow Labs +name: 1023 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`1023` is a English model originally trained by tingchih. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/1023_en_5.2.0_3.0_1700345642807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/1023_en_5.2.0_3.0_1700345642807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("1023","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("1023","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|1023| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tingchih/1023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-action_decisions_en.md b/docs/_posts/ahmedlone127/2023-11-18-action_decisions_en.md new file mode 100644 index 000000000000..681b2edafae2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-action_decisions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English action_decisions DistilBertForSequenceClassification from knkarthick +author: John Snow Labs +name: action_decisions +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`action_decisions` is a English model originally trained by knkarthick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/action_decisions_en_5.2.0_3.0_1700347033471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/action_decisions_en_5.2.0_3.0_1700347033471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("action_decisions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("action_decisions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|action_decisions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/knkarthick/Action_Decisions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-action_items_en.md b/docs/_posts/ahmedlone127/2023-11-18-action_items_en.md new file mode 100644 index 000000000000..828543d2a4af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-action_items_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English action_items DistilBertForSequenceClassification from knkarthick +author: John Snow Labs +name: action_items +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`action_items` is a English model originally trained by knkarthick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/action_items_en_5.2.0_3.0_1700341844870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/action_items_en_5.2.0_3.0_1700341844870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("action_items","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("action_items","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|action_items| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/knkarthick/Action_Items \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-aes_ielts_en.md b/docs/_posts/ahmedlone127/2023-11-18-aes_ielts_en.md new file mode 100644 index 000000000000..882d3d0fedd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-aes_ielts_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English aes_ielts DistilBertForSequenceClassification from tkharisov7 +author: John Snow Labs +name: aes_ielts +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aes_ielts` is a English model originally trained by tkharisov7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aes_ielts_en_5.2.0_3.0_1700347010917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aes_ielts_en_5.2.0_3.0_1700347010917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("aes_ielts","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("aes_ielts","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aes_ielts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/tkharisov7/aes-ielts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-agent_customer_cls_en.md b/docs/_posts/ahmedlone127/2023-11-18-agent_customer_cls_en.md new file mode 100644 index 000000000000..f6823ab46953 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-agent_customer_cls_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English agent_customer_cls DistilBertForSequenceClassification from utterworks +author: John Snow Labs +name: agent_customer_cls +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`agent_customer_cls` is a English model originally trained by utterworks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/agent_customer_cls_en_5.2.0_3.0_1700344286073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/agent_customer_cls_en_5.2.0_3.0_1700344286073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("agent_customer_cls","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("agent_customer_cls","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|agent_customer_cls| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/utterworks/agent-customer-cls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-ai_voice_assistant_en.md b/docs/_posts/ahmedlone127/2023-11-18-ai_voice_assistant_en.md new file mode 100644 index 000000000000..973bddc09bde --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-ai_voice_assistant_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ai_voice_assistant DistilBertForSequenceClassification from qnlbnsl +author: John Snow Labs +name: ai_voice_assistant +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai_voice_assistant` is a English model originally trained by qnlbnsl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai_voice_assistant_en_5.2.0_3.0_1700350403569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai_voice_assistant_en_5.2.0_3.0_1700350403569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ai_voice_assistant","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ai_voice_assistant","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai_voice_assistant| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/qnlbnsl/ai_voice_assistant \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-amazon_query_product_ranking_en.md b/docs/_posts/ahmedlone127/2023-11-18-amazon_query_product_ranking_en.md new file mode 100644 index 000000000000..cec0f6173285 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-amazon_query_product_ranking_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English amazon_query_product_ranking DistilBertForSequenceClassification from LiYuan +author: John Snow Labs +name: amazon_query_product_ranking +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazon_query_product_ranking` is a English model originally trained by LiYuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazon_query_product_ranking_en_5.2.0_3.0_1700339958064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazon_query_product_ranking_en_5.2.0_3.0_1700339958064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("amazon_query_product_ranking","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("amazon_query_product_ranking","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazon_query_product_ranking| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/LiYuan/amazon-query-product-ranking \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-angry_birds_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-angry_birds_classifier_en.md new file mode 100644 index 000000000000..537c64567f06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-angry_birds_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English angry_birds_classifier DistilBertForSequenceClassification from wesleyacheng +author: John Snow Labs +name: angry_birds_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`angry_birds_classifier` is a English model originally trained by wesleyacheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/angry_birds_classifier_en_5.2.0_3.0_1700345891167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/angry_birds_classifier_en_5.2.0_3.0_1700345891167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("angry_birds_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("angry_birds_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|angry_birds_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/wesleyacheng/angry-birds-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-anomaly_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-anomaly_detection_en.md new file mode 100644 index 000000000000..ec1bfe290ddf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-anomaly_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English anomaly_detection DistilBertForSequenceClassification from SrimathiE21ALR044 +author: John Snow Labs +name: anomaly_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`anomaly_detection` is a English model originally trained by SrimathiE21ALR044. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/anomaly_detection_en_5.2.0_3.0_1700350809132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/anomaly_detection_en_5.2.0_3.0_1700350809132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("anomaly_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("anomaly_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|anomaly_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SrimathiE21ALR044/Anomaly-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-arabic_base_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-arabic_base_model_en.md new file mode 100644 index 000000000000..88de90ab5991 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-arabic_base_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English arabic_base_model DistilBertForSequenceClassification from mathildeparlo +author: John Snow Labs +name: arabic_base_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabic_base_model` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabic_base_model_en_5.2.0_3.0_1700342186291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabic_base_model_en_5.2.0_3.0_1700342186291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("arabic_base_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("arabic_base_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabic_base_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mathildeparlo/ar_base_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-arxiv_distilbert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-11-18-arxiv_distilbert_base_cased_en.md new file mode 100644 index 000000000000..ee0d120c9f6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-arxiv_distilbert_base_cased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English arxiv_distilbert_base_cased DistilBertForSequenceClassification from Wi +author: John Snow Labs +name: arxiv_distilbert_base_cased +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arxiv_distilbert_base_cased` is a English model originally trained by Wi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arxiv_distilbert_base_cased_en_5.2.0_3.0_1700343368898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arxiv_distilbert_base_cased_en_5.2.0_3.0_1700343368898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("arxiv_distilbert_base_cased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("arxiv_distilbert_base_cased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arxiv_distilbert_base_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/Wi/arxiv-distilbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-autonlp_imdb_sentiment_analysis2_7121569_en.md b/docs/_posts/ahmedlone127/2023-11-18-autonlp_imdb_sentiment_analysis2_7121569_en.md new file mode 100644 index 000000000000..43089aff69d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-autonlp_imdb_sentiment_analysis2_7121569_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autonlp_imdb_sentiment_analysis2_7121569 DistilBertForSequenceClassification from MICADEE +author: John Snow Labs +name: autonlp_imdb_sentiment_analysis2_7121569 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_imdb_sentiment_analysis2_7121569` is a English model originally trained by MICADEE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_imdb_sentiment_analysis2_7121569_en_5.2.0_3.0_1700337740291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_imdb_sentiment_analysis2_7121569_en_5.2.0_3.0_1700337740291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("autonlp_imdb_sentiment_analysis2_7121569","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("autonlp_imdb_sentiment_analysis2_7121569","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_imdb_sentiment_analysis2_7121569| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/MICADEE/autonlp-imdb-sentiment-analysis2-7121569 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-banglabert_with_tfmodel_en.md b/docs/_posts/ahmedlone127/2023-11-18-banglabert_with_tfmodel_en.md new file mode 100644 index 000000000000..f748bb14daad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-banglabert_with_tfmodel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English banglabert_with_tfmodel DistilBertForSequenceClassification from SarwarShafee +author: John Snow Labs +name: banglabert_with_tfmodel +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`banglabert_with_tfmodel` is a English model originally trained by SarwarShafee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/banglabert_with_tfmodel_en_5.2.0_3.0_1700339465311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/banglabert_with_tfmodel_en_5.2.0_3.0_1700339465311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("banglabert_with_tfmodel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("banglabert_with_tfmodel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|banglabert_with_tfmodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|413.3 MB| + +## References + +https://huggingface.co/SarwarShafee/BanglaBert_with_TFModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-banking_intent_distilbert_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-banking_intent_distilbert_classifier_en.md new file mode 100644 index 000000000000..f54d6893074f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-banking_intent_distilbert_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English banking_intent_distilbert_classifier DistilBertForSequenceClassification from lxyuan +author: John Snow Labs +name: banking_intent_distilbert_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`banking_intent_distilbert_classifier` is a English model originally trained by lxyuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/banking_intent_distilbert_classifier_en_5.2.0_3.0_1700342805121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/banking_intent_distilbert_classifier_en_5.2.0_3.0_1700342805121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("banking_intent_distilbert_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("banking_intent_distilbert_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|banking_intent_distilbert_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/lxyuan/banking-intent-distilbert-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-baseline_subtaska_en.md b/docs/_posts/ahmedlone127/2023-11-18-baseline_subtaska_en.md new file mode 100644 index 000000000000..e3cbc70d10c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-baseline_subtaska_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English baseline_subtaska DistilBertForSequenceClassification from robertotraba +author: John Snow Labs +name: baseline_subtaska +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`baseline_subtaska` is a English model originally trained by robertotraba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/baseline_subtaska_en_5.2.0_3.0_1700345189146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/baseline_subtaska_en_5.2.0_3.0_1700345189146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("baseline_subtaska","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("baseline_subtaska","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|baseline_subtaska| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/robertotraba/baseline_SubTaskA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-bert_alekseykorshuk_en.md b/docs/_posts/ahmedlone127/2023-11-18-bert_alekseykorshuk_en.md new file mode 100644 index 000000000000..fcbf2e8256f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-bert_alekseykorshuk_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_alekseykorshuk DistilBertForSequenceClassification from AlekseyKorshuk +author: John Snow Labs +name: bert_alekseykorshuk +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_alekseykorshuk` is a English model originally trained by AlekseyKorshuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_alekseykorshuk_en_5.2.0_3.0_1700344717864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_alekseykorshuk_en_5.2.0_3.0_1700344717864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_alekseykorshuk","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_alekseykorshuk","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_alekseykorshuk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AlekseyKorshuk/bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-bert_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-bert_classification_model_en.md new file mode 100644 index 000000000000..9005d0fd05d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-bert_classification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_classification_model DistilBertForSequenceClassification from maurosm +author: John Snow Labs +name: bert_classification_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_classification_model` is a English model originally trained by maurosm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_classification_model_en_5.2.0_3.0_1700342627429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_classification_model_en_5.2.0_3.0_1700342627429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_classification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_classification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/maurosm/bert_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-bert_network_packet_flow_header_payload_en.md b/docs/_posts/ahmedlone127/2023-11-18-bert_network_packet_flow_header_payload_en.md new file mode 100644 index 000000000000..76e0082834e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-bert_network_packet_flow_header_payload_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_network_packet_flow_header_payload DistilBertForSequenceClassification from rdpahalavan +author: John Snow Labs +name: bert_network_packet_flow_header_payload +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_network_packet_flow_header_payload` is a English model originally trained by rdpahalavan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_network_packet_flow_header_payload_en_5.2.0_3.0_1700346824536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_network_packet_flow_header_payload_en_5.2.0_3.0_1700346824536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_network_packet_flow_header_payload","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_network_packet_flow_header_payload","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_network_packet_flow_header_payload| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/rdpahalavan/bert-network-packet-flow-header-payload \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-bert_news_en.md b/docs/_posts/ahmedlone127/2023-11-18-bert_news_en.md new file mode 100644 index 000000000000..b38baf6b6dac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-bert_news_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English bert_news DistilBertEmbeddings from harvinder676 +author: John Snow Labs +name: bert_news +date: 2023-11-18 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_news` is a English model originally trained by harvinder676. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_news_en_5.2.0_3.0_1700349260738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_news_en_5.2.0_3.0_1700349260738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("bert_news","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("bert_news", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +References + +https://huggingface.co/harvinder676/bert-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-bert_sentimental_en.md b/docs/_posts/ahmedlone127/2023-11-18-bert_sentimental_en.md new file mode 100644 index 000000000000..c933c02b02c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-bert_sentimental_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_sentimental DistilBertForSequenceClassification from bowipawan +author: John Snow Labs +name: bert_sentimental +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_sentimental` is a English model originally trained by bowipawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_sentimental_en_5.2.0_3.0_1700341326416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_sentimental_en_5.2.0_3.0_1700341326416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_sentimental","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_sentimental","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_sentimental| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bowipawan/bert-sentimental \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-binary_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-binary_classification_en.md new file mode 100644 index 000000000000..8e054c021a2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-binary_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English binary_classification DistilBertForSequenceClassification from autoevaluate +author: John Snow Labs +name: binary_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_classification` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_classification_en_5.2.0_3.0_1700347036052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_classification_en_5.2.0_3.0_1700347036052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("binary_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("binary_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/autoevaluate/binary-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-binary_skills_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-binary_skills_classifier_en.md new file mode 100644 index 000000000000..c228f8cd916a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-binary_skills_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English binary_skills_classifier DistilBertForSequenceClassification from tkuye +author: John Snow Labs +name: binary_skills_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_skills_classifier` is a English model originally trained by tkuye. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_skills_classifier_en_5.2.0_3.0_1700342114060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_skills_classifier_en_5.2.0_3.0_1700342114060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("binary_skills_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("binary_skills_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_skills_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/tkuye/binary-skills-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-bl_books_genre_xx.md b/docs/_posts/ahmedlone127/2023-11-18-bl_books_genre_xx.md new file mode 100644 index 000000000000..eda2267f2a06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-bl_books_genre_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual bl_books_genre DistilBertForSequenceClassification from TheBritishLibrary +author: John Snow Labs +name: bl_books_genre +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bl_books_genre` is a Multilingual model originally trained by TheBritishLibrary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bl_books_genre_xx_5.2.0_3.0_1700342452109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bl_books_genre_xx_5.2.0_3.0_1700342452109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bl_books_genre","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bl_books_genre","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bl_books_genre| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|246.0 MB| + +## References + +https://huggingface.co/TheBritishLibrary/bl-books-genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_chjun_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_chjun_en.md new file mode 100644 index 000000000000..6b008ba74f18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_chjun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_chjun DistilBertForSequenceClassification from chjun +author: John Snow Labs +name: burmese_awesome_model_chjun +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_chjun` is a English model originally trained by chjun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_chjun_en_5.2.0_3.0_1700347787273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_chjun_en_5.2.0_3.0_1700347787273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_chjun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_chjun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_chjun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/chjun/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_shengqin_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_shengqin_en.md new file mode 100644 index 000000000000..1408fa5e02e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_shengqin_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_shengqin DistilBertForSequenceClassification from shengqin +author: John Snow Labs +name: burmese_awesome_model_shengqin +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_shengqin` is a English model originally trained by shengqin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_shengqin_en_5.2.0_3.0_1700348330193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_shengqin_en_5.2.0_3.0_1700348330193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_shengqin","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_shengqin","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_shengqin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/shengqin/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_stevhliu_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_stevhliu_en.md new file mode 100644 index 000000000000..dcdb76044ba5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_stevhliu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_stevhliu DistilBertForSequenceClassification from stevhliu +author: John Snow Labs +name: burmese_awesome_model_stevhliu +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_stevhliu` is a English model originally trained by stevhliu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_stevhliu_en_5.2.0_3.0_1700337299373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_stevhliu_en_5.2.0_3.0_1700337299373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_stevhliu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_stevhliu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_stevhliu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/stevhliu/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_v2_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_v2_en.md new file mode 100644 index 000000000000..2a5d57dd5630 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_awesome_model_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_v2 DistilBertForSequenceClassification from rarisenpai +author: John Snow Labs +name: burmese_awesome_model_v2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_v2` is a English model originally trained by rarisenpai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_v2_en_5.2.0_3.0_1700339112273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_v2_en_5.2.0_3.0_1700339112273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rarisenpai/my-awesome-model_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_bert_model_chensyii_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_bert_model_chensyii_en.md new file mode 100644 index 000000000000..14653edfdf9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_bert_model_chensyii_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_bert_model_chensyii DistilBertForSequenceClassification from chensyii +author: John Snow Labs +name: burmese_bert_model_chensyii +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_bert_model_chensyii` is a English model originally trained by chensyii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_bert_model_chensyii_en_5.2.0_3.0_1700342985544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_bert_model_chensyii_en_5.2.0_3.0_1700342985544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_bert_model_chensyii","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_bert_model_chensyii","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_bert_model_chensyii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/chensyii/my_bert_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_bert_model_v2_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_bert_model_v2_en.md new file mode 100644 index 000000000000..98cb3e30cb4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_bert_model_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_bert_model_v2 DistilBertForSequenceClassification from chensyii +author: John Snow Labs +name: burmese_bert_model_v2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_bert_model_v2` is a English model originally trained by chensyii. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_bert_model_v2_en_5.2.0_3.0_1700349258244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_bert_model_v2_en_5.2.0_3.0_1700349258244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_bert_model_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_bert_model_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_bert_model_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/chensyii/my_bert_model_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_distilbert_model_tirendaz_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_distilbert_model_tirendaz_en.md new file mode 100644 index 000000000000..f2241aaeadc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_distilbert_model_tirendaz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_distilbert_model_tirendaz DistilBertForSequenceClassification from Tirendaz +author: John Snow Labs +name: burmese_distilbert_model_tirendaz +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_distilbert_model_tirendaz` is a English model originally trained by Tirendaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_distilbert_model_tirendaz_en_5.2.0_3.0_1700343934246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_distilbert_model_tirendaz_en_5.2.0_3.0_1700343934246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_distilbert_model_tirendaz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_distilbert_model_tirendaz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_distilbert_model_tirendaz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Tirendaz/my_distilbert_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_hospital_reco_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_hospital_reco_en.md new file mode 100644 index 000000000000..51d10965a4d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_hospital_reco_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_hospital_reco DistilBertForSequenceClassification from itsriya +author: John Snow Labs +name: burmese_hospital_reco +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_hospital_reco` is a English model originally trained by itsriya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_hospital_reco_en_5.2.0_3.0_1700347470228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_hospital_reco_en_5.2.0_3.0_1700347470228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_hospital_reco","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_hospital_reco","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_hospital_reco| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/itsriya/my_hospital_reco \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-burmese_sequenceclassification_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-burmese_sequenceclassification_model_en.md new file mode 100644 index 000000000000..4e343c035b49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-burmese_sequenceclassification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_sequenceclassification_model DistilBertForSequenceClassification from TiffanyTiffany +author: John Snow Labs +name: burmese_sequenceclassification_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_sequenceclassification_model` is a English model originally trained by TiffanyTiffany. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_sequenceclassification_model_en_5.2.0_3.0_1700351048904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_sequenceclassification_model_en_5.2.0_3.0_1700351048904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_sequenceclassification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_sequenceclassification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_sequenceclassification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/TiffanyTiffany/my_sequenceClassification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-byt_malurl_db_bu_en.md b/docs/_posts/ahmedlone127/2023-11-18-byt_malurl_db_bu_en.md new file mode 100644 index 000000000000..963355eab5b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-byt_malurl_db_bu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English byt_malurl_db_bu DistilBertForSequenceClassification from bgspaditya +author: John Snow Labs +name: byt_malurl_db_bu +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`byt_malurl_db_bu` is a English model originally trained by bgspaditya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/byt_malurl_db_bu_en_5.2.0_3.0_1700350091321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/byt_malurl_db_bu_en_5.2.0_3.0_1700350091321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("byt_malurl_db_bu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("byt_malurl_db_bu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|byt_malurl_db_bu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/bgspaditya/byt-malurl-db-bu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-cadd_nsfw_sfw_en.md b/docs/_posts/ahmedlone127/2023-11-18-cadd_nsfw_sfw_en.md new file mode 100644 index 000000000000..aa98b0534580 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-cadd_nsfw_sfw_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cadd_nsfw_sfw DistilBertForSequenceClassification from feruskas +author: John Snow Labs +name: cadd_nsfw_sfw +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cadd_nsfw_sfw` is a English model originally trained by feruskas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cadd_nsfw_sfw_en_5.2.0_3.0_1700351462923.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cadd_nsfw_sfw_en_5.2.0_3.0_1700351462923.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("cadd_nsfw_sfw","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("cadd_nsfw_sfw","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cadd_nsfw_sfw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/feruskas/CADD-NSFW-SFW \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-categorizer_en.md b/docs/_posts/ahmedlone127/2023-11-18-categorizer_en.md new file mode 100644 index 000000000000..69a7d56be19c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-categorizer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English categorizer DistilBertForSequenceClassification from passionMan +author: John Snow Labs +name: categorizer +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`categorizer` is a English model originally trained by passionMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/categorizer_en_5.2.0_3.0_1700338648650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/categorizer_en_5.2.0_3.0_1700338648650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("categorizer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("categorizer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|categorizer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/passionMan/categorizer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-chatgpt_eli5_text_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-chatgpt_eli5_text_classifier_en.md new file mode 100644 index 000000000000..720278451a1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-chatgpt_eli5_text_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chatgpt_eli5_text_classifier DistilBertForSequenceClassification from RafiBrent +author: John Snow Labs +name: chatgpt_eli5_text_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt_eli5_text_classifier` is a English model originally trained by RafiBrent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_eli5_text_classifier_en_5.2.0_3.0_1700349959934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_eli5_text_classifier_en_5.2.0_3.0_1700349959934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("chatgpt_eli5_text_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("chatgpt_eli5_text_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt_eli5_text_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/RafiBrent/chatgpt_eli5_text_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-chatgpt_en.md b/docs/_posts/ahmedlone127/2023-11-18-chatgpt_en.md new file mode 100644 index 000000000000..145a6d4f7683 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-chatgpt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chatgpt DistilBertForSequenceClassification from lewtun +author: John Snow Labs +name: chatgpt +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt` is a English model originally trained by lewtun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_en_5.2.0_3.0_1700346276509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_en_5.2.0_3.0_1700346276509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("chatgpt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("chatgpt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/lewtun/chatgpt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-chucklewhizrater_en.md b/docs/_posts/ahmedlone127/2023-11-18-chucklewhizrater_en.md new file mode 100644 index 000000000000..e503d14d937a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-chucklewhizrater_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chucklewhizrater DistilBertForSequenceClassification from botbrain +author: John Snow Labs +name: chucklewhizrater +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chucklewhizrater` is a English model originally trained by botbrain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chucklewhizrater_en_5.2.0_3.0_1700340381257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chucklewhizrater_en_5.2.0_3.0_1700340381257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("chucklewhizrater","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("chucklewhizrater","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chucklewhizrater| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/botbrain/ChuckleWhizRater \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-claim3a_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-18-claim3a_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..2e850c6bf547 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-claim3a_distilbert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English claim3a_distilbert_base_uncased DistilBertForSequenceClassification from SCORE +author: John Snow Labs +name: claim3a_distilbert_base_uncased +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`claim3a_distilbert_base_uncased` is a English model originally trained by SCORE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/claim3a_distilbert_base_uncased_en_5.2.0_3.0_1700349670484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/claim3a_distilbert_base_uncased_en_5.2.0_3.0_1700349670484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("claim3a_distilbert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("claim3a_distilbert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|claim3a_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SCORE/claim3a-distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-clasificador_news_en.md b/docs/_posts/ahmedlone127/2023-11-18-clasificador_news_en.md new file mode 100644 index 000000000000..9e74ccc125a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-clasificador_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English clasificador_news DistilBertForSequenceClassification from Alesteba +author: John Snow Labs +name: clasificador_news +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clasificador_news` is a English model originally trained by Alesteba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clasificador_news_en_5.2.0_3.0_1700347962424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clasificador_news_en_5.2.0_3.0_1700347962424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("clasificador_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("clasificador_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clasificador_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Alesteba/clasificador-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-code_vs_dutch_en.md b/docs/_posts/ahmedlone127/2023-11-18-code_vs_dutch_en.md new file mode 100644 index 000000000000..4334b76708eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-code_vs_dutch_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English code_vs_dutch DistilBertForSequenceClassification from usvsnsp +author: John Snow Labs +name: code_vs_dutch +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`code_vs_dutch` is a English model originally trained by usvsnsp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/code_vs_dutch_en_5.2.0_3.0_1700341977071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/code_vs_dutch_en_5.2.0_3.0_1700341977071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("code_vs_dutch","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("code_vs_dutch","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|code_vs_dutch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/usvsnsp/code-vs-nl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k_en.md b/docs/_posts/ahmedlone127/2023-11-18-cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k_en.md new file mode 100644 index 000000000000..e19ac3ca103e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k DistilBertForSequenceClassification from vocab-transformers +author: John Snow Labs +name: cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k` is a English model originally trained by vocab-transformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k_en_5.2.0_3.0_1700345223017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k_en_5.2.0_3.0_1700345223017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_msmarco_distilbert_word2vec256k_mlm_400k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|887.9 MB| + +## References + +https://huggingface.co/vocab-transformers/cross_encoder-msmarco-distilbert-word2vec256k-MLM_400k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated_en.md b/docs/_posts/ahmedlone127/2023-11-18-cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated_en.md new file mode 100644 index 000000000000..a7f1c98cdc41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated DistilBertForSequenceClassification from vocab-transformers +author: John Snow Labs +name: cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated` is a English model originally trained by vocab-transformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated_en_5.2.0_3.0_1700346735708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated_en_5.2.0_3.0_1700346735708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_msmarco_distilbert_word2vec256k_mlm_785k_emb_updated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|908.2 MB| + +## References + +https://huggingface.co/vocab-transformers/cross_encoder-msmarco-distilbert-word2vec256k-MLM_785k_emb_updated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-cs4375a1_class_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-cs4375a1_class_model_en.md new file mode 100644 index 000000000000..1824e1b613af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-cs4375a1_class_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cs4375a1_class_model DistilBertForSequenceClassification from suvelmuttreja +author: John Snow Labs +name: cs4375a1_class_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cs4375a1_class_model` is a English model originally trained by suvelmuttreja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cs4375a1_class_model_en_5.2.0_3.0_1700347787969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cs4375a1_class_model_en_5.2.0_3.0_1700347787969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("cs4375a1_class_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("cs4375a1_class_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cs4375a1_class_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/suvelmuttreja/cs4375a1_class_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-cyberbullying_sentiment_dsce_2023_en.md b/docs/_posts/ahmedlone127/2023-11-18-cyberbullying_sentiment_dsce_2023_en.md new file mode 100644 index 000000000000..e51c96d6fe4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-cyberbullying_sentiment_dsce_2023_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cyberbullying_sentiment_dsce_2023 DistilBertForSequenceClassification from sreeniketh +author: John Snow Labs +name: cyberbullying_sentiment_dsce_2023 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cyberbullying_sentiment_dsce_2023` is a English model originally trained by sreeniketh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cyberbullying_sentiment_dsce_2023_en_5.2.0_3.0_1700346628196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cyberbullying_sentiment_dsce_2023_en_5.2.0_3.0_1700346628196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("cyberbullying_sentiment_dsce_2023","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("cyberbullying_sentiment_dsce_2023","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cyberbullying_sentiment_dsce_2023| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sreeniketh/cyberbullying_sentiment_dsce_2023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-dear_jarvis_monolith_xed_english_en.md b/docs/_posts/ahmedlone127/2023-11-18-dear_jarvis_monolith_xed_english_en.md new file mode 100644 index 000000000000..d008f8341cd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-dear_jarvis_monolith_xed_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dear_jarvis_monolith_xed_english DistilBertForSequenceClassification from JuliusAlphonso +author: John Snow Labs +name: dear_jarvis_monolith_xed_english +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dear_jarvis_monolith_xed_english` is a English model originally trained by JuliusAlphonso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dear_jarvis_monolith_xed_english_en_5.2.0_3.0_1700344575211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dear_jarvis_monolith_xed_english_en_5.2.0_3.0_1700344575211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("dear_jarvis_monolith_xed_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("dear_jarvis_monolith_xed_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dear_jarvis_monolith_xed_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/JuliusAlphonso/dear-jarvis-monolith-xed-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-dear_jarvis_v5_en.md b/docs/_posts/ahmedlone127/2023-11-18-dear_jarvis_v5_en.md new file mode 100644 index 000000000000..d2b0b8117221 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-dear_jarvis_v5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dear_jarvis_v5 DistilBertForSequenceClassification from JuliusAlphonso +author: John Snow Labs +name: dear_jarvis_v5 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dear_jarvis_v5` is a English model originally trained by JuliusAlphonso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dear_jarvis_v5_en_5.2.0_3.0_1700346446400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dear_jarvis_v5_en_5.2.0_3.0_1700346446400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("dear_jarvis_v5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("dear_jarvis_v5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dear_jarvis_v5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/JuliusAlphonso/dear-jarvis-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-depression_detection_model_v2_en.md b/docs/_posts/ahmedlone127/2023-11-18-depression_detection_model_v2_en.md new file mode 100644 index 000000000000..48f456234d08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-depression_detection_model_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English depression_detection_model_v2 DistilBertForSequenceClassification from DoryDing +author: John Snow Labs +name: depression_detection_model_v2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`depression_detection_model_v2` is a English model originally trained by DoryDing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/depression_detection_model_v2_en_5.2.0_3.0_1700341998506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/depression_detection_model_v2_en_5.2.0_3.0_1700341998506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("depression_detection_model_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("depression_detection_model_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|depression_detection_model_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DoryDing/Depression_Detection_Model_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-depression_model_tiemnd_en.md b/docs/_posts/ahmedlone127/2023-11-18-depression_model_tiemnd_en.md new file mode 100644 index 000000000000..e792e2fe1f06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-depression_model_tiemnd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English depression_model_tiemnd DistilBertForSequenceClassification from tiemnd +author: John Snow Labs +name: depression_model_tiemnd +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`depression_model_tiemnd` is a English model originally trained by tiemnd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/depression_model_tiemnd_en_5.2.0_3.0_1700350614064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/depression_model_tiemnd_en_5.2.0_3.0_1700350614064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("depression_model_tiemnd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("depression_model_tiemnd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|depression_model_tiemnd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tiemnd/depression_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-depressionanalysis_en.md b/docs/_posts/ahmedlone127/2023-11-18-depressionanalysis_en.md new file mode 100644 index 000000000000..6d20888b8860 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-depressionanalysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English depressionanalysis DistilBertForSequenceClassification from sanskar +author: John Snow Labs +name: depressionanalysis +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`depressionanalysis` is a English model originally trained by sanskar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/depressionanalysis_en_5.2.0_3.0_1700339456466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/depressionanalysis_en_5.2.0_3.0_1700339456466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("depressionanalysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("depressionanalysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|depressionanalysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sanskar/DepressionAnalysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-destilbert_uncased_fever_nli_en.md b/docs/_posts/ahmedlone127/2023-11-18-destilbert_uncased_fever_nli_en.md new file mode 100644 index 000000000000..8f40d9462301 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-destilbert_uncased_fever_nli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English destilbert_uncased_fever_nli DistilBertForSequenceClassification from ernlavr +author: John Snow Labs +name: destilbert_uncased_fever_nli +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`destilbert_uncased_fever_nli` is a English model originally trained by ernlavr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/destilbert_uncased_fever_nli_en_5.2.0_3.0_1700346443749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/destilbert_uncased_fever_nli_en_5.2.0_3.0_1700346443749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("destilbert_uncased_fever_nli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("destilbert_uncased_fever_nli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|destilbert_uncased_fever_nli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ernlavr/destilbert_uncased_fever_nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-detect_ai_text_1_en.md b/docs/_posts/ahmedlone127/2023-11-18-detect_ai_text_1_en.md new file mode 100644 index 000000000000..4b123f8f3a16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-detect_ai_text_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English detect_ai_text_1 DistilBertForSequenceClassification from akshayvkt +author: John Snow Labs +name: detect_ai_text_1 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`detect_ai_text_1` is a English model originally trained by akshayvkt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/detect_ai_text_1_en_5.2.0_3.0_1700349503930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/detect_ai_text_1_en_5.2.0_3.0_1700349503930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("detect_ai_text_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("detect_ai_text_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|detect_ai_text_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/akshayvkt/detect-ai-text-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-detect_ai_text_en.md b/docs/_posts/ahmedlone127/2023-11-18-detect_ai_text_en.md new file mode 100644 index 000000000000..9731cc74efe8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-detect_ai_text_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English detect_ai_text DistilBertForSequenceClassification from akshayvkt +author: John Snow Labs +name: detect_ai_text +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`detect_ai_text` is a English model originally trained by akshayvkt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/detect_ai_text_en_5.2.0_3.0_1700351621688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/detect_ai_text_en_5.2.0_3.0_1700351621688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("detect_ai_text","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("detect_ai_text","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|detect_ai_text| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/akshayvkt/detect-ai-text \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distil_bert_uncased_finetuned_relations_en.md b/docs/_posts/ahmedlone127/2023-11-18-distil_bert_uncased_finetuned_relations_en.md new file mode 100644 index 000000000000..75f4b245b611 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distil_bert_uncased_finetuned_relations_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distil_bert_uncased_finetuned_relations DistilBertForSequenceClassification from nikolamilosevic +author: John Snow Labs +name: distil_bert_uncased_finetuned_relations +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_uncased_finetuned_relations` is a English model originally trained by nikolamilosevic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_uncased_finetuned_relations_en_5.2.0_3.0_1700349241337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_uncased_finetuned_relations_en_5.2.0_3.0_1700349241337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distil_bert_uncased_finetuned_relations","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distil_bert_uncased_finetuned_relations","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_uncased_finetuned_relations| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nikolamilosevic/distil_bert_uncased-finetuned-relations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbart_mnli_github_issues_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbart_mnli_github_issues_en.md new file mode 100644 index 000000000000..8a1a533f2955 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbart_mnli_github_issues_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbart_mnli_github_issues DistilBertForSequenceClassification from AntoineMC +author: John Snow Labs +name: distilbart_mnli_github_issues +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbart_mnli_github_issues` is a English model originally trained by AntoineMC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbart_mnli_github_issues_en_5.2.0_3.0_1700340083472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbart_mnli_github_issues_en_5.2.0_3.0_1700340083472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbart_mnli_github_issues","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbart_mnli_github_issues","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbart_mnli_github_issues| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AntoineMC/distilbart-mnli-github-issues \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_accounting_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_accounting_en.md new file mode 100644 index 000000000000..61101f17ad0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_accounting_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_accounting DistilBertForSequenceClassification from TomeeSK +author: John Snow Labs +name: distilbert_accounting +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_accounting` is a English model originally trained by TomeeSK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_accounting_en_5.2.0_3.0_1700347526041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_accounting_en_5.2.0_3.0_1700347526041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_accounting","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_accounting","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_accounting| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.5 MB| + +## References + +https://huggingface.co/TomeeSK/distilbert-accounting \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_ag_cnn_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_ag_cnn_en.md new file mode 100644 index 000000000000..a3f886660f97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_ag_cnn_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_ag_cnn DistilBertForSequenceClassification from AyoubChLin +author: John Snow Labs +name: distilbert_ag_cnn +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_ag_cnn` is a English model originally trained by AyoubChLin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ag_cnn_en_5.2.0_3.0_1700349529251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ag_cnn_en_5.2.0_3.0_1700349529251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_ag_cnn","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_ag_cnn","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ag_cnn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AyoubChLin/distilbert_ag_cnn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_allsides_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_allsides_en.md new file mode 100644 index 000000000000..cc22d957e743 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_allsides_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_allsides DistilBertForSequenceClassification from valurank +author: John Snow Labs +name: distilbert_allsides +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_allsides` is a English model originally trained by valurank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_allsides_en_5.2.0_3.0_1700339289478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_allsides_en_5.2.0_3.0_1700339289478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_allsides","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_allsides","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_allsides| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/valurank/distilbert-allsides \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_amazon_shoe_reviews_juliensimon_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_amazon_shoe_reviews_juliensimon_en.md new file mode 100644 index 000000000000..a8883f8827d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_amazon_shoe_reviews_juliensimon_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_amazon_shoe_reviews_juliensimon DistilBertForSequenceClassification from juliensimon +author: John Snow Labs +name: distilbert_amazon_shoe_reviews_juliensimon +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_amazon_shoe_reviews_juliensimon` is a English model originally trained by juliensimon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_amazon_shoe_reviews_juliensimon_en_5.2.0_3.0_1700341330035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_amazon_shoe_reviews_juliensimon_en_5.2.0_3.0_1700341330035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_shoe_reviews_juliensimon","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_shoe_reviews_juliensimon","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_amazon_shoe_reviews_juliensimon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/juliensimon/distilbert-amazon-shoe-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_cola_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_cola_en.md new file mode 100644 index 000000000000..228e85c3d8a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_cola DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_cased_cola +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_cola` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_cola_en_5.2.0_3.0_1700338034913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_cola_en_5.2.0_3.0_1700338034913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-cased-CoLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_finetuned_fake_and_real_news_dataset_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_finetuned_fake_and_real_news_dataset_en.md new file mode 100644 index 000000000000..aeb2618b785a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_finetuned_fake_and_real_news_dataset_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_fake_and_real_news_dataset DistilBertForSequenceClassification from Giyaseddin +author: John Snow Labs +name: distilbert_base_cased_finetuned_fake_and_real_news_dataset +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_fake_and_real_news_dataset` is a English model originally trained by Giyaseddin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_fake_and_real_news_dataset_en_5.2.0_3.0_1700347645240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_fake_and_real_news_dataset_en_5.2.0_3.0_1700347645240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_finetuned_fake_and_real_news_dataset","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_finetuned_fake_and_real_news_dataset","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_fake_and_real_news_dataset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/Giyaseddin/distilbert-base-cased-finetuned-fake-and-real-news-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_qqp_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_qqp_en.md new file mode 100644 index 000000000000..81dcb995b569 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_qqp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_qqp DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_cased_qqp +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_qqp` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_qqp_en_5.2.0_3.0_1700344547025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_qqp_en_5.2.0_3.0_1700344547025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_qqp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_qqp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_qqp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-cased-QQP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_sst_2_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_sst_2_en.md new file mode 100644 index 000000000000..2c795f85d170 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_cased_sst_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_sst_2 DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_cased_sst_2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_sst_2` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_sst_2_en_5.2.0_3.0_1700342040464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_sst_2_en_5.2.0_3.0_1700342040464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_sst_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_sst_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_sst_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-cased-SST-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_fallacy_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_fallacy_classification_en.md new file mode 100644 index 000000000000..017753a980b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_fallacy_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_fallacy_classification DistilBertForSequenceClassification from q3fer +author: John Snow Labs +name: distilbert_base_fallacy_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_fallacy_classification` is a English model originally trained by q3fer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_fallacy_classification_en_5.2.0_3.0_1700344394413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_fallacy_classification_en_5.2.0_3.0_1700344394413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_fallacy_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_fallacy_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_fallacy_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/q3fer/distilbert-base-fallacy-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_financial_relation_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_financial_relation_extraction_en.md new file mode 100644 index 000000000000..97ac9666d9e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_financial_relation_extraction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_financial_relation_extraction DistilBertForSequenceClassification from yseop +author: John Snow Labs +name: distilbert_base_financial_relation_extraction +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_financial_relation_extraction` is a English model originally trained by yseop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_financial_relation_extraction_en_5.2.0_3.0_1700341824650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_financial_relation_extraction_en_5.2.0_3.0_1700341824650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_financial_relation_extraction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_financial_relation_extraction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_financial_relation_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yseop/distilbert-base-financial-relation-extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multi_french_sexism_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multi_french_sexism_en.md new file mode 100644 index 000000000000..4a7dd49eca08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multi_french_sexism_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_multi_french_sexism DistilBertForSequenceClassification from lidiapierre +author: John Snow Labs +name: distilbert_base_multi_french_sexism +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multi_french_sexism` is a English model originally trained by lidiapierre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multi_french_sexism_en_5.2.0_3.0_1700349435272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multi_french_sexism_en_5.2.0_3.0_1700349435272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multi_french_sexism","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multi_french_sexism","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multi_french_sexism| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/lidiapierre/distilbert-base-multi-fr-sexism \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_amazon_chinese_20000_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_amazon_chinese_20000_xx.md new file mode 100644 index 000000000000..b30c8741553c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_amazon_chinese_20000_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_amazon_chinese_20000 DistilBertForSequenceClassification from ASCCCCCCCC +author: John Snow Labs +name: distilbert_base_multilingual_cased_amazon_chinese_20000 +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_amazon_chinese_20000` is a Multilingual model originally trained by ASCCCCCCCC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_amazon_chinese_20000_xx_5.2.0_3.0_1700349277085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_amazon_chinese_20000_xx_5.2.0_3.0_1700349277085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_amazon_chinese_20000","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_amazon_chinese_20000","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_amazon_chinese_20000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/ASCCCCCCCC/distilbert-base-multilingual-cased-amazon_zh_20000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_email_spam_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_email_spam_xx.md new file mode 100644 index 000000000000..1eb4ba378175 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_email_spam_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_email_spam DistilBertForSequenceClassification from 1aurent +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_email_spam +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_email_spam` is a Multilingual model originally trained by 1aurent. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_email_spam_xx_5.2.0_3.0_1700343876105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_email_spam_xx_5.2.0_3.0_1700343876105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_email_spam","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_email_spam","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_email_spam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/1aurent/distilbert-base-multilingual-cased-finetuned-email-spam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_emotion_toshifumi_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_emotion_toshifumi_xx.md new file mode 100644 index 000000000000..b23516681e0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_emotion_toshifumi_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_emotion_toshifumi DistilBertForSequenceClassification from Toshifumi +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_emotion_toshifumi +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_emotion_toshifumi` is a Multilingual model originally trained by Toshifumi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_emotion_toshifumi_xx_5.2.0_3.0_1700341886418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_emotion_toshifumi_xx_5.2.0_3.0_1700341886418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_emotion_toshifumi","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_emotion_toshifumi","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_emotion_toshifumi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/Toshifumi/distilbert-base-multilingual-cased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_sentiment_albanian_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_sentiment_albanian_xx.md new file mode 100644 index 000000000000..971c6be050bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_finetuned_sentiment_albanian_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_sentiment_albanian DistilBertForSequenceClassification from Gerti +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_sentiment_albanian +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_sentiment_albanian` is a Multilingual model originally trained by Gerti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_sentiment_albanian_xx_5.2.0_3.0_1700344402352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_sentiment_albanian_xx_5.2.0_3.0_1700344402352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_sentiment_albanian","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_sentiment_albanian","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_sentiment_albanian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/Gerti/distilbert-base-multilingual-cased-finetuned-sentiment-albanian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_language_detection_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_language_detection_xx.md new file mode 100644 index 000000000000..86523be26f0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_language_detection_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_language_detection DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_multilingual_cased_language_detection +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_language_detection` is a Multilingual model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_language_detection_xx_5.2.0_3.0_1700340723044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_language_detection_xx_5.2.0_3.0_1700340723044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_language_detection","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_language_detection","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_language_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-multilingual-cased-language_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sarcasmo_esp_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sarcasmo_esp_xx.md new file mode 100644 index 000000000000..83d3086b080d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sarcasmo_esp_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_sarcasmo_esp DistilBertForSequenceClassification from rogelioplatt +author: John Snow Labs +name: distilbert_base_multilingual_cased_sarcasmo_esp +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_sarcasmo_esp` is a Multilingual model originally trained by rogelioplatt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sarcasmo_esp_xx_5.2.0_3.0_1700345503140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sarcasmo_esp_xx_5.2.0_3.0_1700345503140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sarcasmo_esp","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sarcasmo_esp","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_sarcasmo_esp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/rogelioplatt/distilbert-base-multilingual-cased-Sarcasmo_Esp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiment_2_philschmid_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiment_2_philschmid_xx.md new file mode 100644 index 000000000000..adc732937862 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiment_2_philschmid_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_sentiment_2_philschmid DistilBertForSequenceClassification from philschmid +author: John Snow Labs +name: distilbert_base_multilingual_cased_sentiment_2_philschmid +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_sentiment_2_philschmid` is a Multilingual model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiment_2_philschmid_xx_5.2.0_3.0_1700338511492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiment_2_philschmid_xx_5.2.0_3.0_1700338511492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiment_2_philschmid","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiment_2_philschmid","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_sentiment_2_philschmid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.5 MB| + +## References + +https://huggingface.co/philschmid/distilbert-base-multilingual-cased-sentiment-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiment_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiment_xx.md new file mode 100644 index 000000000000..79935dc6f76d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiment_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_sentiment DistilBertForSequenceClassification from philschmid +author: John Snow Labs +name: distilbert_base_multilingual_cased_sentiment +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_sentiment` is a Multilingual model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiment_xx_5.2.0_3.0_1700339912546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiment_xx_5.2.0_3.0_1700339912546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiment","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiment","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/philschmid/distilbert-base-multilingual-cased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiments_student_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiments_student_xx.md new file mode 100644 index 000000000000..e389ec565688 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_sentiments_student_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_sentiments_student DistilBertForSequenceClassification from lxyuan +author: John Snow Labs +name: distilbert_base_multilingual_cased_sentiments_student +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_sentiments_student` is a Multilingual model originally trained by lxyuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiments_student_xx_5.2.0_3.0_1700338771201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiments_student_xx_5.2.0_3.0_1700338771201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiments_student","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiments_student","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_sentiments_student| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/lxyuan/distilbert-base-multilingual-cased-sentiments-student \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_toxicity_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_toxicity_xx.md new file mode 100644 index 000000000000..32edbe687e8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_toxicity_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_toxicity DistilBertForSequenceClassification from citizenlab +author: John Snow Labs +name: distilbert_base_multilingual_cased_toxicity +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_toxicity` is a Multilingual model originally trained by citizenlab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_toxicity_xx_5.2.0_3.0_1700340331622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_toxicity_xx_5.2.0_3.0_1700340331622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_toxicity","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_toxicity","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_toxicity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/citizenlab/distilbert-base-multilingual-cased-toxicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_vietnamese_topicifier_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_vietnamese_topicifier_xx.md new file mode 100644 index 000000000000..5b98789b632f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_multilingual_cased_vietnamese_topicifier_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_vietnamese_topicifier DistilBertForSequenceClassification from lamhieu +author: John Snow Labs +name: distilbert_base_multilingual_cased_vietnamese_topicifier +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_vietnamese_topicifier` is a Multilingual model originally trained by lamhieu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_vietnamese_topicifier_xx_5.2.0_3.0_1700341525253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_vietnamese_topicifier_xx_5.2.0_3.0_1700341525253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_vietnamese_topicifier","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_vietnamese_topicifier","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_vietnamese_topicifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|515.5 MB| + +## References + +https://huggingface.co/lamhieu/distilbert-base-multilingual-cased-vietnamese-topicifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_pii_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_pii_en.md new file mode 100644 index 000000000000..0a4502a60fad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_pii_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_pii DistilBertForSequenceClassification from bburnworth +author: John Snow Labs +name: distilbert_base_pii +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_pii` is a English model originally trained by bburnworth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_pii_en_5.2.0_3.0_1700342631770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_pii_en_5.2.0_3.0_1700342631770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_pii","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_pii","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_pii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bburnworth/distilbert-base-pii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_spanish_uncased_finetuned_mldoc_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_spanish_uncased_finetuned_mldoc_en.md new file mode 100644 index 000000000000..a598584cdf8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_spanish_uncased_finetuned_mldoc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_mldoc DistilBertForSequenceClassification from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_mldoc +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_mldoc` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_mldoc_en_5.2.0_3.0_1700344423595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_mldoc_en_5.2.0_3.0_1700344423595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_spanish_uncased_finetuned_mldoc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_spanish_uncased_finetuned_mldoc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_mldoc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-mldoc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_spanish_uncased_finetuned_xnli_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_spanish_uncased_finetuned_xnli_en.md new file mode 100644 index 000000000000..2e4c7778bcea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_spanish_uncased_finetuned_xnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_xnli DistilBertForSequenceClassification from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_xnli +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_xnli` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_xnli_en_5.2.0_3.0_1700349531748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_xnli_en_5.2.0_3.0_1700349531748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_spanish_uncased_finetuned_xnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_spanish_uncased_finetuned_xnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_xnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-xnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_turkish_cased_offensive_tr.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_turkish_cased_offensive_tr.md new file mode 100644 index 000000000000..669a3dca6bcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_turkish_cased_offensive_tr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Turkish distilbert_base_turkish_cased_offensive DistilBertForSequenceClassification from Overfit-GM +author: John Snow Labs +name: distilbert_base_turkish_cased_offensive +date: 2023-11-18 +tags: [bert, tr, open_source, sequence_classification, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_turkish_cased_offensive` is a Turkish model originally trained by Overfit-GM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_offensive_tr_5.2.0_3.0_1700345747674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_offensive_tr_5.2.0_3.0_1700345747674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_turkish_cased_offensive","tr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_turkish_cased_offensive","tr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_turkish_cased_offensive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tr| +|Size:|254.1 MB| + +## References + +https://huggingface.co/Overfit-GM/distilbert-base-turkish-cased-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__enron_spam__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__enron_spam__all_train_en.md new file mode 100644 index 000000000000..9f092b8b149d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__enron_spam__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__enron_spam__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__enron_spam__all_train +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__enron_spam__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__enron_spam__all_train_en_5.2.0_3.0_1700343164036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__enron_spam__all_train_en_5.2.0_3.0_1700343164036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__enron_spam__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__enron_spam__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__enron_spam__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__enron_spam__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_0_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_0_en.md new file mode 100644 index 000000000000..89f95edf5ca3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_0 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_0 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_0` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_0_en_5.2.0_3.0_1700351544640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_0_en_5.2.0_3.0_1700351544640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_2_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_2_en.md new file mode 100644 index 000000000000..0b274431c00d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_2 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_2` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_2_en_5.2.0_3.0_1700350805521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_2_en_5.2.0_3.0_1700350805521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_7_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_7_en.md new file mode 100644 index 000000000000..5e2f1a71ae5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_16_7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_7 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_7 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_7` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_7_en_5.2.0_3.0_1700350982840.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_7_en_5.2.0_3.0_1700350982840.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_32_3_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_32_3_en.md new file mode 100644 index 000000000000..1ea3392d69a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_32_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_3 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_3 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_3` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_3_en_5.2.0_3.0_1700351823680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_3_en_5.2.0_3.0_1700351823680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_32_5_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_32_5_en.md new file mode 100644 index 000000000000..a45991ba5d06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_32_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_5 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_5 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_5` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_5_en_5.2.0_3.0_1700348813610.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_5_en_5.2.0_3.0_1700348813610.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_8_4_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_8_4_en.md new file mode 100644 index 000000000000..bd2213f3af75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_8_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_4 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_4 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_4` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_4_en_5.2.0_3.0_1700349816116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_4_en_5.2.0_3.0_1700349816116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_8_5_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_8_5_en.md new file mode 100644 index 000000000000..a23621e4332b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__hate_speech_offensive__train_8_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_5 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_5 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_5` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_5_en_5.2.0_3.0_1700348542850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_5_en_5.2.0_3.0_1700348542850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__sst2__train_8_5_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__sst2__train_8_5_en.md new file mode 100644 index 000000000000..f9baf43ce17b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__sst2__train_8_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_5 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_5 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_5` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_5_en_5.2.0_3.0_1700351134723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_5_en_5.2.0_3.0_1700351134723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__sst5__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__sst5__all_train_en.md new file mode 100644 index 000000000000..5c71c54b2478 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__sst5__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst5__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst5__all_train +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst5__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst5__all_train_en_5.2.0_3.0_1700348903568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst5__all_train_en_5.2.0_3.0_1700348903568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst5__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst5__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst5__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst5__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__subj__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__subj__all_train_en.md new file mode 100644 index 000000000000..709b3b7f2de2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased__subj__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__all_train +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__all_train_en_5.2.0_3.0_1700351546871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__all_train_en_5.2.0_3.0_1700351546871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_ag_news_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_ag_news_en.md new file mode 100644 index 000000000000..922b0f261e7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_ag_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_ag_news DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_uncased_ag_news +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ag_news` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ag_news_en_5.2.0_3.0_1700338334317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ag_news_en_5.2.0_3.0_1700338334317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_ag_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_ag_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ag_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-uncased-ag-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_airlines_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_airlines_en.md new file mode 100644 index 000000000000..87f434a30e74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_airlines_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_airlines DistilBertForSequenceClassification from tasosk +author: John Snow Labs +name: distilbert_base_uncased_airlines +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_airlines` is a English model originally trained by tasosk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_airlines_en_5.2.0_3.0_1700345026499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_airlines_en_5.2.0_3.0_1700345026499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_airlines","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_airlines","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_airlines| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tasosk/distilbert-base-uncased-airlines \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_career_path_prediction_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_career_path_prediction_en.md new file mode 100644 index 000000000000..f506e8a73553 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_career_path_prediction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_career_path_prediction DistilBertForSequenceClassification from fazni +author: John Snow Labs +name: distilbert_base_uncased_career_path_prediction +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_career_path_prediction` is a English model originally trained by fazni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_career_path_prediction_en_5.2.0_3.0_1700351684734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_career_path_prediction_en_5.2.0_3.0_1700351684734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_career_path_prediction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_career_path_prediction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_career_path_prediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/fazni/distilbert-base-uncased-career-path-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_cola_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_cola_en.md new file mode 100644 index 000000000000..783c4eec2bb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_cola DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_uncased_cola +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_cola` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_cola_en_5.2.0_3.0_1700338086761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_cola_en_5.2.0_3.0_1700338086761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-uncased-CoLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_dangerrat_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_dangerrat_en.md new file mode 100644 index 000000000000..1c0b31c38a15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_dangerrat_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_dangerrat DistilBertForSequenceClassification from DangerRat +author: John Snow Labs +name: distilbert_base_uncased_dangerrat +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_dangerrat` is a English model originally trained by DangerRat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_dangerrat_en_5.2.0_3.0_1700348334815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_dangerrat_en_5.2.0_3.0_1700348334815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_dangerrat","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_dangerrat","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_dangerrat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DangerRat/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilbert_model_enyonam_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilbert_model_enyonam_en.md new file mode 100644 index 000000000000..8e2bb7812c19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilbert_model_enyonam_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilbert_model_enyonam DistilBertForSequenceClassification from Enyonam +author: John Snow Labs +name: distilbert_base_uncased_distilbert_model_enyonam +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilbert_model_enyonam` is a English model originally trained by Enyonam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilbert_model_enyonam_en_5.2.0_3.0_1700345029042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilbert_model_enyonam_en_5.2.0_3.0_1700345029042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilbert_model_enyonam","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilbert_model_enyonam","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilbert_model_enyonam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Enyonam/distilbert-base-uncased-Distilbert-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_haesun_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_haesun_en.md new file mode 100644 index 000000000000..f0ab49076543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_haesun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_haesun DistilBertForSequenceClassification from haesun +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_haesun +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_haesun` is a English model originally trained by haesun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_haesun_en_5.2.0_3.0_1700349082463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_haesun_en_5.2.0_3.0_1700349082463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_haesun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_haesun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_haesun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/haesun/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_mhf_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_mhf_en.md new file mode 100644 index 000000000000..905d69e0f02e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_mhf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_mhf DistilBertForSequenceClassification from MhF +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_mhf +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_mhf` is a English model originally trained by MhF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_mhf_en_5.2.0_3.0_1700348898334.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_mhf_en_5.2.0_3.0_1700348898334.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_mhf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_mhf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_mhf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/MhF/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_transformersbook_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_transformersbook_en.md new file mode 100644 index 000000000000..0b44c516b0ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_distilled_clinc_transformersbook_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_transformersbook DistilBertForSequenceClassification from transformersbook +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_transformersbook +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_transformersbook` is a English model originally trained by transformersbook. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_transformersbook_en_5.2.0_3.0_1700341178720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_transformersbook_en_5.2.0_3.0_1700341178720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_transformersbook","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_transformersbook","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_transformersbook| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/transformersbook/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_empathetic_dialogues_context_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_empathetic_dialogues_context_en.md new file mode 100644 index 000000000000..24ded02364f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_empathetic_dialogues_context_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_empathetic_dialogues_context DistilBertForSequenceClassification from bdotloh +author: John Snow Labs +name: distilbert_base_uncased_empathetic_dialogues_context +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_empathetic_dialogues_context` is a English model originally trained by bdotloh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_empathetic_dialogues_context_en_5.2.0_3.0_1700338025327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_empathetic_dialogues_context_en_5.2.0_3.0_1700338025327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_empathetic_dialogues_context","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_empathetic_dialogues_context","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_empathetic_dialogues_context| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bdotloh/distilbert-base-uncased-empathetic-dialogues-context \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_empatheticdialogues_sentiment_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_empatheticdialogues_sentiment_classifier_en.md new file mode 100644 index 000000000000..2e861c7072b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_empatheticdialogues_sentiment_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_empatheticdialogues_sentiment_classifier DistilBertForSequenceClassification from benjaminbeilharz +author: John Snow Labs +name: distilbert_base_uncased_empatheticdialogues_sentiment_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_empatheticdialogues_sentiment_classifier` is a English model originally trained by benjaminbeilharz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_empatheticdialogues_sentiment_classifier_en_5.2.0_3.0_1700345270285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_empatheticdialogues_sentiment_classifier_en_5.2.0_3.0_1700345270285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_empatheticdialogues_sentiment_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_empatheticdialogues_sentiment_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_empatheticdialogues_sentiment_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/benjaminbeilharz/distilbert-base-uncased-empatheticdialogues-sentiment-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_few_shot_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_few_shot_sentiment_model_en.md new file mode 100644 index 000000000000..07cee2750b88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_few_shot_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_few_shot_sentiment_model DistilBertForSequenceClassification from okho0653 +author: John Snow Labs +name: distilbert_base_uncased_few_shot_sentiment_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_few_shot_sentiment_model` is a English model originally trained by okho0653. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_few_shot_sentiment_model_en_5.2.0_3.0_1700345398386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_few_shot_sentiment_model_en_5.2.0_3.0_1700345398386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_few_shot_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_few_shot_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_few_shot_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/okho0653/distilbert-base-uncased-few-shot-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_financial_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_financial_sentiment_analysis_en.md new file mode 100644 index 000000000000..b62c6055db67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_financial_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_financial_sentiment_analysis DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_financial_sentiment_analysis +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_financial_sentiment_analysis` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_financial_sentiment_analysis_en_5.2.0_3.0_1700343536764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_financial_sentiment_analysis_en_5.2.0_3.0_1700343536764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_financial_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_financial_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_financial_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Financial_Sentiment_Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_9th_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_9th_en.md new file mode 100644 index 000000000000..7dcb19ec340b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_9th_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_9th DistilBertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: distilbert_base_uncased_finetuned_9th +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_9th` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_9th_en_5.2.0_3.0_1700350978867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_9th_en_5.2.0_3.0_1700350978867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_9th","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_9th","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_9th| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Katsiaryna/distilbert-base-uncased-finetuned_9th \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_amazon_chinese_20000_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_amazon_chinese_20000_en.md new file mode 100644 index 000000000000..9cc0d8ee032b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_amazon_chinese_20000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_amazon_chinese_20000 DistilBertForSequenceClassification from ASCCCCCCCC +author: John Snow Labs +name: distilbert_base_uncased_finetuned_amazon_chinese_20000 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_amazon_chinese_20000` is a English model originally trained by ASCCCCCCCC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_chinese_20000_en_5.2.0_3.0_1700351141632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_chinese_20000_en_5.2.0_3.0_1700351141632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_amazon_chinese_20000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_amazon_chinese_20000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_amazon_chinese_20000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ASCCCCCCCC/distilbert-base-uncased-finetuned-amazon_zh_20000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_assamese_sentences_fewshot_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_assamese_sentences_fewshot_en.md new file mode 100644 index 000000000000..17114d5ceb37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_assamese_sentences_fewshot_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_assamese_sentences_fewshot DistilBertForSequenceClassification from sarahflan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_assamese_sentences_fewshot +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_assamese_sentences_fewshot` is a English model originally trained by sarahflan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_assamese_sentences_fewshot_en_5.2.0_3.0_1700344711202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_assamese_sentences_fewshot_en_5.2.0_3.0_1700344711202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_assamese_sentences_fewshot","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_assamese_sentences_fewshot","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_assamese_sentences_fewshot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/sarahflan/distilbert-base-uncased-finetuned-as_sentences_fewshot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_banking77_optimum_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_banking77_optimum_en.md new file mode 100644 index 000000000000..14b3ba4a12f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_banking77_optimum_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_banking77_optimum DistilBertForSequenceClassification from optimum +author: John Snow Labs +name: distilbert_base_uncased_finetuned_banking77_optimum +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_banking77_optimum` is a English model originally trained by optimum. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_banking77_optimum_en_5.2.0_3.0_1700343166305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_banking77_optimum_en_5.2.0_3.0_1700343166305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_banking77_optimum","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_banking77_optimum","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_banking77_optimum| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/optimum/distilbert-base-uncased-finetuned-banking77 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_classification_piecake_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_classification_piecake_en.md new file mode 100644 index 000000000000..5bfcc52be4ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_classification_piecake_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_classification_piecake DistilBertForSequenceClassification from piecake +author: John Snow Labs +name: distilbert_base_uncased_finetuned_classification_piecake +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_classification_piecake` is a English model originally trained by piecake. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_classification_piecake_en_5.2.0_3.0_1700351369784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_classification_piecake_en_5.2.0_3.0_1700351369784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_classification_piecake","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_classification_piecake","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_classification_piecake| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/piecake/distilbert-base-uncased-finetuned-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_ascccccccc_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_ascccccccc_en.md new file mode 100644 index 000000000000..40a4701e8665 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_ascccccccc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_ascccccccc DistilBertForSequenceClassification from ASCCCCCCCC +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_ascccccccc +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_ascccccccc` is a English model originally trained by ASCCCCCCCC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_ascccccccc_en_5.2.0_3.0_1700346037680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_ascccccccc_en_5.2.0_3.0_1700346037680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_ascccccccc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_ascccccccc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_ascccccccc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/ASCCCCCCCC/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_mhf_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_mhf_en.md new file mode 100644 index 000000000000..f97399fbc1c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_mhf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_mhf DistilBertForSequenceClassification from MhF +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_mhf +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_mhf` is a English model originally trained by MhF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_mhf_en_5.2.0_3.0_1700348957742.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_mhf_en_5.2.0_3.0_1700348957742.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_mhf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_mhf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_mhf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/MhF/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_transformersbook_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_transformersbook_en.md new file mode 100644 index 000000000000..a394af205bc6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_clinc_transformersbook_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_transformersbook DistilBertForSequenceClassification from transformersbook +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_transformersbook +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_transformersbook` is a English model originally trained by transformersbook. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_transformersbook_en_5.2.0_3.0_1700344316628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_transformersbook_en_5.2.0_3.0_1700344316628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_transformersbook","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_transformersbook","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_transformersbook| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/transformersbook/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_code_snippet_quality_scoring_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_code_snippet_quality_scoring_en.md new file mode 100644 index 000000000000..3f80ef2b20d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_code_snippet_quality_scoring_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_code_snippet_quality_scoring DistilBertForSequenceClassification from Johannes +author: John Snow Labs +name: distilbert_base_uncased_finetuned_code_snippet_quality_scoring +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_code_snippet_quality_scoring` is a English model originally trained by Johannes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_code_snippet_quality_scoring_en_5.2.0_3.0_1700351369768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_code_snippet_quality_scoring_en_5.2.0_3.0_1700351369768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_code_snippet_quality_scoring","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_code_snippet_quality_scoring","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_code_snippet_quality_scoring| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Johannes/distilbert-base-uncased-finetuned-code-snippet-quality-scoring \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_09panesara_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_09panesara_en.md new file mode 100644 index 000000000000..87473ce5b5fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_09panesara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_09panesara DistilBertForSequenceClassification from 09panesara +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_09panesara +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_09panesara` is a English model originally trained by 09panesara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_09panesara_en_5.2.0_3.0_1700343536716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_09panesara_en_5.2.0_3.0_1700343536716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_09panesara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_09panesara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_09panesara| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/09panesara/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_123abhialflkfo_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_123abhialflkfo_en.md new file mode 100644 index 000000000000..2c2f452d2946 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_123abhialflkfo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_123abhialflkfo DistilBertForSequenceClassification from 123abhiALFLKFO +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_123abhialflkfo +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_123abhialflkfo` is a English model originally trained by 123abhiALFLKFO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_123abhialflkfo_en_5.2.0_3.0_1700346628597.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_123abhialflkfo_en_5.2.0_3.0_1700346628597.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_123abhialflkfo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_123abhialflkfo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_123abhialflkfo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/123abhiALFLKFO/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_2umm3r_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_2umm3r_en.md new file mode 100644 index 000000000000..f9a7aa934c02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_2umm3r_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_2umm3r DistilBertForSequenceClassification from 2umm3r +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_2umm3r +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_2umm3r` is a English model originally trained by 2umm3r. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_2umm3r_en_5.2.0_3.0_1700345476102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_2umm3r_en_5.2.0_3.0_1700345476102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_2umm3r","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_2umm3r","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_2umm3r| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/2umm3r/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_ahren09_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_ahren09_en.md new file mode 100644 index 000000000000..90d6890746c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_ahren09_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_ahren09 DistilBertForSequenceClassification from Ahren09 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_ahren09 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_ahren09` is a English model originally trained by Ahren09. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_ahren09_en_5.2.0_3.0_1700348322845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_ahren09_en_5.2.0_3.0_1700348322845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_ahren09","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_ahren09","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_ahren09| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Ahren09/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_akash7897_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_akash7897_en.md new file mode 100644 index 000000000000..1a7de5fcea51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_akash7897_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_akash7897 DistilBertForSequenceClassification from Akash7897 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_akash7897 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_akash7897` is a English model originally trained by Akash7897. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_akash7897_en_5.2.0_3.0_1700350494978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_akash7897_en_5.2.0_3.0_1700350494978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_akash7897","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_akash7897","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_akash7897| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Akash7897/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_alstractor_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_alstractor_en.md new file mode 100644 index 000000000000..1762f96764ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_alstractor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_alstractor DistilBertForSequenceClassification from Alstractor +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_alstractor +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_alstractor` is a English model originally trained by Alstractor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_alstractor_en_5.2.0_3.0_1700351668472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_alstractor_en_5.2.0_3.0_1700351668472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_alstractor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_alstractor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_alstractor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Alstractor/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_bahija_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_bahija_en.md new file mode 100644 index 000000000000..6701fc1b2fa9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_bahija_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_bahija DistilBertForSequenceClassification from BAHIJA +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_bahija +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_bahija` is a English model originally trained by BAHIJA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_bahija_en_5.2.0_3.0_1700348759185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_bahija_en_5.2.0_3.0_1700348759185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_bahija","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_bahija","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_bahija| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/BAHIJA/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_bearthreat_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_bearthreat_en.md new file mode 100644 index 000000000000..415c00370ab5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_bearthreat_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_bearthreat DistilBertForSequenceClassification from BearThreat +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_bearthreat +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_bearthreat` is a English model originally trained by BearThreat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_bearthreat_en_5.2.0_3.0_1700349099839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_bearthreat_en_5.2.0_3.0_1700349099839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_bearthreat","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_bearthreat","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_bearthreat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/BearThreat/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_jungwoo_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_jungwoo_en.md new file mode 100644 index 000000000000..f2525a8ed527 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_jungwoo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_jungwoo DistilBertForSequenceClassification from Jungwoo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_jungwoo +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_jungwoo` is a English model originally trained by Jungwoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_jungwoo_en_5.2.0_3.0_1700349659464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_jungwoo_en_5.2.0_3.0_1700349659464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_jungwoo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_jungwoo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_jungwoo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Jungwoo/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kien_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kien_en.md new file mode 100644 index 000000000000..267017681924 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kien_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_kien DistilBertForSequenceClassification from Kien +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_kien +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_kien` is a English model originally trained by Kien. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kien_en_5.2.0_3.0_1700351928680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kien_en_5.2.0_3.0_1700351928680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kien","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kien","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_kien| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Kien/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kieran_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kieran_en.md new file mode 100644 index 000000000000..ea87b28cfcc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kieran_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_kieran DistilBertForSequenceClassification from Kieran +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_kieran +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_kieran` is a English model originally trained by Kieran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kieran_en_5.2.0_3.0_1700351137197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kieran_en_5.2.0_3.0_1700351137197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kieran","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kieran","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_kieran| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kieran/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kumicho_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kumicho_en.md new file mode 100644 index 000000000000..cc9d4705f667 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_kumicho_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_kumicho DistilBertForSequenceClassification from Kumicho +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_kumicho +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_kumicho` is a English model originally trained by Kumicho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kumicho_en_5.2.0_3.0_1700348661601.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kumicho_en_5.2.0_3.0_1700348661601.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kumicho","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kumicho","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_kumicho| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kumicho/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_melissatessa_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_melissatessa_en.md new file mode 100644 index 000000000000..c1f96c713529 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_melissatessa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_melissatessa DistilBertForSequenceClassification from MelissaTESSA +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_melissatessa +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_melissatessa` is a English model originally trained by MelissaTESSA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_melissatessa_en_5.2.0_3.0_1700349646643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_melissatessa_en_5.2.0_3.0_1700349646643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_melissatessa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_melissatessa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_melissatessa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/MelissaTESSA/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_minyoung_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_minyoung_en.md new file mode 100644 index 000000000000..d9def6f47d26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_minyoung_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_minyoung DistilBertForSequenceClassification from MINYOUNG +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_minyoung +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_minyoung` is a English model originally trained by MINYOUNG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_minyoung_en_5.2.0_3.0_1700349492351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_minyoung_en_5.2.0_3.0_1700349492351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_minyoung","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_minyoung","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_minyoung| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/MINYOUNG/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_misbahf_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_misbahf_en.md new file mode 100644 index 000000000000..ef343e9223a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_cola_misbahf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_misbahf DistilBertForSequenceClassification from MisbaHF +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_misbahf +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_misbahf` is a English model originally trained by MisbaHF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_misbahf_en_5.2.0_3.0_1700349958984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_misbahf_en_5.2.0_3.0_1700349958984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_misbahf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_misbahf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_misbahf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/MisbaHF/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_customer_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_customer_reviews_en.md new file mode 100644 index 000000000000..21ce7e74312e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_customer_reviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_customer_reviews DistilBertForSequenceClassification from alexiskirke +author: John Snow Labs +name: distilbert_base_uncased_finetuned_customer_reviews +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_customer_reviews` is a English model originally trained by alexiskirke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_customer_reviews_en_5.2.0_3.0_1700347878994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_customer_reviews_en_5.2.0_3.0_1700347878994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_customer_reviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_customer_reviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_customer_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alexiskirke/distilbert-base-uncased-finetuned-customer-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_dbpedia_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_dbpedia_en.md new file mode 100644 index 000000000000..e3ac073cdfbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_dbpedia_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_dbpedia DistilBertForSequenceClassification from Danni +author: John Snow Labs +name: distilbert_base_uncased_finetuned_dbpedia +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_dbpedia` is a English model originally trained by Danni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_dbpedia_en_5.2.0_3.0_1700347194021.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_dbpedia_en_5.2.0_3.0_1700347194021.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_dbpedia","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_dbpedia","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_dbpedia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Danni/distilbert-base-uncased-finetuned-dbpedia \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_dialog_acts_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_dialog_acts_en.md new file mode 100644 index 000000000000..21565feef93f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_dialog_acts_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_dialog_acts DistilBertForSequenceClassification from JIHussain +author: John Snow Labs +name: distilbert_base_uncased_finetuned_dialog_acts +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_dialog_acts` is a English model originally trained by JIHussain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_dialog_acts_en_5.2.0_3.0_1700348759153.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_dialog_acts_en_5.2.0_3.0_1700348759153.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_dialog_acts","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_dialog_acts","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_dialog_acts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JIHussain/distilbert-base-uncased-finetuned-dialog-acts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_activationai_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_activationai_en.md new file mode 100644 index 000000000000..981e7af7dcfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_activationai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_activationai DistilBertForSequenceClassification from ActivationAI +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_activationai +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_activationai` is a English model originally trained by ActivationAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_activationai_en_5.2.0_3.0_1700344167827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_activationai_en_5.2.0_3.0_1700344167827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_activationai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_activationai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_activationai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ActivationAI/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_aron_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_aron_en.md new file mode 100644 index 000000000000..1fee24b5d389 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_aron_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_aron DistilBertForSequenceClassification from Aron +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_aron +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_aron` is a English model originally trained by Aron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_aron_en_5.2.0_3.0_1700345614884.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_aron_en_5.2.0_3.0_1700345614884.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_aron","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_aron","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_aron| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Aron/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_crives_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_crives_en.md new file mode 100644 index 000000000000..136476e47077 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_crives_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_crives DistilBertForSequenceClassification from Crives +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_crives +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_crives` is a English model originally trained by Crives. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_crives_en_5.2.0_3.0_1700346824523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_crives_en_5.2.0_3.0_1700346824523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_crives","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_crives","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_crives| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Crives/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_esuriddick_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_esuriddick_en.md new file mode 100644 index 000000000000..540f5e0053cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_esuriddick_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_esuriddick DistilBertForSequenceClassification from esuriddick +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_esuriddick +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_esuriddick` is a English model originally trained by esuriddick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_esuriddick_en_5.2.0_3.0_1700339148384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_esuriddick_en_5.2.0_3.0_1700339148384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_esuriddick","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_esuriddick","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_esuriddick| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/esuriddick/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_fabiodatageek_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_fabiodatageek_en.md new file mode 100644 index 000000000000..1d21295ea751 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_fabiodatageek_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_fabiodatageek DistilBertForSequenceClassification from FabioDataGeek +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_fabiodatageek +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_fabiodatageek` is a English model originally trained by FabioDataGeek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_fabiodatageek_en_5.2.0_3.0_1700350009940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_fabiodatageek_en_5.2.0_3.0_1700350009940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_fabiodatageek","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_fabiodatageek","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_fabiodatageek| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/FabioDataGeek/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_florianehmann_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_florianehmann_en.md new file mode 100644 index 000000000000..be2f37d7a3e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_florianehmann_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_florianehmann DistilBertForSequenceClassification from florianehmann +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_florianehmann +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_florianehmann` is a English model originally trained by florianehmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_florianehmann_en_5.2.0_3.0_1700349079246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_florianehmann_en_5.2.0_3.0_1700349079246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_florianehmann","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_florianehmann","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_florianehmann| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/florianehmann/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_hafaa_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_hafaa_en.md new file mode 100644 index 000000000000..df9152c3ec29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_hafaa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_hafaa DistilBertForSequenceClassification from Hafaa +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_hafaa +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_hafaa` is a English model originally trained by Hafaa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hafaa_en_5.2.0_3.0_1700347047032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hafaa_en_5.2.0_3.0_1700347047032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hafaa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hafaa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_hafaa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Hafaa/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_hatemnoaman_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_hatemnoaman_en.md new file mode 100644 index 000000000000..3d9eca3a7120 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_hatemnoaman_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_hatemnoaman DistilBertForSequenceClassification from hatemnoaman +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_hatemnoaman +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_hatemnoaman` is a English model originally trained by hatemnoaman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hatemnoaman_en_5.2.0_3.0_1700346454430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hatemnoaman_en_5.2.0_3.0_1700346454430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hatemnoaman","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hatemnoaman","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_hatemnoaman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hatemnoaman/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_jpaulhunter_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_jpaulhunter_en.md new file mode 100644 index 000000000000..5254f211eaa2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_jpaulhunter_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jpaulhunter DistilBertForSequenceClassification from jpaulhunter +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jpaulhunter +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jpaulhunter` is a English model originally trained by jpaulhunter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jpaulhunter_en_5.2.0_3.0_1700343668389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jpaulhunter_en_5.2.0_3.0_1700343668389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jpaulhunter","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jpaulhunter","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jpaulhunter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jpaulhunter/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_mhf_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_mhf_en.md new file mode 100644 index 000000000000..3cd27bbadb00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_mhf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_mhf DistilBertForSequenceClassification from MhF +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_mhf +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_mhf` is a English model originally trained by MhF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_mhf_en_5.2.0_3.0_1700349795133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_mhf_en_5.2.0_3.0_1700349795133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_mhf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_mhf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_mhf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/MhF/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_sheldonsides_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_sheldonsides_en.md new file mode 100644 index 000000000000..30feac287412 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_sheldonsides_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_sheldonsides DistilBertForSequenceClassification from SheldonSides +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_sheldonsides +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_sheldonsides` is a English model originally trained by SheldonSides. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sheldonsides_en_5.2.0_3.0_1700341848566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sheldonsides_en_5.2.0_3.0_1700341848566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sheldonsides","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sheldonsides","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_sheldonsides| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SheldonSides/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_tensor_trek_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_tensor_trek_en.md new file mode 100644 index 000000000000..d39fdf3bb6d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_tensor_trek_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_tensor_trek DistilBertForSequenceClassification from tensor-trek +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_tensor_trek +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_tensor_trek` is a English model originally trained by tensor-trek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_tensor_trek_en_5.2.0_3.0_1700350616465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_tensor_trek_en_5.2.0_3.0_1700350616465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_tensor_trek","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_tensor_trek","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_tensor_trek| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tensor-trek/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_transformersbook_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_transformersbook_en.md new file mode 100644 index 000000000000..368861699461 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotion_transformersbook_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_transformersbook DistilBertForSequenceClassification from transformersbook +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_transformersbook +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_transformersbook` is a English model originally trained by transformersbook. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_transformersbook_en_5.2.0_3.0_1700339294483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_transformersbook_en_5.2.0_3.0_1700339294483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_transformersbook","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_transformersbook","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_transformersbook| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/transformersbook/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotional_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotional_en.md new file mode 100644 index 000000000000..745dc6430224 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotional_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotional DistilBertForSequenceClassification from TieIncred +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotional +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotional` is a English model originally trained by TieIncred. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotional_en_5.2.0_3.0_1700348509921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotional_en_5.2.0_3.0_1700348509921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotional","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotional","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotional| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/TieIncred/distilbert-base-uncased-finetuned-emotional \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotions_ysharma_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotions_ysharma_en.md new file mode 100644 index 000000000000..a96b33c8b7cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_emotions_ysharma_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotions_ysharma DistilBertForSequenceClassification from ysharma +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotions_ysharma +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotions_ysharma` is a English model originally trained by ysharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotions_ysharma_en_5.2.0_3.0_1700346328033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotions_ysharma_en_5.2.0_3.0_1700346328033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotions_ysharma","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotions_ysharma","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotions_ysharma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ysharma/distilbert-base-uncased-finetuned-emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_eoir_privacy_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_eoir_privacy_en.md new file mode 100644 index 000000000000..2f95c727701c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_eoir_privacy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_eoir_privacy DistilBertForSequenceClassification from pile-of-law +author: John Snow Labs +name: distilbert_base_uncased_finetuned_eoir_privacy +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_eoir_privacy` is a English model originally trained by pile-of-law. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_eoir_privacy_en_5.2.0_3.0_1700351801285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_eoir_privacy_en_5.2.0_3.0_1700351801285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_eoir_privacy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_eoir_privacy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_eoir_privacy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/pile-of-law/distilbert-base-uncased-finetuned-eoir_privacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_for_tweet_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_for_tweet_sentiment_en.md new file mode 100644 index 000000000000..8ce1dad2b545 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_for_tweet_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_for_tweet_sentiment DistilBertForSequenceClassification from caffsean +author: John Snow Labs +name: distilbert_base_uncased_finetuned_for_tweet_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_for_tweet_sentiment` is a English model originally trained by caffsean. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_for_tweet_sentiment_en_5.2.0_3.0_1700344572841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_for_tweet_sentiment_en_5.2.0_3.0_1700344572841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_for_tweet_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_for_tweet_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_for_tweet_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/caffsean/distilbert-base-uncased-finetuned-for-tweet-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_gender_classification_padmajabfrl_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_gender_classification_padmajabfrl_en.md new file mode 100644 index 000000000000..961343a156cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_gender_classification_padmajabfrl_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_gender_classification_padmajabfrl DistilBertForSequenceClassification from padmajabfrl +author: John Snow Labs +name: distilbert_base_uncased_finetuned_gender_classification_padmajabfrl +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_gender_classification_padmajabfrl` is a English model originally trained by padmajabfrl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_gender_classification_padmajabfrl_en_5.2.0_3.0_1700339960573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_gender_classification_padmajabfrl_en_5.2.0_3.0_1700339960573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_gender_classification_padmajabfrl","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_gender_classification_padmajabfrl","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_gender_classification_padmajabfrl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/padmajabfrl/distilbert-base-uncased-finetuned_gender_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_imdb_kurianbenoy_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_imdb_kurianbenoy_en.md new file mode 100644 index 000000000000..8c59cd9c3536 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_imdb_kurianbenoy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_kurianbenoy DistilBertForSequenceClassification from kurianbenoy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_kurianbenoy +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_kurianbenoy` is a English model originally trained by kurianbenoy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_kurianbenoy_en_5.2.0_3.0_1700346196111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_kurianbenoy_en_5.2.0_3.0_1700346196111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_kurianbenoy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_kurianbenoy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_kurianbenoy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/kurianbenoy/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_massive_intent_detection_english_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_massive_intent_detection_english_en.md new file mode 100644 index 000000000000..4959840c2be3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_massive_intent_detection_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_massive_intent_detection_english DistilBertForSequenceClassification from joaobarroca +author: John Snow Labs +name: distilbert_base_uncased_finetuned_massive_intent_detection_english +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_massive_intent_detection_english` is a English model originally trained by joaobarroca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_massive_intent_detection_english_en_5.2.0_3.0_1700345304049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_massive_intent_detection_english_en_5.2.0_3.0_1700345304049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_massive_intent_detection_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_massive_intent_detection_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_massive_intent_detection_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/joaobarroca/distilbert-base-uncased-finetuned-massive-intent-detection-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_mathqa_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_mathqa_en.md new file mode 100644 index 000000000000..2b17b27c7764 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_mathqa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_mathqa DistilBertForSequenceClassification from rootacess +author: John Snow Labs +name: distilbert_base_uncased_finetuned_mathqa +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_mathqa` is a English model originally trained by rootacess. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mathqa_en_5.2.0_3.0_1700348012994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mathqa_en_5.2.0_3.0_1700348012994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mathqa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mathqa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_mathqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rootacess/distilbert-base-uncased-finetuned-mathQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_mnli_samarth_kulkarni_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_mnli_samarth_kulkarni_en.md new file mode 100644 index 000000000000..1ab6631414f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_mnli_samarth_kulkarni_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_mnli_samarth_kulkarni DistilBertForSequenceClassification from samarth-kulkarni +author: John Snow Labs +name: distilbert_base_uncased_finetuned_mnli_samarth_kulkarni +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_mnli_samarth_kulkarni` is a English model originally trained by samarth-kulkarni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mnli_samarth_kulkarni_en_5.2.0_3.0_1700345565554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mnli_samarth_kulkarni_en_5.2.0_3.0_1700345565554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mnli_samarth_kulkarni","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mnli_samarth_kulkarni","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_mnli_samarth_kulkarni| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/samarth-kulkarni/distilbert-base-uncased-finetuned-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_movie_genre_langfab_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_movie_genre_langfab_en.md new file mode 100644 index 000000000000..f2177671d4ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_movie_genre_langfab_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_movie_genre_langfab DistilBertForSequenceClassification from langfab +author: John Snow Labs +name: distilbert_base_uncased_finetuned_movie_genre_langfab +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_movie_genre_langfab` is a English model originally trained by langfab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_movie_genre_langfab_en_5.2.0_3.0_1700346824484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_movie_genre_langfab_en_5.2.0_3.0_1700346824484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_movie_genre_langfab","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_movie_genre_langfab","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_movie_genre_langfab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/langfab/distilbert-base-uncased-finetuned-movie-genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_news_aimesoft_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_news_aimesoft_en.md new file mode 100644 index 000000000000..279e513375f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_news_aimesoft_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_news_aimesoft DistilBertForSequenceClassification from hnhoangdz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_news_aimesoft +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_news_aimesoft` is a English model originally trained by hnhoangdz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_news_aimesoft_en_5.2.0_3.0_1700344567775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_news_aimesoft_en_5.2.0_3.0_1700344567775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_news_aimesoft","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_news_aimesoft","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_news_aimesoft| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hnhoangdz/distilbert-base-uncased-finetuned-news-aimesoft \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_non_augmented_binary_emotions_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_non_augmented_binary_emotions_en.md new file mode 100644 index 000000000000..1fa42ef9e5d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_non_augmented_binary_emotions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_non_augmented_binary_emotions DistilBertForSequenceClassification from raul-af7 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_non_augmented_binary_emotions +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_non_augmented_binary_emotions` is a English model originally trained by raul-af7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_non_augmented_binary_emotions_en_5.2.0_3.0_1700344877313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_non_augmented_binary_emotions_en_5.2.0_3.0_1700344877313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_non_augmented_binary_emotions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_non_augmented_binary_emotions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_non_augmented_binary_emotions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/raul-af7/distilbert-base-uncased-finetuned-non-augmented-binary-emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_objectivity_rotten_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_objectivity_rotten_en.md new file mode 100644 index 000000000000..d2c94c354a39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_objectivity_rotten_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_objectivity_rotten DistilBertForSequenceClassification from marcosfp +author: John Snow Labs +name: distilbert_base_uncased_finetuned_objectivity_rotten +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_objectivity_rotten` is a English model originally trained by marcosfp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_objectivity_rotten_en_5.2.0_3.0_1700344930991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_objectivity_rotten_en_5.2.0_3.0_1700344930991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_objectivity_rotten","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_objectivity_rotten","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_objectivity_rotten| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/marcosfp/distilbert-base-uncased-finetuned-objectivity-rotten \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_qnli_jensthyregod_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_qnli_jensthyregod_en.md new file mode 100644 index 000000000000..b8ad8b15fc01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_qnli_jensthyregod_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_qnli_jensthyregod DistilBertForSequenceClassification from JensThyregod +author: John Snow Labs +name: distilbert_base_uncased_finetuned_qnli_jensthyregod +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_qnli_jensthyregod` is a English model originally trained by JensThyregod. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qnli_jensthyregod_en_5.2.0_3.0_1700349399154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qnli_jensthyregod_en_5.2.0_3.0_1700349399154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_qnli_jensthyregod","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_qnli_jensthyregod","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_qnli_jensthyregod| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JensThyregod/distilbert-base-uncased-finetuned-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_qqp_0xb1_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_qqp_0xb1_en.md new file mode 100644 index 000000000000..8a0af9707f63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_qqp_0xb1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_qqp_0xb1 DistilBertForSequenceClassification from 0xb1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_qqp_0xb1 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_qqp_0xb1` is a English model originally trained by 0xb1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qqp_0xb1_en_5.2.0_3.0_1700345737381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qqp_0xb1_en_5.2.0_3.0_1700345737381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_qqp_0xb1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_qqp_0xb1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_qqp_0xb1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/0xb1/distilbert-base-uncased-finetuned-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_quora_insincere_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_quora_insincere_en.md new file mode 100644 index 000000000000..47c92cb70db0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_quora_insincere_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_quora_insincere DistilBertForSequenceClassification from Asif555355 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_quora_insincere +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_quora_insincere` is a English model originally trained by Asif555355. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_quora_insincere_en_5.2.0_3.0_1700341349208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_quora_insincere_en_5.2.0_3.0_1700341349208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_quora_insincere","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_quora_insincere","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_quora_insincere| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Asif555355/distilbert-base-uncased-finetuned-quora-insincere \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_radiology_txt_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_radiology_txt_en.md new file mode 100644 index 000000000000..a7a98c0ae290 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_radiology_txt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_radiology_txt DistilBertForSequenceClassification from josejames00 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_radiology_txt +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_radiology_txt` is a English model originally trained by josejames00. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_radiology_txt_en_5.2.0_3.0_1700351311455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_radiology_txt_en_5.2.0_3.0_1700351311455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_radiology_txt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_radiology_txt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_radiology_txt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/josejames00/distilbert-base-uncased-finetuned-radiology-txt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_requirement_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_requirement_classification_en.md new file mode 100644 index 000000000000..3dc159ea2e67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_requirement_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_requirement_classification DistilBertForSequenceClassification from dadi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_requirement_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_requirement_classification` is a English model originally trained by dadi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_requirement_classification_en_5.2.0_3.0_1700344104351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_requirement_classification_en_5.2.0_3.0_1700344104351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_requirement_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_requirement_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_requirement_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dadi/distilbert-base-uncased-finetuned-requirement-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_requirement_classification_isoko_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_requirement_classification_isoko_en.md new file mode 100644 index 000000000000..64281bf23b35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_requirement_classification_isoko_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_requirement_classification_isoko DistilBertForSequenceClassification from dadi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_requirement_classification_isoko +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_requirement_classification_isoko` is a English model originally trained by dadi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_requirement_classification_isoko_en_5.2.0_3.0_1700346883685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_requirement_classification_isoko_en_5.2.0_3.0_1700346883685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_requirement_classification_isoko","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_requirement_classification_isoko","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_requirement_classification_isoko| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dadi/distilbert-base-uncased-finetuned-requirement-classification-iso \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_rte_anirudh21_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_rte_anirudh21_en.md new file mode 100644 index 000000000000..e25a1b522a46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_rte_anirudh21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_rte_anirudh21 DistilBertForSequenceClassification from anirudh21 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_rte_anirudh21 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_rte_anirudh21` is a English model originally trained by anirudh21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_rte_anirudh21_en_5.2.0_3.0_1700348142986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_rte_anirudh21_en_5.2.0_3.0_1700348142986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_rte_anirudh21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_rte_anirudh21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_rte_anirudh21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anirudh21/distilbert-base-uncased-finetuned-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sentiment_amazon_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sentiment_amazon_en.md new file mode 100644 index 000000000000..13358d5ab389 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sentiment_amazon_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sentiment_amazon DistilBertForSequenceClassification from AdamCodd +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sentiment_amazon +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sentiment_amazon` is a English model originally trained by AdamCodd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sentiment_amazon_en_5.2.0_3.0_1700338170740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sentiment_amazon_en_5.2.0_3.0_1700338170740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sentiment_amazon","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sentiment_amazon","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sentiment_amazon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AdamCodd/distilbert-base-uncased-finetuned-sentiment-amazon \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sms_spam_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sms_spam_detection_en.md new file mode 100644 index 000000000000..39b178038302 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sms_spam_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sms_spam_detection DistilBertForSequenceClassification from mariagrandury +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sms_spam_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sms_spam_detection` is a English model originally trained by mariagrandury. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sms_spam_detection_en_5.2.0_3.0_1700342329102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sms_spam_detection_en_5.2.0_3.0_1700342329102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sms_spam_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sms_spam_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sms_spam_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mariagrandury/distilbert-base-uncased-finetuned-sms-spam-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst2_akash7897_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst2_akash7897_en.md new file mode 100644 index 000000000000..c084fbea978b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst2_akash7897_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_akash7897 DistilBertForSequenceClassification from Akash7897 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_akash7897 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_akash7897` is a English model originally trained by Akash7897. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_akash7897_en_5.2.0_3.0_1700349816251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_akash7897_en_5.2.0_3.0_1700349816251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_akash7897","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_akash7897","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_akash7897| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Akash7897/distilbert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst2_portuguese_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst2_portuguese_en.md new file mode 100644 index 000000000000..25cc28dce418 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst2_portuguese_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_portuguese DistilBertForSequenceClassification from rwang5688 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_portuguese +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_portuguese` is a English model originally trained by rwang5688. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_portuguese_en_5.2.0_3.0_1700347654043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_portuguese_en_5.2.0_3.0_1700347654043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_portuguese","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_portuguese","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_portuguese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rwang5688/distilbert-base-uncased-finetuned-sst2-pt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model_en.md new file mode 100644 index 000000000000..56b5e228d1be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model DistilBertForSequenceClassification from okho0653 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model` is a English model originally trained by okho0653. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model_en_5.2.0_3.0_1700345417369.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model_en_5.2.0_3.0_1700345417369.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst_2_english_zero_shot_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/okho0653/distilbert-base-uncased-finetuned-sst-2-english-zero-shot-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_topic_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_topic_model_en.md new file mode 100644 index 000000000000..2936935f40db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_topic_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_topic_model DistilBertForSequenceClassification from nebo333 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_topic_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_topic_model` is a English model originally trained by nebo333. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_topic_model_en_5.2.0_3.0_1700345902899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_topic_model_en_5.2.0_3.0_1700345902899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_topic_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_topic_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_topic_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nebo333/distilbert-base-uncased-finetuned-topic-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_tweets_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_tweets_sentiment_en.md new file mode 100644 index 000000000000..094f14ca5b17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_tweets_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_tweets_sentiment DistilBertForSequenceClassification from austinmw +author: John Snow Labs +name: distilbert_base_uncased_finetuned_tweets_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_tweets_sentiment` is a English model originally trained by austinmw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tweets_sentiment_en_5.2.0_3.0_1700343337776.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tweets_sentiment_en_5.2.0_3.0_1700343337776.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_tweets_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_tweets_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_tweets_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/austinmw/distilbert-base-uncased-finetuned-tweets-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_with_spanish_tweets_clf_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_with_spanish_tweets_clf_en.md new file mode 100644 index 000000000000..46a098c6551d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_with_spanish_tweets_clf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_with_spanish_tweets_clf DistilBertForSequenceClassification from francisco-perez-sorrosal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_with_spanish_tweets_clf +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_with_spanish_tweets_clf` is a English model originally trained by francisco-perez-sorrosal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_with_spanish_tweets_clf_en_5.2.0_3.0_1700349396167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_with_spanish_tweets_clf_en_5.2.0_3.0_1700349396167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_with_spanish_tweets_clf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_with_spanish_tweets_clf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_with_spanish_tweets_clf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/francisco-perez-sorrosal/distilbert-base-uncased-finetuned-with-spanish-tweets-clf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_yelp_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_yelp_reviews_en.md new file mode 100644 index 000000000000..4b0e9bfa4f17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_finetuned_yelp_reviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_yelp_reviews DistilBertForSequenceClassification from Ramamurthi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_yelp_reviews +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_yelp_reviews` is a English model originally trained by Ramamurthi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_yelp_reviews_en_5.2.0_3.0_1700340240093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_yelp_reviews_en_5.2.0_3.0_1700340240093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_yelp_reviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_yelp_reviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_yelp_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Ramamurthi/distilbert-base-uncased-finetuned-yelp-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_fintuned_emotion_maitake0115_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_fintuned_emotion_maitake0115_en.md new file mode 100644 index 000000000000..8c348766adeb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_fintuned_emotion_maitake0115_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fintuned_emotion_maitake0115 DistilBertForSequenceClassification from maitake0115 +author: John Snow Labs +name: distilbert_base_uncased_fintuned_emotion_maitake0115 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fintuned_emotion_maitake0115` is a English model originally trained by maitake0115. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fintuned_emotion_maitake0115_en_5.2.0_3.0_1700348327277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fintuned_emotion_maitake0115_en_5.2.0_3.0_1700348327277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fintuned_emotion_maitake0115","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fintuned_emotion_maitake0115","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fintuned_emotion_maitake0115| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/maitake0115/distilbert-base-uncased-fintuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_helpful_amazon_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_helpful_amazon_en.md new file mode 100644 index 000000000000..7f03a3724e15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_helpful_amazon_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_helpful_amazon DistilBertForSequenceClassification from banjtheman +author: John Snow Labs +name: distilbert_base_uncased_helpful_amazon +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_helpful_amazon` is a English model originally trained by banjtheman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_helpful_amazon_en_5.2.0_3.0_1700344692065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_helpful_amazon_en_5.2.0_3.0_1700344692065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_helpful_amazon","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_helpful_amazon","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_helpful_amazon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/banjtheman/distilbert-base-uncased-helpful-amazon \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_high_risk_tickets_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_high_risk_tickets_en.md new file mode 100644 index 000000000000..c94dff6913c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_high_risk_tickets_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_high_risk_tickets DistilBertForSequenceClassification from MFrazz +author: John Snow Labs +name: distilbert_base_uncased_high_risk_tickets +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_high_risk_tickets` is a English model originally trained by MFrazz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_high_risk_tickets_en_5.2.0_3.0_1700338771131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_high_risk_tickets_en_5.2.0_3.0_1700338771131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_high_risk_tickets","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_high_risk_tickets","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_high_risk_tickets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/MFrazz/distilbert-base-uncased-High-Risk-Tickets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_imdb_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_imdb_textattack_en.md new file mode 100644 index 000000000000..9b7c8b8bc0f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_imdb_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_imdb_textattack DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_uncased_imdb_textattack +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_imdb_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_textattack_en_5.2.0_3.0_1700338874981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_textattack_en_5.2.0_3.0_1700338874981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_imdb_textattack| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-uncased-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_kaggle_readability_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_kaggle_readability_en.md new file mode 100644 index 000000000000..c8f74912c2cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_kaggle_readability_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_kaggle_readability DistilBertForSequenceClassification from Tymoteusz +author: John Snow Labs +name: distilbert_base_uncased_kaggle_readability +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_kaggle_readability` is a English model originally trained by Tymoteusz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_kaggle_readability_en_5.2.0_3.0_1700348958626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_kaggle_readability_en_5.2.0_3.0_1700348958626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_kaggle_readability","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_kaggle_readability","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_kaggle_readability| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Tymoteusz/distilbert-base-uncased-kaggle-readability \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_md_gender_bias_saved_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_md_gender_bias_saved_en.md new file mode 100644 index 000000000000..7da600180874 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_md_gender_bias_saved_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_md_gender_bias_saved DistilBertForSequenceClassification from thaile +author: John Snow Labs +name: distilbert_base_uncased_md_gender_bias_saved +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_md_gender_bias_saved` is a English model originally trained by thaile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_md_gender_bias_saved_en_5.2.0_3.0_1700347961075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_md_gender_bias_saved_en_5.2.0_3.0_1700347961075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_md_gender_bias_saved","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_md_gender_bias_saved","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_md_gender_bias_saved| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/thaile/distilbert-base-uncased-md_gender_bias-saved \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_mnli_en.md new file mode 100644 index 000000000000..3011a8d4e1cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_mnli DistilBertForSequenceClassification from ishan +author: John Snow Labs +name: distilbert_base_uncased_mnli +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_mnli` is a English model originally trained by ishan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mnli_en_5.2.0_3.0_1700342815352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mnli_en_5.2.0_3.0_1700342815352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_mnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ishan/distilbert-base-uncased-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_mrpc_en.md new file mode 100644 index 000000000000..6c29c1772310 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_mrpc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_mrpc DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_uncased_mrpc +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_mrpc` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mrpc_en_5.2.0_3.0_1700339623978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mrpc_en_5.2.0_3.0_1700339623978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_mrpc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_mrpc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_mrpc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-uncased-MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_news_trained_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_news_trained_en.md new file mode 100644 index 000000000000..e91f1b701cac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_news_trained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_news_trained DistilBertForSequenceClassification from bfriederich +author: John Snow Labs +name: distilbert_base_uncased_news_trained +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_news_trained` is a English model originally trained by bfriederich. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_news_trained_en_5.2.0_3.0_1700350304680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_news_trained_en_5.2.0_3.0_1700350304680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_news_trained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_news_trained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_news_trained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bfriederich/distilbert-base-uncased-news-trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_nonsense_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_nonsense_en.md new file mode 100644 index 000000000000..bd5cee762392 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_nonsense_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_nonsense DistilBertForSequenceClassification from hubtype +author: John Snow Labs +name: distilbert_base_uncased_nonsense +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_nonsense` is a English model originally trained by hubtype. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_nonsense_en_5.2.0_3.0_1700338436341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_nonsense_en_5.2.0_3.0_1700338436341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_nonsense","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_nonsense","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_nonsense| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/hubtype/distilbert-base-uncased-nonsense \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_qqp_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_qqp_en.md new file mode 100644 index 000000000000..a01cc9fa064d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_qqp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_qqp DistilBertForSequenceClassification from assemblyai +author: John Snow Labs +name: distilbert_base_uncased_qqp +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_qqp` is a English model originally trained by assemblyai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_qqp_en_5.2.0_3.0_1700342655579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_qqp_en_5.2.0_3.0_1700342655579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_qqp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_qqp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_qqp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/assemblyai/distilbert-base-uncased-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_rotten_tomatoes_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_rotten_tomatoes_textattack_en.md new file mode 100644 index 000000000000..1e8eb412079a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_rotten_tomatoes_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_rotten_tomatoes_textattack DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_uncased_rotten_tomatoes_textattack +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_rotten_tomatoes_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_rotten_tomatoes_textattack_en_5.2.0_3.0_1700341198394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_rotten_tomatoes_textattack_en_5.2.0_3.0_1700341198394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_rotten_tomatoes_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_rotten_tomatoes_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_rotten_tomatoes_textattack| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-uncased-rotten-tomatoes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_analysis_movie_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_analysis_movie_reviews_en.md new file mode 100644 index 000000000000..2a01b4098d7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_analysis_movie_reviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sentiment_analysis_movie_reviews DistilBertForSequenceClassification from DataMonke +author: John Snow Labs +name: distilbert_base_uncased_sentiment_analysis_movie_reviews +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sentiment_analysis_movie_reviews` is a English model originally trained by DataMonke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_analysis_movie_reviews_en_5.2.0_3.0_1700340576095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_analysis_movie_reviews_en_5.2.0_3.0_1700340576095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_analysis_movie_reviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_analysis_movie_reviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sentiment_analysis_movie_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DataMonke/distilbert-base-uncased-sentiment-analysis-movie-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_20epoch_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_20epoch_en.md new file mode 100644 index 000000000000..35392ed83235 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_20epoch_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sentiment_finetuned_memes_20epoch DistilBertForSequenceClassification from jayanta +author: John Snow Labs +name: distilbert_base_uncased_sentiment_finetuned_memes_20epoch +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sentiment_finetuned_memes_20epoch` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_finetuned_memes_20epoch_en_5.2.0_3.0_1700344795052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_finetuned_memes_20epoch_en_5.2.0_3.0_1700344795052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_finetuned_memes_20epoch","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_finetuned_memes_20epoch","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sentiment_finetuned_memes_20epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jayanta/distilbert-base-uncased-sentiment-finetuned-memes-20epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_30epochs_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_30epochs_en.md new file mode 100644 index 000000000000..898822d7ddf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_30epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sentiment_finetuned_memes_30epochs DistilBertForSequenceClassification from jayanta +author: John Snow Labs +name: distilbert_base_uncased_sentiment_finetuned_memes_30epochs +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sentiment_finetuned_memes_30epochs` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_finetuned_memes_30epochs_en_5.2.0_3.0_1700344320371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_finetuned_memes_30epochs_en_5.2.0_3.0_1700344320371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_finetuned_memes_30epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_finetuned_memes_30epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sentiment_finetuned_memes_30epochs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jayanta/distilbert-base-uncased-sentiment-finetuned-memes-30epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_en.md new file mode 100644 index 000000000000..707d0d1b882d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_finetuned_memes_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sentiment_finetuned_memes DistilBertForSequenceClassification from jayanta +author: John Snow Labs +name: distilbert_base_uncased_sentiment_finetuned_memes +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sentiment_finetuned_memes` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_finetuned_memes_en_5.2.0_3.0_1700346440791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_finetuned_memes_en_5.2.0_3.0_1700346440791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_finetuned_memes","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_finetuned_memes","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sentiment_finetuned_memes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jayanta/distilbert-base-uncased-sentiment-finetuned-memes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_reddit_crypto_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_reddit_crypto_en.md new file mode 100644 index 000000000000..34ced51a15d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sentiment_reddit_crypto_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sentiment_reddit_crypto DistilBertForSequenceClassification from mwkby +author: John Snow Labs +name: distilbert_base_uncased_sentiment_reddit_crypto +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sentiment_reddit_crypto` is a English model originally trained by mwkby. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_reddit_crypto_en_5.2.0_3.0_1700340121929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sentiment_reddit_crypto_en_5.2.0_3.0_1700340121929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_reddit_crypto","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sentiment_reddit_crypto","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sentiment_reddit_crypto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mwkby/distilbert-base-uncased-sentiment-reddit-crypto \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sst2_assemblyai_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sst2_assemblyai_en.md new file mode 100644 index 000000000000..e70b2ea62db9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sst2_assemblyai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sst2_assemblyai DistilBertForSequenceClassification from assemblyai +author: John Snow Labs +name: distilbert_base_uncased_sst2_assemblyai +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sst2_assemblyai` is a English model originally trained by assemblyai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sst2_assemblyai_en_5.2.0_3.0_1700340928344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sst2_assemblyai_en_5.2.0_3.0_1700340928344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sst2_assemblyai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sst2_assemblyai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sst2_assemblyai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/assemblyai/distilbert-base-uncased-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sst_2_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sst_2_en.md new file mode 100644 index 000000000000..712ee5c51c3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_sst_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sst_2 DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_uncased_sst_2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sst_2` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sst_2_en_5.2.0_3.0_1700343168943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sst_2_en_5.2.0_3.0_1700343168943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sst_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sst_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sst_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-uncased-SST-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_tweet_about_disaster_or_not_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_tweet_about_disaster_or_not_en.md new file mode 100644 index 000000000000..92575c008cdb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_tweet_about_disaster_or_not_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_tweet_about_disaster_or_not DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_tweet_about_disaster_or_not +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_tweet_about_disaster_or_not` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_tweet_about_disaster_or_not_en_5.2.0_3.0_1700344855565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_tweet_about_disaster_or_not_en_5.2.0_3.0_1700344855565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_tweet_about_disaster_or_not","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_tweet_about_disaster_or_not","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_tweet_about_disaster_or_not| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Tweet_About_Disaster_Or_Not \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_zero_shot_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_zero_shot_sentiment_model_en.md new file mode 100644 index 000000000000..b9ecbc938fc6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncased_zero_shot_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_zero_shot_sentiment_model DistilBertForSequenceClassification from okho0653 +author: John Snow Labs +name: distilbert_base_uncased_zero_shot_sentiment_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_zero_shot_sentiment_model` is a English model originally trained by okho0653. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_zero_shot_sentiment_model_en_5.2.0_3.0_1700339285482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_zero_shot_sentiment_model_en_5.2.0_3.0_1700339285482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_zero_shot_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_zero_shot_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_zero_shot_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/okho0653/distilbert-base-uncased-zero-shot-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncasedv1_finetuned_twitter_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncasedv1_finetuned_twitter_sentiment_en.md new file mode 100644 index 000000000000..62593b2d5de0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_base_uncasedv1_finetuned_twitter_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncasedv1_finetuned_twitter_sentiment DistilBertForSequenceClassification from macildur +author: John Snow Labs +name: distilbert_base_uncasedv1_finetuned_twitter_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncasedv1_finetuned_twitter_sentiment` is a English model originally trained by macildur. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncasedv1_finetuned_twitter_sentiment_en_5.2.0_3.0_1700342623481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncasedv1_finetuned_twitter_sentiment_en_5.2.0_3.0_1700342623481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncasedv1_finetuned_twitter_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncasedv1_finetuned_twitter_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncasedv1_finetuned_twitter_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/macildur/distilbert-base-uncasedv1-finetuned-twitter-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_bbc_news_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_bbc_news_classification_en.md new file mode 100644 index 000000000000..8df46a6b900f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_bbc_news_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_bbc_news_classification DistilBertForSequenceClassification from Umesh +author: John Snow Labs +name: distilbert_bbc_news_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_bbc_news_classification` is a English model originally trained by Umesh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_bbc_news_classification_en_5.2.0_3.0_1700351311648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_bbc_news_classification_en_5.2.0_3.0_1700351311648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_bbc_news_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_bbc_news_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_bbc_news_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Umesh/distilbert-bbc-news-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_cased_antisemitic_tweets_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_cased_antisemitic_tweets_en.md new file mode 100644 index 000000000000..a6d08d11cbe9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_cased_antisemitic_tweets_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_cased_antisemitic_tweets DistilBertForSequenceClassification from astarostap +author: John Snow Labs +name: distilbert_cased_antisemitic_tweets +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_cased_antisemitic_tweets` is a English model originally trained by astarostap. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_cased_antisemitic_tweets_en_5.2.0_3.0_1700343928975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_cased_antisemitic_tweets_en_5.2.0_3.0_1700343928975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cased_antisemitic_tweets","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cased_antisemitic_tweets","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_cased_antisemitic_tweets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/astarostap/distilbert-cased-antisemitic-tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_class_heaps_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_class_heaps_en.md new file mode 100644 index 000000000000..0e9e5ae83aa2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_class_heaps_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_class_heaps DistilBertForSequenceClassification from johannes-garstenauer +author: John Snow Labs +name: distilbert_class_heaps +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_class_heaps` is a English model originally trained by johannes-garstenauer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_class_heaps_en_5.2.0_3.0_1700347465541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_class_heaps_en_5.2.0_3.0_1700347465541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_class_heaps","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_class_heaps","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_class_heaps| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.8 MB| + +## References + +https://huggingface.co/johannes-garstenauer/distilbert_class_heaps \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_classifier_base_uncased_newspop_student_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_classifier_base_uncased_newspop_student_en.md new file mode 100644 index 000000000000..4c5856fc61ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_classifier_base_uncased_newspop_student_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from mrm8488) +author: John Snow Labs +name: distilbert_classifier_base_uncased_newspop_student +date: 2023-11-18 +tags: [open_source, distilbert, sequence_classifier, classification, newspop, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBERT Classification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-newspop-student` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + +`palestine`, `obama`, `microsoft`, `economy` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_classifier_base_uncased_newspop_student_en_5.2.0_3.0_1700335653559.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_classifier_base_uncased_newspop_student_en_5.2.0_3.0_1700335653559.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq = DistilBertForSequenceClassification.pretrained("distilbert_classifier_base_uncased_newspop_student","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE."]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val seq = DistilBertForSequenceClassification.pretrained("distilbert_classifier_base_uncased_newspop_student","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq)) + +val data = Seq("PUT YOUR STRING HERE.").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.news.uncased_base").predict("""PUT YOUR STRING HERE.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_classifier_base_uncased_newspop_student| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +https://huggingface.co/mrm8488/distilbert-base-uncased-newspop-student \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_cloths_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_cloths_sentiment_en.md new file mode 100644 index 000000000000..d429ce4a654e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_cloths_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_cloths_sentiment DistilBertForSequenceClassification from ongaunjie +author: John Snow Labs +name: distilbert_cloths_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_cloths_sentiment` is a English model originally trained by ongaunjie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_cloths_sentiment_en_5.2.0_3.0_1700341176551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_cloths_sentiment_en_5.2.0_3.0_1700341176551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cloths_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cloths_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_cloths_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ongaunjie/distilbert-cloths-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_complaints_product_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_complaints_product_en.md new file mode 100644 index 000000000000..5160440060a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_complaints_product_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_complaints_product DistilBertForSequenceClassification from Kayvane +author: John Snow Labs +name: distilbert_complaints_product +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_complaints_product` is a English model originally trained by Kayvane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_complaints_product_en_5.2.0_3.0_1700342632296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_complaints_product_en_5.2.0_3.0_1700342632296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_complaints_product","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_complaints_product","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_complaints_product| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kayvane/distilbert-complaints-product \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_customer_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_customer_classifier_en.md new file mode 100644 index 000000000000..62f683839f15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_customer_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_customer_classifier DistilBertForSequenceClassification from OllieStanley +author: John Snow Labs +name: distilbert_customer_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_customer_classifier` is a English model originally trained by OllieStanley. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_customer_classifier_en_5.2.0_3.0_1700343531524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_customer_classifier_en_5.2.0_3.0_1700343531524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_customer_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_customer_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_customer_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/OllieStanley/distilbert-customer-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_dummy_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_dummy_sentiment_en.md new file mode 100644 index 000000000000..1753d7943b53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_dummy_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_dummy_sentiment DistilBertForSequenceClassification from dhpollack +author: John Snow Labs +name: distilbert_dummy_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_dummy_sentiment` is a English model originally trained by dhpollack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_dummy_sentiment_en_5.2.0_3.0_1700342223587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_dummy_sentiment_en_5.2.0_3.0_1700342223587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_dummy_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_dummy_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_dummy_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|15.6 KB| + +## References + +https://huggingface.co/dhpollack/distilbert-dummy-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_emotion_richardchai_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_emotion_richardchai_en.md new file mode 100644 index 000000000000..8be6f29e2b36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_emotion_richardchai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_emotion_richardchai DistilBertForSequenceClassification from richardchai +author: John Snow Labs +name: distilbert_emotion_richardchai +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_emotion_richardchai` is a English model originally trained by richardchai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_emotion_richardchai_en_5.2.0_3.0_1700338643039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_emotion_richardchai_en_5.2.0_3.0_1700338643039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_emotion_richardchai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_emotion_richardchai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_emotion_richardchai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/richardchai/distilbert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_emotion_tirendaz_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_emotion_tirendaz_en.md new file mode 100644 index 000000000000..8611ca558781 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_emotion_tirendaz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_emotion_tirendaz DistilBertForSequenceClassification from Tirendaz +author: John Snow Labs +name: distilbert_emotion_tirendaz +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_emotion_tirendaz` is a English model originally trained by Tirendaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_emotion_tirendaz_en_5.2.0_3.0_1700347194048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_emotion_tirendaz_en_5.2.0_3.0_1700347194048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_emotion_tirendaz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_emotion_tirendaz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_emotion_tirendaz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tirendaz/distilbert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_feature_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_feature_classifier_en.md new file mode 100644 index 000000000000..c05e031b69f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_feature_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_feature_classifier DistilBertForSequenceClassification from Peterard +author: John Snow Labs +name: distilbert_feature_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_feature_classifier` is a English model originally trained by Peterard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_feature_classifier_en_5.2.0_3.0_1700348145438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_feature_classifier_en_5.2.0_3.0_1700348145438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_feature_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_feature_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_feature_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.4 MB| + +## References + +https://huggingface.co/Peterard/distilbert_feature_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_ag_news_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_ag_news_en.md new file mode 100644 index 000000000000..49d36dd854d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_ag_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_ag_news DistilBertForSequenceClassification from MuntasirHossain +author: John Snow Labs +name: distilbert_finetuned_ag_news +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ag_news` is a English model originally trained by MuntasirHossain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ag_news_en_5.2.0_3.0_1700340530318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ag_news_en_5.2.0_3.0_1700340530318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_ag_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_ag_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ag_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/MuntasirHossain/distilbert-finetuned-ag-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_headings_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_headings_en.md new file mode 100644 index 000000000000..beb8734dc6b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_headings_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_headings DistilBertForSequenceClassification from OrbitalWitness +author: John Snow Labs +name: distilbert_finetuned_headings +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_headings` is a English model originally trained by OrbitalWitness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_headings_en_5.2.0_3.0_1700338328657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_headings_en_5.2.0_3.0_1700338328657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_headings","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_headings","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_headings| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/OrbitalWitness/distilbert-finetuned-headings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_headings_v2_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_headings_v2_en.md new file mode 100644 index 000000000000..3c36a998984f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_headings_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_headings_v2 DistilBertForSequenceClassification from OrbitalWitness +author: John Snow Labs +name: distilbert_finetuned_headings_v2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_headings_v2` is a English model originally trained by OrbitalWitness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_headings_v2_en_5.2.0_3.0_1700339633727.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_headings_v2_en_5.2.0_3.0_1700339633727.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_headings_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_headings_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_headings_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/OrbitalWitness/distilbert-finetuned-headings-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_spanish_offensive_language_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_spanish_offensive_language_en.md new file mode 100644 index 000000000000..8b7061c8870c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_finetuned_spanish_offensive_language_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_spanish_offensive_language DistilBertForSequenceClassification from Brandon-h +author: John Snow Labs +name: distilbert_finetuned_spanish_offensive_language +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_spanish_offensive_language` is a English model originally trained by Brandon-h. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_spanish_offensive_language_en_5.2.0_3.0_1700342815355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_spanish_offensive_language_en_5.2.0_3.0_1700342815355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_spanish_offensive_language","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_spanish_offensive_language","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_spanish_offensive_language| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/Brandon-h/distilbert-finetuned-spanish-offensive-language \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_german_text_complexity_de.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_german_text_complexity_de.md new file mode 100644 index 000000000000..57650905b64f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_german_text_complexity_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German distilbert_german_text_complexity DistilBertForSequenceClassification from MiriUll +author: John Snow Labs +name: distilbert_german_text_complexity +date: 2023-11-18 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_german_text_complexity` is a German model originally trained by MiriUll. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_german_text_complexity_de_5.2.0_3.0_1700350267511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_german_text_complexity_de_5.2.0_3.0_1700350267511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_german_text_complexity","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_german_text_complexity","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_german_text_complexity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|252.5 MB| + +## References + +https://huggingface.co/MiriUll/distilbert-german-text-complexity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_go_emotions_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_go_emotions_en.md new file mode 100644 index 000000000000..6ac5e6a45c0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_go_emotions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_go_emotions DistilBertForSequenceClassification from tasinhoque +author: John Snow Labs +name: distilbert_go_emotions +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_go_emotions` is a English model originally trained by tasinhoque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_go_emotions_en_5.2.0_3.0_1700341675703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_go_emotions_en_5.2.0_3.0_1700341675703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_go_emotions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_go_emotions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_go_emotions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tasinhoque/distilbert-go-emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_goodreads_wandb_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_goodreads_wandb_en.md new file mode 100644 index 000000000000..a5ba1aabb0d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_goodreads_wandb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_goodreads_wandb DistilBertForSequenceClassification from dhmeltzer +author: John Snow Labs +name: distilbert_goodreads_wandb +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_goodreads_wandb` is a English model originally trained by dhmeltzer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_goodreads_wandb_en_5.2.0_3.0_1700350113152.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_goodreads_wandb_en_5.2.0_3.0_1700350113152.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_goodreads_wandb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_goodreads_wandb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_goodreads_wandb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dhmeltzer/distilbert-goodreads-wandb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_helpdesk_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_helpdesk_sentiment_en.md new file mode 100644 index 000000000000..462b02b4cdae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_helpdesk_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_helpdesk_sentiment DistilBertForSequenceClassification from Venkatesh4342 +author: John Snow Labs +name: distilbert_helpdesk_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_helpdesk_sentiment` is a English model originally trained by Venkatesh4342. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_helpdesk_sentiment_en_5.2.0_3.0_1700338019614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_helpdesk_sentiment_en_5.2.0_3.0_1700338019614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_helpdesk_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_helpdesk_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_helpdesk_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Venkatesh4342/distilbert-helpdesk-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_imdb_dhlee347_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_imdb_dhlee347_en.md new file mode 100644 index 000000000000..3bc554accc71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_imdb_dhlee347_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_dhlee347 DistilBertForSequenceClassification from dhlee347 +author: John Snow Labs +name: distilbert_imdb_dhlee347 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_dhlee347` is a English model originally trained by dhlee347. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_dhlee347_en_5.2.0_3.0_1700348152689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_dhlee347_en_5.2.0_3.0_1700348152689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_dhlee347","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_dhlee347","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_dhlee347| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/dhlee347/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_imdb_lvwerra_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_imdb_lvwerra_en.md new file mode 100644 index 000000000000..0d606fd6f79a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_imdb_lvwerra_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_lvwerra DistilBertForSequenceClassification from lvwerra +author: John Snow Labs +name: distilbert_imdb_lvwerra +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_lvwerra` is a English model originally trained by lvwerra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_lvwerra_en_5.2.0_3.0_1700337444546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_lvwerra_en_5.2.0_3.0_1700337444546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_lvwerra","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_lvwerra","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_lvwerra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/lvwerra/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_jobcategory_410k_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_jobcategory_410k_en.md new file mode 100644 index 000000000000..17debbd3dfb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_jobcategory_410k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_jobcategory_410k DistilBertForSequenceClassification from serbog +author: John Snow Labs +name: distilbert_jobcategory_410k +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_jobcategory_410k` is a English model originally trained by serbog. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_jobcategory_410k_en_5.2.0_3.0_1700345934763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_jobcategory_410k_en_5.2.0_3.0_1700345934763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_jobcategory_410k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_jobcategory_410k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_jobcategory_410k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/serbog/distilbert-jobCategory_410k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_large_sms_spam_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_large_sms_spam_en.md new file mode 100644 index 000000000000..7a20a15c3dba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_large_sms_spam_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_large_sms_spam DistilBertForSequenceClassification from sureshs +author: John Snow Labs +name: distilbert_large_sms_spam +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_large_sms_spam` is a English model originally trained by sureshs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_large_sms_spam_en_5.2.0_3.0_1700346319231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_large_sms_spam_en_5.2.0_3.0_1700346319231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_large_sms_spam","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_large_sms_spam","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_large_sms_spam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sureshs/distilbert-large-sms-spam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_lazylearner_hatespeech_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_lazylearner_hatespeech_detection_en.md new file mode 100644 index 000000000000..48e833302738 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_lazylearner_hatespeech_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_lazylearner_hatespeech_detection DistilBertForSequenceClassification from Sakil +author: John Snow Labs +name: distilbert_lazylearner_hatespeech_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_lazylearner_hatespeech_detection` is a English model originally trained by Sakil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_lazylearner_hatespeech_detection_en_5.2.0_3.0_1700342988449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_lazylearner_hatespeech_detection_en_5.2.0_3.0_1700342988449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_lazylearner_hatespeech_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_lazylearner_hatespeech_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_lazylearner_hatespeech_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Sakil/distilbert_lazylearner_hatespeech_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_mrpc_en.md new file mode 100644 index 000000000000..26b7f523f58a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_mrpc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_mrpc DistilBertForSequenceClassification from mattchurgin +author: John Snow Labs +name: distilbert_mrpc +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_mrpc` is a English model originally trained by mattchurgin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_mrpc_en_5.2.0_3.0_1700343954281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_mrpc_en_5.2.0_3.0_1700343954281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_mrpc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_mrpc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_mrpc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mattchurgin/distilbert-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1_xx.md new file mode 100644 index 000000000000..8d726d681586 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1 DistilBertForSequenceClassification from somm +author: John Snow Labs +name: distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1 +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1` is a Multilingual model originally trained by somm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1_xx_5.2.0_3.0_1700341687991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1_xx_5.2.0_3.0_1700341687991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_multilingual_uncased_english_german_french_nov_24_epoch_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/somm/distilbert-multilingual-uncased-en-de-fr-nov-24-epoch-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_multilingual_uncased_oct_8_xx.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_multilingual_uncased_oct_8_xx.md new file mode 100644 index 000000000000..f5026a1926f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_multilingual_uncased_oct_8_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_multilingual_uncased_oct_8 DistilBertForSequenceClassification from SmilestheSad +author: John Snow Labs +name: distilbert_multilingual_uncased_oct_8 +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_multilingual_uncased_oct_8` is a Multilingual model originally trained by SmilestheSad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_oct_8_xx_5.2.0_3.0_1700351828728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_oct_8_xx_5.2.0_3.0_1700351828728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_oct_8","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_oct_8","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_multilingual_uncased_oct_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/SmilestheSad/distilbert-multilingual-uncased-oct-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_negation_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_negation_en.md new file mode 100644 index 000000000000..b0c1c131f9a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_negation_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_negation DistilBertForSequenceClassification from chitra +author: John Snow Labs +name: distilbert_negation +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_negation` is a English model originally trained by chitra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_negation_en_5.2.0_3.0_1700340105410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_negation_en_5.2.0_3.0_1700340105410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_negation","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_negation","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_negation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/chitra/distilbert-negation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_organization_matching_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_organization_matching_en.md new file mode 100644 index 000000000000..5a25d53c2b8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_organization_matching_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_organization_matching DistilBertForSequenceClassification from thedavidhackett +author: John Snow Labs +name: distilbert_organization_matching +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_organization_matching` is a English model originally trained by thedavidhackett. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_organization_matching_en_5.2.0_3.0_1700348902720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_organization_matching_en_5.2.0_3.0_1700348902720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_organization_matching","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_organization_matching","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_organization_matching| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/thedavidhackett/distilbert-organization-matching \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_pizza_intent_sachin19566_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_pizza_intent_sachin19566_en.md new file mode 100644 index 000000000000..b286cc2ff26e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_pizza_intent_sachin19566_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_pizza_intent_sachin19566 DistilBertForSequenceClassification from sachin19566 +author: John Snow Labs +name: distilbert_pizza_intent_sachin19566 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_pizza_intent_sachin19566` is a English model originally trained by sachin19566. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_pizza_intent_sachin19566_en_5.2.0_3.0_1700342460084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_pizza_intent_sachin19566_en_5.2.0_3.0_1700342460084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_pizza_intent_sachin19566","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_pizza_intent_sachin19566","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_pizza_intent_sachin19566| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sachin19566/distilbert_Pizza_Intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_plutchik_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_plutchik_en.md new file mode 100644 index 000000000000..5fd82a331033 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_plutchik_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_plutchik DistilBertForSequenceClassification from JuliusAlphonso +author: John Snow Labs +name: distilbert_plutchik +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_plutchik` is a English model originally trained by JuliusAlphonso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_plutchik_en_5.2.0_3.0_1700338499076.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_plutchik_en_5.2.0_3.0_1700338499076.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_plutchik","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_plutchik","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_plutchik| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/JuliusAlphonso/distilbert-plutchik \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_prompt_injection_fmops_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_prompt_injection_fmops_en.md new file mode 100644 index 000000000000..248c92819ba0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_prompt_injection_fmops_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_prompt_injection_fmops DistilBertForSequenceClassification from fmops +author: John Snow Labs +name: distilbert_prompt_injection_fmops +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_prompt_injection_fmops` is a English model originally trained by fmops. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_prompt_injection_fmops_en_5.2.0_3.0_1700337599573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_prompt_injection_fmops_en_5.2.0_3.0_1700337599573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_prompt_injection_fmops","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_prompt_injection_fmops","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_prompt_injection_fmops| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/fmops/distilbert-prompt-injection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_resume_parts_classify_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_resume_parts_classify_en.md new file mode 100644 index 000000000000..ad14ea252af0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_resume_parts_classify_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_resume_parts_classify DistilBertForSequenceClassification from manishiitg +author: John Snow Labs +name: distilbert_resume_parts_classify +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_resume_parts_classify` is a English model originally trained by manishiitg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_resume_parts_classify_en_5.2.0_3.0_1700338628267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_resume_parts_classify_en_5.2.0_3.0_1700338628267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_resume_parts_classify","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_resume_parts_classify","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_resume_parts_classify| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/manishiitg/distilbert-resume-parts-classify \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_reviews_with_language_drift_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_reviews_with_language_drift_en.md new file mode 100644 index 000000000000..c3e9503d76a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_reviews_with_language_drift_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_reviews_with_language_drift DistilBertForSequenceClassification from arize-ai +author: John Snow Labs +name: distilbert_reviews_with_language_drift +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_reviews_with_language_drift` is a English model originally trained by arize-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_reviews_with_language_drift_en_5.2.0_3.0_1700344570501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_reviews_with_language_drift_en_5.2.0_3.0_1700344570501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_reviews_with_language_drift","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_reviews_with_language_drift","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_reviews_with_language_drift| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/arize-ai/distilbert_reviews_with_language_drift \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_scp_class_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_scp_class_classification_en.md new file mode 100644 index 000000000000..07f51580cf17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_scp_class_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_scp_class_classification DistilBertForSequenceClassification from Azaghast +author: John Snow Labs +name: distilbert_scp_class_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_scp_class_classification` is a English model originally trained by Azaghast. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_scp_class_classification_en_5.2.0_3.0_1700349679198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_scp_class_classification_en_5.2.0_3.0_1700349679198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_scp_class_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_scp_class_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_scp_class_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Azaghast/DistilBERT-SCP-Class-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sentico_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sentico_finetuned_emotion_en.md new file mode 100644 index 000000000000..10663a6b4514 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sentico_finetuned_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sentico_finetuned_emotion DistilBertForSequenceClassification from RaeesTahir +author: John Snow Labs +name: distilbert_sentico_finetuned_emotion +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sentico_finetuned_emotion` is a English model originally trained by RaeesTahir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sentico_finetuned_emotion_en_5.2.0_3.0_1700339119737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sentico_finetuned_emotion_en_5.2.0_3.0_1700339119737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentico_finetuned_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentico_finetuned_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sentico_finetuned_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/RaeesTahir/DistilBERT-Sentico-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sentiment_en.md new file mode 100644 index 000000000000..709c7ec8c66b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sentiment DistilBertForSequenceClassification from AbeerAlbashiti +author: John Snow Labs +name: distilbert_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sentiment` is a English model originally trained by AbeerAlbashiti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_en_5.2.0_3.0_1700350805412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_en_5.2.0_3.0_1700350805412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AbeerAlbashiti/distilbert-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased_en.md new file mode 100644 index 000000000000..7987f2666433 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Cased model (from Wi) +author: John Snow Labs +name: distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `arxiv-topics-distilbert-base-cased` is a English model originally trained by `Wi`. + +## Predicted Entities + +`Other`, `Statistics`, `Astrophysics`, `Quantum Physics`, `Nonlinear Sciences`, `Electrical Engineering and Systems Science`, `High Energy Physics - Lattice`, `Quantitative Biology`, `High Energy Physics - Theory`, `Nuclear Theory`, `High Energy Physics - Experiment`, `Condensed Matter`, `Nuclear Experiment`, `High Energy Physics - Phenomenology`, `Mathematics`, `Physics`, `Quantitative Finance`, `Mathematical Physics`, `Economics`, `General Relativity and Quantum Cosmology`, `Computer Science` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased_en_5.2.0_3.0_1700335648926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased_en_5.2.0_3.0_1700335648926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.cased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_arxiv_topics_distilbert_base_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Wi/arxiv-topics-distilbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_doctor_de_24595544_de.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_doctor_de_24595544_de.md new file mode 100644 index 000000000000..ec418a829057 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_doctor_de_24595544_de.md @@ -0,0 +1,99 @@ +--- +layout: model +title: German distilbert_sequence_classifier_autonlp_doctor_de_24595544 DistilBertForSequenceClassification from muhtasham +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_doctor_de_24595544 +date: 2023-11-18 +tags: [bert, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_autonlp_doctor_de_24595544` is a German model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_doctor_de_24595544_de_5.2.0_3.0_1700336111326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_doctor_de_24595544_de_5.2.0_3.0_1700336111326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_doctor_de_24595544","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_doctor_de_24595544","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_doctor_de_24595544| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|252.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/muhtasham/autonlp-Doctor_DE-24595544 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_gibberish_detector_492513457_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_gibberish_detector_492513457_en.md new file mode 100644 index 000000000000..86f4361ab854 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_gibberish_detector_492513457_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_autonlp_gibberish_detector_492513457 DistilBertForSequenceClassification from madhurjindal +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_gibberish_detector_492513457 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_autonlp_gibberish_detector_492513457` is a English model originally trained by madhurjindal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_gibberish_detector_492513457_en_5.2.0_3.0_1700336292429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_gibberish_detector_492513457_en_5.2.0_3.0_1700336292429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_gibberish_detector_492513457","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_gibberish_detector_492513457","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_gibberish_detector_492513457| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_kaggledays_625717992_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_kaggledays_625717992_en.md new file mode 100644 index 000000000000..5c33a8f32361 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_kaggledays_625717992_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from Someshfengde) +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_kaggledays_625717992 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-kaggledays-625717992` is a English model originally trained by `Someshfengde`. + +## Predicted Entities + +`disagreement`, `unbiased`, `association` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_kaggledays_625717992_en_5.2.0_3.0_1700335907555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_kaggledays_625717992_en_5.2.0_3.0_1700335907555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_kaggledays_625717992","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_kaggledays_625717992","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_someshfengde").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_kaggledays_625717992| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Someshfengde/autonlp-kaggledays-625717992 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_ks_530615016_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_ks_530615016_en.md new file mode 100644 index 000000000000..4bf4faa73a75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_ks_530615016_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from bitmorse) +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_ks_530615016 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-ks-530615016` is a English model originally trained by `bitmorse`. + +## Predicted Entities + +`live`, `canceled`, `failed`, `successful` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_ks_530615016_en_5.2.0_3.0_1700335865626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_ks_530615016_en_5.2.0_3.0_1700335865626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_ks_530615016","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_ks_530615016","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_bitmorse").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_ks_530615016| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bitmorse/autonlp-ks-530615016 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_mono_625317956_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_mono_625317956_en.md new file mode 100644 index 000000000000..78f6c314d76a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_mono_625317956_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from Chijioke) +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_mono_625317956 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-mono-625317956` is a English model originally trained by `Chijioke`. + +## Predicted Entities + +`food`, `stamp_duties_charges`, `investment`, `bank_charges`, `loan_repayment`, `mature_loan_instalment`, `atm_withdrawal_charges`, `phone_and_internet`, `card_request_commission`, `health`, `salary`, `atm_withdrawal`, `others`, `transfer`, `bills_or_fees`, `miscellaneous`, `vat`, `offline_transactions`, `reversals`, `transportation`, `online_transactions`, `cash_deposit`, `Rent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_mono_625317956_en_5.2.0_3.0_1700335871835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_mono_625317956_en_5.2.0_3.0_1700335871835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_mono_625317956","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_mono_625317956","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_chijioke").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_mono_625317956| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Chijioke/autonlp-mono-625317956 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_song_lyrics_18753423_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_song_lyrics_18753423_en.md new file mode 100644 index 000000000000..6b91232f8adc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_song_lyrics_18753423_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from juliensimon) +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_song_lyrics_18753423 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-song-lyrics-18753423` is a English model originally trained by `juliensimon`. + +## Predicted Entities + +`Heavy Metal`, `Dance`, `Rock`, `Pop`, `Indie`, `Hip Hop` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_song_lyrics_18753423_en_5.2.0_3.0_1700336096502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_song_lyrics_18753423_en_5.2.0_3.0_1700336096502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_song_lyrics_18753423","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_song_lyrics_18753423","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_juliensimon").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_song_lyrics_18753423| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/juliensimon/autonlp-song-lyrics-18753423 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_test3_2101787_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_test3_2101787_en.md new file mode 100644 index 000000000000..0e3fb24c86ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_test3_2101787_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from clem) +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_test3_2101787 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-test3-2101787` is a English model originally trained by `clem`. + +## Predicted Entities + +`urgent`, `not_urgent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_test3_2101787_en_5.2.0_3.0_1700335868525.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_test3_2101787_en_5.2.0_3.0_1700335868525.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_test3_2101787","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_test3_2101787","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_clem").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_test3_2101787| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/clem/autonlp-test3-2101787 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963_en.md new file mode 100644 index 000000000000..27331e40d993 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963 DistilBertForSequenceClassification from abhishek +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963` is a English model originally trained by abhishek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963_en_5.2.0_3.0_1700336441739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963_en_5.2.0_3.0_1700336441739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_toxic_nepal_bhasa_30516963| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/abhishek/autonlp-toxic-new-30516963 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061_en.md new file mode 100644 index 000000000000..e8db1f175a9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061 DistilBertForSequenceClassification from amansolanki +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061` is a English model originally trained by amansolanki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061_en_5.2.0_3.0_1700335643291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061_en_5.2.0_3.0_1700335643291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_tweet_sentiment_extraction_20114061| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/amansolanki/autonlp-Tweet-Sentiment-Extraction-20114061 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_tweets_classification_23044997_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_tweets_classification_23044997_en.md new file mode 100644 index 000000000000..06d56872f215 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autonlp_tweets_classification_23044997_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from Monsia) +author: John Snow Labs +name: distilbert_sequence_classifier_autonlp_tweets_classification_23044997 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-tweets-classification-23044997` is a English model originally trained by `Monsia`. + +## Predicted Entities + +`economic_violence`, `Physical_violence`, `Harmful_Traditional_practice`, `sexual_violence`, `emotional_violence` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_tweets_classification_23044997_en_5.2.0_3.0_1700336295054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autonlp_tweets_classification_23044997_en_5.2.0_3.0_1700336295054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_tweets_classification_23044997","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autonlp_tweets_classification_23044997","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.tweet.by_monsia").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autonlp_tweets_classification_23044997| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Monsia/autonlp-tweets-classification-23044997 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535_en.md new file mode 100644 index 000000000000..eecfc40ba02b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from deepesh0x) +author: John Snow Labs +name: distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-finetunedmodelbert-1034335535` is a English model originally trained by `deepesh0x`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535_en_5.2.0_3.0_1700336111282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535_en_5.2.0_3.0_1700336111282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autotrain_finetunedmodelbert_1034335535| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/deepesh0x/autotrain-finetunedmodelbert-1034335535 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_imdb_1166543171_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_imdb_1166543171_en.md new file mode 100644 index 000000000000..790857a98165 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_imdb_1166543171_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from ameerazam08) +author: John Snow Labs +name: distilbert_sequence_classifier_autotrain_imdb_1166543171 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-imdb-1166543171` is a English model originally trained by `ameerazam08`. + +## Predicted Entities + +`neg`, `pos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_imdb_1166543171_en_5.2.0_3.0_1700336297951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_imdb_1166543171_en_5.2.0_3.0_1700336297951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_imdb_1166543171","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_imdb_1166543171","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.imdb.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autotrain_imdb_1166543171| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ameerazam08/autotrain-imdb-1166543171 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296_en.md new file mode 100644 index 000000000000..bfa29b59870b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from Danitg95) +author: John Snow Labs +name: distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-kaggle-effective-arguments-1086739296` is a English model originally trained by `Danitg95`. + +## Predicted Entities + +`Ineffective`, `Adequate`, `Effective` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296_en_5.2.0_3.0_1700336513539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296_en_5.2.0_3.0_1700336513539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_danitg95").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autotrain_kaggle_effective_arguments_1086739296| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Danitg95/autotrain-kaggle-effective-arguments-1086739296 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_mbtinlp_798824628_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_mbtinlp_798824628_en.md new file mode 100644 index 000000000000..afedadc9808e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_mbtinlp_798824628_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_autotrain_mbtinlp_798824628 DistilBertForSequenceClassification from Sathira +author: John Snow Labs +name: distilbert_sequence_classifier_autotrain_mbtinlp_798824628 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_autotrain_mbtinlp_798824628` is a English model originally trained by Sathira. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_mbtinlp_798824628_en_5.2.0_3.0_1700336135789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_mbtinlp_798824628_en_5.2.0_3.0_1700336135789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_mbtinlp_798824628","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_mbtinlp_798824628","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autotrain_mbtinlp_798824628| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/Sathira/autotrain-mbtiNlp-798824628 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_online_orders_755323156_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_online_orders_755323156_en.md new file mode 100644 index 000000000000..ec018331fcf0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_autotrain_online_orders_755323156_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_autotrain_online_orders_755323156 DistilBertForSequenceClassification from xInsignia +author: John Snow Labs +name: distilbert_sequence_classifier_autotrain_online_orders_755323156 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_autotrain_online_orders_755323156` is a English model originally trained by xInsignia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_online_orders_755323156_en_5.2.0_3.0_1700336106277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_autotrain_online_orders_755323156_en_5.2.0_3.0_1700336106277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_online_orders_755323156","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_autotrain_online_orders_755323156","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_autotrain_online_orders_755323156| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/xInsignia/autotrain-Online_orders-755323156 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_bert_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_bert_base_uncased_emotion_en.md new file mode 100644 index 000000000000..edb55c4ab7dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_bert_base_uncased_emotion_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from vamossyd) +author: John Snow Labs +name: distilbert_sequence_classifier_bert_base_uncased_emotion +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bert-base-uncased-emotion` is a English model originally trained by `vamossyd`. + +## Predicted Entities + +`Anger`, `Sad`, `Happy`, `Disgust`, `Surprise`, `Neutral`, `Fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_bert_base_uncased_emotion_en_5.2.0_3.0_1700336551833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_bert_base_uncased_emotion_en_5.2.0_3.0_1700336551833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_bert_base_uncased_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_bert_base_uncased_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.emotion.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_bert_base_uncased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|263.4 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/vamossyd/bert-base-uncased-emotion +- https://github.com/sarnthil/unify-emotion-datasets +- https://github.com/dvamossy/EmTract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion_en.md new file mode 100644 index 000000000000..13606c407c66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from bhadresh-savani) +author: John Snow Labs +name: distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-emotion` is a English model originally trained by `bhadresh-savani`. + +## Predicted Entities + +`sadness`, `anger`, `love`, `surprise`, `joy`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700336301439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700336301439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.emotion.uncased_base.by_bhadresh_savani").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_bhadresh_savani_distilbert_base_uncased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bhadresh-savani/distilbert-base-uncased-emotion +- https://arxiv.org/abs/1910.01108 +- https://github.com/bhadreshpsavani/ExploringSentimentalAnalysis/blob/main/SentimentalAnalysisWithDistilbert.ipynb +- https://learning.oreilly.com/library/view/natural-language-processing/9781098103231/ +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased_zh.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased_zh.md new file mode 100644 index 000000000000..63d94aa8f066 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased DistilBertForSequenceClassification from liam168 +author: John Snow Labs +name: distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased +date: 2023-11-18 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased` is a Chinese model originally trained by liam168. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased_zh_5.2.0_3.0_1700336694550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased_zh_5.2.0_3.0_1700336694550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_c4_chinese_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|249.5 MB| + +## References + +https://huggingface.co/liam168/c4-zh-distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici_it.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici_it.md new file mode 100644 index 000000000000..bf95c788655d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici_it.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Italian distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici DistilBertForSequenceClassification from efederici +author: John Snow Labs +name: distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici +date: 2023-11-18 +tags: [bert, it, open_source, sequence_classification, onnx] +task: Text Classification +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici` is a Italian model originally trained by efederici. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici_it_5.2.0_3.0_1700336662100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici_it_5.2.0_3.0_1700336662100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici","it")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici","it") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_cross_encoder_distilbert_italian_efederici| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|it| +|Size:|236.4 MB| + +## References + +https://huggingface.co/efederici/cross-encoder-distilbert-it \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base_de.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base_de.md new file mode 100644 index 000000000000..9e391a7bc497 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base_de.md @@ -0,0 +1,108 @@ +--- +layout: model +title: German DistilBertForSequenceClassification Base Cased model (from ml6team) +author: John Snow Labs +name: distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, de, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `cross-encoder-mmarco-german-distilbert-base` is a German model originally trained by `ml6team`. + +## Predicted Entities + +`LABEL_0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base_de_5.2.0_3.0_1700336786891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base_de_5.2.0_3.0_1700336786891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.distil_bert.base.by_ml6team").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_cross_encoder_mmarco_german_distilbert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|507.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ml6team/cross-encoder-mmarco-german-distilbert-base +- https://www.sbert.net/ +- https://www.sbert.net/examples/training/cross-encoder/README.html \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_cased_trec_coarse_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_cased_trec_coarse_en.md new file mode 100644 index 000000000000..4d1b18812703 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_cased_trec_coarse_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Cased model (from aychang) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_cased_trec_coarse +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-cased-trec-coarse` is a English model originally trained by `aychang`. + +## Predicted Entities + +`ABBR`, `HUM`, `NUM`, `ENTY`, `LOC`, `DESC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_cased_trec_coarse_en_5.2.0_3.0_1700337065225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_cased_trec_coarse_en_5.2.0_3.0_1700337065225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_cased_trec_coarse","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_cased_trec_coarse","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.cased_base.by_aychang").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_cased_trec_coarse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.0 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/aychang/distilbert-base-cased-trec-coarse \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments_nl.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments_nl.md new file mode 100644 index 000000000000..b39569c2e151 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments_nl.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Dutch DistilBertForSequenceClassification Base Cased model (from ml6team) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, nl, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-dutch-cased-toxic-comments` is a Dutch model originally trained by `ml6team`. + +## Predicted Entities + +`toxic`, `non-toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments_nl_5.2.0_3.0_1700336518467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments_nl_5.2.0_3.0_1700336518467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.distil_bert.cased_base").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_dutch_cased_toxic_comments| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|507.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ml6team/distilbert-base-dutch-cased-toxic-comments +- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments_de.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments_de.md new file mode 100644 index 000000000000..9bc12e92e25f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments_de.md @@ -0,0 +1,112 @@ +--- +layout: model +title: German DistilBertForSequenceClassification Base Cased model (from ml6team) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, de, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-german-cased-toxic-comments` is a German model originally trained by `ml6team`. + +## Predicted Entities + +`toxic`, `non_toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments_de_5.2.0_3.0_1700336728905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments_de_5.2.0_3.0_1700336728905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments","de") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Ich liebe Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments","de") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Ich liebe Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("de.classify.distil_bert.cased_base").predict("""Ich liebe Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_german_cased_toxic_comments| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|252.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ml6team/distilbert-base-german-cased-toxic-comments +- http://ub-web.de/research/ +- https://github.com/uds-lsv/GermEval-2018-Data +- https://arxiv.org/pdf/1701.08118.pdf +- https://github.com/UCSM-DUE/IWG_hatespeech_public +- https://hasocfire.github.io/hasoc/2019/index.html +- https://github.com/germeval2021toxic/SharedTask/tree/main/Data%20Sets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion_tr.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion_tr.md new file mode 100644 index 000000000000..e6f374026eb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion_tr.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Turkish DistilBertForSequenceClassification Base Cased model (from zafercavdar) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, tr, onnx] +task: Text Classification +language: tr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-turkish-cased-emotion` is a Turkish model originally trained by `zafercavdar`. + +## Predicted Entities + +`sadness`, `anger`, `love`, `surprise`, `joy`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion_tr_5.2.0_3.0_1700336924403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion_tr_5.2.0_3.0_1700336924403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion","tr") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["Spark NLP'yi seviyorum"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion","tr") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("Spark NLP'yi seviyorum").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("tr.classify.distil_bert.cased_base").predict("""Spark NLP'yi seviyorum""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_turkish_cased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|tr| +|Size:|254.1 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/zafercavdar/distilbert-base-turkish-cased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_emotion_2_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_emotion_2_en.md new file mode 100644 index 000000000000..69dc3b1fe976 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_emotion_2_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from nanopass) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_emotion_2 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-emotion-2` is a English model originally trained by `nanopass`. + +## Predicted Entities + +`sadness`, `anger`, `love`, `surprise`, `joy`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_emotion_2_en_5.2.0_3.0_1700337114064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_emotion_2_en_5.2.0_3.0_1700337114064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_emotion_2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_emotion_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.emotion.uncased_base.by_nanopass").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_emotion_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/nanopass/distilbert-base-uncased-emotion-2 +- https://arxiv.org/abs/1910.01108 +- https://github.com/bhadreshpsavani/ExploringSentimentalAnalysis/blob/main/SentimentalAnalysisWithDistilbert.ipynb +- https://learning.oreilly.com/library/view/natural-language-processing/9781098103231/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news_en.md new file mode 100644 index 000000000000..41bad2ddc197 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from Datasaur) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-ag-news` is a English model originally trained by `Datasaur`. + +## Predicted Entities + +`World`, `Sci/Tech`, `Sports`, `Business` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news_en_5.2.0_3.0_1700336977028.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news_en_5.2.0_3.0_1700336977028.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.news.uncased_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_finetuned_ag_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Datasaur/distilbert-base-uncased-finetuned-ag-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app_en.md new file mode 100644 index 000000000000..889f758d2621 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from nsi319) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-app` is a English model originally trained by `nsi319`. + +## Predicted Entities + +`News & Magazines`, `Entertainment`, `Productivity`, `Photography`, `Sports`, `Education` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app_en_5.2.0_3.0_1700337465363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app_en_5.2.0_3.0_1700337465363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.uncased_base_finetuned.by_nsi319").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_finetuned_app| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/nsi319/distilbert-base-uncased-finetuned-app +- https://play.google.com/store/apps \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets_en.md new file mode 100644 index 000000000000..dea4025631ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets DistilBertForSequenceClassification from suvrobaner +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets` is a English model originally trained by suvrobaner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets_en_5.2.0_3.0_1700337098715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets_en_5.2.0_3.0_1700337098715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_finetuned_emotion_english_tweets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/suvrobaner/distilbert-base-uncased-finetuned-emotion-en-tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student_en.md new file mode 100644 index 000000000000..269e701b7773 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from joeddav) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-go-emotions-student` is a English model originally trained by `joeddav`. + +## Predicted Entities + +`sadness`, `nervousness`, `disapproval`, `love`, `fear`, `grief`, `admiration`, `caring`, `curiosity`, `realization`, `anger`, `gratitude`, `optimism`, `approval`, `surprise`, `joy`, `embarrassment`, `remorse`, `pride`, `disappointment`, `relief`, `desire`, `amusement`, `annoyance`, `confusion`, `disgust`, `excitement`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student_en_5.2.0_3.0_1700337659806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student_en_5.2.0_3.0_1700337659806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.go_emotions.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_go_emotions_student| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/joeddav/distilbert-base-uncased-go-emotions-student \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_if_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_if_en.md new file mode 100644 index 000000000000..4802ed21bf18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_if_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from Aureliano) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_if +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-if` is a English model originally trained by `Aureliano`. + +## Predicted Entities + +`charge.v.17`, `kill.v.01`, `put.v.01`, `switch_off.v.01`, `ask.v.01`, `dig.v.01`, `search.v.04`, `repeat.v.01`, `wear.v.02`, `play.v.03`, `ask.v.02`, `wait.v.01`, `smash.v.02`, `clean.v.01`, `drink.v.01`, `inventory.v.01`, `climb.v.01`, `close.v.01`, `set.v.05`, `hit.v.03`, `remove.v.01`, `hit.v.02`, `sit_down.v.01`, `memorize.v.01`, `stand.v.03`, `write.v.07`, `insert.v.01`, `light_up.v.05`, `show.v.01`, `travel.v.01`, `listen.v.01`, `sequence.n.02`, `brandish.v.01`, `take_off.v.06`, `wake_up.v.02`, `connect.v.01`, `say.v.08`, `burn.v.01`, `talk.v.02`, `turn.v.09`, `smell.v.01`, `pull.v.04`, `move.v.02`, `shoot.v.01`, `press.v.01`, `exit.v.01`, `take.v.04`, `examine.v.02`, `read.v.01`, `follow.v.01`, `jump.v.01`, `rub.v.01`, `throw.v.01`, `answer.v.01`, `shake.v.01`, `drive.v.01`, `buy.v.01`, `eat.v.01`, `open.v.01`, `break.v.05`, `note.v.04`, `sleep.v.01`, `drop.v.01`, `blow.v.01`, `fill.v.01`, `choose.v.01`, `enter.v.01`, `pray.v.01`, `skid.v.04`, `lower.v.01`, `lie_down.v.01`, `cut.v.01`, `look.v.01`, `unlock.v.01`, `give.v.03`, `tell.v.03`, `unknown`, `switch_on.v.01`, `consult.v.02`, `raise.v.02`, `insert.v.02`, `pour.v.01`, `touch.v.01`, `push.v.01` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_if_en_5.2.0_3.0_1700336513918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_if_en_5.2.0_3.0_1700336513918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_if","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_if","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_if| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.7 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Aureliano/distilbert-base-uncased-if +- https://rasa.com/docs/rasa/components#languagemodelfeaturizer +- https://github.com/aporporato/jericho-corpora \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews_en.md new file mode 100644 index 000000000000..6cdbbc216bee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from andi611) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-ner-agnews` is a English model originally trained by `andi611`. + +## Predicted Entities + +`World`, `Sci/Tech`, `Sports`, `Business` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews_en_5.2.0_3.0_1700337076999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews_en_5.2.0_3.0_1700337076999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.news.uncased_base.by_andi611").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_ner_agnews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-ner-agnews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq_en.md new file mode 100644 index 000000000000..212884d07c6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from andi611) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-qa-boolq` is a English model originally trained by `andi611`. + +## Predicted Entities + +`False`, `True` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq_en_5.2.0_3.0_1700337311749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq_en_5.2.0_3.0_1700337311749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.uncased_base.by_andi611").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_qa_boolq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-qa-boolq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2_en.md new file mode 100644 index 000000000000..67c2ecf9205f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from bhadresh-savani) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2 +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-sentiment-sst2` is a English model originally trained by `bhadresh-savani`. + +## Predicted Entities + +`POSITIVE`, `NEGATIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2_en_5.2.0_3.0_1700336730260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2_en_5.2.0_3.0_1700336730260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.sentiment.uncased_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_base_uncased_sentiment_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.0 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bhadresh-savani/distilbert-base-uncased-sentiment-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_multiclass_textclassification_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_multiclass_textclassification_en.md new file mode 100644 index 000000000000..6839aa01b851 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_multiclass_textclassification_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_distilbert_multiclass_textclassification DistilBertForSequenceClassification from palakagl +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_multiclass_textclassification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_distilbert_multiclass_textclassification` is a English model originally trained by palakagl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_multiclass_textclassification_en_5.2.0_3.0_1700336872203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_multiclass_textclassification_en_5.2.0_3.0_1700336872203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_multiclass_textclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_multiclass_textclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_multiclass_textclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|246.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/palakagl/distilbert_MultiClass_TextClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_political_tweets_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_political_tweets_en.md new file mode 100644 index 000000000000..7b0a2d72a5c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_political_tweets_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from m-newhauser) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_political_tweets +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-political-tweets` is a English model originally trained by `m-newhauser`. + +## Predicted Entities + +`Republican`, `Democrat` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_political_tweets_en_5.2.0_3.0_1700337280861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_political_tweets_en_5.2.0_3.0_1700337280861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_political_tweets","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_political_tweets","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.tweet.by_m_newhauser").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_political_tweets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/m-newhauser/distilbert-political-tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_quality_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_quality_en.md new file mode 100644 index 000000000000..56a032a41493 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_quality_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_quality +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-quality` is a English model originally trained by `valurank`. + +## Predicted Entities + +`medium`, `bad`, `good` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_quality_en_5.2.0_3.0_1700337472329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_quality_en_5.2.0_3.0_1700337472329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_quality","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_quality","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_quality| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/valurank/distilbert-quality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_tweet_eval_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_tweet_eval_emotion_en.md new file mode 100644 index 000000000000..c99416b3704c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_distilbert_tweet_eval_emotion_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_distilbert_tweet_eval_emotion DistilBertForSequenceClassification from philschmid +author: John Snow Labs +name: distilbert_sequence_classifier_distilbert_tweet_eval_emotion +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_distilbert_tweet_eval_emotion` is a English model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_tweet_eval_emotion_en_5.2.0_3.0_1700335874314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_distilbert_tweet_eval_emotion_en_5.2.0_3.0_1700335874314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_tweet_eval_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_distilbert_tweet_eval_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_distilbert_tweet_eval_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/philschmid/DistilBERT-tweet-eval-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_finetuned_distilbert_needmining_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_finetuned_distilbert_needmining_en.md new file mode 100644 index 000000000000..1b4e36e4253e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_finetuned_distilbert_needmining_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Mini Cased model (from svenstahlmann) +author: John Snow Labs +name: distilbert_sequence_classifier_finetuned_distilbert_needmining +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetuned-distilbert-needmining` is a English model originally trained by `svenstahlmann`. + +## Predicted Entities + +`no need`, `contains need` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_finetuned_distilbert_needmining_en_5.2.0_3.0_1700337312220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_finetuned_distilbert_needmining_en_5.2.0_3.0_1700337312220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_finetuned_distilbert_needmining","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_finetuned_distilbert_needmining","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.finetuned.by_svenstahlmann").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_finetuned_distilbert_needmining| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/svenstahlmann/finetuned-distilbert-needmining +- https://www.deepmind.com/blog/population-based-training-of-neural-networks \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_icelandic_legit_kwd_march_27_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_icelandic_legit_kwd_march_27_en.md new file mode 100644 index 000000000000..f4fb580d2612 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_icelandic_legit_kwd_march_27_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sequence_classifier_icelandic_legit_kwd_march_27 DistilBertForSequenceClassification from world-wide +author: John Snow Labs +name: distilbert_sequence_classifier_icelandic_legit_kwd_march_27 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_icelandic_legit_kwd_march_27` is a English model originally trained by world-wide. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_icelandic_legit_kwd_march_27_en_5.2.0_3.0_1700337456292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_icelandic_legit_kwd_march_27_en_5.2.0_3.0_1700337456292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_icelandic_legit_kwd_march_27","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_icelandic_legit_kwd_march_27","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_icelandic_legit_kwd_march_27| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/world-wide/is-legit-kwd-march-27 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_industry_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_industry_classification_en.md new file mode 100644 index 000000000000..ee605e31ddf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_industry_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from sampathkethineedi) +author: John Snow Labs +name: distilbert_sequence_classifier_industry_classification +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `industry-classification` is a English model originally trained by `sampathkethineedi`. + +## Predicted Entities + +`Semiconductors`, `Data Processing & Outsourced Services`, `Oil & Gas Exploration & Production`, `Industrial Machinery`, `Technology Distributors`, `Apparel Retail`, `Application Software`, `Research & Consulting Services`, `Specialty Stores`, `Diversified Support Services`, `Gold`, `Human Resource & Employment Services`, `Interactive Media & Services`, `Internet & Direct Marketing Retail`, `Auto Parts & Equipment`, `Building Products`, `Personal Products`, `Communications Equipment`, `Electronic Equipment & Instruments`, `Regional Banks`, `Systems Software`, `Health Care Services`, `Health Care Supplies`, `Asset Management & Custody Banks`, `Aerospace & Defense`, `Specialty Chemicals`, `Life Sciences Tools & Services`, `Electric Utilities`, `Commodity Chemicals`, `Health Care Equipment`, `Construction Machinery & Heavy Trucks`, `Environmental & Facilities Services`, `Oil & Gas Equipment & Services`, `Oil & Gas Refining & Marketing`, `Casinos & Gaming`, `Diversified Metals & Mining`, `Property & Casualty Insurance`, `IT Consulting & Other Services`, `Leisure Products`, `Pharmaceuticals`, `Movies & Entertainment`, `Restaurants`, `Steel`, `Thrifts & Mortgage Finance`, `Health Care Facilities`, `Oil & Gas Storage & Transportation`, `Internet Services & Infrastructure`, `Health Care Technology`, `Packaged Foods & Meats`, `Integrated Telecommunication Services`, `Consumer Finance`, `Investment Banking & Brokerage`, `Electrical Components & Equipment`, `Trading Companies & Distributors`, `Construction & Engineering`, `Advertising`, `Homebuilding`, `Biotechnology`, `Real Estate Operating Companies` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_industry_classification_en_5.2.0_3.0_1700337671470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_industry_classification_en_5.2.0_3.0_1700337671470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_industry_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_industry_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_sampathkethineedi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_industry_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.6 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sampathkethineedi/industry-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_ma_mlc_v7_distil_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_ma_mlc_v7_distil_en.md new file mode 100644 index 000000000000..1aa58d6de366 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_ma_mlc_v7_distil_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from CouchCat) +author: John Snow Labs +name: distilbert_sequence_classifier_ma_mlc_v7_distil +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ma_mlc_v7_distil` is a English model originally trained by `CouchCat`. + +## Predicted Entities + +`delivery`, `monetary`, `product`, `return` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_ma_mlc_v7_distil_en_5.2.0_3.0_1700336921868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_ma_mlc_v7_distil_en_5.2.0_3.0_1700336921868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_ma_mlc_v7_distil","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_ma_mlc_v7_distil","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.ma_mlc.by_couchcat").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_ma_mlc_v7_distil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/CouchCat/ma_mlc_v7_distil \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_markingmulticlass_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_markingmulticlass_en.md new file mode 100644 index 000000000000..4607917dd57b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_markingmulticlass_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_markingmulticlass DistilBertForSequenceClassification from FuriouslyAsleep +author: John Snow Labs +name: distilbert_sequence_classifier_markingmulticlass +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_markingmulticlass` is a English model originally trained by FuriouslyAsleep. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_markingmulticlass_en_5.2.0_3.0_1700337488278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_markingmulticlass_en_5.2.0_3.0_1700337488278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_markingmulticlass","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_markingmulticlass","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_markingmulticlass| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/FuriouslyAsleep/markingMultiClass \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion_en.md new file mode 100644 index 000000000000..efcbe6721aff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Base Uncased model (from philschmid) +author: John Snow Labs +name: distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-emotion` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`sadness`, `anger`, `love`, `surprise`, `joy`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700337661650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700337661650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.emotion.uncased_base.by_philschmid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_philschmid_distilbert_base_uncased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|false| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/philschmid/distilbert-base-uncased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_policy_distilbert_7d_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_policy_distilbert_7d_en.md new file mode 100644 index 000000000000..8cdab7d82dad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_policy_distilbert_7d_en.md @@ -0,0 +1,111 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from MoritzLaurer) +author: John Snow Labs +name: distilbert_sequence_classifier_policy_distilbert_7d +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `policy-distilbert-7d` is a English model originally trained by `MoritzLaurer`. + +## Predicted Entities + +`economy`, `political system`, `welfare and quality of life`, `fabric of society`, `external relations`, `freedom and democracy`, `social groups` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_policy_distilbert_7d_en_5.2.0_3.0_1700337994808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_policy_distilbert_7d_en_5.2.0_3.0_1700337994808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_policy_distilbert_7d","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_policy_distilbert_7d","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_moritzlaurer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_policy_distilbert_7d| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/MoritzLaurer/policy-distilbert-7d +- https://manifesto-project.wzb.eu/down/data/2020b/codebooks/codebook_MPDataset_MPDS2020b.pdf +- https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html +- https://manifesto-project.wzb.eu/information/documents/information +- https://manifesto-project.wzb.eu/datasets +- https://scikit-learn.org/stable/modules/generated/sklearn.metrics.balanced_accuracy_score.html \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_sentiment_analysis_sbcbi_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_sentiment_analysis_sbcbi_en.md new file mode 100644 index 000000000000..4d20902a57c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_sentiment_analysis_sbcbi_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_sentiment_analysis_sbcbi DistilBertForSequenceClassification from CK42 +author: John Snow Labs +name: distilbert_sequence_classifier_sentiment_analysis_sbcbi +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_sentiment_analysis_sbcbi` is a English model originally trained by CK42. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_sentiment_analysis_sbcbi_en_5.2.0_3.0_1700337856182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_sentiment_analysis_sbcbi_en_5.2.0_3.0_1700337856182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_sentiment_analysis_sbcbi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_sentiment_analysis_sbcbi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_sentiment_analysis_sbcbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/CK42/sentiment_analysis_sbcBI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_sentimentanalysisdistillbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_sentimentanalysisdistillbert_en.md new file mode 100644 index 000000000000..950dab8fc922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_sentimentanalysisdistillbert_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_sentimentanalysisdistillbert DistilBertForSequenceClassification from Souvikcmsa +author: John Snow Labs +name: distilbert_sequence_classifier_sentimentanalysisdistillbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_sentimentanalysisdistillbert` is a English model originally trained by Souvikcmsa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_sentimentanalysisdistillbert_en_5.2.0_3.0_1700335650437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_sentimentanalysisdistillbert_en_5.2.0_3.0_1700335650437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_sentimentanalysisdistillbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_sentimentanalysisdistillbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_sentimentanalysisdistillbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/Souvikcmsa/SentimentAnalysisDistillBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_state_op_detector_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_state_op_detector_en.md new file mode 100644 index 000000000000..a676bb25def3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_state_op_detector_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from lingwave-admin) +author: John Snow Labs +name: distilbert_sequence_classifier_state_op_detector +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `state-op-detector` is a English model originally trained by `lingwave-admin`. + +## Predicted Entities + +`State Operator`, `Normal User` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_state_op_detector_en_5.2.0_3.0_1700338191269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_state_op_detector_en_5.2.0_3.0_1700338191269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_state_op_detector","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_state_op_detector","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_lingwave_admin").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_state_op_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/lingwave-admin/state-op-detector +- https://www.nbcnews.com/news/latino/russia-disinformation-ukraine-spreading-spanish-speaking-media-rcna22843 +- https://github.com/curt-tigges/state-social-operator-detection +- https://www.brookings.edu/techstream/china-and-russia-are-joining-forces-to-spread-disinformation/ +- https://journals.sagepub.com/doi/10.1177/19401612221082052 +- https://www.brennancenter.org/our-work/analysis-opinion/new-evidence-shows-how-russias-election-interference-has-gotten-more +- https://www.lawfareblog.com/understanding-pro-china-propaganda-and-disinformation-tool-set-xinjiang +- https://www.bbc.com/news/56364952 +- https://www.forbes.com/sites/petersuciu/2022/03/10/russian-sock-puppets-spreading-misinformation-on-social-media-about-ukraine/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_testing_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_testing_en.md new file mode 100644 index 000000000000..a179214517b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_testing_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from LysandreJik) +author: John Snow Labs +name: distilbert_sequence_classifier_testing +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `testing` is a English model originally trained by `LysandreJik`. + +## Predicted Entities + +`equivalent`, `not_equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_testing_en_5.2.0_3.0_1700338368306.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_testing_en_5.2.0_3.0_1700338368306.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_testing","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_testing","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.glue.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_testing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/LysandreJik/testing +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_toxic_comment_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_toxic_comment_model_en.md new file mode 100644 index 000000000000..f26a3af3777c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_toxic_comment_model_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from martin-ha) +author: John Snow Labs +name: distilbert_sequence_classifier_toxic_comment_model +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `toxic-comment-model` is a English model originally trained by `martin-ha`. + +## Predicted Entities + +`toxic`, `non-toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_toxic_comment_model_en_5.2.0_3.0_1700337852196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_toxic_comment_model_en_5.2.0_3.0_1700337852196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_toxic_comment_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_toxic_comment_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.by_martin_ha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_toxic_comment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/martin-ha/toxic-comment-model +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/overview/evaluation +- https://github.com/MSIA/wenyang_pan_nlp_project_2021 +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_tweet_disaster_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_tweet_disaster_classifier_en.md new file mode 100644 index 000000000000..accd86ad11ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_tweet_disaster_classifier_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English DistilBertForSequenceClassification Cased model (from bgoel4132) +author: John Snow Labs +name: distilbert_sequence_classifier_tweet_disaster_classifier +date: 2023-11-18 +tags: [distilbert, sequence_classification, open_source, en, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tweet-disaster-classifier` is a English model originally trained by `bgoel4132`. + +## Predicted Entities + +`cyclone`, `earthquake`, `medical`, `hurricane`, `fire`, `typhoon`, `flood`, `pollution`, `accident`, `tornado`, `volcano`, `explosion`, `other` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_tweet_disaster_classifier_en_5.2.0_3.0_1700338554336.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_tweet_disaster_classifier_en_5.2.0_3.0_1700338554336.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_tweet_disaster_classifier","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer,sequenceClassifier_loaded]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols(Array("document")) + .setOutputCol("token") + +val sequenceClassifier_loaded = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_tweet_disaster_classifier","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer,sequenceClassifier_loaded)) + +val data = Seq("PUT YOUR STRING HERE").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_bert.tweet.by_bgoel4132").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_tweet_disaster_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.5 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/bgoel4132/tweet-disaster-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_wiki_complexity_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_wiki_complexity_en.md new file mode 100644 index 000000000000..10890b98dedc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sequence_classifier_wiki_complexity_en.md @@ -0,0 +1,99 @@ +--- +layout: model +title: English distilbert_sequence_classifier_wiki_complexity DistilBertForSequenceClassification from hidude562 +author: John Snow Labs +name: distilbert_sequence_classifier_wiki_complexity +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sequence_classifier_wiki_complexity` is a English model originally trained by hidude562. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_wiki_complexity_en_5.2.0_3.0_1700335650388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sequence_classifier_wiki_complexity_en_5.2.0_3.0_1700335650388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_wiki_complexity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sequence_classifier_wiki_complexity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sequence_classifier_wiki_complexity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +https://huggingface.co/hidude562/Wiki-Complexity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sexism_detector_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sexism_detector_en.md new file mode 100644 index 000000000000..d1a9534a6b71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sexism_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sexism_detector DistilBertForSequenceClassification from NLP-LTU +author: John Snow Labs +name: distilbert_sexism_detector +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sexism_detector` is a English model originally trained by NLP-LTU. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sexism_detector_en_5.2.0_3.0_1700342992722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sexism_detector_en_5.2.0_3.0_1700342992722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sexism_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sexism_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sexism_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/NLP-LTU/distilbert-sexism-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_spamemail_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_spamemail_en.md new file mode 100644 index 000000000000..e6f44ba64b39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_spamemail_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_spamemail DistilBertForSequenceClassification from tony4194 +author: John Snow Labs +name: distilbert_spamemail +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_spamemail` is a English model originally trained by tony4194. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_spamemail_en_5.2.0_3.0_1700341042734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_spamemail_en_5.2.0_3.0_1700341042734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_spamemail","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_spamemail","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_spamemail| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tony4194/distilbert-spamEmail \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_sst2_mahtab_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sst2_mahtab_en.md new file mode 100644 index 000000000000..5aef2b8522e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_sst2_mahtab_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sst2_mahtab DistilBertForSequenceClassification from Motahar +author: John Snow Labs +name: distilbert_sst2_mahtab +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sst2_mahtab` is a English model originally trained by Motahar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sst2_mahtab_en_5.2.0_3.0_1700350168719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sst2_mahtab_en_5.2.0_3.0_1700350168719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sst2_mahtab","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sst2_mahtab","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sst2_mahtab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Motahar/distilbert-sst2-mahtab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_svd_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_svd_en.md new file mode 100644 index 000000000000..f02efa9f29d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_svd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_svd DistilBertForSequenceClassification from Manivarsh +author: John Snow Labs +name: distilbert_svd +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_svd` is a English model originally trained by Manivarsh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_svd_en_5.2.0_3.0_1700342814509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_svd_en_5.2.0_3.0_1700342814509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_svd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_svd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_svd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Manivarsh/DistilBERT_SVD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_svd_hypo2_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_svd_hypo2_en.md new file mode 100644 index 000000000000..963a9e9b1763 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_svd_hypo2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_svd_hypo2 DistilBertForSequenceClassification from Manivarsh +author: John Snow Labs +name: distilbert_svd_hypo2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_svd_hypo2` is a English model originally trained by Manivarsh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_svd_hypo2_en_5.2.0_3.0_1700346278967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_svd_hypo2_en_5.2.0_3.0_1700346278967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_svd_hypo2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_svd_hypo2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_svd_hypo2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Manivarsh/DistilBERT_SVD_Hypo2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_toxicity_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_toxicity_classifier_en.md new file mode 100644 index 000000000000..939366153a3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_toxicity_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_toxicity_classifier DistilBertForSequenceClassification from tensor-trek +author: John Snow Labs +name: distilbert_toxicity_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_toxicity_classifier` is a English model originally trained by tensor-trek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_toxicity_classifier_en_5.2.0_3.0_1700340262256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_toxicity_classifier_en_5.2.0_3.0_1700340262256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxicity_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxicity_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_toxicity_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/tensor-trek/distilbert-toxicity-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_undersampled_noweights_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_undersampled_noweights_en.md new file mode 100644 index 000000000000..da1f1c069db9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_undersampled_noweights_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_undersampled_noweights DistilBertForSequenceClassification from Kayvane +author: John Snow Labs +name: distilbert_undersampled_noweights +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_undersampled_noweights` is a English model originally trained by Kayvane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_undersampled_noweights_en_5.2.0_3.0_1700350799993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_undersampled_noweights_en_5.2.0_3.0_1700350799993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_undersampled_noweights","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_undersampled_noweights","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_undersampled_noweights| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/Kayvane/distilbert-undersampled-noweights \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_xnli_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_xnli_en.md new file mode 100644 index 000000000000..b319dfa62325 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_xnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_xnli DistilBertForSequenceClassification from regisss +author: John Snow Labs +name: distilbert_xnli +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_xnli` is a English model originally trained by regisss. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_xnli_en_5.2.0_3.0_1700349113902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_xnli_en_5.2.0_3.0_1700349113902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_xnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_xnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_xnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/regisss/distilbert_xnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilbert_yes_no_other_intent_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilbert_yes_no_other_intent_en.md new file mode 100644 index 000000000000..f51d42d7e236 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilbert_yes_no_other_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_yes_no_other_intent DistilBertForSequenceClassification from sachin19566 +author: John Snow Labs +name: distilbert_yes_no_other_intent +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_yes_no_other_intent` is a English model originally trained by sachin19566. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_yes_no_other_intent_en_5.2.0_3.0_1700343924835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_yes_no_other_intent_en_5.2.0_3.0_1700343924835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_yes_no_other_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_yes_no_other_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_yes_no_other_intent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sachin19566/distilbert_Yes_No_Other_Intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distillbert_base_uncased_80_equal_en.md b/docs/_posts/ahmedlone127/2023-11-18-distillbert_base_uncased_80_equal_en.md new file mode 100644 index 000000000000..3569715899d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distillbert_base_uncased_80_equal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_base_uncased_80_equal DistilBertForSequenceClassification from Tejas3 +author: John Snow Labs +name: distillbert_base_uncased_80_equal +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncased_80_equal` is a English model originally trained by Tejas3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_80_equal_en_5.2.0_3.0_1700340778325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_80_equal_en_5.2.0_3.0_1700340778325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_80_equal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_80_equal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncased_80_equal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tejas3/distillbert_base_uncased_80_equal \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distillbert_complaints_en.md b/docs/_posts/ahmedlone127/2023-11-18-distillbert_complaints_en.md new file mode 100644 index 000000000000..41ed1704867e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distillbert_complaints_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_complaints DistilBertForSequenceClassification from Davegd +author: John Snow Labs +name: distillbert_complaints +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_complaints` is a English model originally trained by Davegd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_complaints_en_5.2.0_3.0_1700351528528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_complaints_en_5.2.0_3.0_1700351528528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_complaints","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_complaints","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_complaints| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Davegd/distillbert_complaints \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distillbert_conv_quality_score_en.md b/docs/_posts/ahmedlone127/2023-11-18-distillbert_conv_quality_score_en.md new file mode 100644 index 000000000000..b1c96c63f122 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distillbert_conv_quality_score_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_conv_quality_score DistilBertForSequenceClassification from alespalla +author: John Snow Labs +name: distillbert_conv_quality_score +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_conv_quality_score` is a English model originally trained by alespalla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_conv_quality_score_en_5.2.0_3.0_1700345900617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_conv_quality_score_en_5.2.0_3.0_1700345900617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_conv_quality_score","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_conv_quality_score","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_conv_quality_score| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/alespalla/distillbert_conv_quality_score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distillbert_misinformation_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-distillbert_misinformation_classifier_en.md new file mode 100644 index 000000000000..90cac7e47b1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distillbert_misinformation_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_misinformation_classifier DistilBertForSequenceClassification from FriedGil +author: John Snow Labs +name: distillbert_misinformation_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_misinformation_classifier` is a English model originally trained by FriedGil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_misinformation_classifier_en_5.2.0_3.0_1700342993297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_misinformation_classifier_en_5.2.0_3.0_1700342993297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_misinformation_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_misinformation_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_misinformation_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/FriedGil/distillBERT-misinformation-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear_en.md new file mode 100644 index 000000000000..66bce6589974 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear DistilBertForSequenceClassification from mmillet +author: John Snow Labs +name: distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear` is a English model originally trained by mmillet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear_en_5.2.0_3.0_1700340767180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear_en_5.2.0_3.0_1700340767180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilrubert_tiny_cased_conversational_v1_finetuned_emotion_experiment_augmented_anger_fear| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|39.2 MB| + +## References + +https://huggingface.co/mmillet/distilrubert-tiny-cased-conversational-v1_finetuned_emotion_experiment_augmented_anger_fear \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-distilvert_complaints_subproduct_en.md b/docs/_posts/ahmedlone127/2023-11-18-distilvert_complaints_subproduct_en.md new file mode 100644 index 000000000000..6453bb230879 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-distilvert_complaints_subproduct_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilvert_complaints_subproduct DistilBertForSequenceClassification from Kayvane +author: John Snow Labs +name: distilvert_complaints_subproduct +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilvert_complaints_subproduct` is a English model originally trained by Kayvane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilvert_complaints_subproduct_en_5.2.0_3.0_1700351931650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilvert_complaints_subproduct_en_5.2.0_3.0_1700351931650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilvert_complaints_subproduct","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilvert_complaints_subproduct","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilvert_complaints_subproduct| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/Kayvane/distilvert-complaints-subproduct \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-ditilbert_spamemail_en.md b/docs/_posts/ahmedlone127/2023-11-18-ditilbert_spamemail_en.md new file mode 100644 index 000000000000..8a00331a4557 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-ditilbert_spamemail_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ditilbert_spamemail DistilBertForSequenceClassification from tony4194 +author: John Snow Labs +name: ditilbert_spamemail +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ditilbert_spamemail` is a English model originally trained by tony4194. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ditilbert_spamemail_en_5.2.0_3.0_1700342308747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ditilbert_spamemail_en_5.2.0_3.0_1700342308747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ditilbert_spamemail","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ditilbert_spamemail","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ditilbert_spamemail| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/tony4194/ditilbert-spamEmail \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-domain_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-domain_classification_en.md new file mode 100644 index 000000000000..ae1698289e96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-domain_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English domain_classification DistilBertForSequenceClassification from phongmt184172 +author: John Snow Labs +name: domain_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`domain_classification` is a English model originally trained by phongmt184172. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/domain_classification_en_5.2.0_3.0_1700343651693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/domain_classification_en_5.2.0_3.0_1700343651693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("domain_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("domain_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|domain_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/phongmt184172/domain_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-dummy_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-dummy_model_en.md new file mode 100644 index 000000000000..c96067b91629 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-dummy_model_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English dummy_model DistilBertEmbeddings from luoweijie +author: John Snow Labs +name: dummy_model +date: 2023-11-18 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_model` is a English model originally trained by luoweijie. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_model_en_5.2.0_3.0_1700347881399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_model_en_5.2.0_3.0_1700347881399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("dummy_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("dummy_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dummy_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +References + +https://huggingface.co/luoweijie/dummy-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-emailclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-emailclassifier_en.md new file mode 100644 index 000000000000..b17f528c846c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-emailclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emailclassifier DistilBertForSequenceClassification from fathyshalaby +author: John Snow Labs +name: emailclassifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emailclassifier` is a English model originally trained by fathyshalaby. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emailclassifier_en_5.2.0_3.0_1700349933875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emailclassifier_en_5.2.0_3.0_1700349933875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emailclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emailclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emailclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/fathyshalaby/emailclassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-emotion_advance_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-emotion_advance_classifier_en.md new file mode 100644 index 000000000000..2460e69bb31c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-emotion_advance_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_advance_classifier DistilBertForSequenceClassification from neel-jotaniya +author: John Snow Labs +name: emotion_advance_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_advance_classifier` is a English model originally trained by neel-jotaniya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_advance_classifier_en_5.2.0_3.0_1700341060946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_advance_classifier_en_5.2.0_3.0_1700341060946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_advance_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_advance_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_advance_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/neel-jotaniya/emotion-advance-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-emotion_model_giraffewt_en.md b/docs/_posts/ahmedlone127/2023-11-18-emotion_model_giraffewt_en.md new file mode 100644 index 000000000000..282a6705bd0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-emotion_model_giraffewt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_model_giraffewt DistilBertForSequenceClassification from giraffewt +author: John Snow Labs +name: emotion_model_giraffewt +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_model_giraffewt` is a English model originally trained by giraffewt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_model_giraffewt_en_5.2.0_3.0_1700346120020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_model_giraffewt_en_5.2.0_3.0_1700346120020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_model_giraffewt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_model_giraffewt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_model_giraffewt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.8 MB| + +## References + +https://huggingface.co/giraffewt/emotion_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-emotion_trained_final_en.md b/docs/_posts/ahmedlone127/2023-11-18-emotion_trained_final_en.md new file mode 100644 index 000000000000..190a0be42ba6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-emotion_trained_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_trained_final DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: emotion_trained_final +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_trained_final` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_trained_final_en_5.2.0_3.0_1700345110038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_trained_final_en_5.2.0_3.0_1700345110038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_trained_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_trained_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_trained_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aXhyra/emotion_trained_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-emotions_gssi_en.md b/docs/_posts/ahmedlone127/2023-11-18-emotions_gssi_en.md new file mode 100644 index 000000000000..1c3a05159218 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-emotions_gssi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotions_gssi DistilBertForSequenceClassification from agil +author: John Snow Labs +name: emotions_gssi +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotions_gssi` is a English model originally trained by agil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotions_gssi_en_5.2.0_3.0_1700347010903.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotions_gssi_en_5.2.0_3.0_1700347010903.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotions_gssi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotions_gssi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotions_gssi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/agil/emotions-gssi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-english_sms_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-english_sms_classification_model_en.md new file mode 100644 index 000000000000..e301f24eebe3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-english_sms_classification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English english_sms_classification_model DistilBertForSequenceClassification from akuysal +author: John Snow Labs +name: english_sms_classification_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_sms_classification_model` is a English model originally trained by akuysal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_sms_classification_model_en_5.2.0_3.0_1700350669534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_sms_classification_model_en_5.2.0_3.0_1700350669534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("english_sms_classification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("english_sms_classification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_sms_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/akuysal/English-SMS-classification-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-event_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-event_classification_model_en.md new file mode 100644 index 000000000000..304e21df3779 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-event_classification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English event_classification_model DistilBertForSequenceClassification from bhuvi +author: John Snow Labs +name: event_classification_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`event_classification_model` is a English model originally trained by bhuvi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/event_classification_model_en_5.2.0_3.0_1700343993938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/event_classification_model_en_5.2.0_3.0_1700343993938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("event_classification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("event_classification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|event_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bhuvi/event_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-extended_distilbert_finetuned_resumes_sections_en.md b/docs/_posts/ahmedlone127/2023-11-18-extended_distilbert_finetuned_resumes_sections_en.md new file mode 100644 index 000000000000..d8a9a9b8065d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-extended_distilbert_finetuned_resumes_sections_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English extended_distilbert_finetuned_resumes_sections DistilBertForSequenceClassification from has-abi +author: John Snow Labs +name: extended_distilbert_finetuned_resumes_sections +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`extended_distilbert_finetuned_resumes_sections` is a English model originally trained by has-abi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/extended_distilbert_finetuned_resumes_sections_en_5.2.0_3.0_1700339694922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/extended_distilbert_finetuned_resumes_sections_en_5.2.0_3.0_1700339694922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("extended_distilbert_finetuned_resumes_sections","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("extended_distilbert_finetuned_resumes_sections","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|extended_distilbert_finetuned_resumes_sections| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|259.2 MB| + +## References + +https://huggingface.co/has-abi/extended_distilBERT-finetuned-resumes-sections \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-eyy_categorisation_1_0_en.md b/docs/_posts/ahmedlone127/2023-11-18-eyy_categorisation_1_0_en.md new file mode 100644 index 000000000000..4e905cf82c93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-eyy_categorisation_1_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English eyy_categorisation_1_0 DistilBertForSequenceClassification from ICFNext +author: John Snow Labs +name: eyy_categorisation_1_0 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`eyy_categorisation_1_0` is a English model originally trained by ICFNext. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/eyy_categorisation_1_0_en_5.2.0_3.0_1700339009257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/eyy_categorisation_1_0_en_5.2.0_3.0_1700339009257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("eyy_categorisation_1_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("eyy_categorisation_1_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|eyy_categorisation_1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ICFNext/EYY-categorisation-1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-eyy_topic_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-eyy_topic_classification_en.md new file mode 100644 index 000000000000..20fafde4ed63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-eyy_topic_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English eyy_topic_classification DistilBertForSequenceClassification from ebrigham +author: John Snow Labs +name: eyy_topic_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`eyy_topic_classification` is a English model originally trained by ebrigham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/eyy_topic_classification_en_5.2.0_3.0_1700343824127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/eyy_topic_classification_en_5.2.0_3.0_1700343824127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("eyy_topic_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("eyy_topic_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|eyy_topic_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ebrigham/EYY-Topic-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fake_news_classification_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-fake_news_classification_distilbert_en.md new file mode 100644 index 000000000000..9bbad717fe9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fake_news_classification_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_news_classification_distilbert DistilBertForSequenceClassification from therealcyberlord +author: John Snow Labs +name: fake_news_classification_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_news_classification_distilbert` is a English model originally trained by therealcyberlord. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_news_classification_distilbert_en_5.2.0_3.0_1700347518976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_news_classification_distilbert_en_5.2.0_3.0_1700347518976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_classification_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_classification_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_news_classification_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/therealcyberlord/fake-news-classification-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fake_news_detector_qiaozhen_en.md b/docs/_posts/ahmedlone127/2023-11-18-fake_news_detector_qiaozhen_en.md new file mode 100644 index 000000000000..0fb934fa5d65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fake_news_detector_qiaozhen_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_news_detector_qiaozhen DistilBertForSequenceClassification from Qiaozhen +author: John Snow Labs +name: fake_news_detector_qiaozhen +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_news_detector_qiaozhen` is a English model originally trained by Qiaozhen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_news_detector_qiaozhen_en_5.2.0_3.0_1700345267778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_news_detector_qiaozhen_en_5.2.0_3.0_1700345267778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_detector_qiaozhen","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_detector_qiaozhen","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_news_detector_qiaozhen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Qiaozhen/fake-news-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fake_reviews_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-fake_reviews_distilbert_en.md new file mode 100644 index 000000000000..88b33e1b7ded --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fake_reviews_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_reviews_distilbert DistilBertForSequenceClassification from astrosbd +author: John Snow Labs +name: fake_reviews_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_reviews_distilbert` is a English model originally trained by astrosbd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_reviews_distilbert_en_5.2.0_3.0_1700341670844.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_reviews_distilbert_en_5.2.0_3.0_1700341670844.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_reviews_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_reviews_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_reviews_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/astrosbd/fake-reviews-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fake_reviews_distilbert_v3_en.md b/docs/_posts/ahmedlone127/2023-11-18-fake_reviews_distilbert_v3_en.md new file mode 100644 index 000000000000..841e4457ecc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fake_reviews_distilbert_v3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_reviews_distilbert_v3 DistilBertForSequenceClassification from astrosbd +author: John Snow Labs +name: fake_reviews_distilbert_v3 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_reviews_distilbert_v3` is a English model originally trained by astrosbd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_reviews_distilbert_v3_en_5.2.0_3.0_1700347741196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_reviews_distilbert_v3_en_5.2.0_3.0_1700347741196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_reviews_distilbert_v3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_reviews_distilbert_v3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_reviews_distilbert_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/astrosbd/fake-reviews-distilbert-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fake_tweet_detect_en.md b/docs/_posts/ahmedlone127/2023-11-18-fake_tweet_detect_en.md new file mode 100644 index 000000000000..53ed374bf09a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fake_tweet_detect_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_tweet_detect DistilBertForSequenceClassification from chinhon +author: John Snow Labs +name: fake_tweet_detect +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_tweet_detect` is a English model originally trained by chinhon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_tweet_detect_en_5.2.0_3.0_1700348510518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_tweet_detect_en_5.2.0_3.0_1700348510518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_tweet_detect","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_tweet_detect","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_tweet_detect| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/chinhon/fake_tweet_detect \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finance_sentiment_chinese_fast_zh.md b/docs/_posts/ahmedlone127/2023-11-18-finance_sentiment_chinese_fast_zh.md new file mode 100644 index 000000000000..e8ba6cb72e20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finance_sentiment_chinese_fast_zh.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Chinese finance_sentiment_chinese_fast DistilBertForSequenceClassification from bardsai +author: John Snow Labs +name: finance_sentiment_chinese_fast +date: 2023-11-18 +tags: [bert, zh, open_source, sequence_classification, onnx] +task: Text Classification +language: zh +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_sentiment_chinese_fast` is a Chinese model originally trained by bardsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_sentiment_chinese_fast_zh_5.2.0_3.0_1700341246811.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_sentiment_chinese_fast_zh_5.2.0_3.0_1700341246811.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finance_sentiment_chinese_fast","zh")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finance_sentiment_chinese_fast","zh") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_sentiment_chinese_fast| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|zh| +|Size:|510.6 MB| + +## References + +https://huggingface.co/bardsai/finance-sentiment-zh-fast \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finance_sentiment_polish_fast_pl.md b/docs/_posts/ahmedlone127/2023-11-18-finance_sentiment_polish_fast_pl.md new file mode 100644 index 000000000000..81f9e1fd3399 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finance_sentiment_polish_fast_pl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Polish finance_sentiment_polish_fast DistilBertForSequenceClassification from bardsai +author: John Snow Labs +name: finance_sentiment_polish_fast +date: 2023-11-18 +tags: [bert, pl, open_source, sequence_classification, onnx] +task: Text Classification +language: pl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finance_sentiment_polish_fast` is a Polish model originally trained by bardsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finance_sentiment_polish_fast_pl_5.2.0_3.0_1700342510656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finance_sentiment_polish_fast_pl_5.2.0_3.0_1700342510656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finance_sentiment_polish_fast","pl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finance_sentiment_polish_fast","pl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finance_sentiment_polish_fast| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pl| +|Size:|510.6 MB| + +## References + +https://huggingface.co/bardsai/finance-sentiment-pl-fast \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fine_tuned_resume_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-fine_tuned_resume_model_en.md new file mode 100644 index 000000000000..44ad72a86ae0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fine_tuned_resume_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tuned_resume_model DistilBertForSequenceClassification from Invimatic +author: John Snow Labs +name: fine_tuned_resume_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_resume_model` is a English model originally trained by Invimatic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_resume_model_en_5.2.0_3.0_1700346054564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_resume_model_en_5.2.0_3.0_1700346054564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_resume_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_resume_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_resume_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Invimatic/fine_tuned_resume_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fine_tuned_sentiment_analysis_customer_feedback_en.md b/docs/_posts/ahmedlone127/2023-11-18-fine_tuned_sentiment_analysis_customer_feedback_en.md new file mode 100644 index 000000000000..1254701b287a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fine_tuned_sentiment_analysis_customer_feedback_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tuned_sentiment_analysis_customer_feedback DistilBertForSequenceClassification from anchit48 +author: John Snow Labs +name: fine_tuned_sentiment_analysis_customer_feedback +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_sentiment_analysis_customer_feedback` is a English model originally trained by anchit48. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_sentiment_analysis_customer_feedback_en_5.2.0_3.0_1700345729348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_sentiment_analysis_customer_feedback_en_5.2.0_3.0_1700345729348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_sentiment_analysis_customer_feedback","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_sentiment_analysis_customer_feedback","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_sentiment_analysis_customer_feedback| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anchit48/fine-tuned-sentiment-analysis-customer-feedback \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetune_sentiment_analysis_model_3000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetune_sentiment_analysis_model_3000_samples_en.md new file mode 100644 index 000000000000..517c72a42694 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetune_sentiment_analysis_model_3000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetune_sentiment_analysis_model_3000_samples DistilBertForSequenceClassification from federicopascual +author: John Snow Labs +name: finetune_sentiment_analysis_model_3000_samples +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetune_sentiment_analysis_model_3000_samples` is a English model originally trained by federicopascual. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetune_sentiment_analysis_model_3000_samples_en_5.2.0_3.0_1700348754632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetune_sentiment_analysis_model_3000_samples_en_5.2.0_3.0_1700348754632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetune_sentiment_analysis_model_3000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetune_sentiment_analysis_model_3000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetune_sentiment_analysis_model_3000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/federicopascual/finetune-sentiment-analysis-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_cyberbullying_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_cyberbullying_en.md new file mode 100644 index 000000000000..382371392bae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_cyberbullying_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_cyberbullying DistilBertForSequenceClassification from kingsotn +author: John Snow Labs +name: finetuned_cyberbullying +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_cyberbullying` is a English model originally trained by kingsotn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_cyberbullying_en_5.2.0_3.0_1700340925675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_cyberbullying_en_5.2.0_3.0_1700340925675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_cyberbullying","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_cyberbullying","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_cyberbullying| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kingsotn/finetuned_cyberbullying \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_adult_content_detection_valurank_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_adult_content_detection_valurank_en.md new file mode 100644 index 000000000000..6f10ec68bd38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_adult_content_detection_valurank_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_adult_content_detection_valurank DistilBertForSequenceClassification from valurank +author: John Snow Labs +name: finetuned_distilbert_adult_content_detection_valurank +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_adult_content_detection_valurank` is a English model originally trained by valurank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_adult_content_detection_valurank_en_5.2.0_3.0_1700340118726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_adult_content_detection_valurank_en_5.2.0_3.0_1700340118726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_adult_content_detection_valurank","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_adult_content_detection_valurank","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_adult_content_detection_valurank| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/valurank/finetuned-distilbert-adult-content-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_base_model_kwabenamufasa_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_base_model_kwabenamufasa_en.md new file mode 100644 index 000000000000..d8cebe5082d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_base_model_kwabenamufasa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_base_model_kwabenamufasa DistilBertForSequenceClassification from KwabenaMufasa +author: John Snow Labs +name: finetuned_distilbert_base_model_kwabenamufasa +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_base_model_kwabenamufasa` is a English model originally trained by KwabenaMufasa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_model_kwabenamufasa_en_5.2.0_3.0_1700347300722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_model_kwabenamufasa_en_5.2.0_3.0_1700347300722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_model_kwabenamufasa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_model_kwabenamufasa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_base_model_kwabenamufasa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/KwabenaMufasa/Finetuned-Distilbert-base-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_explicit_content_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_explicit_content_detection_en.md new file mode 100644 index 000000000000..afd76ef5dd39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_explicit_content_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_explicit_content_detection DistilBertForSequenceClassification from valurank +author: John Snow Labs +name: finetuned_distilbert_explicit_content_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_explicit_content_detection` is a English model originally trained by valurank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_explicit_content_detection_en_5.2.0_3.0_1700340630823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_explicit_content_detection_en_5.2.0_3.0_1700340630823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_explicit_content_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_explicit_content_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_explicit_content_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/valurank/finetuned-distilbert-explicit-content-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_mahmoud8_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_mahmoud8_en.md new file mode 100644 index 000000000000..d651c3e3f1b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_mahmoud8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_mahmoud8 DistilBertForSequenceClassification from Mahmoud8 +author: John Snow Labs +name: finetuned_distilbert_mahmoud8 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_mahmoud8` is a English model originally trained by Mahmoud8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_mahmoud8_en_5.2.0_3.0_1700338204974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_mahmoud8_en_5.2.0_3.0_1700338204974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_mahmoud8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_mahmoud8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_mahmoud8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Mahmoud8/finetuned-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_model_en.md new file mode 100644 index 000000000000..7630ad0f6b35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_model DistilBertForSequenceClassification from Afia-manubea +author: John Snow Labs +name: finetuned_distilbert_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_model` is a English model originally trained by Afia-manubea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_model_en_5.2.0_3.0_1700340490965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_model_en_5.2.0_3.0_1700340490965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Afia-manubea/FineTuned-DistilBert-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_multi_label_emotion_valurank_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_multi_label_emotion_valurank_en.md new file mode 100644 index 000000000000..5679d86c4b8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_multi_label_emotion_valurank_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_multi_label_emotion_valurank DistilBertForSequenceClassification from valurank +author: John Snow Labs +name: finetuned_distilbert_multi_label_emotion_valurank +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_multi_label_emotion_valurank` is a English model originally trained by valurank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_multi_label_emotion_valurank_en_5.2.0_3.0_1700339294482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_multi_label_emotion_valurank_en_5.2.0_3.0_1700339294482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_multi_label_emotion_valurank","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_multi_label_emotion_valurank","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_multi_label_emotion_valurank| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/valurank/finetuned-distilbert-multi-label-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_news_article_categorization_valurank_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_news_article_categorization_valurank_en.md new file mode 100644 index 000000000000..4f13065ed6ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_news_article_categorization_valurank_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_news_article_categorization_valurank DistilBertForSequenceClassification from valurank +author: John Snow Labs +name: finetuned_distilbert_news_article_categorization_valurank +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_news_article_categorization_valurank` is a English model originally trained by valurank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_news_article_categorization_valurank_en_5.2.0_3.0_1700339976719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_news_article_categorization_valurank_en_5.2.0_3.0_1700339976719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_news_article_categorization_valurank","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_news_article_categorization_valurank","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_news_article_categorization_valurank| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/valurank/finetuned-distilbert-news-article-categorization \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_shidhant_emotion_article_categorization_2_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_shidhant_emotion_article_categorization_2_en.md new file mode 100644 index 000000000000..825a0eeaba7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_distilbert_shidhant_emotion_article_categorization_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_shidhant_emotion_article_categorization_2 DistilBertForSequenceClassification from almalabs +author: John Snow Labs +name: finetuned_distilbert_shidhant_emotion_article_categorization_2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_shidhant_emotion_article_categorization_2` is a English model originally trained by almalabs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_shidhant_emotion_article_categorization_2_en_5.2.0_3.0_1700350249141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_shidhant_emotion_article_categorization_2_en_5.2.0_3.0_1700350249141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_shidhant_emotion_article_categorization_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_shidhant_emotion_article_categorization_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_shidhant_emotion_article_categorization_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/almalabs/finetuned-distilbert-shidhant-emotion-article-categorization_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_sentiment_analysis_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_sentiment_analysis_model_en.md new file mode 100644 index 000000000000..93f2dbd8e870 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_sentiment_analysis_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentiment_analysis_model DistilBertForSequenceClassification from federicopascual +author: John Snow Labs +name: finetuned_sentiment_analysis_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentiment_analysis_model` is a English model originally trained by federicopascual. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_analysis_model_en_5.2.0_3.0_1700341480377.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_analysis_model_en_5.2.0_3.0_1700341480377.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentiment_analysis_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentiment_analysis_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentiment_analysis_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/federicopascual/finetuned-sentiment-analysis-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuned_sentiment_model_yrajm1997_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuned_sentiment_model_yrajm1997_en.md new file mode 100644 index 000000000000..a01b21658c66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuned_sentiment_model_yrajm1997_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentiment_model_yrajm1997 DistilBertForSequenceClassification from yrajm1997 +author: John Snow Labs +name: finetuned_sentiment_model_yrajm1997 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentiment_model_yrajm1997` is a English model originally trained by yrajm1997. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_model_yrajm1997_en_5.2.0_3.0_1700343794392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_model_yrajm1997_en_5.2.0_3.0_1700343794392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentiment_model_yrajm1997","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentiment_model_yrajm1997","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentiment_model_yrajm1997| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yrajm1997/finetuned-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_bert_sentiment_reviews_2_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_bert_sentiment_reviews_2_en.md new file mode 100644 index 000000000000..8816ff82996c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_bert_sentiment_reviews_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_bert_sentiment_reviews_2 DistilBertForSequenceClassification from IAyoub +author: John Snow Labs +name: finetuning_bert_sentiment_reviews_2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_bert_sentiment_reviews_2` is a English model originally trained by IAyoub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_bert_sentiment_reviews_2_en_5.2.0_3.0_1700343224929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_bert_sentiment_reviews_2_en_5.2.0_3.0_1700343224929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_bert_sentiment_reviews_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_bert_sentiment_reviews_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_bert_sentiment_reviews_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/IAyoub/finetuning-bert-sentiment-reviews-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_emotion_model_dcssdc_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_emotion_model_dcssdc_en.md new file mode 100644 index 000000000000..9ccde3d75df6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_emotion_model_dcssdc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_emotion_model_dcssdc DistilBertForSequenceClassification from dcssdc +author: John Snow Labs +name: finetuning_emotion_model_dcssdc +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_emotion_model_dcssdc` is a English model originally trained by dcssdc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_emotion_model_dcssdc_en_5.2.0_3.0_1700346119424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_emotion_model_dcssdc_en_5.2.0_3.0_1700346119424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_emotion_model_dcssdc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_emotion_model_dcssdc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_emotion_model_dcssdc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dcssdc/finetuning-emotion-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_financial_news_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_financial_news_sentiment_en.md new file mode 100644 index 000000000000..eff9c914fbae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_financial_news_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_financial_news_sentiment DistilBertForSequenceClassification from samayash +author: John Snow Labs +name: finetuning_financial_news_sentiment +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_financial_news_sentiment` is a English model originally trained by samayash. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_financial_news_sentiment_en_5.2.0_3.0_1700341842054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_financial_news_sentiment_en_5.2.0_3.0_1700341842054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_financial_news_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_financial_news_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_financial_news_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/samayash/finetuning-financial-news-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_homonymssentimentanalysis_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_homonymssentimentanalysis_model_en.md new file mode 100644 index 000000000000..098bd0772843 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_homonymssentimentanalysis_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_homonymssentimentanalysis_model DistilBertForSequenceClassification from MahmoudMohsen +author: John Snow Labs +name: finetuning_homonymssentimentanalysis_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_homonymssentimentanalysis_model` is a English model originally trained by MahmoudMohsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_homonymssentimentanalysis_model_en_5.2.0_3.0_1700347172114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_homonymssentimentanalysis_model_en_5.2.0_3.0_1700347172114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_homonymssentimentanalysis_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_homonymssentimentanalysis_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_homonymssentimentanalysis_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/MahmoudMohsen/finetuning-HomonymsSentimentAnalysis-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_medical_specialty_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_medical_specialty_model_en.md new file mode 100644 index 000000000000..d2b38677cc02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_medical_specialty_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_medical_specialty_model DistilBertForSequenceClassification from richtsai1103 +author: John Snow Labs +name: finetuning_medical_specialty_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_medical_specialty_model` is a English model originally trained by richtsai1103. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_medical_specialty_model_en_5.2.0_3.0_1700348502646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_medical_specialty_model_en_5.2.0_3.0_1700348502646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_medical_specialty_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_medical_specialty_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_medical_specialty_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/richtsai1103/finetuning-medical-specialty-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_all_df_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_all_df_en.md new file mode 100644 index 000000000000..6025385eb4bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_all_df_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_all_df DistilBertForSequenceClassification from Timothy1337 +author: John Snow Labs +name: finetuning_sentiment_all_df +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_all_df` is a English model originally trained by Timothy1337. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_all_df_en_5.2.0_3.0_1700345468890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_all_df_en_5.2.0_3.0_1700345468890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_all_df","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_all_df","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_all_df| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/Timothy1337/finetuning-sentiment-all_df \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_classification_model_with_amazon_appliances_data_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_classification_model_with_amazon_appliances_data_en.md new file mode 100644 index 000000000000..a9370e96155b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_classification_model_with_amazon_appliances_data_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_classification_model_with_amazon_appliances_data DistilBertForSequenceClassification from m-aamir95 +author: John Snow Labs +name: finetuning_sentiment_classification_model_with_amazon_appliances_data +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_classification_model_with_amazon_appliances_data` is a English model originally trained by m-aamir95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_classification_model_with_amazon_appliances_data_en_5.2.0_3.0_1700340796786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_classification_model_with_amazon_appliances_data_en_5.2.0_3.0_1700340796786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_classification_model_with_amazon_appliances_data","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_classification_model_with_amazon_appliances_data","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_classification_model_with_amazon_appliances_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/m-aamir95/finetuning-sentiment-classification-model-with-amazon-appliances-data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_12000_samples_mansidw_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_12000_samples_mansidw_en.md new file mode 100644 index 000000000000..7ee7392633fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_12000_samples_mansidw_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_12000_samples_mansidw DistilBertForSequenceClassification from mansidw +author: John Snow Labs +name: finetuning_sentiment_model_12000_samples_mansidw +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_12000_samples_mansidw` is a English model originally trained by mansidw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_12000_samples_mansidw_en_5.2.0_3.0_1700351781011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_12000_samples_mansidw_en_5.2.0_3.0_1700351781011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_12000_samples_mansidw","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_12000_samples_mansidw","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_12000_samples_mansidw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mansidw/finetuning-sentiment-model-12000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_18000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_18000_samples_en.md new file mode 100644 index 000000000000..25fe1321ac8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_18000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_18000_samples DistilBertForSequenceClassification from LeakyDishes +author: John Snow Labs +name: finetuning_sentiment_model_18000_samples +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_18000_samples` is a English model originally trained by LeakyDishes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_18000_samples_en_5.2.0_3.0_1700350991520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_18000_samples_en_5.2.0_3.0_1700350991520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_18000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_18000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_18000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/LeakyDishes/finetuning-sentiment-model-18000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_federicopascual_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_federicopascual_en.md new file mode 100644 index 000000000000..6b63ee19b8c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_federicopascual_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_federicopascual DistilBertForSequenceClassification from federicopascual +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_federicopascual +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_federicopascual` is a English model originally trained by federicopascual. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_federicopascual_en_5.2.0_3.0_1700341673223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_federicopascual_en_5.2.0_3.0_1700341673223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_federicopascual","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_federicopascual","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_federicopascual| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/federicopascual/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_intradiction_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_intradiction_en.md new file mode 100644 index 000000000000..4c3df8e04912 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_intradiction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_intradiction DistilBertForSequenceClassification from Intradiction +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_intradiction +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_intradiction` is a English model originally trained by Intradiction. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_intradiction_en_5.2.0_3.0_1700341056541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_intradiction_en_5.2.0_3.0_1700341056541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_intradiction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_intradiction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_intradiction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Intradiction/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_leakydishes_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_leakydishes_en.md new file mode 100644 index 000000000000..b379886e7369 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_3000_samples_leakydishes_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_leakydishes DistilBertForSequenceClassification from LeakyDishes +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_leakydishes +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_leakydishes` is a English model originally trained by LeakyDishes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_leakydishes_en_5.2.0_3.0_1700342154800.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_leakydishes_en_5.2.0_3.0_1700342154800.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_leakydishes","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_leakydishes","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_leakydishes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/LeakyDishes/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_for_c2er_teomotun_en.md b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_for_c2er_teomotun_en.md new file mode 100644 index 000000000000..034df95e7359 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-finetuning_sentiment_model_for_c2er_teomotun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_for_c2er_teomotun DistilBertForSequenceClassification from teomotun +author: John Snow Labs +name: finetuning_sentiment_model_for_c2er_teomotun +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_for_c2er_teomotun` is a English model originally trained by teomotun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_for_c2er_teomotun_en_5.2.0_3.0_1700338570308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_for_c2er_teomotun_en_5.2.0_3.0_1700338570308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_for_c2er_teomotun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_for_c2er_teomotun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_for_c2er_teomotun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/teomotun/finetuning-sentiment-model-for-c2er \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fpc_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-fpc_model_en.md new file mode 100644 index 000000000000..2ab734551d9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fpc_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fpc_model DistilBertForSequenceClassification from anth0nyhak1m +author: John Snow Labs +name: fpc_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fpc_model` is a English model originally trained by anth0nyhak1m. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fpc_model_en_5.2.0_3.0_1700349675444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fpc_model_en_5.2.0_3.0_1700349675444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fpc_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fpc_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fpc_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anth0nyhak1m/FPC_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-fraud_text_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-fraud_text_detection_en.md new file mode 100644 index 000000000000..989df26c1433 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-fraud_text_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fraud_text_detection DistilBertForSequenceClassification from austinb +author: John Snow Labs +name: fraud_text_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fraud_text_detection` is a English model originally trained by austinb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fraud_text_detection_en_5.2.0_3.0_1700341003074.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fraud_text_detection_en_5.2.0_3.0_1700341003074.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fraud_text_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fraud_text_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fraud_text_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/austinb/fraud_text_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-gc_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-gc_model_en.md new file mode 100644 index 000000000000..06c622de38af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-gc_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gc_model DistilBertForSequenceClassification from peterjwms +author: John Snow Labs +name: gc_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gc_model` is a English model originally trained by peterjwms. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gc_model_en_5.2.0_3.0_1700347648096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gc_model_en_5.2.0_3.0_1700347648096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("gc_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("gc_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gc_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/peterjwms/gc_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-gender_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-gender_classification_en.md new file mode 100644 index 000000000000..b1a4db7a6abf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-gender_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gender_classification DistilBertForSequenceClassification from padmajabfrl +author: John Snow Labs +name: gender_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gender_classification` is a English model originally trained by padmajabfrl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gender_classification_en_5.2.0_3.0_1700338938839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gender_classification_en_5.2.0_3.0_1700338938839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("gender_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("gender_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gender_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/padmajabfrl/Gender-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-genre_pred_model_balanced_matthiasr_en.md b/docs/_posts/ahmedlone127/2023-11-18-genre_pred_model_balanced_matthiasr_en.md new file mode 100644 index 000000000000..dc307e25b330 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-genre_pred_model_balanced_matthiasr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English genre_pred_model_balanced_matthiasr DistilBertForSequenceClassification from matthiasr +author: John Snow Labs +name: genre_pred_model_balanced_matthiasr +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`genre_pred_model_balanced_matthiasr` is a English model originally trained by matthiasr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/genre_pred_model_balanced_matthiasr_en_5.2.0_3.0_1700350236617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/genre_pred_model_balanced_matthiasr_en_5.2.0_3.0_1700350236617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("genre_pred_model_balanced_matthiasr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("genre_pred_model_balanced_matthiasr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|genre_pred_model_balanced_matthiasr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/matthiasr/genre_pred_model_balanced \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-gibberish_text_detector_en.md b/docs/_posts/ahmedlone127/2023-11-18-gibberish_text_detector_en.md new file mode 100644 index 000000000000..3c3f1fceadea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-gibberish_text_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gibberish_text_detector DistilBertForSequenceClassification from wajidlinux99 +author: John Snow Labs +name: gibberish_text_detector +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gibberish_text_detector` is a English model originally trained by wajidlinux99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gibberish_text_detector_en_5.2.0_3.0_1700339276860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gibberish_text_detector_en_5.2.0_3.0_1700339276860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("gibberish_text_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("gibberish_text_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gibberish_text_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wajidlinux99/gibberish-text-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-goemotions_distilbertbase_en.md b/docs/_posts/ahmedlone127/2023-11-18-goemotions_distilbertbase_en.md new file mode 100644 index 000000000000..18e0aad24da0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-goemotions_distilbertbase_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English goemotions_distilbertbase DistilBertForSequenceClassification from mrovejaxd +author: John Snow Labs +name: goemotions_distilbertbase +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`goemotions_distilbertbase` is a English model originally trained by mrovejaxd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/goemotions_distilbertbase_en_5.2.0_3.0_1700344148635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/goemotions_distilbertbase_en_5.2.0_3.0_1700344148635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("goemotions_distilbertbase","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("goemotions_distilbertbase","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|goemotions_distilbertbase| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mrovejaxd/goemotions_distilbertbase \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-handsfree_intent_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-handsfree_intent_classification_en.md new file mode 100644 index 000000000000..5f4dc0fca6da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-handsfree_intent_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English handsfree_intent_classification DistilBertForSequenceClassification from eloi-goncalves +author: John Snow Labs +name: handsfree_intent_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`handsfree_intent_classification` is a English model originally trained by eloi-goncalves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/handsfree_intent_classification_en_5.2.0_3.0_1700347342427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/handsfree_intent_classification_en_5.2.0_3.0_1700347342427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("handsfree_intent_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("handsfree_intent_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|handsfree_intent_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/eloi-goncalves/handsfree_intent_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-hate_speech_multilabel_classification_with_bert_en.md b/docs/_posts/ahmedlone127/2023-11-18-hate_speech_multilabel_classification_with_bert_en.md new file mode 100644 index 000000000000..bf3eec003303 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-hate_speech_multilabel_classification_with_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_speech_multilabel_classification_with_bert DistilBertForSequenceClassification from wesleyacheng +author: John Snow Labs +name: hate_speech_multilabel_classification_with_bert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_multilabel_classification_with_bert` is a English model originally trained by wesleyacheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_multilabel_classification_with_bert_en_5.2.0_3.0_1700345187586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_multilabel_classification_with_bert_en_5.2.0_3.0_1700345187586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_speech_multilabel_classification_with_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_speech_multilabel_classification_with_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_multilabel_classification_with_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wesleyacheng/hate-speech-multilabel-classification-with-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-hf_repo_en.md b/docs/_posts/ahmedlone127/2023-11-18-hf_repo_en.md new file mode 100644 index 000000000000..b51725ac68cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-hf_repo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hf_repo DistilBertForSequenceClassification from htahir1 +author: John Snow Labs +name: hf_repo +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hf_repo` is a English model originally trained by htahir1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hf_repo_en_5.2.0_3.0_1700348149716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hf_repo_en_5.2.0_3.0_1700348149716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hf_repo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hf_repo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hf_repo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|245.8 MB| + +## References + +https://huggingface.co/htahir1/hf-repo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-hinglish_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-hinglish_distilbert_en.md new file mode 100644 index 000000000000..19b46af1832a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-hinglish_distilbert_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English hinglish_distilbert DistilBertEmbeddings from meghanabhange +author: John Snow Labs +name: hinglish_distilbert +date: 2023-11-18 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hinglish_distilbert` is a English model originally trained by meghanabhange. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hinglish_distilbert_en_5.2.0_3.0_1700338170414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hinglish_distilbert_en_5.2.0_3.0_1700338170414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("hinglish_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("hinglish_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hinglish_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|245.9 MB| + +## References + +References + +https://huggingface.co/meghanabhange/Hinglish-DistilBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-humor_norwegian_humor_en.md b/docs/_posts/ahmedlone127/2023-11-18-humor_norwegian_humor_en.md new file mode 100644 index 000000000000..349f9b2da7b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-humor_norwegian_humor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English humor_norwegian_humor DistilBertForSequenceClassification from mohameddhiab +author: John Snow Labs +name: humor_norwegian_humor +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`humor_norwegian_humor` is a English model originally trained by mohameddhiab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/humor_norwegian_humor_en_5.2.0_3.0_1700338311817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/humor_norwegian_humor_en_5.2.0_3.0_1700338311817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("humor_norwegian_humor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("humor_norwegian_humor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|humor_norwegian_humor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mohameddhiab/humor-no-humor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-hyperpartisan_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-hyperpartisan_classifier_en.md new file mode 100644 index 000000000000..a949044467ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-hyperpartisan_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hyperpartisan_classifier DistilBertForSequenceClassification from alexgshaw +author: John Snow Labs +name: hyperpartisan_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hyperpartisan_classifier` is a English model originally trained by alexgshaw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hyperpartisan_classifier_en_5.2.0_3.0_1700344814320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hyperpartisan_classifier_en_5.2.0_3.0_1700344814320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hyperpartisan_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hyperpartisan_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hyperpartisan_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alexgshaw/hyperpartisan-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-imdb_urdusentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-imdb_urdusentiment_model_en.md new file mode 100644 index 000000000000..d87c922b80c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-imdb_urdusentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English imdb_urdusentiment_model DistilBertForSequenceClassification from Sakil +author: John Snow Labs +name: imdb_urdusentiment_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb_urdusentiment_model` is a English model originally trained by Sakil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb_urdusentiment_model_en_5.2.0_3.0_1700350264433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb_urdusentiment_model_en_5.2.0_3.0_1700350264433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("imdb_urdusentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("imdb_urdusentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb_urdusentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Sakil/IMDB_URDUSENTIMENT_MODEL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-imdbsentdistilbertmodel_en.md b/docs/_posts/ahmedlone127/2023-11-18-imdbsentdistilbertmodel_en.md new file mode 100644 index 000000000000..823103ce36df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-imdbsentdistilbertmodel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English imdbsentdistilbertmodel DistilBertForSequenceClassification from Sakil +author: John Snow Labs +name: imdbsentdistilbertmodel +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdbsentdistilbertmodel` is a English model originally trained by Sakil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdbsentdistilbertmodel_en_5.2.0_3.0_1700348817593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdbsentdistilbertmodel_en_5.2.0_3.0_1700348817593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("imdbsentdistilbertmodel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("imdbsentdistilbertmodel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdbsentdistilbertmodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Sakil/imdbsentdistilbertmodel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-inappropriate_text_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-inappropriate_text_classifier_en.md new file mode 100644 index 000000000000..f1932df017b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-inappropriate_text_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English inappropriate_text_classifier DistilBertForSequenceClassification from michellejieli +author: John Snow Labs +name: inappropriate_text_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`inappropriate_text_classifier` is a English model originally trained by michellejieli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/inappropriate_text_classifier_en_5.2.0_3.0_1700337871935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/inappropriate_text_classifier_en_5.2.0_3.0_1700337871935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("inappropriate_text_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("inappropriate_text_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|inappropriate_text_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/michellejieli/inappropriate_text_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-indo_specific_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-indo_specific_model_en.md new file mode 100644 index 000000000000..8e29b4408ece --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-indo_specific_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indo_specific_model DistilBertForSequenceClassification from mathildeparlo +author: John Snow Labs +name: indo_specific_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indo_specific_model` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indo_specific_model_en_5.2.0_3.0_1700342307794.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indo_specific_model_en_5.2.0_3.0_1700342307794.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("indo_specific_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("indo_specific_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indo_specific_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mathildeparlo/indo_specific_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-instagram_caption_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-instagram_caption_classifier_en.md new file mode 100644 index 000000000000..db6a1d721608 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-instagram_caption_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English instagram_caption_classifier DistilBertForSequenceClassification from prakhars +author: John Snow Labs +name: instagram_caption_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`instagram_caption_classifier` is a English model originally trained by prakhars. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/instagram_caption_classifier_en_5.2.0_3.0_1700340903519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/instagram_caption_classifier_en_5.2.0_3.0_1700340903519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("instagram_caption_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("instagram_caption_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|instagram_caption_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/prakhars/instagram_caption_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-intent_classification_falconsai_en.md b/docs/_posts/ahmedlone127/2023-11-18-intent_classification_falconsai_en.md new file mode 100644 index 000000000000..84862e357f62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-intent_classification_falconsai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English intent_classification_falconsai DistilBertForSequenceClassification from Falconsai +author: John Snow Labs +name: intent_classification_falconsai +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_classification_falconsai` is a English model originally trained by Falconsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_classification_falconsai_en_5.2.0_3.0_1700340100506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_classification_falconsai_en_5.2.0_3.0_1700340100506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_falconsai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_falconsai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_classification_falconsai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Falconsai/intent_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-intent_classification_rowdy_store_en.md b/docs/_posts/ahmedlone127/2023-11-18-intent_classification_rowdy_store_en.md new file mode 100644 index 000000000000..7e1fb33fcfa4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-intent_classification_rowdy_store_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English intent_classification_rowdy_store DistilBertForSequenceClassification from rowdy-store +author: John Snow Labs +name: intent_classification_rowdy_store +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_classification_rowdy_store` is a English model originally trained by rowdy-store. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_classification_rowdy_store_en_5.2.0_3.0_1700339637278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_classification_rowdy_store_en_5.2.0_3.0_1700339637278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_rowdy_store","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_rowdy_store","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_classification_rowdy_store| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rowdy-store/intent-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-intent_classification_small_en.md b/docs/_posts/ahmedlone127/2023-11-18-intent_classification_small_en.md new file mode 100644 index 000000000000..cce448d2fe1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-intent_classification_small_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English intent_classification_small DistilBertForSequenceClassification from dipesh +author: John Snow Labs +name: intent_classification_small +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_classification_small` is a English model originally trained by dipesh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_classification_small_en_5.2.0_3.0_1700343335579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_classification_small_en_5.2.0_3.0_1700343335579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_small","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_small","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_classification_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.2 MB| + +## References + +https://huggingface.co/dipesh/Intent-Classification-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-intent_classifier_call_tonga_tonga_islands_action_en.md b/docs/_posts/ahmedlone127/2023-11-18-intent_classifier_call_tonga_tonga_islands_action_en.md new file mode 100644 index 000000000000..f96836e3c3d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-intent_classifier_call_tonga_tonga_islands_action_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English intent_classifier_call_tonga_tonga_islands_action DistilBertForSequenceClassification from Zain6699 +author: John Snow Labs +name: intent_classifier_call_tonga_tonga_islands_action +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_classifier_call_tonga_tonga_islands_action` is a English model originally trained by Zain6699. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_classifier_call_tonga_tonga_islands_action_en_5.2.0_3.0_1700339122211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_classifier_call_tonga_tonga_islands_action_en_5.2.0_3.0_1700339122211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classifier_call_tonga_tonga_islands_action","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classifier_call_tonga_tonga_islands_action","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_classifier_call_tonga_tonga_islands_action| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Zain6699/intent-classifier-call_to_action \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-kodwo_finetuned_distilbert_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-kodwo_finetuned_distilbert_model_en.md new file mode 100644 index 000000000000..634a1cffe5bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-kodwo_finetuned_distilbert_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English kodwo_finetuned_distilbert_model DistilBertForSequenceClassification from Kodwo11 +author: John Snow Labs +name: kodwo_finetuned_distilbert_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kodwo_finetuned_distilbert_model` is a English model originally trained by Kodwo11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kodwo_finetuned_distilbert_model_en_5.2.0_3.0_1700343753824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kodwo_finetuned_distilbert_model_en_5.2.0_3.0_1700343753824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("kodwo_finetuned_distilbert_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("kodwo_finetuned_distilbert_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kodwo_finetuned_distilbert_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kodwo11/Kodwo-Finetuned-distilbert-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-lie_detection_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-lie_detection_distilbert_en.md new file mode 100644 index 000000000000..59b20c624cdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-lie_detection_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English lie_detection_distilbert DistilBertForSequenceClassification from dlentr +author: John Snow Labs +name: lie_detection_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lie_detection_distilbert` is a English model originally trained by dlentr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lie_detection_distilbert_en_5.2.0_3.0_1700339461246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lie_detection_distilbert_en_5.2.0_3.0_1700339461246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("lie_detection_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("lie_detection_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lie_detection_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/dlentr/lie_detection_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-lkd_3_classes_seed_16_en.md b/docs/_posts/ahmedlone127/2023-11-18-lkd_3_classes_seed_16_en.md new file mode 100644 index 000000000000..529029d6bc5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-lkd_3_classes_seed_16_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English lkd_3_classes_seed_16 DistilBertForSequenceClassification from joshnielsen876 +author: John Snow Labs +name: lkd_3_classes_seed_16 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lkd_3_classes_seed_16` is a English model originally trained by joshnielsen876. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lkd_3_classes_seed_16_en_5.2.0_3.0_1700349870924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lkd_3_classes_seed_16_en_5.2.0_3.0_1700349870924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("lkd_3_classes_seed_16","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("lkd_3_classes_seed_16","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lkd_3_classes_seed_16| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/joshnielsen876/LKD_3_classes_seed_16 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-mbti_classifier_nmcahill_en.md b/docs/_posts/ahmedlone127/2023-11-18-mbti_classifier_nmcahill_en.md new file mode 100644 index 000000000000..b1b9c7d4812f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-mbti_classifier_nmcahill_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbti_classifier_nmcahill DistilBertForSequenceClassification from nmcahill +author: John Snow Labs +name: mbti_classifier_nmcahill +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbti_classifier_nmcahill` is a English model originally trained by nmcahill. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbti_classifier_nmcahill_en_5.2.0_3.0_1700342983135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbti_classifier_nmcahill_en_5.2.0_3.0_1700342983135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("mbti_classifier_nmcahill","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("mbti_classifier_nmcahill","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbti_classifier_nmcahill| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.2 MB| + +## References + +https://huggingface.co/nmcahill/mbti-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-med_nonmed_en.md b/docs/_posts/ahmedlone127/2023-11-18-med_nonmed_en.md new file mode 100644 index 000000000000..f5774d69d6b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-med_nonmed_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English med_nonmed DistilBertForSequenceClassification from c-s-ale +author: John Snow Labs +name: med_nonmed +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`med_nonmed` is a English model originally trained by c-s-ale. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/med_nonmed_en_5.2.0_3.0_1700351144080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/med_nonmed_en_5.2.0_3.0_1700351144080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("med_nonmed","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("med_nonmed","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|med_nonmed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/c-s-ale/med_nonmed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-medium_article_titles_engagement_all_en.md b/docs/_posts/ahmedlone127/2023-11-18-medium_article_titles_engagement_all_en.md new file mode 100644 index 000000000000..9745e5aca6e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-medium_article_titles_engagement_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English medium_article_titles_engagement_all DistilBertForSequenceClassification from dima806 +author: John Snow Labs +name: medium_article_titles_engagement_all +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medium_article_titles_engagement_all` is a English model originally trained by dima806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medium_article_titles_engagement_all_en_5.2.0_3.0_1700345727589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medium_article_titles_engagement_all_en_5.2.0_3.0_1700345727589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("medium_article_titles_engagement_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("medium_article_titles_engagement_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medium_article_titles_engagement_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/dima806/medium-article-titles-engagement-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-meister_mindmap_model_pytorch_en.md b/docs/_posts/ahmedlone127/2023-11-18-meister_mindmap_model_pytorch_en.md new file mode 100644 index 000000000000..41a19cae6110 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-meister_mindmap_model_pytorch_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English meister_mindmap_model_pytorch DistilBertForSequenceClassification from soymia +author: John Snow Labs +name: meister_mindmap_model_pytorch +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`meister_mindmap_model_pytorch` is a English model originally trained by soymia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/meister_mindmap_model_pytorch_en_5.2.0_3.0_1700349107025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/meister_mindmap_model_pytorch_en_5.2.0_3.0_1700349107025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("meister_mindmap_model_pytorch","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("meister_mindmap_model_pytorch","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|meister_mindmap_model_pytorch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/soymia/meister-mindmap-model-pytorch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-message_classification_question_other_smalltalk_modified_en.md b/docs/_posts/ahmedlone127/2023-11-18-message_classification_question_other_smalltalk_modified_en.md new file mode 100644 index 000000000000..30d3e1497ab0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-message_classification_question_other_smalltalk_modified_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English message_classification_question_other_smalltalk_modified DistilBertForSequenceClassification from Wyona +author: John Snow Labs +name: message_classification_question_other_smalltalk_modified +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`message_classification_question_other_smalltalk_modified` is a English model originally trained by Wyona. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/message_classification_question_other_smalltalk_modified_en_5.2.0_3.0_1700347298258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/message_classification_question_other_smalltalk_modified_en_5.2.0_3.0_1700347298258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("message_classification_question_other_smalltalk_modified","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("message_classification_question_other_smalltalk_modified","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|message_classification_question_other_smalltalk_modified| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Wyona/message-classification-question-other-smalltalk-modified \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-message_intent_en.md b/docs/_posts/ahmedlone127/2023-11-18-message_intent_en.md new file mode 100644 index 000000000000..ebb8d512bf98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-message_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English message_intent DistilBertForSequenceClassification from Yanjie +author: John Snow Labs +name: message_intent +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`message_intent` is a English model originally trained by Yanjie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/message_intent_en_5.2.0_3.0_1700345107732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/message_intent_en_5.2.0_3.0_1700345107732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("message_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("message_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|message_intent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Yanjie/message-intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-mgtdetectionmodel_en.md b/docs/_posts/ahmedlone127/2023-11-18-mgtdetectionmodel_en.md new file mode 100644 index 000000000000..16c810df3ac4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-mgtdetectionmodel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mgtdetectionmodel DistilBertForSequenceClassification from huyen89 +author: John Snow Labs +name: mgtdetectionmodel +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mgtdetectionmodel` is a English model originally trained by huyen89. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mgtdetectionmodel_en_5.2.0_3.0_1700342157392.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mgtdetectionmodel_en_5.2.0_3.0_1700342157392.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("mgtdetectionmodel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("mgtdetectionmodel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mgtdetectionmodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/huyen89/MGTDetectionModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-mlops_demo_en.md b/docs/_posts/ahmedlone127/2023-11-18-mlops_demo_en.md new file mode 100644 index 000000000000..f47166c3a86a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-mlops_demo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mlops_demo DistilBertForSequenceClassification from profoz +author: John Snow Labs +name: mlops_demo +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlops_demo` is a English model originally trained by profoz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlops_demo_en_5.2.0_3.0_1700349890904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlops_demo_en_5.2.0_3.0_1700349890904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("mlops_demo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("mlops_demo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlops_demo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/profoz/mlops-demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-multi_class_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-multi_class_classification_en.md new file mode 100644 index 000000000000..1a244b74baf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-multi_class_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multi_class_classification DistilBertForSequenceClassification from autoevaluate +author: John Snow Labs +name: multi_class_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_class_classification` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_class_classification_en_5.2.0_3.0_1700349263077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_class_classification_en_5.2.0_3.0_1700349263077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("multi_class_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("multi_class_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_class_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/autoevaluate/multi-class-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-natural_language_inference_autoevaluate_en.md b/docs/_posts/ahmedlone127/2023-11-18-natural_language_inference_autoevaluate_en.md new file mode 100644 index 000000000000..ab9edb3f30e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-natural_language_inference_autoevaluate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English natural_language_inference_autoevaluate DistilBertForSequenceClassification from autoevaluate +author: John Snow Labs +name: natural_language_inference_autoevaluate +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`natural_language_inference_autoevaluate` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/natural_language_inference_autoevaluate_en_5.2.0_3.0_1700344168959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/natural_language_inference_autoevaluate_en_5.2.0_3.0_1700344168959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("natural_language_inference_autoevaluate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("natural_language_inference_autoevaluate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|natural_language_inference_autoevaluate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/autoevaluate/natural-language-inference \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-nepal_bhasa_text_classifications_en.md b/docs/_posts/ahmedlone127/2023-11-18-nepal_bhasa_text_classifications_en.md new file mode 100644 index 000000000000..9310248952eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-nepal_bhasa_text_classifications_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nepal_bhasa_text_classifications DistilBertForSequenceClassification from rayschwartz +author: John Snow Labs +name: nepal_bhasa_text_classifications +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepal_bhasa_text_classifications` is a English model originally trained by rayschwartz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepal_bhasa_text_classifications_en_5.2.0_3.0_1700347962472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepal_bhasa_text_classifications_en_5.2.0_3.0_1700347962472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nepal_bhasa_text_classifications","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nepal_bhasa_text_classifications","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepal_bhasa_text_classifications| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rayschwartz/new-text-classifications \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-news_sentiment_distillbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-news_sentiment_distillbert_en.md new file mode 100644 index 000000000000..edfb10b194de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-news_sentiment_distillbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English news_sentiment_distillbert DistilBertForSequenceClassification from Harvinder6766 +author: John Snow Labs +name: news_sentiment_distillbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_sentiment_distillbert` is a English model originally trained by Harvinder6766. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_sentiment_distillbert_en_5.2.0_3.0_1700340870647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_sentiment_distillbert_en_5.2.0_3.0_1700340870647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_sentiment_distillbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_sentiment_distillbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_sentiment_distillbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Harvinder6766/news_sentiment_distillbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-news_sentiment_sentence_v1_en.md b/docs/_posts/ahmedlone127/2023-11-18-news_sentiment_sentence_v1_en.md new file mode 100644 index 000000000000..a59166868d97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-news_sentiment_sentence_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English news_sentiment_sentence_v1 DistilBertForSequenceClassification from Harvinder6766 +author: John Snow Labs +name: news_sentiment_sentence_v1 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_sentiment_sentence_v1` is a English model originally trained by Harvinder6766. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_sentiment_sentence_v1_en_5.2.0_3.0_1700341392770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_sentiment_sentence_v1_en_5.2.0_3.0_1700341392770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_sentiment_sentence_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_sentiment_sentence_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_sentiment_sentence_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Harvinder6766/news_sentiment_sentence_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-news_topic_classification_with_bert_en.md b/docs/_posts/ahmedlone127/2023-11-18-news_topic_classification_with_bert_en.md new file mode 100644 index 000000000000..3d10ee3bd2f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-news_topic_classification_with_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English news_topic_classification_with_bert DistilBertForSequenceClassification from wesleyacheng +author: John Snow Labs +name: news_topic_classification_with_bert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_topic_classification_with_bert` is a English model originally trained by wesleyacheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_topic_classification_with_bert_en_5.2.0_3.0_1700339800237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_topic_classification_with_bert_en_5.2.0_3.0_1700339800237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_topic_classification_with_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_topic_classification_with_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_topic_classification_with_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wesleyacheng/news-topic-classification-with-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-nigerian_pidgin_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-nigerian_pidgin_model_en.md new file mode 100644 index 000000000000..fe92feeb4abe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-nigerian_pidgin_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nigerian_pidgin_model DistilBertForSequenceClassification from RonTuretzky +author: John Snow Labs +name: nigerian_pidgin_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nigerian_pidgin_model` is a English model originally trained by RonTuretzky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nigerian_pidgin_model_en_5.2.0_3.0_1700350508410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nigerian_pidgin_model_en_5.2.0_3.0_1700350508410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nigerian_pidgin_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nigerian_pidgin_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nigerian_pidgin_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/RonTuretzky/pcm_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-nsfw_prompt_detector_en.md b/docs/_posts/ahmedlone127/2023-11-18-nsfw_prompt_detector_en.md new file mode 100644 index 000000000000..e9923d695cd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-nsfw_prompt_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nsfw_prompt_detector DistilBertForSequenceClassification from Zlatislav +author: John Snow Labs +name: nsfw_prompt_detector +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nsfw_prompt_detector` is a English model originally trained by Zlatislav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nsfw_prompt_detector_en_5.2.0_3.0_1700340548921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nsfw_prompt_detector_en_5.2.0_3.0_1700340548921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nsfw_prompt_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nsfw_prompt_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nsfw_prompt_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Zlatislav/NSFW-Prompt-Detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-nsfw_text_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-nsfw_text_classifier_en.md new file mode 100644 index 000000000000..88a5e6cd759d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-nsfw_text_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nsfw_text_classifier DistilBertForSequenceClassification from michellejieli +author: John Snow Labs +name: nsfw_text_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nsfw_text_classifier` is a English model originally trained by michellejieli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nsfw_text_classifier_en_5.2.0_3.0_1700339109622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nsfw_text_classifier_en_5.2.0_3.0_1700339109622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nsfw_text_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nsfw_text_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nsfw_text_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/michellejieli/NSFW_text_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-offensive_speech_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-offensive_speech_detection_en.md new file mode 100644 index 000000000000..fd2c903606ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-offensive_speech_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English offensive_speech_detection DistilBertForSequenceClassification from Falconsai +author: John Snow Labs +name: offensive_speech_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`offensive_speech_detection` is a English model originally trained by Falconsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/offensive_speech_detection_en_5.2.0_3.0_1700340396073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/offensive_speech_detection_en_5.2.0_3.0_1700340396073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("offensive_speech_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("offensive_speech_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|offensive_speech_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Falconsai/offensive_speech_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-patronizing_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-patronizing_detection_en.md new file mode 100644 index 000000000000..0c2ea8cb8680 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-patronizing_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English patronizing_detection DistilBertForSequenceClassification from achyut +author: John Snow Labs +name: patronizing_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`patronizing_detection` is a English model originally trained by achyut. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/patronizing_detection_en_5.2.0_3.0_1700351620421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/patronizing_detection_en_5.2.0_3.0_1700351620421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("patronizing_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("patronizing_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|patronizing_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/achyut/patronizing_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-pengmengjie_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-18-pengmengjie_finetuned_emotion_en.md new file mode 100644 index 000000000000..0295b0ccfeea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-pengmengjie_finetuned_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pengmengjie_finetuned_emotion DistilBertForSequenceClassification from ASCCCCCCCC +author: John Snow Labs +name: pengmengjie_finetuned_emotion +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pengmengjie_finetuned_emotion` is a English model originally trained by ASCCCCCCCC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pengmengjie_finetuned_emotion_en_5.2.0_3.0_1700351462868.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pengmengjie_finetuned_emotion_en_5.2.0_3.0_1700351462868.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("pengmengjie_finetuned_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("pengmengjie_finetuned_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pengmengjie_finetuned_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ASCCCCCCCC/PENGMENGJIE-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-phishing_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-phishing_distilbert_en.md new file mode 100644 index 000000000000..b7221cd80840 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-phishing_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English phishing_distilbert DistilBertForSequenceClassification from foghlaimeoir +author: John Snow Labs +name: phishing_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phishing_distilbert` is a English model originally trained by foghlaimeoir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phishing_distilbert_en_5.2.0_3.0_1700342475896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phishing_distilbert_en_5.2.0_3.0_1700342475896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("phishing_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("phishing_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phishing_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/foghlaimeoir/phishing-DistilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-phishing_email_detection_en.md b/docs/_posts/ahmedlone127/2023-11-18-phishing_email_detection_en.md new file mode 100644 index 000000000000..c046bd237316 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-phishing_email_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English phishing_email_detection DistilBertForSequenceClassification from dima806 +author: John Snow Labs +name: phishing_email_detection +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phishing_email_detection` is a English model originally trained by dima806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phishing_email_detection_en_5.2.0_3.0_1700339993302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phishing_email_detection_en_5.2.0_3.0_1700339993302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("phishing_email_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("phishing_email_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phishing_email_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/dima806/phishing-email-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-poem_labeler_en.md b/docs/_posts/ahmedlone127/2023-11-18-poem_labeler_en.md new file mode 100644 index 000000000000..ef0381d7985e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-poem_labeler_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English poem_labeler DistilBertForSequenceClassification from jgeselowitz +author: John Snow Labs +name: poem_labeler +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`poem_labeler` is a English model originally trained by jgeselowitz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/poem_labeler_en_5.2.0_3.0_1700342004311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/poem_labeler_en_5.2.0_3.0_1700342004311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("poem_labeler","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("poem_labeler","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|poem_labeler| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jgeselowitz/poem_labeler \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-product_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-product_classifier_en.md new file mode 100644 index 000000000000..560a530f0b97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-product_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English product_classifier DistilBertForSequenceClassification from cnicu +author: John Snow Labs +name: product_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`product_classifier` is a English model originally trained by cnicu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/product_classifier_en_5.2.0_3.0_1700338940109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/product_classifier_en_5.2.0_3.0_1700338940109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("product_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("product_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|product_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cnicu/product_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-psychiq2_en.md b/docs/_posts/ahmedlone127/2023-11-18-psychiq2_en.md new file mode 100644 index 000000000000..f7d4b87913ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-psychiq2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English psychiq2 DistilBertForSequenceClassification from derenrich +author: John Snow Labs +name: psychiq2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`psychiq2` is a English model originally trained by derenrich. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/psychiq2_en_5.2.0_3.0_1700338486891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/psychiq2_en_5.2.0_3.0_1700338486891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("psychiq2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("psychiq2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|psychiq2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.3 MB| + +## References + +https://huggingface.co/derenrich/psychiq2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-ptbr_news_classifier_pt.md b/docs/_posts/ahmedlone127/2023-11-18-ptbr_news_classifier_pt.md new file mode 100644 index 000000000000..623f31f9b8fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-ptbr_news_classifier_pt.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Portuguese ptbr_news_classifier DistilBertForSequenceClassification from jmbrito +author: John Snow Labs +name: ptbr_news_classifier +date: 2023-11-18 +tags: [bert, pt, open_source, sequence_classification, onnx] +task: Text Classification +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ptbr_news_classifier` is a Portuguese model originally trained by jmbrito. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ptbr_news_classifier_pt_5.2.0_3.0_1700348150327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ptbr_news_classifier_pt_5.2.0_3.0_1700348150327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ptbr_news_classifier","pt")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ptbr_news_classifier","pt") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ptbr_news_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pt| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jmbrito/ptbr-news-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-quora_helpful_answers_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-quora_helpful_answers_classifier_en.md new file mode 100644 index 000000000000..378d1fab51e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-quora_helpful_answers_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English quora_helpful_answers_classifier DistilBertForSequenceClassification from Radella +author: John Snow Labs +name: quora_helpful_answers_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quora_helpful_answers_classifier` is a English model originally trained by Radella. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quora_helpful_answers_classifier_en_5.2.0_3.0_1700346083400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quora_helpful_answers_classifier_en_5.2.0_3.0_1700346083400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("quora_helpful_answers_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("quora_helpful_answers_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quora_helpful_answers_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Radella/quora_helpful_answers_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-rea_genderidentification_v1_en.md b/docs/_posts/ahmedlone127/2023-11-18-rea_genderidentification_v1_en.md new file mode 100644 index 000000000000..4ae06258c818 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-rea_genderidentification_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rea_genderidentification_v1 DistilBertForSequenceClassification from malcolm +author: John Snow Labs +name: rea_genderidentification_v1 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rea_genderidentification_v1` is a English model originally trained by malcolm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rea_genderidentification_v1_en_5.2.0_3.0_1700342807827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rea_genderidentification_v1_en_5.2.0_3.0_1700342807827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("rea_genderidentification_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("rea_genderidentification_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rea_genderidentification_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/malcolm/REA_GenderIdentification_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-refutation_detector_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-refutation_detector_distilbert_en.md new file mode 100644 index 000000000000..619437ff459c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-refutation_detector_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English refutation_detector_distilbert DistilBertForSequenceClassification from leondz +author: John Snow Labs +name: refutation_detector_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`refutation_detector_distilbert` is a English model originally trained by leondz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/refutation_detector_distilbert_en_5.2.0_3.0_1700340710799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/refutation_detector_distilbert_en_5.2.0_3.0_1700340710799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("refutation_detector_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("refutation_detector_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|refutation_detector_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/leondz/refutation_detector_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-results_iameberedavid_en.md b/docs/_posts/ahmedlone127/2023-11-18-results_iameberedavid_en.md new file mode 100644 index 000000000000..a86b4f43091e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-results_iameberedavid_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English results_iameberedavid DistilBertForSequenceClassification from iameberedavid +author: John Snow Labs +name: results_iameberedavid +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_iameberedavid` is a English model originally trained by iameberedavid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_iameberedavid_en_5.2.0_3.0_1700346883067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_iameberedavid_en_5.2.0_3.0_1700346883067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("results_iameberedavid","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("results_iameberedavid","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_iameberedavid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/iameberedavid/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_juliensimon_en.md b/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_juliensimon_en.md new file mode 100644 index 000000000000..eb88a903c676 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_juliensimon_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English reviews_sentiment_analysis_juliensimon DistilBertForSequenceClassification from juliensimon +author: John Snow Labs +name: reviews_sentiment_analysis_juliensimon +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reviews_sentiment_analysis_juliensimon` is a English model originally trained by juliensimon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reviews_sentiment_analysis_juliensimon_en_5.2.0_3.0_1700338872781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reviews_sentiment_analysis_juliensimon_en_5.2.0_3.0_1700338872781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("reviews_sentiment_analysis_juliensimon","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("reviews_sentiment_analysis_juliensimon","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reviews_sentiment_analysis_juliensimon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/juliensimon/reviews-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_mmcquade11_en.md b/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_mmcquade11_en.md new file mode 100644 index 000000000000..25ae7fb82aca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_mmcquade11_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English reviews_sentiment_analysis_mmcquade11 DistilBertForSequenceClassification from mmcquade11 +author: John Snow Labs +name: reviews_sentiment_analysis_mmcquade11 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reviews_sentiment_analysis_mmcquade11` is a English model originally trained by mmcquade11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reviews_sentiment_analysis_mmcquade11_en_5.2.0_3.0_1700348335313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reviews_sentiment_analysis_mmcquade11_en_5.2.0_3.0_1700348335313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("reviews_sentiment_analysis_mmcquade11","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("reviews_sentiment_analysis_mmcquade11","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reviews_sentiment_analysis_mmcquade11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mmcquade11/reviews-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_two_en.md b/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_two_en.md new file mode 100644 index 000000000000..a88402cb8272 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-reviews_sentiment_analysis_two_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English reviews_sentiment_analysis_two DistilBertForSequenceClassification from mmcquade11 +author: John Snow Labs +name: reviews_sentiment_analysis_two +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reviews_sentiment_analysis_two` is a English model originally trained by mmcquade11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reviews_sentiment_analysis_two_en_5.2.0_3.0_1700344708734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reviews_sentiment_analysis_two_en_5.2.0_3.0_1700344708734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("reviews_sentiment_analysis_two","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("reviews_sentiment_analysis_two","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reviews_sentiment_analysis_two| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mmcquade11/reviews-sentiment-analysis-two \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sagemaker_distilbert_emotion_javibj_en.md b/docs/_posts/ahmedlone127/2023-11-18-sagemaker_distilbert_emotion_javibj_en.md new file mode 100644 index 000000000000..0ac2793239ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sagemaker_distilbert_emotion_javibj_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sagemaker_distilbert_emotion_javibj DistilBertForSequenceClassification from JaviBJ +author: John Snow Labs +name: sagemaker_distilbert_emotion_javibj +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagemaker_distilbert_emotion_javibj` is a English model originally trained by JaviBJ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_javibj_en_5.2.0_3.0_1700350384161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_javibj_en_5.2.0_3.0_1700350384161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_javibj","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_javibj","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagemaker_distilbert_emotion_javibj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JaviBJ/sagemaker-distilbert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sarcasm_model_zykrr_en.md b/docs/_posts/ahmedlone127/2023-11-18-sarcasm_model_zykrr_en.md new file mode 100644 index 000000000000..0ea3bf86b2bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sarcasm_model_zykrr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sarcasm_model_zykrr DistilBertForSequenceClassification from zykrr +author: John Snow Labs +name: sarcasm_model_zykrr +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sarcasm_model_zykrr` is a English model originally trained by zykrr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sarcasm_model_zykrr_en_5.2.0_3.0_1700341503883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sarcasm_model_zykrr_en_5.2.0_3.0_1700341503883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sarcasm_model_zykrr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sarcasm_model_zykrr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sarcasm_model_zykrr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/zykrr/sarcasm_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-scan_u_doc_bool_question_en.md b/docs/_posts/ahmedlone127/2023-11-18-scan_u_doc_bool_question_en.md new file mode 100644 index 000000000000..067ae3fe4b3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-scan_u_doc_bool_question_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English scan_u_doc_bool_question DistilBertForSequenceClassification from roaltopo +author: John Snow Labs +name: scan_u_doc_bool_question +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scan_u_doc_bool_question` is a English model originally trained by roaltopo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scan_u_doc_bool_question_en_5.2.0_3.0_1700340272990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scan_u_doc_bool_question_en_5.2.0_3.0_1700340272990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("scan_u_doc_bool_question","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("scan_u_doc_bool_question","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scan_u_doc_bool_question| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/roaltopo/scan-u-doc_bool-question \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sent_analysis_cvs_xx.md b/docs/_posts/ahmedlone127/2023-11-18-sent_analysis_cvs_xx.md new file mode 100644 index 000000000000..3bbccd5f2fd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sent_analysis_cvs_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual sent_analysis_cvs DistilBertForSequenceClassification from Softechlb +author: John Snow Labs +name: sent_analysis_cvs +date: 2023-11-18 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sent_analysis_cvs` is a Multilingual model originally trained by Softechlb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_analysis_cvs_xx_5.2.0_3.0_1700343206467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_analysis_cvs_xx_5.2.0_3.0_1700343206467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sent_analysis_cvs","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sent_analysis_cvs","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sent_analysis_cvs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/Softechlb/Sent_analysis_CVs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentence_level_stereotype_detector_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentence_level_stereotype_detector_en.md new file mode 100644 index 000000000000..52ce22d01cf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentence_level_stereotype_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentence_level_stereotype_detector DistilBertForSequenceClassification from wu981526092 +author: John Snow Labs +name: sentence_level_stereotype_detector +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_level_stereotype_detector` is a English model originally trained by wu981526092. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_level_stereotype_detector_en_5.2.0_3.0_1700342331802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_level_stereotype_detector_en_5.2.0_3.0_1700342331802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentence_level_stereotype_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentence_level_stereotype_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_level_stereotype_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wu981526092/Sentence-Level-Stereotype-Detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_generic_dataset_seethal_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_generic_dataset_seethal_en.md new file mode 100644 index 000000000000..75897fa02eb8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_generic_dataset_seethal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_generic_dataset_seethal DistilBertForSequenceClassification from Seethal +author: John Snow Labs +name: sentiment_analysis_generic_dataset_seethal +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_generic_dataset_seethal` is a English model originally trained by Seethal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_generic_dataset_seethal_en_5.2.0_3.0_1700337790087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_generic_dataset_seethal_en_5.2.0_3.0_1700337790087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_generic_dataset_seethal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_generic_dataset_seethal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_generic_dataset_seethal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Seethal/sentiment_analysis_generic_dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_model_for_socialmedia_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_model_for_socialmedia_en.md new file mode 100644 index 000000000000..a5402220d98c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_model_for_socialmedia_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_model_for_socialmedia DistilBertForSequenceClassification from Remicm +author: John Snow Labs +name: sentiment_analysis_model_for_socialmedia +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_model_for_socialmedia` is a English model originally trained by Remicm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_model_for_socialmedia_en_5.2.0_3.0_1700339631163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_model_for_socialmedia_en_5.2.0_3.0_1700339631163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_model_for_socialmedia","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_model_for_socialmedia","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_model_for_socialmedia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Remicm/sentiment-analysis-model-for-socialmedia \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_model_sbcbi_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_model_sbcbi_en.md new file mode 100644 index 000000000000..92df232b43d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_model_sbcbi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_model_sbcbi DistilBertForSequenceClassification from sbcBI +author: John Snow Labs +name: sentiment_analysis_model_sbcbi +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_model_sbcbi` is a English model originally trained by sbcBI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_model_sbcbi_en_5.2.0_3.0_1700337168046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_model_sbcbi_en_5.2.0_3.0_1700337168046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_model_sbcbi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_model_sbcbi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_model_sbcbi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sbcBI/sentiment_analysis_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_using_bert_with_10000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_using_bert_with_10000_samples_en.md new file mode 100644 index 000000000000..5f49b8adfb4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_using_bert_with_10000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_using_bert_with_10000_samples DistilBertForSequenceClassification from Satyajithchary +author: John Snow Labs +name: sentiment_analysis_using_bert_with_10000_samples +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_using_bert_with_10000_samples` is a English model originally trained by Satyajithchary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_using_bert_with_10000_samples_en_5.2.0_3.0_1700351244638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_using_bert_with_10000_samples_en_5.2.0_3.0_1700351244638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_using_bert_with_10000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_using_bert_with_10000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_using_bert_with_10000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Satyajithchary/Sentiment_Analysis_using_BERT_With_10000_Samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_xaqren_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_xaqren_en.md new file mode 100644 index 000000000000..4d81d78b7276 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentiment_analysis_xaqren_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_xaqren DistilBertForSequenceClassification from xaqren +author: John Snow Labs +name: sentiment_analysis_xaqren +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_xaqren` is a English model originally trained by xaqren. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_xaqren_en_5.2.0_3.0_1700343671268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_xaqren_en_5.2.0_3.0_1700343671268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_xaqren","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_xaqren","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_xaqren| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/xaqren/sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentiment_model_finetuning_using_steam_data_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentiment_model_finetuning_using_steam_data_en.md new file mode 100644 index 000000000000..fd5ddf281105 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentiment_model_finetuning_using_steam_data_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_model_finetuning_using_steam_data DistilBertForSequenceClassification from PJHinAI +author: John Snow Labs +name: sentiment_model_finetuning_using_steam_data +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_model_finetuning_using_steam_data` is a English model originally trained by PJHinAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_model_finetuning_using_steam_data_en_5.2.0_3.0_1700343528823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_model_finetuning_using_steam_data_en_5.2.0_3.0_1700343528823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_finetuning_using_steam_data","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_finetuning_using_steam_data","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_model_finetuning_using_steam_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/PJHinAI/sentiment-model-finetuning-using-steam-data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentiment_model_zykrr_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentiment_model_zykrr_en.md new file mode 100644 index 000000000000..71499a946d7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentiment_model_zykrr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_model_zykrr DistilBertForSequenceClassification from zykrr +author: John Snow Labs +name: sentiment_model_zykrr +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_model_zykrr` is a English model originally trained by zykrr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_model_zykrr_en_5.2.0_3.0_1700341353745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_model_zykrr_en_5.2.0_3.0_1700341353745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_zykrr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_zykrr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_model_zykrr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/zykrr/sentiment_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentimental_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentimental_analysis_en.md new file mode 100644 index 000000000000..e4cb612bc134 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentimental_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentimental_analysis DistilBertForSequenceClassification from Dmyadav2001 +author: John Snow Labs +name: sentimental_analysis +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentimental_analysis` is a English model originally trained by Dmyadav2001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentimental_analysis_en_5.2.0_3.0_1700341205504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentimental_analysis_en_5.2.0_3.0_1700341205504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentimental_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentimental_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentimental_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Dmyadav2001/Sentimental-Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sentimentclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-sentimentclassifier_en.md new file mode 100644 index 000000000000..337ab8587f5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sentimentclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentimentclassifier DistilBertForSequenceClassification from BaxterAI +author: John Snow Labs +name: sentimentclassifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentimentclassifier` is a English model originally trained by BaxterAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentimentclassifier_en_5.2.0_3.0_1700351311819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentimentclassifier_en_5.2.0_3.0_1700351311819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentimentclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentimentclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentimentclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/BaxterAI/SentimentClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-shakespeare_classifier_model_en.md b/docs/_posts/ahmedlone127/2023-11-18-shakespeare_classifier_model_en.md new file mode 100644 index 000000000000..2f7b338d0d37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-shakespeare_classifier_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English shakespeare_classifier_model DistilBertForSequenceClassification from notaphoenix +author: John Snow Labs +name: shakespeare_classifier_model +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`shakespeare_classifier_model` is a English model originally trained by notaphoenix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/shakespeare_classifier_model_en_5.2.0_3.0_1700347163676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/shakespeare_classifier_model_en_5.2.0_3.0_1700347163676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("shakespeare_classifier_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("shakespeare_classifier_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|shakespeare_classifier_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/notaphoenix/shakespeare_classifier_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-skills_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-skills_classifier_en.md new file mode 100644 index 000000000000..6605ae757a12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-skills_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English skills_classifier DistilBertForSequenceClassification from tkuye +author: John Snow Labs +name: skills_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`skills_classifier` is a English model originally trained by tkuye. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/skills_classifier_en_5.2.0_3.0_1700340394699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/skills_classifier_en_5.2.0_3.0_1700340394699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|skills_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tkuye/skills-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-skills_description_big_en.md b/docs/_posts/ahmedlone127/2023-11-18-skills_description_big_en.md new file mode 100644 index 000000000000..fcfda73e5260 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-skills_description_big_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English skills_description_big DistilBertForSequenceClassification from joblift-julian +author: John Snow Labs +name: skills_description_big +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`skills_description_big` is a English model originally trained by joblift-julian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/skills_description_big_en_5.2.0_3.0_1700350810439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/skills_description_big_en_5.2.0_3.0_1700350810439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_description_big","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_description_big","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|skills_description_big| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/joblift-julian/skills_description_big \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-skills_trainer_en.md b/docs/_posts/ahmedlone127/2023-11-18-skills_trainer_en.md new file mode 100644 index 000000000000..3f104cb1e11b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-skills_trainer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English skills_trainer DistilBertForSequenceClassification from bkane2 +author: John Snow Labs +name: skills_trainer +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`skills_trainer` is a English model originally trained by bkane2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/skills_trainer_en_5.2.0_3.0_1700344265535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/skills_trainer_en_5.2.0_3.0_1700344265535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_trainer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_trainer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|skills_trainer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bkane2/skills-trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-slurp_intent_baseline_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-18-slurp_intent_baseline_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..e895811ced12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-slurp_intent_baseline_distilbert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English slurp_intent_baseline_distilbert_base_uncased DistilBertForSequenceClassification from sankar1535 +author: John Snow Labs +name: slurp_intent_baseline_distilbert_base_uncased +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`slurp_intent_baseline_distilbert_base_uncased` is a English model originally trained by sankar1535. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/slurp_intent_baseline_distilbert_base_uncased_en_5.2.0_3.0_1700347487030.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/slurp_intent_baseline_distilbert_base_uncased_en_5.2.0_3.0_1700347487030.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("slurp_intent_baseline_distilbert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("slurp_intent_baseline_distilbert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|slurp_intent_baseline_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/sankar1535/slurp-intent_baseline-distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-sms_spam_classification_with_bert_en.md b/docs/_posts/ahmedlone127/2023-11-18-sms_spam_classification_with_bert_en.md new file mode 100644 index 000000000000..f23ba811a149 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-sms_spam_classification_with_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sms_spam_classification_with_bert DistilBertForSequenceClassification from wesleyacheng +author: John Snow Labs +name: sms_spam_classification_with_bert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sms_spam_classification_with_bert` is a English model originally trained by wesleyacheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sms_spam_classification_with_bert_en_5.2.0_3.0_1700339863289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sms_spam_classification_with_bert_en_5.2.0_3.0_1700339863289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sms_spam_classification_with_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sms_spam_classification_with_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sms_spam_classification_with_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wesleyacheng/sms-spam-classification-with-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-snli_distilbert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-11-18-snli_distilbert_base_cased_en.md new file mode 100644 index 000000000000..5623a7cedf69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-snli_distilbert_base_cased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English snli_distilbert_base_cased DistilBertForSequenceClassification from boychaboy +author: John Snow Labs +name: snli_distilbert_base_cased +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`snli_distilbert_base_cased` is a English model originally trained by boychaboy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/snli_distilbert_base_cased_en_5.2.0_3.0_1700346627588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/snli_distilbert_base_cased_en_5.2.0_3.0_1700346627588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("snli_distilbert_base_cased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("snli_distilbert_base_cased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|snli_distilbert_base_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/boychaboy/SNLI_distilbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-spam_classifier_skandavivek2_en.md b/docs/_posts/ahmedlone127/2023-11-18-spam_classifier_skandavivek2_en.md new file mode 100644 index 000000000000..40eaad0075eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-spam_classifier_skandavivek2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English spam_classifier_skandavivek2 DistilBertForSequenceClassification from skandavivek2 +author: John Snow Labs +name: spam_classifier_skandavivek2 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spam_classifier_skandavivek2` is a English model originally trained by skandavivek2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spam_classifier_skandavivek2_en_5.2.0_3.0_1700341017147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spam_classifier_skandavivek2_en_5.2.0_3.0_1700341017147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("spam_classifier_skandavivek2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("spam_classifier_skandavivek2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spam_classifier_skandavivek2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/skandavivek2/spam-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-spam_message_classification_bw7898_en.md b/docs/_posts/ahmedlone127/2023-11-18-spam_message_classification_bw7898_en.md new file mode 100644 index 000000000000..d4b1d923a9b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-spam_message_classification_bw7898_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English spam_message_classification_bw7898 DistilBertForSequenceClassification from BW7898 +author: John Snow Labs +name: spam_message_classification_bw7898 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spam_message_classification_bw7898` is a English model originally trained by BW7898. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spam_message_classification_bw7898_en_5.2.0_3.0_1700344105211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spam_message_classification_bw7898_en_5.2.0_3.0_1700344105211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("spam_message_classification_bw7898","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("spam_message_classification_bw7898","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spam_message_classification_bw7898| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/BW7898/spam_message_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-stock_news_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-stock_news_distilbert_en.md new file mode 100644 index 000000000000..ed70dfba99fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-stock_news_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stock_news_distilbert DistilBertForSequenceClassification from KernAI +author: John Snow Labs +name: stock_news_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stock_news_distilbert` is a English model originally trained by KernAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stock_news_distilbert_en_5.2.0_3.0_1700339798974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stock_news_distilbert_en_5.2.0_3.0_1700339798974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("stock_news_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("stock_news_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stock_news_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/KernAI/stock-news-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-subreddit_predictor_en.md b/docs/_posts/ahmedlone127/2023-11-18-subreddit_predictor_en.md new file mode 100644 index 000000000000..6771cc13463a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-subreddit_predictor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English subreddit_predictor DistilBertForSequenceClassification from daspartho +author: John Snow Labs +name: subreddit_predictor +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`subreddit_predictor` is a English model originally trained by daspartho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/subreddit_predictor_en_5.2.0_3.0_1700347340487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/subreddit_predictor_en_5.2.0_3.0_1700347340487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("subreddit_predictor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("subreddit_predictor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|subreddit_predictor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.2 MB| + +## References + +https://huggingface.co/daspartho/subreddit-predictor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-symptom2disease_en.md b/docs/_posts/ahmedlone127/2023-11-18-symptom2disease_en.md new file mode 100644 index 000000000000..298e02a49af4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-symptom2disease_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English symptom2disease DistilBertForSequenceClassification from alibidaran +author: John Snow Labs +name: symptom2disease +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`symptom2disease` is a English model originally trained by alibidaran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/symptom2disease_en_5.2.0_3.0_1700339459054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/symptom2disease_en_5.2.0_3.0_1700339459054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("symptom2disease","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("symptom2disease","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|symptom2disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alibidaran/Symptom2disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tamil_sentiment_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-tamil_sentiment_distilbert_en.md new file mode 100644 index 000000000000..20d15ed40b51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tamil_sentiment_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tamil_sentiment_distilbert DistilBertForSequenceClassification from Vasanth +author: John Snow Labs +name: tamil_sentiment_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tamil_sentiment_distilbert` is a English model originally trained by Vasanth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tamil_sentiment_distilbert_en_5.2.0_3.0_1700348660934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tamil_sentiment_distilbert_en_5.2.0_3.0_1700348660934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tamil_sentiment_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tamil_sentiment_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tamil_sentiment_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/Vasanth/tamil-sentiment-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-taser_en.md b/docs/_posts/ahmedlone127/2023-11-18-taser_en.md new file mode 100644 index 000000000000..446968edde21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-taser_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English taser DistilBertForSequenceClassification from dwsunimannheim +author: John Snow Labs +name: taser +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taser` is a English model originally trained by dwsunimannheim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taser_en_5.2.0_3.0_1700345609626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taser_en_5.2.0_3.0_1700345609626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("taser","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("taser","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taser| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dwsunimannheim/TaSeR \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-telugu_sentiment_movie_te.md b/docs/_posts/ahmedlone127/2023-11-18-telugu_sentiment_movie_te.md new file mode 100644 index 000000000000..e689ee501c0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-telugu_sentiment_movie_te.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Telugu telugu_sentiment_movie DistilBertForSequenceClassification from Sanath369 +author: John Snow Labs +name: telugu_sentiment_movie +date: 2023-11-18 +tags: [bert, te, open_source, sequence_classification, onnx] +task: Text Classification +language: te +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`telugu_sentiment_movie` is a Telugu model originally trained by Sanath369. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/telugu_sentiment_movie_te_5.2.0_3.0_1700350992125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/telugu_sentiment_movie_te_5.2.0_3.0_1700350992125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("telugu_sentiment_movie","te")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("telugu_sentiment_movie","te") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|telugu_sentiment_movie| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|te| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Sanath369/Telugu_sentiment_movie \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-testpytorchclassification_en.md b/docs/_posts/ahmedlone127/2023-11-18-testpytorchclassification_en.md new file mode 100644 index 000000000000..b987c41520c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-testpytorchclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English testpytorchclassification DistilBertForSequenceClassification from Andranik +author: John Snow Labs +name: testpytorchclassification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testpytorchclassification` is a English model originally trained by Andranik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testpytorchclassification_en_5.2.0_3.0_1700350114443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testpytorchclassification_en_5.2.0_3.0_1700350114443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("testpytorchclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("testpytorchclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testpytorchclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Andranik/TestPytorchClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-text_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-18-text_analysis_en.md new file mode 100644 index 000000000000..6736feedc0f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-text_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_analysis DistilBertForSequenceClassification from abcdda +author: John Snow Labs +name: text_analysis +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_analysis` is a English model originally trained by abcdda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_analysis_en_5.2.0_3.0_1700351685769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_analysis_en_5.2.0_3.0_1700351685769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/abcdda/text_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-text_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-18-text_emotion_en.md new file mode 100644 index 000000000000..5b3b9980a649 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-text_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_emotion DistilBertForSequenceClassification from daspartho +author: John Snow Labs +name: text_emotion +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_emotion` is a English model originally trained by daspartho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_emotion_en_5.2.0_3.0_1700348513255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_emotion_en_5.2.0_3.0_1700348513255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/daspartho/text-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tiny_classification_fast_6_en.md b/docs/_posts/ahmedlone127/2023-11-18-tiny_classification_fast_6_en.md new file mode 100644 index 000000000000..22de26d4189b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tiny_classification_fast_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_classification_fast_6 DistilBertForSequenceClassification from Elytum +author: John Snow Labs +name: tiny_classification_fast_6 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_classification_fast_6` is a English model originally trained by Elytum. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_classification_fast_6_en_5.2.0_3.0_1700343511592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_classification_fast_6_en_5.2.0_3.0_1700343511592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_classification_fast_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_classification_fast_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_classification_fast_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Elytum/tiny-classification-fast-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_classification_philschmid_en.md b/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_classification_philschmid_en.md new file mode 100644 index 000000000000..79adf9bea3be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_classification_philschmid_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_distilbert_classification_philschmid DistilBertForSequenceClassification from philschmid +author: John Snow Labs +name: tiny_distilbert_classification_philschmid +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_distilbert_classification_philschmid` is a English model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_distilbert_classification_philschmid_en_5.2.0_3.0_1700340452025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_distilbert_classification_philschmid_en_5.2.0_3.0_1700340452025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_distilbert_classification_philschmid","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_distilbert_classification_philschmid","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_distilbert_classification_philschmid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|552.8 KB| + +## References + +https://huggingface.co/philschmid/tiny-distilbert-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_classification_sgugger_en.md b/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_classification_sgugger_en.md new file mode 100644 index 000000000000..bc57312e53dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_classification_sgugger_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_distilbert_classification_sgugger DistilBertForSequenceClassification from sgugger +author: John Snow Labs +name: tiny_distilbert_classification_sgugger +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_distilbert_classification_sgugger` is a English model originally trained by sgugger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_distilbert_classification_sgugger_en_5.2.0_3.0_1700338292646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_distilbert_classification_sgugger_en_5.2.0_3.0_1700338292646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_distilbert_classification_sgugger","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_distilbert_classification_sgugger","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_distilbert_classification_sgugger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|552.8 KB| + +## References + +https://huggingface.co/sgugger/tiny-distilbert-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_sequence_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_sequence_classification_en.md new file mode 100644 index 000000000000..2611f763dd10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tiny_distilbert_sequence_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_distilbert_sequence_classification DistilBertForSequenceClassification from Narsil +author: John Snow Labs +name: tiny_distilbert_sequence_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_distilbert_sequence_classification` is a English model originally trained by Narsil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_distilbert_sequence_classification_en_5.2.0_3.0_1700348619335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_distilbert_sequence_classification_en_5.2.0_3.0_1700348619335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_distilbert_sequence_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_distilbert_sequence_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_distilbert_sequence_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|220.1 KB| + +## References + +https://huggingface.co/Narsil/tiny-distilbert-sequence-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbert_hf_internal_testing_en.md b/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbert_hf_internal_testing_en.md new file mode 100644 index 000000000000..530e30cdb40d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbert_hf_internal_testing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_random_distilbert_hf_internal_testing DistilBertForSequenceClassification from hf-internal-testing +author: John Snow Labs +name: tiny_random_distilbert_hf_internal_testing +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_distilbert_hf_internal_testing` is a English model originally trained by hf-internal-testing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_distilbert_hf_internal_testing_en_5.2.0_3.0_1700337036515.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_distilbert_hf_internal_testing_en_5.2.0_3.0_1700337036515.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_random_distilbert_hf_internal_testing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_random_distilbert_hf_internal_testing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_distilbert_hf_internal_testing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|351.4 KB| + +## References + +https://huggingface.co/hf-internal-testing/tiny-random-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbertforsequenceclassification_hf_internal_testing_en.md b/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbertforsequenceclassification_hf_internal_testing_en.md new file mode 100644 index 000000000000..3385bae47e06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbertforsequenceclassification_hf_internal_testing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_random_distilbertforsequenceclassification_hf_internal_testing DistilBertForSequenceClassification from hf-internal-testing +author: John Snow Labs +name: tiny_random_distilbertforsequenceclassification_hf_internal_testing +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_distilbertforsequenceclassification_hf_internal_testing` is a English model originally trained by hf-internal-testing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertforsequenceclassification_hf_internal_testing_en_5.2.0_3.0_1700338654216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertforsequenceclassification_hf_internal_testing_en_5.2.0_3.0_1700338654216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_random_distilbertforsequenceclassification_hf_internal_testing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_random_distilbertforsequenceclassification_hf_internal_testing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_distilbertforsequenceclassification_hf_internal_testing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|351.3 KB| + +## References + +https://huggingface.co/hf-internal-testing/tiny-random-DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbertforsequenceclassification_hf_tiny_model_private_en.md b/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbertforsequenceclassification_hf_tiny_model_private_en.md new file mode 100644 index 000000000000..20b1fd3f1531 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tiny_random_distilbertforsequenceclassification_hf_tiny_model_private_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tiny_random_distilbertforsequenceclassification_hf_tiny_model_private DistilBertForSequenceClassification from hf-tiny-model-private +author: John Snow Labs +name: tiny_random_distilbertforsequenceclassification_hf_tiny_model_private +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_distilbertforsequenceclassification_hf_tiny_model_private` is a English model originally trained by hf-tiny-model-private. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertforsequenceclassification_hf_tiny_model_private_en_5.2.0_3.0_1700347755173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertforsequenceclassification_hf_tiny_model_private_en_5.2.0_3.0_1700347755173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_random_distilbertforsequenceclassification_hf_tiny_model_private","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tiny_random_distilbertforsequenceclassification_hf_tiny_model_private","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_distilbertforsequenceclassification_hf_tiny_model_private| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|351.3 KB| + +## References + +https://huggingface.co/hf-tiny-model-private/tiny-random-DistilBertForSequenceClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tonality_sree2910_en.md b/docs/_posts/ahmedlone127/2023-11-18-tonality_sree2910_en.md new file mode 100644 index 000000000000..fa84df26d24f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tonality_sree2910_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tonality_sree2910 DistilBertForSequenceClassification from sree2910 +author: John Snow Labs +name: tonality_sree2910 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tonality_sree2910` is a English model originally trained by sree2910. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tonality_sree2910_en_5.2.0_3.0_1700350394820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tonality_sree2910_en_5.2.0_3.0_1700350394820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tonality_sree2910","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tonality_sree2910","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tonality_sree2910| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/sree2910/tonality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-toxic_tweets_en.md b/docs/_posts/ahmedlone127/2023-11-18-toxic_tweets_en.md new file mode 100644 index 000000000000..67af78247702 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-toxic_tweets_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English toxic_tweets DistilBertForSequenceClassification from Kev07 +author: John Snow Labs +name: toxic_tweets +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_tweets` is a English model originally trained by Kev07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_tweets_en_5.2.0_3.0_1700341501290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_tweets_en_5.2.0_3.0_1700341501290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("toxic_tweets","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("toxic_tweets","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_tweets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kev07/Toxic-Tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-tweet_sentiment_analysis_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-tweet_sentiment_analysis_distilbert_en.md new file mode 100644 index 000000000000..f1649b8dc3c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-tweet_sentiment_analysis_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tweet_sentiment_analysis_distilbert DistilBertForSequenceClassification from bambadij +author: John Snow Labs +name: tweet_sentiment_analysis_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tweet_sentiment_analysis_distilbert` is a English model originally trained by bambadij. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tweet_sentiment_analysis_distilbert_en_5.2.0_3.0_1700342464504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tweet_sentiment_analysis_distilbert_en_5.2.0_3.0_1700342464504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tweet_sentiment_analysis_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tweet_sentiment_analysis_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tweet_sentiment_analysis_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/bambadij/Tweet_sentiment_analysis_Distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-twitter_disaster_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-18-twitter_disaster_distilbert_en.md new file mode 100644 index 000000000000..8c54a1e9259f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-twitter_disaster_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_disaster_distilbert DistilBertForSequenceClassification from ReynaQuita +author: John Snow Labs +name: twitter_disaster_distilbert +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_disaster_distilbert` is a English model originally trained by ReynaQuita. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_disaster_distilbert_en_5.2.0_3.0_1700347650545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_disaster_distilbert_en_5.2.0_3.0_1700347650545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_disaster_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_disaster_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_disaster_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ReynaQuita/twitter_disaster_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-twitter_emotion_polish_fast_pl.md b/docs/_posts/ahmedlone127/2023-11-18-twitter_emotion_polish_fast_pl.md new file mode 100644 index 000000000000..ed1975c7d0fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-twitter_emotion_polish_fast_pl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Polish twitter_emotion_polish_fast DistilBertForSequenceClassification from bardsai +author: John Snow Labs +name: twitter_emotion_polish_fast +date: 2023-11-18 +tags: [bert, pl, open_source, sequence_classification, onnx] +task: Text Classification +language: pl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_emotion_polish_fast` is a Polish model originally trained by bardsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_emotion_polish_fast_pl_5.2.0_3.0_1700341694313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_emotion_polish_fast_pl_5.2.0_3.0_1700341694313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_emotion_polish_fast","pl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_emotion_polish_fast","pl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_emotion_polish_fast| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pl| +|Size:|510.7 MB| + +## References + +https://huggingface.co/bardsai/twitter-emotion-pl-fast \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-twitter_input_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-18-twitter_input_classifier_en.md new file mode 100644 index 000000000000..ed093023c264 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-twitter_input_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_input_classifier DistilBertForSequenceClassification from Abris +author: John Snow Labs +name: twitter_input_classifier +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_input_classifier` is a English model originally trained by Abris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_input_classifier_en_5.2.0_3.0_1700350646379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_input_classifier_en_5.2.0_3.0_1700350646379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_input_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_input_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_input_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Abris/twitter-input-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-twitter_sentiment_analysis_alexgeh196_en.md b/docs/_posts/ahmedlone127/2023-11-18-twitter_sentiment_analysis_alexgeh196_en.md new file mode 100644 index 000000000000..73b2bd6b15e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-twitter_sentiment_analysis_alexgeh196_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_sentiment_analysis_alexgeh196 DistilBertForSequenceClassification from alexgeh196 +author: John Snow Labs +name: twitter_sentiment_analysis_alexgeh196 +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_sentiment_analysis_alexgeh196` is a English model originally trained by alexgeh196. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_sentiment_analysis_alexgeh196_en_5.2.0_3.0_1700340685642.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_sentiment_analysis_alexgeh196_en_5.2.0_3.0_1700340685642.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_sentiment_analysis_alexgeh196","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_sentiment_analysis_alexgeh196","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_sentiment_analysis_alexgeh196| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alexgeh196/twitter-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-twitter_sentiment_polish_fast_pl.md b/docs/_posts/ahmedlone127/2023-11-18-twitter_sentiment_polish_fast_pl.md new file mode 100644 index 000000000000..4be0411dca4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-twitter_sentiment_polish_fast_pl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Polish twitter_sentiment_polish_fast DistilBertForSequenceClassification from bardsai +author: John Snow Labs +name: twitter_sentiment_polish_fast +date: 2023-11-18 +tags: [bert, pl, open_source, sequence_classification, onnx] +task: Text Classification +language: pl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_sentiment_polish_fast` is a Polish model originally trained by bardsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_sentiment_polish_fast_pl_5.2.0_3.0_1700346260226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_sentiment_polish_fast_pl_5.2.0_3.0_1700346260226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_sentiment_polish_fast","pl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_sentiment_polish_fast","pl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_sentiment_polish_fast| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|pl| +|Size:|510.6 MB| + +## References + +https://huggingface.co/bardsai/twitter-sentiment-pl-fast \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-website_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-website_classification_en.md new file mode 100644 index 000000000000..1e9c4724c26a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-website_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English website_classification DistilBertForSequenceClassification from alimazhar-110 +author: John Snow Labs +name: website_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`website_classification` is a English model originally trained by alimazhar-110. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/website_classification_en_5.2.0_3.0_1700338157294.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/website_classification_en_5.2.0_3.0_1700338157294.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("website_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("website_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|website_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alimazhar-110/website_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-wildlife_classification_en.md b/docs/_posts/ahmedlone127/2023-11-18-wildlife_classification_en.md new file mode 100644 index 000000000000..2903b2cc35cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-wildlife_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English wildlife_classification DistilBertForSequenceClassification from julesbarbosa +author: John Snow Labs +name: wildlife_classification +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wildlife_classification` is a English model originally trained by julesbarbosa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wildlife_classification_en_5.2.0_3.0_1700338940196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wildlife_classification_en_5.2.0_3.0_1700338940196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("wildlife_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("wildlife_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wildlife_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/julesbarbosa/wildlife-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-yelp_restaurant_review_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-18-yelp_restaurant_review_sentiment_analysis_en.md new file mode 100644 index 000000000000..30f2bce054b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-yelp_restaurant_review_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English yelp_restaurant_review_sentiment_analysis DistilBertForSequenceClassification from mrcaelumn +author: John Snow Labs +name: yelp_restaurant_review_sentiment_analysis +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yelp_restaurant_review_sentiment_analysis` is a English model originally trained by mrcaelumn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yelp_restaurant_review_sentiment_analysis_en_5.2.0_3.0_1700340570020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yelp_restaurant_review_sentiment_analysis_en_5.2.0_3.0_1700340570020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("yelp_restaurant_review_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("yelp_restaurant_review_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yelp_restaurant_review_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mrcaelumn/yelp_restaurant_review_sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-18-yt_title_grader_en.md b/docs/_posts/ahmedlone127/2023-11-18-yt_title_grader_en.md new file mode 100644 index 000000000000..4a4a5737fbf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-18-yt_title_grader_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English yt_title_grader DistilBertForSequenceClassification from focia +author: John Snow Labs +name: yt_title_grader +date: 2023-11-18 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yt_title_grader` is a English model originally trained by focia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yt_title_grader_en_5.2.0_3.0_1700339846592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yt_title_grader_en_5.2.0_3.0_1700339846592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("yt_title_grader","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("yt_title_grader","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yt_title_grader| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/focia/yt-title-grader \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-04_model_sales_external_en.md b/docs/_posts/ahmedlone127/2023-11-19-04_model_sales_external_en.md new file mode 100644 index 000000000000..3968e1cc2ba2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-04_model_sales_external_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 04_model_sales_external DistilBertForSequenceClassification from hannoh +author: John Snow Labs +name: 04_model_sales_external +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`04_model_sales_external` is a English model originally trained by hannoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/04_model_sales_external_en_5.2.0_3.0_1700370035745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/04_model_sales_external_en_5.2.0_3.0_1700370035745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("04_model_sales_external","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("04_model_sales_external","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|04_model_sales_external| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/hannoh/04_model_sales_external \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-1026_en.md b/docs/_posts/ahmedlone127/2023-11-19-1026_en.md new file mode 100644 index 000000000000..8d24909cbc45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-1026_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 1026 DistilBertForSequenceClassification from tingchih +author: John Snow Labs +name: 1026 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`1026` is a English model originally trained by tingchih. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/1026_en_5.2.0_3.0_1700387903703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/1026_en_5.2.0_3.0_1700387903703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("1026","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("1026","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|1026| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tingchih/1026 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-1104_en.md b/docs/_posts/ahmedlone127/2023-11-19-1104_en.md new file mode 100644 index 000000000000..7440795202a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-1104_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 1104 DistilBertForSequenceClassification from tingchih +author: John Snow Labs +name: 1104 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`1104` is a English model originally trained by tingchih. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/1104_en_5.2.0_3.0_1700359782439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/1104_en_5.2.0_3.0_1700359782439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("1104","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("1104","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|1104| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tingchih/1104 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-20_news_group_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-20_news_group_classifier_en.md new file mode 100644 index 000000000000..0af67dead9dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-20_news_group_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 20_news_group_classifier DistilBertForSequenceClassification from paragsmhatre +author: John Snow Labs +name: 20_news_group_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`20_news_group_classifier` is a English model originally trained by paragsmhatre. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/20_news_group_classifier_en_5.2.0_3.0_1700356234135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/20_news_group_classifier_en_5.2.0_3.0_1700356234135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("20_news_group_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("20_news_group_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|20_news_group_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/paragsmhatre/20_news_group_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-ad_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-ad_classifier_en.md new file mode 100644 index 000000000000..db7be05396f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-ad_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ad_classifier DistilBertForSequenceClassification from Joshnicholas +author: John Snow Labs +name: ad_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ad_classifier` is a English model originally trained by Joshnicholas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ad_classifier_en_5.2.0_3.0_1700361825803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ad_classifier_en_5.2.0_3.0_1700361825803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ad_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ad_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ad_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Joshnicholas/ad-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-aesthetic_attribute_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-aesthetic_attribute_classifier_en.md new file mode 100644 index 000000000000..ab2de16124cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-aesthetic_attribute_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English aesthetic_attribute_classifier DistilBertForSequenceClassification from daveni +author: John Snow Labs +name: aesthetic_attribute_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aesthetic_attribute_classifier` is a English model originally trained by daveni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aesthetic_attribute_classifier_en_5.2.0_3.0_1700354415798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aesthetic_attribute_classifier_en_5.2.0_3.0_1700354415798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("aesthetic_attribute_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("aesthetic_attribute_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aesthetic_attribute_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/daveni/aesthetic_attribute_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-age_predict_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-age_predict_model_en.md new file mode 100644 index 000000000000..3169a08e6a01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-age_predict_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English age_predict_model DistilBertForSequenceClassification from priyabrat +author: John Snow Labs +name: age_predict_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`age_predict_model` is a English model originally trained by priyabrat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/age_predict_model_en_5.2.0_3.0_1700379705287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/age_predict_model_en_5.2.0_3.0_1700379705287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("age_predict_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("age_predict_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|age_predict_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/priyabrat/AGE_predict_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-amazon_review_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-19-amazon_review_sentiment_analysis_en.md new file mode 100644 index 000000000000..3b54b4843f08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-amazon_review_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English amazon_review_sentiment_analysis DistilBertForSequenceClassification from Christian2903 +author: John Snow Labs +name: amazon_review_sentiment_analysis +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazon_review_sentiment_analysis` is a English model originally trained by Christian2903. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazon_review_sentiment_analysis_en_5.2.0_3.0_1700382930888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazon_review_sentiment_analysis_en_5.2.0_3.0_1700382930888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("amazon_review_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("amazon_review_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazon_review_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Christian2903/amazon-review-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-anchor_classification_dmv_en.md b/docs/_posts/ahmedlone127/2023-11-19-anchor_classification_dmv_en.md new file mode 100644 index 000000000000..07ef81feae95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-anchor_classification_dmv_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English anchor_classification_dmv DistilBertForSequenceClassification from VictorZhu +author: John Snow Labs +name: anchor_classification_dmv +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`anchor_classification_dmv` is a English model originally trained by VictorZhu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/anchor_classification_dmv_en_5.2.0_3.0_1700374979052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/anchor_classification_dmv_en_5.2.0_3.0_1700374979052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("anchor_classification_dmv","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("anchor_classification_dmv","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|anchor_classification_dmv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/VictorZhu/Anchor-Classification-DMV \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-anime_oriya_not_en.md b/docs/_posts/ahmedlone127/2023-11-19-anime_oriya_not_en.md new file mode 100644 index 000000000000..3836b71e6d54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-anime_oriya_not_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English anime_oriya_not DistilBertForSequenceClassification from daspartho +author: John Snow Labs +name: anime_oriya_not +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`anime_oriya_not` is a English model originally trained by daspartho. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/anime_oriya_not_en_5.2.0_3.0_1700424153119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/anime_oriya_not_en_5.2.0_3.0_1700424153119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("anime_oriya_not","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("anime_oriya_not","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|anime_oriya_not| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/daspartho/anime-or-not \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-article_classification_model_from_the_arxiv_en.md b/docs/_posts/ahmedlone127/2023-11-19-article_classification_model_from_the_arxiv_en.md new file mode 100644 index 000000000000..1a861dc0ab4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-article_classification_model_from_the_arxiv_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English article_classification_model_from_the_arxiv DistilBertForSequenceClassification from combat-helicopter +author: John Snow Labs +name: article_classification_model_from_the_arxiv +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`article_classification_model_from_the_arxiv` is a English model originally trained by combat-helicopter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/article_classification_model_from_the_arxiv_en_5.2.0_3.0_1700395263171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/article_classification_model_from_the_arxiv_en_5.2.0_3.0_1700395263171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("article_classification_model_from_the_arxiv","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("article_classification_model_from_the_arxiv","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|article_classification_model_from_the_arxiv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/combat-helicopter/article-classification-model-from-the-arxiv \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-asrs_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-19-asrs_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..4fdf0d497aa9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-asrs_distilbert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English asrs_distilbert_base_uncased DistilBertForSequenceClassification from jfernsler +author: John Snow Labs +name: asrs_distilbert_base_uncased +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`asrs_distilbert_base_uncased` is a English model originally trained by jfernsler. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/asrs_distilbert_base_uncased_en_5.2.0_3.0_1700367166125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/asrs_distilbert_base_uncased_en_5.2.0_3.0_1700367166125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("asrs_distilbert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("asrs_distilbert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|asrs_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jfernsler/ASRS_distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-autonlp_tc_13522454_en.md b/docs/_posts/ahmedlone127/2023-11-19-autonlp_tc_13522454_en.md new file mode 100644 index 000000000000..702c79fca5fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-autonlp_tc_13522454_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autonlp_tc_13522454 DistilBertForSequenceClassification from Kceilord +author: John Snow Labs +name: autonlp_tc_13522454 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_tc_13522454` is a English model originally trained by Kceilord. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_tc_13522454_en_5.2.0_3.0_1700352769518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_tc_13522454_en_5.2.0_3.0_1700352769518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("autonlp_tc_13522454","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("autonlp_tc_13522454","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_tc_13522454| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Kceilord/autonlp-tc-13522454 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-autotrain_distilbert_risk_ranker_1593356256_en.md b/docs/_posts/ahmedlone127/2023-11-19-autotrain_distilbert_risk_ranker_1593356256_en.md new file mode 100644 index 000000000000..b4c443d694dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-autotrain_distilbert_risk_ranker_1593356256_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_distilbert_risk_ranker_1593356256 DistilBertForSequenceClassification from mrosinski +author: John Snow Labs +name: autotrain_distilbert_risk_ranker_1593356256 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_distilbert_risk_ranker_1593356256` is a English model originally trained by mrosinski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_distilbert_risk_ranker_1593356256_en_5.2.0_3.0_1700394895914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_distilbert_risk_ranker_1593356256_en_5.2.0_3.0_1700394895914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("autotrain_distilbert_risk_ranker_1593356256","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("autotrain_distilbert_risk_ranker_1593356256","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_distilbert_risk_ranker_1593356256| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mrosinski/autotrain-distilbert-risk-ranker-1593356256 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bart_large_en.md b/docs/_posts/ahmedlone127/2023-11-19-bart_large_en.md new file mode 100644 index 000000000000..d5468ba05131 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bart_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bart_large DistilBertForSequenceClassification from Mahmoud8 +author: John Snow Labs +name: bart_large +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bart_large` is a English model originally trained by Mahmoud8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_en_5.2.0_3.0_1700355138007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_en_5.2.0_3.0_1700355138007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bart_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bart_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bart_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Mahmoud8/bart-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-based_bert_sardinian_en.md b/docs/_posts/ahmedlone127/2023-11-19-based_bert_sardinian_en.md new file mode 100644 index 000000000000..9ee7201979ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-based_bert_sardinian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English based_bert_sardinian DistilBertForSequenceClassification from 0xMaka +author: John Snow Labs +name: based_bert_sardinian +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`based_bert_sardinian` is a English model originally trained by 0xMaka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/based_bert_sardinian_en_5.2.0_3.0_1700387903687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/based_bert_sardinian_en_5.2.0_3.0_1700387903687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("based_bert_sardinian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("based_bert_sardinian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|based_bert_sardinian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/0xMaka/based-bert-sc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-ben_base_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-ben_base_model_en.md new file mode 100644 index 000000000000..342c1816ef0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-ben_base_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ben_base_model DistilBertForSequenceClassification from mathildeparlo +author: John Snow Labs +name: ben_base_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ben_base_model` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ben_base_model_en_5.2.0_3.0_1700355149512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ben_base_model_en_5.2.0_3.0_1700355149512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ben_base_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ben_base_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ben_base_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mathildeparlo/ben_base_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_095ey11_en.md b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_095ey11_en.md new file mode 100644 index 000000000000..b68dc208fa17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_095ey11_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_095ey11 DistilBertForSequenceClassification from 095ey11 +author: John Snow Labs +name: bert_emotion_095ey11 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_095ey11` is a English model originally trained by 095ey11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_095ey11_en_5.2.0_3.0_1700395902438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_095ey11_en_5.2.0_3.0_1700395902438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_095ey11","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_095ey11","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_095ey11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/095ey11/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_apetulante_en.md b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_apetulante_en.md new file mode 100644 index 000000000000..a834533aee10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_apetulante_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_apetulante DistilBertForSequenceClassification from apetulante +author: John Snow Labs +name: bert_emotion_apetulante +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_apetulante` is a English model originally trained by apetulante. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_apetulante_en_5.2.0_3.0_1700426488010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_apetulante_en_5.2.0_3.0_1700426488010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_apetulante","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_apetulante","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_apetulante| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/apetulante/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_mehnaazasad_en.md b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_mehnaazasad_en.md new file mode 100644 index 000000000000..c0646c76b6d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_mehnaazasad_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_mehnaazasad DistilBertForSequenceClassification from mehnaazasad +author: John Snow Labs +name: bert_emotion_mehnaazasad +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_mehnaazasad` is a English model originally trained by mehnaazasad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_mehnaazasad_en_5.2.0_3.0_1700383745593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_mehnaazasad_en_5.2.0_3.0_1700383745593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_mehnaazasad","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_mehnaazasad","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_mehnaazasad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/mehnaazasad/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_umangchaudhry_en.md b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_umangchaudhry_en.md new file mode 100644 index 000000000000..1d462ea74c46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bert_emotion_umangchaudhry_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_umangchaudhry DistilBertForSequenceClassification from umangchaudhry +author: John Snow Labs +name: bert_emotion_umangchaudhry +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_umangchaudhry` is a English model originally trained by umangchaudhry. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_umangchaudhry_en_5.2.0_3.0_1700405308040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_umangchaudhry_en_5.2.0_3.0_1700405308040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_umangchaudhry","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_umangchaudhry","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_umangchaudhry| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/umangchaudhry/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bert_model_notpretrained_en.md b/docs/_posts/ahmedlone127/2023-11-19-bert_model_notpretrained_en.md new file mode 100644 index 000000000000..1e94eb056fb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bert_model_notpretrained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_model_notpretrained DistilBertForSequenceClassification from soumyasinha +author: John Snow Labs +name: bert_model_notpretrained +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_model_notpretrained` is a English model originally trained by soumyasinha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_model_notpretrained_en_5.2.0_3.0_1700352005443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_model_notpretrained_en_5.2.0_3.0_1700352005443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_model_notpretrained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_model_notpretrained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_model_notpretrained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.1 MB| + +## References + +https://huggingface.co/soumyasinha/BERT_model_notpretrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bert_model_reddit_tsla_tracked_actions_en.md b/docs/_posts/ahmedlone127/2023-11-19-bert_model_reddit_tsla_tracked_actions_en.md new file mode 100644 index 000000000000..930e159904f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bert_model_reddit_tsla_tracked_actions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_model_reddit_tsla_tracked_actions DistilBertForSequenceClassification from fourthbrain-demo +author: John Snow Labs +name: bert_model_reddit_tsla_tracked_actions +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_model_reddit_tsla_tracked_actions` is a English model originally trained by fourthbrain-demo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_model_reddit_tsla_tracked_actions_en_5.2.0_3.0_1700400247464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_model_reddit_tsla_tracked_actions_en_5.2.0_3.0_1700400247464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_model_reddit_tsla_tracked_actions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_model_reddit_tsla_tracked_actions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_model_reddit_tsla_tracked_actions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/fourthbrain-demo/bert_model_reddit_tsla_tracked_actions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-bert_model_tianyib_en.md b/docs/_posts/ahmedlone127/2023-11-19-bert_model_tianyib_en.md new file mode 100644 index 000000000000..d83fed9aac55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-bert_model_tianyib_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_model_tianyib DistilBertForSequenceClassification from tianyib +author: John Snow Labs +name: bert_model_tianyib +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_model_tianyib` is a English model originally trained by tianyib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_model_tianyib_en_5.2.0_3.0_1700352605435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_model_tianyib_en_5.2.0_3.0_1700352605435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_model_tianyib","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_model_tianyib","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_model_tianyib| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tianyib/bert_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-binary_compqa_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-binary_compqa_classifier_en.md new file mode 100644 index 000000000000..e8484ee4673c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-binary_compqa_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English binary_compqa_classifier DistilBertForSequenceClassification from uhhlt +author: John Snow Labs +name: binary_compqa_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`binary_compqa_classifier` is a English model originally trained by uhhlt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/binary_compqa_classifier_en_5.2.0_3.0_1700372035281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/binary_compqa_classifier_en_5.2.0_3.0_1700372035281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("binary_compqa_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("binary_compqa_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|binary_compqa_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/uhhlt/binary-compqa-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_icd_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_icd_model_en.md new file mode 100644 index 000000000000..99c32a5831ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_icd_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_icd_model DistilBertForSequenceClassification from jacksprat +author: John Snow Labs +name: burmese_awesome_icd_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_icd_model` is a English model originally trained by jacksprat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_icd_model_en_5.2.0_3.0_1700356973910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_icd_model_en_5.2.0_3.0_1700356973910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_icd_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_icd_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_icd_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/jacksprat/my_awesome_ICD_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_andriidemk_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_andriidemk_en.md new file mode 100644 index 000000000000..de2f33486d76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_andriidemk_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_andriidemk DistilBertForSequenceClassification from AndriiDemk +author: John Snow Labs +name: burmese_awesome_model_andriidemk +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_andriidemk` is a English model originally trained by AndriiDemk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_andriidemk_en_5.2.0_3.0_1700379777064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_andriidemk_en_5.2.0_3.0_1700379777064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_andriidemk","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_andriidemk","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_andriidemk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AndriiDemk/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_jaheroth_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_jaheroth_en.md new file mode 100644 index 000000000000..3ca203e94986 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_jaheroth_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_jaheroth DistilBertForSequenceClassification from jaheroth +author: John Snow Labs +name: burmese_awesome_model_jaheroth +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_jaheroth` is a English model originally trained by jaheroth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_jaheroth_en_5.2.0_3.0_1700356417519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_jaheroth_en_5.2.0_3.0_1700356417519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_jaheroth","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_jaheroth","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_jaheroth| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jaheroth/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_tiemnd_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_tiemnd_en.md new file mode 100644 index 000000000000..cc8e5ac3fd7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_awesome_model_tiemnd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_tiemnd DistilBertForSequenceClassification from tiemnd +author: John Snow Labs +name: burmese_awesome_model_tiemnd +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_tiemnd` is a English model originally trained by tiemnd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_tiemnd_en_5.2.0_3.0_1700356414723.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_tiemnd_en_5.2.0_3.0_1700356414723.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_tiemnd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_tiemnd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_tiemnd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tiemnd/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label26_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label26_en.md new file mode 100644 index 000000000000..fe8bc53db062 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label26_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_classifier_label26 DistilBertForSequenceClassification from passionMan +author: John Snow Labs +name: burmese_classifier_label26 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_classifier_label26` is a English model originally trained by passionMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_classifier_label26_en_5.2.0_3.0_1700374198376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_classifier_label26_en_5.2.0_3.0_1700374198376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_classifier_label26","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_classifier_label26","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_classifier_label26| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/passionMan/my_classifier_label26 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label26_transductive_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label26_transductive_en.md new file mode 100644 index 000000000000..a66d74c472a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label26_transductive_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_classifier_label26_transductive DistilBertForSequenceClassification from passionMan +author: John Snow Labs +name: burmese_classifier_label26_transductive +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_classifier_label26_transductive` is a English model originally trained by passionMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_classifier_label26_transductive_en_5.2.0_3.0_1700401999113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_classifier_label26_transductive_en_5.2.0_3.0_1700401999113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_classifier_label26_transductive","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_classifier_label26_transductive","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_classifier_label26_transductive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/passionMan/my_classifier_label26_transductive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label41_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label41_en.md new file mode 100644 index 000000000000..c537665b6a72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_classifier_label41_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_classifier_label41 DistilBertForSequenceClassification from passionMan +author: John Snow Labs +name: burmese_classifier_label41 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_classifier_label41` is a English model originally trained by passionMan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_classifier_label41_en_5.2.0_3.0_1700435830851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_classifier_label41_en_5.2.0_3.0_1700435830851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_classifier_label41","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_classifier_label41","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_classifier_label41| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/passionMan/my_classifier_label41 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_distilbert_model_adirobot_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_distilbert_model_adirobot_en.md new file mode 100644 index 000000000000..2436e3f2e019 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_distilbert_model_adirobot_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_distilbert_model_adirobot DistilBertForSequenceClassification from Adirobot +author: John Snow Labs +name: burmese_distilbert_model_adirobot +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_distilbert_model_adirobot` is a English model originally trained by Adirobot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_distilbert_model_adirobot_en_5.2.0_3.0_1700416377304.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_distilbert_model_adirobot_en_5.2.0_3.0_1700416377304.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_distilbert_model_adirobot","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_distilbert_model_adirobot","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_distilbert_model_adirobot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Adirobot/my_distilbert_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_left_padding_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_left_padding_model_en.md new file mode 100644 index 000000000000..d2a946f4fb2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_left_padding_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_left_padding_model DistilBertForSequenceClassification from Realgon +author: John Snow Labs +name: burmese_left_padding_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_left_padding_model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_left_padding_model_en_5.2.0_3.0_1700432390374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_left_padding_model_en_5.2.0_3.0_1700432390374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_left_padding_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_left_padding_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_left_padding_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Realgon/my_left_padding_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_misinformation_distilbert_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_misinformation_distilbert_model_en.md new file mode 100644 index 000000000000..6852e76b7b31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_misinformation_distilbert_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_misinformation_distilbert_model DistilBertForSequenceClassification from jojo0616 +author: John Snow Labs +name: burmese_misinformation_distilbert_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_misinformation_distilbert_model` is a English model originally trained by jojo0616. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_misinformation_distilbert_model_en_5.2.0_3.0_1700365662229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_misinformation_distilbert_model_en_5.2.0_3.0_1700365662229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_misinformation_distilbert_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_misinformation_distilbert_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_misinformation_distilbert_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jojo0616/my_Misinformation_distilbert_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-burmese_tc_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-burmese_tc_model_en.md new file mode 100644 index 000000000000..bd6a37cca582 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-burmese_tc_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_tc_model DistilBertForSequenceClassification from bryjaco +author: John Snow Labs +name: burmese_tc_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_tc_model` is a English model originally trained by bryjaco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_tc_model_en_5.2.0_3.0_1700436711170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_tc_model_en_5.2.0_3.0_1700436711170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_tc_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_tc_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_tc_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bryjaco/my_tc_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-cancertextv1_en.md b/docs/_posts/ahmedlone127/2023-11-19-cancertextv1_en.md new file mode 100644 index 000000000000..77f12ffaf0ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-cancertextv1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cancertextv1 DistilBertForSequenceClassification from Dinithi +author: John Snow Labs +name: cancertextv1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cancertextv1` is a English model originally trained by Dinithi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cancertextv1_en_5.2.0_3.0_1700370437458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cancertextv1_en_5.2.0_3.0_1700370437458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("cancertextv1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("cancertextv1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cancertextv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Dinithi/CancerTextV1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-claim2_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-19-claim2_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..2844f2b01f30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-claim2_distilbert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English claim2_distilbert_base_uncased DistilBertForSequenceClassification from SCORE +author: John Snow Labs +name: claim2_distilbert_base_uncased +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`claim2_distilbert_base_uncased` is a English model originally trained by SCORE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/claim2_distilbert_base_uncased_en_5.2.0_3.0_1700355612095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/claim2_distilbert_base_uncased_en_5.2.0_3.0_1700355612095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("claim2_distilbert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("claim2_distilbert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|claim2_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SCORE/claim2-distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-claim3b_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-19-claim3b_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..ab8e38ce1f80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-claim3b_distilbert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English claim3b_distilbert_base_uncased DistilBertForSequenceClassification from SCORE +author: John Snow Labs +name: claim3b_distilbert_base_uncased +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`claim3b_distilbert_base_uncased` is a English model originally trained by SCORE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/claim3b_distilbert_base_uncased_en_5.2.0_3.0_1700356077533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/claim3b_distilbert_base_uncased_en_5.2.0_3.0_1700356077533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("claim3b_distilbert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("claim3b_distilbert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|claim3b_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SCORE/claim3b-distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-clasificador_news_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-clasificador_news_2_en.md new file mode 100644 index 000000000000..6ecf5008ec4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-clasificador_news_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English clasificador_news_2 DistilBertForSequenceClassification from Alesteba +author: John Snow Labs +name: clasificador_news_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clasificador_news_2` is a English model originally trained by Alesteba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clasificador_news_2_en_5.2.0_3.0_1700352132250.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clasificador_news_2_en_5.2.0_3.0_1700352132250.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("clasificador_news_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("clasificador_news_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clasificador_news_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Alesteba/clasificador-news-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-classifier_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-classifier_distilbert_en.md new file mode 100644 index 000000000000..47366f7bf73b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-classifier_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English classifier_distilbert DistilBertForSequenceClassification from MateuszW +author: John Snow Labs +name: classifier_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`classifier_distilbert` is a English model originally trained by MateuszW. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/classifier_distilbert_en_5.2.0_3.0_1700355643611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/classifier_distilbert_en_5.2.0_3.0_1700355643611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("classifier_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("classifier_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|classifier_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/MateuszW/classifier-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-codenetclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-codenetclassifier_en.md new file mode 100644 index 000000000000..f2897cd562b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-codenetclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English codenetclassifier DistilBertForSequenceClassification from petersa2 +author: John Snow Labs +name: codenetclassifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`codenetclassifier` is a English model originally trained by petersa2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/codenetclassifier_en_5.2.0_3.0_1700427308081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/codenetclassifier_en_5.2.0_3.0_1700427308081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("codenetclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("codenetclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|codenetclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/petersa2/CodeNetClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-coherence_classifier_final_en.md b/docs/_posts/ahmedlone127/2023-11-19-coherence_classifier_final_en.md new file mode 100644 index 000000000000..ba096754ba28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-coherence_classifier_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English coherence_classifier_final DistilBertForSequenceClassification from kccheng1988 +author: John Snow Labs +name: coherence_classifier_final +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`coherence_classifier_final` is a English model originally trained by kccheng1988. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/coherence_classifier_final_en_5.2.0_3.0_1700355930202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/coherence_classifier_final_en_5.2.0_3.0_1700355930202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("coherence_classifier_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("coherence_classifier_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|coherence_classifier_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kccheng1988/coherence-classifier-final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-compliance_monitoring_oms_en.md b/docs/_posts/ahmedlone127/2023-11-19-compliance_monitoring_oms_en.md new file mode 100644 index 000000000000..8b64f0c12f52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-compliance_monitoring_oms_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English compliance_monitoring_oms DistilBertForSequenceClassification from Vineetttt +author: John Snow Labs +name: compliance_monitoring_oms +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`compliance_monitoring_oms` is a English model originally trained by Vineetttt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/compliance_monitoring_oms_en_5.2.0_3.0_1700390529101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/compliance_monitoring_oms_en_5.2.0_3.0_1700390529101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("compliance_monitoring_oms","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("compliance_monitoring_oms","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|compliance_monitoring_oms| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Vineetttt/compliance_monitoring_oms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-covid_tweet_sentiment_analyzer_distilbert_kwameoo_en.md b/docs/_posts/ahmedlone127/2023-11-19-covid_tweet_sentiment_analyzer_distilbert_kwameoo_en.md new file mode 100644 index 000000000000..01e5aca512d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-covid_tweet_sentiment_analyzer_distilbert_kwameoo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_tweet_sentiment_analyzer_distilbert_kwameoo DistilBertForSequenceClassification from KwameOO +author: John Snow Labs +name: covid_tweet_sentiment_analyzer_distilbert_kwameoo +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_tweet_sentiment_analyzer_distilbert_kwameoo` is a English model originally trained by KwameOO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_tweet_sentiment_analyzer_distilbert_kwameoo_en_5.2.0_3.0_1700436712233.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_tweet_sentiment_analyzer_distilbert_kwameoo_en_5.2.0_3.0_1700436712233.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("covid_tweet_sentiment_analyzer_distilbert_kwameoo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("covid_tweet_sentiment_analyzer_distilbert_kwameoo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_tweet_sentiment_analyzer_distilbert_kwameoo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/KwameOO/covid-tweet-sentiment-analyzer-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-cross_encoder_msmarco_distilbert_word2vec256k_en.md b/docs/_posts/ahmedlone127/2023-11-19-cross_encoder_msmarco_distilbert_word2vec256k_en.md new file mode 100644 index 000000000000..6b798c946b51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-cross_encoder_msmarco_distilbert_word2vec256k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cross_encoder_msmarco_distilbert_word2vec256k DistilBertForSequenceClassification from vocab-transformers +author: John Snow Labs +name: cross_encoder_msmarco_distilbert_word2vec256k +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_msmarco_distilbert_word2vec256k` is a English model originally trained by vocab-transformers. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_msmarco_distilbert_word2vec256k_en_5.2.0_3.0_1700387785281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_msmarco_distilbert_word2vec256k_en_5.2.0_3.0_1700387785281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("cross_encoder_msmarco_distilbert_word2vec256k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("cross_encoder_msmarco_distilbert_word2vec256k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_msmarco_distilbert_word2vec256k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|887.9 MB| + +## References + +https://huggingface.co/vocab-transformers/cross_encoder-msmarco-distilbert-word2vec256k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-custom_handler_tutorial_en.md b/docs/_posts/ahmedlone127/2023-11-19-custom_handler_tutorial_en.md new file mode 100644 index 000000000000..959f54fc3aba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-custom_handler_tutorial_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English custom_handler_tutorial DistilBertForSequenceClassification from joelb +author: John Snow Labs +name: custom_handler_tutorial +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`custom_handler_tutorial` is a English model originally trained by joelb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/custom_handler_tutorial_en_5.2.0_3.0_1700355071342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/custom_handler_tutorial_en_5.2.0_3.0_1700355071342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("custom_handler_tutorial","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("custom_handler_tutorial","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|custom_handler_tutorial| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/joelb/custom-handler-tutorial \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-data_science_article_titles_engagement_en.md b/docs/_posts/ahmedlone127/2023-11-19-data_science_article_titles_engagement_en.md new file mode 100644 index 000000000000..d286b719668d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-data_science_article_titles_engagement_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English data_science_article_titles_engagement DistilBertForSequenceClassification from dima806 +author: John Snow Labs +name: data_science_article_titles_engagement +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`data_science_article_titles_engagement` is a English model originally trained by dima806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/data_science_article_titles_engagement_en_5.2.0_3.0_1700382773581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/data_science_article_titles_engagement_en_5.2.0_3.0_1700382773581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("data_science_article_titles_engagement","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("data_science_article_titles_engagement","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|data_science_article_titles_engagement| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/dima806/data-science-article-titles-engagement \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en.md b/docs/_posts/ahmedlone127/2023-11-19-dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en.md new file mode 100644 index 000000000000..8bfcab5009a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds DistilBertForSequenceClassification from francisco-perez-sorrosal +author: John Snow Labs +name: dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds` is a English model originally trained by francisco-perez-sorrosal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en_5.2.0_3.0_1700381604154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en_5.2.0_3.0_1700381604154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/francisco-perez-sorrosal/dccuchile-distilbert-base-spanish-uncased-finetuned-with-spanish-tweets-clf-cleaned-ds \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_en.md b/docs/_posts/ahmedlone127/2023-11-19-dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_en.md new file mode 100644 index 000000000000..b360485bd708 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf DistilBertForSequenceClassification from francisco-perez-sorrosal +author: John Snow Labs +name: dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf` is a English model originally trained by francisco-perez-sorrosal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_en_5.2.0_3.0_1700393596260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf_en_5.2.0_3.0_1700393596260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dccuchile_distilbert_base_spanish_uncased_finetuned_with_spanish_tweets_clf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/francisco-perez-sorrosal/dccuchile-distilbert-base-spanish-uncased-finetuned-with-spanish-tweets-clf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-demo_emotion_42_en.md b/docs/_posts/ahmedlone127/2023-11-19-demo_emotion_42_en.md new file mode 100644 index 000000000000..7200eb09ea4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-demo_emotion_42_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English demo_emotion_42 DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: demo_emotion_42 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`demo_emotion_42` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/demo_emotion_42_en_5.2.0_3.0_1700389669941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/demo_emotion_42_en_5.2.0_3.0_1700389669941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_emotion_42","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_emotion_42","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|demo_emotion_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aXhyra/demo_emotion_42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-demo_hate_1234567_en.md b/docs/_posts/ahmedlone127/2023-11-19-demo_hate_1234567_en.md new file mode 100644 index 000000000000..77dab19e88f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-demo_hate_1234567_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English demo_hate_1234567 DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: demo_hate_1234567 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`demo_hate_1234567` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/demo_hate_1234567_en_5.2.0_3.0_1700381799895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/demo_hate_1234567_en_5.2.0_3.0_1700381799895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_hate_1234567","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_hate_1234567","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|demo_hate_1234567| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/aXhyra/demo_hate_1234567 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-demo_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-demo_model_en.md new file mode 100644 index 000000000000..21d7e196b1e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-demo_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English demo_model DistilBertForSequenceClassification from anth0nyhak1m +author: John Snow Labs +name: demo_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`demo_model` is a English model originally trained by anth0nyhak1m. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/demo_model_en_5.2.0_3.0_1700352282603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/demo_model_en_5.2.0_3.0_1700352282603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|demo_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anth0nyhak1m/demo_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-dibt_confidence_score_v9_en.md b/docs/_posts/ahmedlone127/2023-11-19-dibt_confidence_score_v9_en.md new file mode 100644 index 000000000000..3978bec911e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-dibt_confidence_score_v9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dibt_confidence_score_v9 DistilBertForSequenceClassification from mjbolanos9 +author: John Snow Labs +name: dibt_confidence_score_v9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dibt_confidence_score_v9` is a English model originally trained by mjbolanos9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dibt_confidence_score_v9_en_5.2.0_3.0_1700410234135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dibt_confidence_score_v9_en_5.2.0_3.0_1700410234135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("dibt_confidence_score_v9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("dibt_confidence_score_v9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dibt_confidence_score_v9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mjbolanos9/dibt-confidence-score_v9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-dilbert_uncased_product_categories_en.md b/docs/_posts/ahmedlone127/2023-11-19-dilbert_uncased_product_categories_en.md new file mode 100644 index 000000000000..7af36c4138a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-dilbert_uncased_product_categories_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dilbert_uncased_product_categories DistilBertForSequenceClassification from osiloke +author: John Snow Labs +name: dilbert_uncased_product_categories +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dilbert_uncased_product_categories` is a English model originally trained by osiloke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dilbert_uncased_product_categories_en_5.2.0_3.0_1700355280828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dilbert_uncased_product_categories_en_5.2.0_3.0_1700355280828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("dilbert_uncased_product_categories","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("dilbert_uncased_product_categories","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dilbert_uncased_product_categories| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.8 MB| + +## References + +https://huggingface.co/osiloke/dilbert_uncased_product_categories \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-disaster_tweet_distilbert_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-disaster_tweet_distilbert_4_en.md new file mode 100644 index 000000000000..3c5a8c55396a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-disaster_tweet_distilbert_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English disaster_tweet_distilbert_4 DistilBertForSequenceClassification from aellxx +author: John Snow Labs +name: disaster_tweet_distilbert_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`disaster_tweet_distilbert_4` is a English model originally trained by aellxx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/disaster_tweet_distilbert_4_en_5.2.0_3.0_1700390580258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/disaster_tweet_distilbert_4_en_5.2.0_3.0_1700390580258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("disaster_tweet_distilbert_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("disaster_tweet_distilbert_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|disaster_tweet_distilbert_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/aellxx/disaster-tweet-distilbert-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-discourse_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-discourse_classification_en.md new file mode 100644 index 000000000000..711ca23a363a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-discourse_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English discourse_classification DistilBertForSequenceClassification from Manishkalra +author: John Snow Labs +name: discourse_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`discourse_classification` is a English model originally trained by Manishkalra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/discourse_classification_en_5.2.0_3.0_1700406290656.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/discourse_classification_en_5.2.0_3.0_1700406290656.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("discourse_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("discourse_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|discourse_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Manishkalra/discourse_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distil_bert_uncased_finetuned_github_issues_en.md b/docs/_posts/ahmedlone127/2023-11-19-distil_bert_uncased_finetuned_github_issues_en.md new file mode 100644 index 000000000000..08753db418e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distil_bert_uncased_finetuned_github_issues_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distil_bert_uncased_finetuned_github_issues DistilBertForSequenceClassification from ivanlau +author: John Snow Labs +name: distil_bert_uncased_finetuned_github_issues +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_uncased_finetuned_github_issues` is a English model originally trained by ivanlau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_uncased_finetuned_github_issues_en_5.2.0_3.0_1700382701258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_uncased_finetuned_github_issues_en_5.2.0_3.0_1700382701258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distil_bert_uncased_finetuned_github_issues","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distil_bert_uncased_finetuned_github_issues","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_uncased_finetuned_github_issues| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ivanlau/distil-bert-uncased-finetuned-github-issues \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_add_glue_experiment_logit_kd_cola_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_add_glue_experiment_logit_kd_cola_en.md new file mode 100644 index 000000000000..b478500a308d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_add_glue_experiment_logit_kd_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_cola DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_cola +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_cola` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_cola_en_5.2.0_3.0_1700388742521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_cola_en_5.2.0_3.0_1700388742521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_add_glue_experiment_logit_kd_pretrain_cola_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_add_glue_experiment_logit_kd_pretrain_cola_en.md new file mode 100644 index 000000000000..9c2146b4cfab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_add_glue_experiment_logit_kd_pretrain_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_pretrain_cola DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_pretrain_cola +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_pretrain_cola` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_pretrain_cola_en_5.2.0_3.0_1700409379163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_pretrain_cola_en_5.2.0_3.0_1700409379163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_pretrain_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_pretrain_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_pretrain_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_pretrain_cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_amazon_sentiment_aspect_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_amazon_sentiment_aspect_en.md new file mode 100644 index 000000000000..ed641367ce98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_amazon_sentiment_aspect_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_amazon_sentiment_aspect DistilBertForSequenceClassification from MTOrange +author: John Snow Labs +name: distilbert_amazon_sentiment_aspect +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_amazon_sentiment_aspect` is a English model originally trained by MTOrange. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_amazon_sentiment_aspect_en_5.2.0_3.0_1700369216568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_amazon_sentiment_aspect_en_5.2.0_3.0_1700369216568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_sentiment_aspect","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_sentiment_aspect","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_amazon_sentiment_aspect| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/MTOrange/distilbert-amazon-sentiment-aspect \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_amazon_shoe_reviews_ubuntu_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_amazon_shoe_reviews_ubuntu_en.md new file mode 100644 index 000000000000..d897bca2acb4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_amazon_shoe_reviews_ubuntu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_amazon_shoe_reviews_ubuntu DistilBertForSequenceClassification from excode +author: John Snow Labs +name: distilbert_amazon_shoe_reviews_ubuntu +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_amazon_shoe_reviews_ubuntu` is a English model originally trained by excode. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_amazon_shoe_reviews_ubuntu_en_5.2.0_3.0_1700358440235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_amazon_shoe_reviews_ubuntu_en_5.2.0_3.0_1700358440235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_shoe_reviews_ubuntu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_shoe_reviews_ubuntu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_amazon_shoe_reviews_ubuntu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/excode/distilbert-amazon-shoe-reviews_ubuntu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_cased_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_cased_emotion_en.md new file mode 100644 index 000000000000..14db691a613e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_cased_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_emotion DistilBertForSequenceClassification from morenolq +author: John Snow Labs +name: distilbert_base_cased_emotion +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_emotion` is a English model originally trained by morenolq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_emotion_en_5.2.0_3.0_1700356160149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_emotion_en_5.2.0_3.0_1700356160149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/morenolq/distilbert-base-cased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_cased_finetuned_fake_news_detection_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_cased_finetuned_fake_news_detection_en.md new file mode 100644 index 000000000000..8ef32246b24f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_cased_finetuned_fake_news_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_fake_news_detection DistilBertForSequenceClassification from raileymontalan +author: John Snow Labs +name: distilbert_base_cased_finetuned_fake_news_detection +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_fake_news_detection` is a English model originally trained by raileymontalan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_fake_news_detection_en_5.2.0_3.0_1700380573571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_fake_news_detection_en_5.2.0_3.0_1700380573571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_finetuned_fake_news_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_finetuned_fake_news_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_fake_news_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/raileymontalan/distilbert-base-cased-finetuned-fake-news-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_casedfinetuned_fake_news_detection_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_casedfinetuned_fake_news_detection_en.md new file mode 100644 index 000000000000..e18d18efc4c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_casedfinetuned_fake_news_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_casedfinetuned_fake_news_detection DistilBertForSequenceClassification from raileymontalan +author: John Snow Labs +name: distilbert_base_casedfinetuned_fake_news_detection +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_casedfinetuned_fake_news_detection` is a English model originally trained by raileymontalan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_casedfinetuned_fake_news_detection_en_5.2.0_3.0_1700431385180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_casedfinetuned_fake_news_detection_en_5.2.0_3.0_1700431385180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_casedfinetuned_fake_news_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_casedfinetuned_fake_news_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_casedfinetuned_fake_news_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/raileymontalan/distilbert-base-casedfinetuned-fake-news-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_xx.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_xx.md new file mode 100644 index 000000000000..ba6f8d19c76f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf DistilBertForSequenceClassification from francisco-perez-sorrosal +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf` is a Multilingual model originally trained by francisco-perez-sorrosal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_xx_5.2.0_3.0_1700391703687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_xx_5.2.0_3.0_1700391703687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/francisco-perez-sorrosal/distilbert-base-multilingual-cased-finetuned-with-spanish-tweets-clf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_jobcategory_1m_xx.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_jobcategory_1m_xx.md new file mode 100644 index 000000000000..f280776b8746 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_jobcategory_1m_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_jobcategory_1m DistilBertForSequenceClassification from serbog +author: John Snow Labs +name: distilbert_base_multilingual_cased_jobcategory_1m +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_jobcategory_1m` is a Multilingual model originally trained by serbog. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_jobcategory_1m_xx_5.2.0_3.0_1700435652347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_jobcategory_1m_xx_5.2.0_3.0_1700435652347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_jobcategory_1m","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_jobcategory_1m","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_jobcategory_1m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/serbog/distilbert-base-multilingual-cased-jobCategory_1m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_language_detection_lg_xx.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_language_detection_lg_xx.md new file mode 100644 index 000000000000..a20e1da2ffc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_language_detection_lg_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_language_detection_lg DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_multilingual_cased_language_detection_lg +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_language_detection_lg` is a Multilingual model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_language_detection_lg_xx_5.2.0_3.0_1700354971257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_language_detection_lg_xx_5.2.0_3.0_1700354971257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_language_detection_lg","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_language_detection_lg","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_language_detection_lg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.7 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-multilingual-cased-language_detection-LG \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_language_detection_tweets_xx.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_language_detection_tweets_xx.md new file mode 100644 index 000000000000..5e99c2010339 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_language_detection_tweets_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_language_detection_tweets DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_multilingual_cased_language_detection_tweets +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_language_detection_tweets` is a Multilingual model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_language_detection_tweets_xx_5.2.0_3.0_1700377033331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_language_detection_tweets_xx_5.2.0_3.0_1700377033331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_language_detection_tweets","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_language_detection_tweets","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_language_detection_tweets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-multilingual-cased-language_detection_tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer_xx.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer_xx.md new file mode 100644 index 000000000000..16a9d966ae4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer DistilBertForSequenceClassification from arjuntheprogrammer +author: John Snow Labs +name: distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer` is a Multilingual model originally trained by arjuntheprogrammer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer_xx_5.2.0_3.0_1700354630549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer_xx_5.2.0_3.0_1700354630549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_sentiment_2_arjuntheprogrammer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/arjuntheprogrammer/distilbert-base-multilingual-cased-sentiment-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_sentiment_en.md new file mode 100644 index 000000000000..b6e3f6e97cd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_sentiment DistilBertForSequenceClassification from 51la5 +author: John Snow Labs +name: distilbert_base_sentiment +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_sentiment` is a English model originally trained by 51la5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_sentiment_en_5.2.0_3.0_1700354585957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_sentiment_en_5.2.0_3.0_1700354585957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/51la5/distilbert-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_spanish_uncased_finetuned_pawsx_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_spanish_uncased_finetuned_pawsx_en.md new file mode 100644 index 000000000000..160fa2e4c180 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_spanish_uncased_finetuned_pawsx_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_pawsx DistilBertForSequenceClassification from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_pawsx +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_pawsx` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_pawsx_en_5.2.0_3.0_1700355002674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_pawsx_en_5.2.0_3.0_1700355002674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_spanish_uncased_finetuned_pawsx","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_spanish_uncased_finetuned_pawsx","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_pawsx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-pawsx \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_sst_2_english_finetuned_question_v_statement_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_sst_2_english_finetuned_question_v_statement_en.md new file mode 100644 index 000000000000..35cb12dd5a5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_sst_2_english_finetuned_question_v_statement_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_sst_2_english_finetuned_question_v_statement DistilBertForSequenceClassification from mafwalter +author: John Snow Labs +name: distilbert_base_sst_2_english_finetuned_question_v_statement +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_sst_2_english_finetuned_question_v_statement` is a English model originally trained by mafwalter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_sst_2_english_finetuned_question_v_statement_en_5.2.0_3.0_1700389612473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_sst_2_english_finetuned_question_v_statement_en_5.2.0_3.0_1700389612473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_sst_2_english_finetuned_question_v_statement","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_sst_2_english_finetuned_question_v_statement","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_sst_2_english_finetuned_question_v_statement| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mafwalter/distilbert-base-sst-2-english-finetuned-question-v-statement \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_tag_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_tag_classification_en.md new file mode 100644 index 000000000000..dca32e936186 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_tag_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_tag_classification DistilBertForSequenceClassification from chinmayapani +author: John Snow Labs +name: distilbert_base_tag_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_tag_classification` is a English model originally trained by chinmayapani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_tag_classification_en_5.2.0_3.0_1700430375940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_tag_classification_en_5.2.0_3.0_1700430375940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_tag_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_tag_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_tag_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/chinmayapani/distilBert-base-tag-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_turkish_cased_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_turkish_cased_sentiment_en.md new file mode 100644 index 000000000000..c217fa718e80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_turkish_cased_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_turkish_cased_sentiment DistilBertForSequenceClassification from azizbarank +author: John Snow Labs +name: distilbert_base_turkish_cased_sentiment +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_turkish_cased_sentiment` is a English model originally trained by azizbarank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_sentiment_en_5.2.0_3.0_1700353695504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_sentiment_en_5.2.0_3.0_1700353695504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_turkish_cased_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_turkish_cased_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_turkish_cased_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|254.0 MB| + +## References + +https://huggingface.co/azizbarank/distilbert-base-turkish-cased-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__ethos_binary__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__ethos_binary__all_train_en.md new file mode 100644 index 000000000000..f25faa6c2a7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__ethos_binary__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__ethos_binary__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__ethos_binary__all_train +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__ethos_binary__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__ethos_binary__all_train_en_5.2.0_3.0_1700355909253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__ethos_binary__all_train_en_5.2.0_3.0_1700355909253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__ethos_binary__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__ethos_binary__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__ethos_binary__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__ethos_binary__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__all_train_en.md new file mode 100644 index 000000000000..c9fe0415e9ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__all_train +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__all_train_en_5.2.0_3.0_1700354916004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__all_train_en_5.2.0_3.0_1700354916004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_1_en.md new file mode 100644 index 000000000000..5ff43f7d3851 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_1 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_1` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_1_en_5.2.0_3.0_1700356659730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_1_en_5.2.0_3.0_1700356659730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_3_en.md new file mode 100644 index 000000000000..982203308546 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_3 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_3` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_3_en_5.2.0_3.0_1700356623234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_3_en_5.2.0_3.0_1700356623234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_4_en.md new file mode 100644 index 000000000000..9b8e8175d33f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_4 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_4` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_4_en_5.2.0_3.0_1700356921106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_4_en_5.2.0_3.0_1700356921106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_5_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_5_en.md new file mode 100644 index 000000000000..e5aa1828a80d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_5 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_5 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_5` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_5_en_5.2.0_3.0_1700356819703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_5_en_5.2.0_3.0_1700356819703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_6_en.md new file mode 100644 index 000000000000..873562230828 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_6 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_6` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_6_en_5.2.0_3.0_1700352436962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_6_en_5.2.0_3.0_1700352436962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_8_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_8_en.md new file mode 100644 index 000000000000..cc26540214de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_8 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_8 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_8` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_8_en_5.2.0_3.0_1700355133744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_8_en_5.2.0_3.0_1700355133744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_9_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_9_en.md new file mode 100644 index 000000000000..81c1fa182c40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_16_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_16_9 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_16_9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_16_9` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_9_en_5.2.0_3.0_1700352439400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_16_9_en_5.2.0_3.0_1700352439400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_16_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_16_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-16-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_0_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_0_en.md new file mode 100644 index 000000000000..74180c2be510 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_0 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_0 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_0` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_0_en_5.2.0_3.0_1700354785706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_0_en_5.2.0_3.0_1700354785706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_1_en.md new file mode 100644 index 000000000000..d2e7bb5af0e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_1 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_1` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_1_en_5.2.0_3.0_1700353999482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_1_en_5.2.0_3.0_1700353999482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_2_en.md new file mode 100644 index 000000000000..b1143a8c42c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_2 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_2` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_2_en_5.2.0_3.0_1700357475762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_2_en_5.2.0_3.0_1700357475762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_4_en.md new file mode 100644 index 000000000000..6cfb8df639e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_4 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_4` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_4_en_5.2.0_3.0_1700367583091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_4_en_5.2.0_3.0_1700367583091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_6_en.md new file mode 100644 index 000000000000..708231a977a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_6 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_6` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_6_en_5.2.0_3.0_1700354418779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_6_en_5.2.0_3.0_1700354418779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_7_en.md new file mode 100644 index 000000000000..4473a1884fe0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_7 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_7` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_7_en_5.2.0_3.0_1700354274943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_7_en_5.2.0_3.0_1700354274943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_8_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_8_en.md new file mode 100644 index 000000000000..f00bd45f88fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_8 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_8 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_8` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_8_en_5.2.0_3.0_1700360576916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_8_en_5.2.0_3.0_1700360576916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_9_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_9_en.md new file mode 100644 index 000000000000..69eec44a01b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_32_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_32_9 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_32_9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_32_9` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_9_en_5.2.0_3.0_1700353825251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_32_9_en_5.2.0_3.0_1700353825251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_32_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_32_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-32-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_0_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_0_en.md new file mode 100644 index 000000000000..2451e856b3d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_0 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_0 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_0` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_0_en_5.2.0_3.0_1700363870674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_0_en_5.2.0_3.0_1700363870674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_1_en.md new file mode 100644 index 000000000000..ae4d46bad979 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_1 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_1` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_1_en_5.2.0_3.0_1700354141072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_1_en_5.2.0_3.0_1700354141072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_2_en.md new file mode 100644 index 000000000000..e7287210c8c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_2 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_2` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_2_en_5.2.0_3.0_1700355925890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_2_en_5.2.0_3.0_1700355925890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_3_en.md new file mode 100644 index 000000000000..116149298ef1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_3 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_3` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_3_en_5.2.0_3.0_1700356780784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_3_en_5.2.0_3.0_1700356780784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_6_en.md new file mode 100644 index 000000000000..417666479f58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_6 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_6` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_6_en_5.2.0_3.0_1700353018583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_6_en_5.2.0_3.0_1700353018583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_7_en.md new file mode 100644 index 000000000000..e00e2e3279a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_7 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_7` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_7_en_5.2.0_3.0_1700353980130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_7_en_5.2.0_3.0_1700353980130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_8_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_8_en.md new file mode 100644 index 000000000000..e94d418f37e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_8 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_8 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_8` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_8_en_5.2.0_3.0_1700352095746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_8_en_5.2.0_3.0_1700352095746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_9_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_9_en.md new file mode 100644 index 000000000000..371a1c544d19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__hate_speech_offensive__train_8_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__hate_speech_offensive__train_8_9 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__hate_speech_offensive__train_8_9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__hate_speech_offensive__train_8_9` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_9_en_5.2.0_3.0_1700353509004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__hate_speech_offensive__train_8_9_en_5.2.0_3.0_1700353509004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__hate_speech_offensive__train_8_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__hate_speech_offensive__train_8_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__hate_speech_offensive__train-8-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__all_train_en.md new file mode 100644 index 000000000000..77a21a2e7771 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__all_train +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__all_train_en_5.2.0_3.0_1700353851382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__all_train_en_5.2.0_3.0_1700353851382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_0_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_0_en.md new file mode 100644 index 000000000000..d9adfdb5140f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_0 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_0 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_0` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_0_en_5.2.0_3.0_1700353202594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_0_en_5.2.0_3.0_1700353202594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_1_en.md new file mode 100644 index 000000000000..5a88f39e6536 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_1 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_1` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_1_en_5.2.0_3.0_1700356233865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_1_en_5.2.0_3.0_1700356233865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_2_en.md new file mode 100644 index 000000000000..c5c6452f4081 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_2 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_2` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_2_en_5.2.0_3.0_1700354754388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_2_en_5.2.0_3.0_1700354754388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_3_en.md new file mode 100644 index 000000000000..00adce0d9566 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_3 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_3` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_3_en_5.2.0_3.0_1700354234588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_3_en_5.2.0_3.0_1700354234588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_4_en.md new file mode 100644 index 000000000000..5fc9103a3f7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_4 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_4` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_4_en_5.2.0_3.0_1700366715638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_4_en_5.2.0_3.0_1700366715638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_5_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_5_en.md new file mode 100644 index 000000000000..b6ec672c982c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_5 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_5 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_5` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_5_en_5.2.0_3.0_1700356960548.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_5_en_5.2.0_3.0_1700356960548.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_6_en.md new file mode 100644 index 000000000000..121c1f059638 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_6 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_6` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_6_en_5.2.0_3.0_1700354592491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_6_en_5.2.0_3.0_1700354592491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_7_en.md new file mode 100644 index 000000000000..209398657d4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_7 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_7` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_7_en_5.2.0_3.0_1700364695862.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_7_en_5.2.0_3.0_1700364695862.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_8_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_8_en.md new file mode 100644 index 000000000000..524de8379ee7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_8 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_8 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_8` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_8_en_5.2.0_3.0_1700370856407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_8_en_5.2.0_3.0_1700370856407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_9_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_9_en.md new file mode 100644 index 000000000000..240edbf66c90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_16_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_16_9 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_16_9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_16_9` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_9_en_5.2.0_3.0_1700359782638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_16_9_en_5.2.0_3.0_1700359782638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_16_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_16_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-16-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_0_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_0_en.md new file mode 100644 index 000000000000..f5f73009c8c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_0 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_0 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_0` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_0_en_5.2.0_3.0_1700366716172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_0_en_5.2.0_3.0_1700366716172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_1_en.md new file mode 100644 index 000000000000..e7c5c3f70f94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_1 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_1` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_1_en_5.2.0_3.0_1700355069172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_1_en_5.2.0_3.0_1700355069172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_2_en.md new file mode 100644 index 000000000000..848912b68892 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_2 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_2` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_2_en_5.2.0_3.0_1700352699581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_2_en_5.2.0_3.0_1700352699581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_3_en.md new file mode 100644 index 000000000000..107c43206db1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_3 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_3` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_3_en_5.2.0_3.0_1700353371755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_3_en_5.2.0_3.0_1700353371755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_4_en.md new file mode 100644 index 000000000000..5902b46e46af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_4 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_4` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_4_en_5.2.0_3.0_1700372207600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_4_en_5.2.0_3.0_1700372207600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_5_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_5_en.md new file mode 100644 index 000000000000..5a5961bdcc9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_5 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_5 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_5` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_5_en_5.2.0_3.0_1700360766420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_5_en_5.2.0_3.0_1700360766420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_6_en.md new file mode 100644 index 000000000000..49c37cc15ba5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_6 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_6` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_6_en_5.2.0_3.0_1700353553162.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_6_en_5.2.0_3.0_1700353553162.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_7_en.md new file mode 100644 index 000000000000..9dda13e1e8b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_7 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_7` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_7_en_5.2.0_3.0_1700363705374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_7_en_5.2.0_3.0_1700363705374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_8_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_8_en.md new file mode 100644 index 000000000000..d3d8059180c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_8 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_8 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_8` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_8_en_5.2.0_3.0_1700352095521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_8_en_5.2.0_3.0_1700352095521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_9_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_9_en.md new file mode 100644 index 000000000000..22610d1e845d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_32_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_32_9 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_32_9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_32_9` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_9_en_5.2.0_3.0_1700359597259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_32_9_en_5.2.0_3.0_1700359597259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_32_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_32_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-32-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_0_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_0_en.md new file mode 100644 index 000000000000..0c49cfd1f81d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_0 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_0 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_0` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_0_en_5.2.0_3.0_1700355221401.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_0_en_5.2.0_3.0_1700355221401.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_1_en.md new file mode 100644 index 000000000000..64c23c32a009 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_1 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_1` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_1_en_5.2.0_3.0_1700355401539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_1_en_5.2.0_3.0_1700355401539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_2_en.md new file mode 100644 index 000000000000..160048f844b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_2 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_2` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_2_en_5.2.0_3.0_1700354750456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_2_en_5.2.0_3.0_1700354750456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_3_en.md new file mode 100644 index 000000000000..b04a401f6596 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_3 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_3` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_3_en_5.2.0_3.0_1700361773019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_3_en_5.2.0_3.0_1700361773019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_4_en.md new file mode 100644 index 000000000000..caffeb8f3638 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_4 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_4` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_4_en_5.2.0_3.0_1700356905669.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_4_en_5.2.0_3.0_1700356905669.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_6_en.md new file mode 100644 index 000000000000..b8cc372cbabf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_6 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_6` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_6_en_5.2.0_3.0_1700353396137.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_6_en_5.2.0_3.0_1700353396137.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_7_en.md new file mode 100644 index 000000000000..b0717762c269 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_7 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_7` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_7_en_5.2.0_3.0_1700352850525.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_7_en_5.2.0_3.0_1700352850525.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_8_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_8_en.md new file mode 100644 index 000000000000..372bff8e4b67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_8 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_8 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_8` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_8_en_5.2.0_3.0_1700363260529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_8_en_5.2.0_3.0_1700363260529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_9_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_9_en.md new file mode 100644 index 000000000000..e22a652554db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__sst2__train_8_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__sst2__train_8_9 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__sst2__train_8_9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__sst2__train_8_9` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_9_en_5.2.0_3.0_1700365763223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__sst2__train_8_9_en_5.2.0_3.0_1700365763223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__sst2__train_8_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__sst2__train_8_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__sst2__train-8-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_0_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_0_en.md new file mode 100644 index 000000000000..8d4d4c97dee5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_0 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_0 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_0` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_0_en_5.2.0_3.0_1700367583703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_0_en_5.2.0_3.0_1700367583703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_1_en.md new file mode 100644 index 000000000000..b2e90d7ecbaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_1 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_1` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_1_en_5.2.0_3.0_1700353210364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_1_en_5.2.0_3.0_1700353210364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_2_en.md new file mode 100644 index 000000000000..a29f45149843 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_2 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_2` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_2_en_5.2.0_3.0_1700357048519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_2_en_5.2.0_3.0_1700357048519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_3_en.md new file mode 100644 index 000000000000..c1638e2139b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_3 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_3` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_3_en_5.2.0_3.0_1700353021218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_3_en_5.2.0_3.0_1700353021218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_4_en.md new file mode 100644 index 000000000000..d04f98caa26e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_4 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_4` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_4_en_5.2.0_3.0_1700362785612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_4_en_5.2.0_3.0_1700362785612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_5_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_5_en.md new file mode 100644 index 000000000000..c122574a5f58 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_5 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_5 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_5` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_5_en_5.2.0_3.0_1700404786394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_5_en_5.2.0_3.0_1700404786394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_6_en.md new file mode 100644 index 000000000000..4e3376ca74ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_6 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_6` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_6_en_5.2.0_3.0_1700392448953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_6_en_5.2.0_3.0_1700392448953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_7_en.md new file mode 100644 index 000000000000..fff0fd32dc54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_7 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_7` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_7_en_5.2.0_3.0_1700419236038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_7_en_5.2.0_3.0_1700419236038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_8_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_8_en.md new file mode 100644 index 000000000000..70f596e25cc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_8 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_8 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_8` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_8_en_5.2.0_3.0_1700431327946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_8_en_5.2.0_3.0_1700431327946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_9_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_9_en.md new file mode 100644 index 000000000000..2f6a74eea5b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__subj__train_8_9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__subj__train_8_9 DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__subj__train_8_9 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__subj__train_8_9` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_9_en_5.2.0_3.0_1700431385171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__subj__train_8_9_en_5.2.0_3.0_1700431385171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__subj__train_8_9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__subj__train_8_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__subj__train-8-9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__trec_qc__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__trec_qc__all_train_en.md new file mode 100644 index 000000000000..aa78f0013301 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__trec_qc__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__trec_qc__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__trec_qc__all_train +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__trec_qc__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__trec_qc__all_train_en_5.2.0_3.0_1700355750258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__trec_qc__all_train_en_5.2.0_3.0_1700355750258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__trec_qc__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__trec_qc__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__trec_qc__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__TREC-QC__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__tweet_eval_stance__all_train_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__tweet_eval_stance__all_train_en.md new file mode 100644 index 000000000000..0369bd22d7c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased__tweet_eval_stance__all_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased__tweet_eval_stance__all_train DistilBertForSequenceClassification from SetFit +author: John Snow Labs +name: distilbert_base_uncased__tweet_eval_stance__all_train +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased__tweet_eval_stance__all_train` is a English model originally trained by SetFit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__tweet_eval_stance__all_train_en_5.2.0_3.0_1700355783760.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased__tweet_eval_stance__all_train_en_5.2.0_3.0_1700355783760.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__tweet_eval_stance__all_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased__tweet_eval_stance__all_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased__tweet_eval_stance__all_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SetFit/distilbert-base-uncased__tweet_eval_stance__all-train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_allagree3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_allagree3_en.md new file mode 100644 index 000000000000..0ce8feeb9b6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_allagree3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_allagree3 DistilBertForSequenceClassification from Farshid +author: John Snow Labs +name: distilbert_base_uncased_allagree3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_allagree3` is a English model originally trained by Farshid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_allagree3_en_5.2.0_3.0_1700379702065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_allagree3_en_5.2.0_3.0_1700379702065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_allagree3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_allagree3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_allagree3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Farshid/distilbert-base-uncased_allagree3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_cola_sciarrilli_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_cola_sciarrilli_en.md new file mode 100644 index 000000000000..3e2bb790992d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_cola_sciarrilli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_cola_sciarrilli DistilBertForSequenceClassification from sciarrilli +author: John Snow Labs +name: distilbert_base_uncased_cola_sciarrilli +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_cola_sciarrilli` is a English model originally trained by sciarrilli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_cola_sciarrilli_en_5.2.0_3.0_1700356837124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_cola_sciarrilli_en_5.2.0_3.0_1700356837124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_cola_sciarrilli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_cola_sciarrilli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_cola_sciarrilli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/sciarrilli/distilbert-base-uncased-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_augustbang_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_augustbang_en.md new file mode 100644 index 000000000000..802377d5f305 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_augustbang_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_augustbang DistilBertForSequenceClassification from Augustbang +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_augustbang +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_augustbang` is a English model originally trained by Augustbang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_augustbang_en_5.2.0_3.0_1700363260912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_augustbang_en_5.2.0_3.0_1700363260912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_augustbang","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_augustbang","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_augustbang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Augustbang/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_frahman_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_frahman_en.md new file mode 100644 index 000000000000..c26816872e5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_frahman_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_frahman DistilBertForSequenceClassification from frahman +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_frahman +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_frahman` is a English model originally trained by frahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_frahman_en_5.2.0_3.0_1700403973061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_frahman_en_5.2.0_3.0_1700403973061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_frahman","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_frahman","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_frahman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/frahman/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_gguichard_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_gguichard_en.md new file mode 100644 index 000000000000..e834792a7792 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_gguichard_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_gguichard DistilBertForSequenceClassification from gguichard +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_gguichard +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_gguichard` is a English model originally trained by gguichard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_gguichard_en_5.2.0_3.0_1700401132610.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_gguichard_en_5.2.0_3.0_1700401132610.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_gguichard","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_gguichard","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_gguichard| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/gguichard/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_gouse_73_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_gouse_73_en.md new file mode 100644 index 000000000000..de028dc4fbfb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_gouse_73_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_gouse_73 DistilBertForSequenceClassification from gouse-73 +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_gouse_73 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_gouse_73` is a English model originally trained by gouse-73. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_gouse_73_en_5.2.0_3.0_1700371635016.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_gouse_73_en_5.2.0_3.0_1700371635016.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_gouse_73","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_gouse_73","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_gouse_73| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/gouse-73/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_omar95farag_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_omar95farag_en.md new file mode 100644 index 000000000000..884af3898f79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_distilled_clinc_omar95farag_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_omar95farag DistilBertForSequenceClassification from Omar95farag +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_omar95farag +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_omar95farag` is a English model originally trained by Omar95farag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_omar95farag_en_5.2.0_3.0_1700353210532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_omar95farag_en_5.2.0_3.0_1700353210532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_omar95farag","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_omar95farag","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_omar95farag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Omar95farag/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emoji_mask_wearing_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emoji_mask_wearing_en.md new file mode 100644 index 000000000000..0a9961c35fe8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emoji_mask_wearing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emoji_mask_wearing DistilBertForSequenceClassification from Suhong +author: John Snow Labs +name: distilbert_base_uncased_emoji_mask_wearing +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emoji_mask_wearing` is a English model originally trained by Suhong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emoji_mask_wearing_en_5.2.0_3.0_1700410319121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emoji_mask_wearing_en_5.2.0_3.0_1700410319121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emoji_mask_wearing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emoji_mask_wearing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emoji_mask_wearing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Suhong/distilbert-base-uncased-emoji_mask_wearing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_climatechange_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_climatechange_en.md new file mode 100644 index 000000000000..5d1c0e448f41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_climatechange_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotion_climatechange DistilBertForSequenceClassification from Suhong +author: John Snow Labs +name: distilbert_base_uncased_emotion_climatechange +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotion_climatechange` is a English model originally trained by Suhong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_climatechange_en_5.2.0_3.0_1700372035494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_climatechange_en_5.2.0_3.0_1700372035494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_climatechange","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_climatechange","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotion_climatechange| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Suhong/distilbert-base-uncased-emotion-climateChange \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_0416_lanchunhui_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_0416_lanchunhui_en.md new file mode 100644 index 000000000000..501c4849d21c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_0416_lanchunhui_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotion_ft_0416_lanchunhui DistilBertForSequenceClassification from lanchunhui +author: John Snow Labs +name: distilbert_base_uncased_emotion_ft_0416_lanchunhui +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotion_ft_0416_lanchunhui` is a English model originally trained by lanchunhui. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_0416_lanchunhui_en_5.2.0_3.0_1700363705372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_0416_lanchunhui_en_5.2.0_3.0_1700363705372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_0416_lanchunhui","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_0416_lanchunhui","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotion_ft_0416_lanchunhui| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/lanchunhui/distilbert-base-uncased_emotion_ft_0416 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_0520_zhouzk_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_0520_zhouzk_en.md new file mode 100644 index 000000000000..e738fd8189f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_0520_zhouzk_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotion_ft_0520_zhouzk DistilBertForSequenceClassification from Zhouzk +author: John Snow Labs +name: distilbert_base_uncased_emotion_ft_0520_zhouzk +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotion_ft_0520_zhouzk` is a English model originally trained by Zhouzk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_0520_zhouzk_en_5.2.0_3.0_1700371427082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_0520_zhouzk_en_5.2.0_3.0_1700371427082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_0520_zhouzk","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_0520_zhouzk","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotion_ft_0520_zhouzk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Zhouzk/distilbert-base-uncased_emotion_ft_0520 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_lincong_logs_1027_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_lincong_logs_1027_en.md new file mode 100644 index 000000000000..d5a21dba26dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotion_ft_lincong_logs_1027_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotion_ft_lincong_logs_1027 DistilBertForSequenceClassification from gemlincong +author: John Snow Labs +name: distilbert_base_uncased_emotion_ft_lincong_logs_1027 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotion_ft_lincong_logs_1027` is a English model originally trained by gemlincong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_lincong_logs_1027_en_5.2.0_3.0_1700365730214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_lincong_logs_1027_en_5.2.0_3.0_1700365730214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_lincong_logs_1027","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_lincong_logs_1027","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotion_ft_lincong_logs_1027| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/gemlincong/distilbert-base-uncased_emotion_ft_lincong_logs_1027 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotions_detection_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotions_detection_en.md new file mode 100644 index 000000000000..c7d767003ab1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_emotions_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotions_detection DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_emotions_detection +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotions_detection` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotions_detection_en_5.2.0_3.0_1700374970564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotions_detection_en_5.2.0_3.0_1700374970564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotions_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotions_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotions_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Emotions_Detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3_en.md new file mode 100644 index 000000000000..1b9a2e06c21d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3 DistilBertForSequenceClassification from hafidikhsan +author: John Snow Labs +name: distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3` is a English model originally trained by hafidikhsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3_en_5.2.0_3.0_1700357906658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3_en_5.2.0_3.0_1700357906658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_english_cefr_lexical_evaluation_bosnian_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hafidikhsan/distilbert-base-uncased-english-cefr-lexical-evaluation-bs-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6_en.md new file mode 100644 index 000000000000..5b0ec897904d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6 DistilBertForSequenceClassification from hafidikhsan +author: John Snow Labs +name: distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6` is a English model originally trained by hafidikhsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6_en_5.2.0_3.0_1700429541011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6_en_5.2.0_3.0_1700429541011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hafidikhsan/distilbert-base-uncased-english-cefr-lexical-evaluation-dt-v6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_sentweet_profane_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_sentweet_profane_en.md new file mode 100644 index 000000000000..55fd495240ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_sentweet_profane_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_english_sentweet_profane DistilBertForSequenceClassification from jayanta +author: John Snow Labs +name: distilbert_base_uncased_english_sentweet_profane +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_english_sentweet_profane` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_sentweet_profane_en_5.2.0_3.0_1700419615262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_sentweet_profane_en_5.2.0_3.0_1700419615262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_sentweet_profane","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_sentweet_profane","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_english_sentweet_profane| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/jayanta/distilbert-base-uncased-english-sentweet-profane \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_sentweet_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_sentweet_sentiment_en.md new file mode 100644 index 000000000000..d18277ce9ed6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_english_sentweet_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_english_sentweet_sentiment DistilBertForSequenceClassification from jayanta +author: John Snow Labs +name: distilbert_base_uncased_english_sentweet_sentiment +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_english_sentweet_sentiment` is a English model originally trained by jayanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_sentweet_sentiment_en_5.2.0_3.0_1700426488016.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_sentweet_sentiment_en_5.2.0_3.0_1700426488016.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_sentweet_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_sentweet_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_english_sentweet_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jayanta/distilbert-base-uncased-english-sentweet-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_9th_auc_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_9th_auc_en.md new file mode 100644 index 000000000000..78915aae09e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_9th_auc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_9th_auc DistilBertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: distilbert_base_uncased_finetuned_9th_auc +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_9th_auc` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_9th_auc_en_5.2.0_3.0_1700352604828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_9th_auc_en_5.2.0_3.0_1700352604828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_9th_auc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_9th_auc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_9th_auc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Katsiaryna/distilbert-base-uncased-finetuned_9th_auc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_alerts_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_alerts_en.md new file mode 100644 index 000000000000..395d491874c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_alerts_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_alerts DistilBertForSequenceClassification from renbtt +author: John Snow Labs +name: distilbert_base_uncased_finetuned_alerts +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_alerts` is a English model originally trained by renbtt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_alerts_en_5.2.0_3.0_1700438364047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_alerts_en_5.2.0_3.0_1700438364047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_alerts","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_alerts","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_alerts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/renbtt/distilbert-base-uncased-finetuned-alerts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_amazon_fine_food_lite_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_amazon_fine_food_lite_en.md new file mode 100644 index 000000000000..864f69e4c4eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_amazon_fine_food_lite_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_amazon_fine_food_lite DistilBertForSequenceClassification from MarioAvolio99 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_amazon_fine_food_lite +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_amazon_fine_food_lite` is a English model originally trained by MarioAvolio99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_fine_food_lite_en_5.2.0_3.0_1700412353680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_fine_food_lite_en_5.2.0_3.0_1700412353680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_amazon_fine_food_lite","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_amazon_fine_food_lite","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_amazon_fine_food_lite| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/MarioAvolio99/distilbert-base-uncased-finetuned-amazon-fine-food-lite \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_amazon_review_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_amazon_review_en.md new file mode 100644 index 000000000000..3727d70a56c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_amazon_review_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_amazon_review DistilBertEmbeddings from soyisauce +author: John Snow Labs +name: distilbert_base_uncased_finetuned_amazon_review +date: 2023-11-19 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_amazon_review` is a English model originally trained by soyisauce. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_review_en_5.2.0_3.0_1700436711396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_review_en_5.2.0_3.0_1700436711396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("distilbert_base_uncased_finetuned_amazon_review","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("distilbert_base_uncased_finetuned_amazon_review", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_amazon_review| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +References + +https://huggingface.co/soyisauce/distilbert-base-uncased-finetuned-amazon_review \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_anuragrawal_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_anuragrawal_en.md new file mode 100644 index 000000000000..208c75aee0b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_anuragrawal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_anuragrawal DistilBertForSequenceClassification from anuragrawal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_anuragrawal +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_anuragrawal` is a English model originally trained by anuragrawal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_anuragrawal_en_5.2.0_3.0_1700387782177.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_anuragrawal_en_5.2.0_3.0_1700387782177.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_anuragrawal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_anuragrawal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_anuragrawal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anuragrawal/distilbert-base-uncased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_binary_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_binary_classifier_en.md new file mode 100644 index 000000000000..fbaa934f582a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_binary_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_binary_classifier DistilBertForSequenceClassification from celise88 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_binary_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_binary_classifier` is a English model originally trained by celise88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_binary_classifier_en_5.2.0_3.0_1700436288774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_binary_classifier_en_5.2.0_3.0_1700436288774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_binary_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_binary_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_binary_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/celise88/distilbert-base-uncased-finetuned-binary-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_btc_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_btc_en.md new file mode 100644 index 000000000000..e3350e2d0845 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_btc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_btc DistilBertForSequenceClassification from mmohamme +author: John Snow Labs +name: distilbert_base_uncased_finetuned_btc +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_btc` is a English model originally trained by mmohamme. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_btc_en_5.2.0_3.0_1700412830495.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_btc_en_5.2.0_3.0_1700412830495.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_btc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_btc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_btc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mmohamme/distilbert-base-uncased-finetuned-btc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_abdelkader_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_abdelkader_en.md new file mode 100644 index 000000000000..cb13b6827538 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_abdelkader_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_abdelkader DistilBertForSequenceClassification from abdelkader +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_abdelkader +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_abdelkader` is a English model originally trained by abdelkader. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_abdelkader_en_5.2.0_3.0_1700409443475.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_abdelkader_en_5.2.0_3.0_1700409443475.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_abdelkader","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_abdelkader","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_abdelkader| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/abdelkader/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_arianpasquali_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_arianpasquali_en.md new file mode 100644 index 000000000000..135666d2570a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_arianpasquali_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_arianpasquali DistilBertForSequenceClassification from arianpasquali +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_arianpasquali +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_arianpasquali` is a English model originally trained by arianpasquali. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_arianpasquali_en_5.2.0_3.0_1700422839449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_arianpasquali_en_5.2.0_3.0_1700422839449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_arianpasquali","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_arianpasquali","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_arianpasquali| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/arianpasquali/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_augustbang_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_augustbang_en.md new file mode 100644 index 000000000000..8912afc28b41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_augustbang_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_augustbang DistilBertForSequenceClassification from Augustbang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_augustbang +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_augustbang` is a English model originally trained by Augustbang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_augustbang_en_5.2.0_3.0_1700354453648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_augustbang_en_5.2.0_3.0_1700354453648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_augustbang","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_augustbang","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_augustbang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Augustbang/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_gouse_73_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_gouse_73_en.md new file mode 100644 index 000000000000..a2550a78e649 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_gouse_73_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_gouse_73 DistilBertForSequenceClassification from gouse-73 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_gouse_73 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_gouse_73` is a English model originally trained by gouse-73. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_gouse_73_en_5.2.0_3.0_1700358574435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_gouse_73_en_5.2.0_3.0_1700358574435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_gouse_73","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_gouse_73","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_gouse_73| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/gouse-73/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_haesun_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_haesun_en.md new file mode 100644 index 000000000000..d4eed90f2137 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_haesun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_haesun DistilBertForSequenceClassification from haesun +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_haesun +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_haesun` is a English model originally trained by haesun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_haesun_en_5.2.0_3.0_1700365666265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_haesun_en_5.2.0_3.0_1700365666265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_haesun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_haesun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_haesun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/haesun/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_k4west_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_k4west_en.md new file mode 100644 index 000000000000..d0fff3c8364a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_k4west_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_k4west DistilBertForSequenceClassification from k4west +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_k4west +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_k4west` is a English model originally trained by k4west. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_k4west_en_5.2.0_3.0_1700357633691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_k4west_en_5.2.0_3.0_1700357633691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_k4west","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_k4west","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_k4west| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/k4west/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_lijingxin_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_lijingxin_en.md new file mode 100644 index 000000000000..b1e549e5101e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_lijingxin_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_lijingxin DistilBertForSequenceClassification from lijingxin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_lijingxin +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_lijingxin` is a English model originally trained by lijingxin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_lijingxin_en_5.2.0_3.0_1700375815978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_lijingxin_en_5.2.0_3.0_1700375815978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_lijingxin","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_lijingxin","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_lijingxin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/lijingxin/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_omar95farag_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_omar95farag_en.md new file mode 100644 index 000000000000..c0952592a33a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_omar95farag_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_omar95farag DistilBertForSequenceClassification from Omar95farag +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_omar95farag +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_omar95farag` is a English model originally trained by Omar95farag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_omar95farag_en_5.2.0_3.0_1700353548341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_omar95farag_en_5.2.0_3.0_1700353548341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_omar95farag","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_omar95farag","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_omar95farag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Omar95farag/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan_en.md new file mode 100644 index 000000000000..1167941d9891 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan DistilBertForSequenceClassification from nikitakapitan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan` is a English model originally trained by nikitakapitan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan_en_5.2.0_3.0_1700422900292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan_en_5.2.0_3.0_1700422900292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_oos_nikitakapitan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/nikitakapitan/distilbert-base-uncased-finetuned-clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_patnelt60_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_patnelt60_en.md new file mode 100644 index 000000000000..2e9b629776c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_patnelt60_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_patnelt60 DistilBertForSequenceClassification from patnelt60 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_patnelt60 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_patnelt60` is a English model originally trained by patnelt60. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_patnelt60_en_5.2.0_3.0_1700355543500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_patnelt60_en_5.2.0_3.0_1700355543500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_patnelt60","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_patnelt60","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_patnelt60| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/patnelt60/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_shiou0601_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_shiou0601_en.md new file mode 100644 index 000000000000..482244c9dd57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_clinc_shiou0601_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_shiou0601 DistilBertForSequenceClassification from Shiou0601 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_shiou0601 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_shiou0601` is a English model originally trained by Shiou0601. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_shiou0601_en_5.2.0_3.0_1700407139956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_shiou0601_en_5.2.0_3.0_1700407139956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_shiou0601","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_shiou0601","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_shiou0601| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Shiou0601/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_3_en.md new file mode 100644 index 000000000000..f7b29a5513c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_3 DistilBertForSequenceClassification from fadhilarkan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_3` is a English model originally trained by fadhilarkan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_3_en_5.2.0_3.0_1700408481332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_3_en_5.2.0_3.0_1700408481332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/fadhilarkan/distilbert-base-uncased-finetuned-cola-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_4_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_4_en.md new file mode 100644 index 000000000000..5ab68cbcbfea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_4 DistilBertForSequenceClassification from fadhilarkan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_4` is a English model originally trained by fadhilarkan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_4_en_5.2.0_3.0_1700418639006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_4_en_5.2.0_3.0_1700418639006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/fadhilarkan/distilbert-base-uncased-finetuned-cola-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_akshara23_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_akshara23_en.md new file mode 100644 index 000000000000..9c8d05fca5a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_akshara23_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_akshara23 DistilBertForSequenceClassification from akshara23 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_akshara23 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_akshara23` is a English model originally trained by akshara23. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_akshara23_en_5.2.0_3.0_1700417039506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_akshara23_en_5.2.0_3.0_1700417039506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_akshara23","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_akshara23","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_akshara23| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/akshara23/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_athar_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_athar_en.md new file mode 100644 index 000000000000..7493c04993c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_athar_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_athar DistilBertForSequenceClassification from athar +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_athar +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_athar` is a English model originally trained by athar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_athar_en_5.2.0_3.0_1700417638169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_athar_en_5.2.0_3.0_1700417638169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_athar","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_athar","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_athar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/athar/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_avneet_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_avneet_en.md new file mode 100644 index 000000000000..d074a59c0152 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_avneet_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_avneet DistilBertForSequenceClassification from avneet +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_avneet +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_avneet` is a English model originally trained by avneet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_avneet_en_5.2.0_3.0_1700398113516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_avneet_en_5.2.0_3.0_1700398113516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_avneet","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_avneet","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_avneet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/avneet/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_banri_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_banri_en.md new file mode 100644 index 000000000000..1fd5823c5346 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_banri_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_banri DistilBertForSequenceClassification from banri +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_banri +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_banri` is a English model originally trained by banri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_banri_en_5.2.0_3.0_1700428354347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_banri_en_5.2.0_3.0_1700428354347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_banri","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_banri","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_banri| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/banri/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_beomi_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_beomi_en.md new file mode 100644 index 000000000000..237616d30163 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_beomi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_beomi DistilBertForSequenceClassification from beomi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_beomi +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_beomi` is a English model originally trained by beomi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_beomi_en_5.2.0_3.0_1700384958145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_beomi_en_5.2.0_3.0_1700384958145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_beomi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_beomi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_beomi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/beomi/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_caioamb_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_caioamb_en.md new file mode 100644 index 000000000000..ab9c09e0c787 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_caioamb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_caioamb DistilBertForSequenceClassification from caioamb +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_caioamb +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_caioamb` is a English model originally trained by caioamb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_caioamb_en_5.2.0_3.0_1700416280098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_caioamb_en_5.2.0_3.0_1700416280098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_caioamb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_caioamb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_caioamb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/caioamb/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_donghyounglee_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_donghyounglee_en.md new file mode 100644 index 000000000000..eb845fb9168f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_donghyounglee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_donghyounglee DistilBertForSequenceClassification from DongHyoungLee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_donghyounglee +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_donghyounglee` is a English model originally trained by DongHyoungLee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_donghyounglee_en_5.2.0_3.0_1700354234833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_donghyounglee_en_5.2.0_3.0_1700354234833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_donghyounglee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_donghyounglee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_donghyounglee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DongHyoungLee/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fadhilarkan_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fadhilarkan_en.md new file mode 100644 index 000000000000..a76a290366ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fadhilarkan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_fadhilarkan DistilBertForSequenceClassification from fadhilarkan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_fadhilarkan +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_fadhilarkan` is a English model originally trained by fadhilarkan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fadhilarkan_en_5.2.0_3.0_1700395391116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fadhilarkan_en_5.2.0_3.0_1700395391116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fadhilarkan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fadhilarkan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_fadhilarkan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/fadhilarkan/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_federicopascual_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_federicopascual_en.md new file mode 100644 index 000000000000..29b422e07ad7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_federicopascual_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_federicopascual DistilBertForSequenceClassification from federicopascual +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_federicopascual +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_federicopascual` is a English model originally trained by federicopascual. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_federicopascual_en_5.2.0_3.0_1700435191805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_federicopascual_en_5.2.0_3.0_1700435191805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_federicopascual","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_federicopascual","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_federicopascual| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/federicopascual/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fiona99_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fiona99_en.md new file mode 100644 index 000000000000..8986bd41c479 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fiona99_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_fiona99 DistilBertForSequenceClassification from Fiona99 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_fiona99 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_fiona99` is a English model originally trained by Fiona99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fiona99_en_5.2.0_3.0_1700352896954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fiona99_en_5.2.0_3.0_1700352896954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fiona99","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fiona99","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_fiona99| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Fiona99/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fuck_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fuck_en.md new file mode 100644 index 000000000000..5cb2b97ed622 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_fuck_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_fuck DistilBertForSequenceClassification from fuck +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_fuck +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_fuck` is a English model originally trained by fuck. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fuck_en_5.2.0_3.0_1700391568980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fuck_en_5.2.0_3.0_1700391568980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fuck","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fuck","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_fuck| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/fuck/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_gauravtripathy_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_gauravtripathy_en.md new file mode 100644 index 000000000000..f5bfaebd80d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_gauravtripathy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_gauravtripathy DistilBertForSequenceClassification from gauravtripathy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_gauravtripathy +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_gauravtripathy` is a English model originally trained by gauravtripathy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_gauravtripathy_en_5.2.0_3.0_1700431241591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_gauravtripathy_en_5.2.0_3.0_1700431241591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_gauravtripathy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_gauravtripathy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_gauravtripathy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/gauravtripathy/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hchc_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hchc_en.md new file mode 100644 index 000000000000..86614530c4bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hchc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_hchc DistilBertForSequenceClassification from hchc +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_hchc +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_hchc` is a English model originally trained by hchc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_hchc_en_5.2.0_3.0_1700414221739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_hchc_en_5.2.0_3.0_1700414221739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_hchc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_hchc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_hchc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/hchc/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hcjang1987_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hcjang1987_en.md new file mode 100644 index 000000000000..429afa361417 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hcjang1987_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_hcjang1987 DistilBertForSequenceClassification from hcjang1987 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_hcjang1987 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_hcjang1987` is a English model originally trained by hcjang1987. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_hcjang1987_en_5.2.0_3.0_1700400996391.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_hcjang1987_en_5.2.0_3.0_1700400996391.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_hcjang1987","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_hcjang1987","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_hcjang1987| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hcjang1987/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_himanshubeniwal_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_himanshubeniwal_en.md new file mode 100644 index 000000000000..33f59e887f1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_himanshubeniwal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_himanshubeniwal DistilBertForSequenceClassification from himanshubeniwal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_himanshubeniwal +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_himanshubeniwal` is a English model originally trained by himanshubeniwal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_himanshubeniwal_en_5.2.0_3.0_1700390474969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_himanshubeniwal_en_5.2.0_3.0_1700390474969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_himanshubeniwal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_himanshubeniwal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_himanshubeniwal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/himanshubeniwal/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hinova_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hinova_en.md new file mode 100644 index 000000000000..1ca968d76040 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_hinova_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_hinova DistilBertForSequenceClassification from Hinova +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_hinova +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_hinova` is a English model originally trained by Hinova. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_hinova_en_5.2.0_3.0_1700354088423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_hinova_en_5.2.0_3.0_1700354088423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_hinova","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_hinova","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_hinova| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Hinova/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_histinct7002_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_histinct7002_en.md new file mode 100644 index 000000000000..e79b933fcab8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_histinct7002_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_histinct7002 DistilBertForSequenceClassification from histinct7002 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_histinct7002 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_histinct7002` is a English model originally trained by histinct7002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_histinct7002_en_5.2.0_3.0_1700427605400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_histinct7002_en_5.2.0_3.0_1700427605400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_histinct7002","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_histinct7002","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_histinct7002| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/histinct7002/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_jbnlry_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_jbnlry_en.md new file mode 100644 index 000000000000..553353726684 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_jbnlry_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_jbnlry DistilBertForSequenceClassification from JBNLRY +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_jbnlry +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_jbnlry` is a English model originally trained by JBNLRY. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_jbnlry_en_5.2.0_3.0_1700352428155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_jbnlry_en_5.2.0_3.0_1700352428155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_jbnlry","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_jbnlry","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_jbnlry| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/JBNLRY/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_ldacunto_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_ldacunto_en.md new file mode 100644 index 000000000000..186f1cfc6491 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_ldacunto_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_ldacunto DistilBertForSequenceClassification from ldacunto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_ldacunto +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_ldacunto` is a English model originally trained by ldacunto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_ldacunto_en_5.2.0_3.0_1700424422871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_ldacunto_en_5.2.0_3.0_1700424422871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_ldacunto","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_ldacunto","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_ldacunto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ldacunto/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_nalinik_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_nalinik_en.md new file mode 100644 index 000000000000..847e81907e3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_nalinik_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_nalinik DistilBertForSequenceClassification from NaliniK +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_nalinik +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_nalinik` is a English model originally trained by NaliniK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_nalinik_en_5.2.0_3.0_1700353015931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_nalinik_en_5.2.0_3.0_1700353015931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_nalinik","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_nalinik","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_nalinik| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/NaliniK/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_nxtcoder19_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_nxtcoder19_en.md new file mode 100644 index 000000000000..b0aca92b831c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_nxtcoder19_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_nxtcoder19 DistilBertForSequenceClassification from nxtcoder19 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_nxtcoder19 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_nxtcoder19` is a English model originally trained by nxtcoder19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_nxtcoder19_en_5.2.0_3.0_1700370061011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_nxtcoder19_en_5.2.0_3.0_1700370061011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_nxtcoder19","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_nxtcoder19","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_nxtcoder19| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/nxtcoder19/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_seanghay_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_seanghay_en.md new file mode 100644 index 000000000000..2875a4ef9dcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_seanghay_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_seanghay DistilBertForSequenceClassification from seanghay +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_seanghay +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_seanghay` is a English model originally trained by seanghay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_seanghay_en_5.2.0_3.0_1700400209145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_seanghay_en_5.2.0_3.0_1700400209145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_seanghay","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_seanghay","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_seanghay| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/seanghay/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_songrb_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_songrb_en.md new file mode 100644 index 000000000000..ce8d8836957c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_songrb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_songrb DistilBertForSequenceClassification from SongRb +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_songrb +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_songrb` is a English model originally trained by SongRb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_songrb_en_5.2.0_3.0_1700363870892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_songrb_en_5.2.0_3.0_1700363870892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_songrb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_songrb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_songrb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SongRb/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_stuser2023_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_stuser2023_en.md new file mode 100644 index 000000000000..7108acb49cdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_stuser2023_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_stuser2023 DistilBertForSequenceClassification from stuser2023 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_stuser2023 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_stuser2023` is a English model originally trained by stuser2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_stuser2023_en_5.2.0_3.0_1700425444712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_stuser2023_en_5.2.0_3.0_1700425444712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_stuser2023","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_stuser2023","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_stuser2023| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/stuser2023/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_v3rx2000_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_v3rx2000_en.md new file mode 100644 index 000000000000..2c0932ffa7fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_v3rx2000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_v3rx2000 DistilBertForSequenceClassification from V3RX2000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_v3rx2000 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_v3rx2000` is a English model originally trained by V3RX2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_v3rx2000_en_5.2.0_3.0_1700428445964.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_v3rx2000_en_5.2.0_3.0_1700428445964.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_v3rx2000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_v3rx2000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_v3rx2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/V3RX2000/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_virens13117_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_virens13117_en.md new file mode 100644 index 000000000000..bd4fb9d1258c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_virens13117_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_virens13117 DistilBertForSequenceClassification from VirenS13117 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_virens13117 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_virens13117` is a English model originally trained by VirenS13117. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_virens13117_en_5.2.0_3.0_1700416280100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_virens13117_en_5.2.0_3.0_1700416280100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_virens13117","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_virens13117","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_virens13117| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/VirenS13117/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_zzddbbcc_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_zzddbbcc_en.md new file mode 100644 index 000000000000..bf5dd6950159 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_cola_zzddbbcc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_zzddbbcc DistilBertForSequenceClassification from ZZDDBBCC +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_zzddbbcc +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_zzddbbcc` is a English model originally trained by ZZDDBBCC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_zzddbbcc_en_5.2.0_3.0_1700415196038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_zzddbbcc_en_5.2.0_3.0_1700415196038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_zzddbbcc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_zzddbbcc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_zzddbbcc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ZZDDBBCC/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_diabetes_sentences_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_diabetes_sentences_en.md new file mode 100644 index 000000000000..df7b7e70e3af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_diabetes_sentences_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_diabetes_sentences DistilBertForSequenceClassification from conorjudge +author: John Snow Labs +name: distilbert_base_uncased_finetuned_diabetes_sentences +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_diabetes_sentences` is a English model originally trained by conorjudge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_sentences_en_5.2.0_3.0_1700432236985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_sentences_en_5.2.0_3.0_1700432236985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_diabetes_sentences","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_diabetes_sentences","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_diabetes_sentences| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/conorjudge/distilbert-base-uncased-finetuned-diabetes_sentences \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_aatmasidha_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_aatmasidha_en.md new file mode 100644 index 000000000000..707d8cf2f707 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_aatmasidha_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_aatmasidha DistilBertForSequenceClassification from aatmasidha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_aatmasidha +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_aatmasidha` is a English model originally trained by aatmasidha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_aatmasidha_en_5.2.0_3.0_1700397230879.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_aatmasidha_en_5.2.0_3.0_1700397230879.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_aatmasidha","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_aatmasidha","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_aatmasidha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aatmasidha/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_abdelkader_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_abdelkader_en.md new file mode 100644 index 000000000000..bc2518b3581b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_abdelkader_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_abdelkader DistilBertForSequenceClassification from abdelkader +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_abdelkader +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_abdelkader` is a English model originally trained by abdelkader. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_abdelkader_en_5.2.0_3.0_1700403101004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_abdelkader_en_5.2.0_3.0_1700403101004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_abdelkader","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_abdelkader","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_abdelkader| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/abdelkader/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ahmed007_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ahmed007_en.md new file mode 100644 index 000000000000..5591b78f62c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ahmed007_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ahmed007 DistilBertForSequenceClassification from Ahmed007 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ahmed007 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ahmed007` is a English model originally trained by Ahmed007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ahmed007_en_5.2.0_3.0_1700377654354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ahmed007_en_5.2.0_3.0_1700377654354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ahmed007","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ahmed007","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ahmed007| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Ahmed007/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ahmettasdemir_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ahmettasdemir_en.md new file mode 100644 index 000000000000..575c63b3ef75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ahmettasdemir_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ahmettasdemir DistilBertForSequenceClassification from ahmettasdemir +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ahmettasdemir +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ahmettasdemir` is a English model originally trained by ahmettasdemir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ahmettasdemir_en_5.2.0_3.0_1700418639017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ahmettasdemir_en_5.2.0_3.0_1700418639017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ahmettasdemir","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ahmettasdemir","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ahmettasdemir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ahmettasdemir/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_andreastgram_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_andreastgram_en.md new file mode 100644 index 000000000000..26d0212f323d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_andreastgram_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_andreastgram DistilBertForSequenceClassification from andreastgram +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_andreastgram +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_andreastgram` is a English model originally trained by andreastgram. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_andreastgram_en_5.2.0_3.0_1700435381235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_andreastgram_en_5.2.0_3.0_1700435381235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_andreastgram","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_andreastgram","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_andreastgram| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/andreastgram/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_asalics_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_asalics_en.md new file mode 100644 index 000000000000..2c4f83bf79fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_asalics_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_asalics DistilBertForSequenceClassification from asalics +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_asalics +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_asalics` is a English model originally trained by asalics. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_asalics_en_5.2.0_3.0_1700436509010.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_asalics_en_5.2.0_3.0_1700436509010.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_asalics","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_asalics","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_asalics| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/asalics/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_atsstagram_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_atsstagram_en.md new file mode 100644 index 000000000000..3cd08b1a1990 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_atsstagram_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_atsstagram DistilBertForSequenceClassification from atsstagram +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_atsstagram +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_atsstagram` is a English model originally trained by atsstagram. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_atsstagram_en_5.2.0_3.0_1700378665941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_atsstagram_en_5.2.0_3.0_1700378665941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_atsstagram","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_atsstagram","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_atsstagram| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/atsstagram/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_balanced_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_balanced_en.md new file mode 100644 index 000000000000..7c2d779a370f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_balanced_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_balanced DistilBertForSequenceClassification from AdamCodd +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_balanced +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_balanced` is a English model originally trained by AdamCodd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_balanced_en_5.2.0_3.0_1700356694289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_balanced_en_5.2.0_3.0_1700356694289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_balanced","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_balanced","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_balanced| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AdamCodd/distilbert-base-uncased-finetuned-emotion-balanced \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_bellaandbria_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_bellaandbria_en.md new file mode 100644 index 000000000000..ad9843fda7b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_bellaandbria_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_bellaandbria DistilBertForSequenceClassification from BellaAndBria +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_bellaandbria +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_bellaandbria` is a English model originally trained by BellaAndBria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_bellaandbria_en_5.2.0_3.0_1700374198407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_bellaandbria_en_5.2.0_3.0_1700374198407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_bellaandbria","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_bellaandbria","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_bellaandbria| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/BellaAndBria/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_bhadresh_savani_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_bhadresh_savani_en.md new file mode 100644 index 000000000000..d546057aad9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_bhadresh_savani_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_bhadresh_savani DistilBertForSequenceClassification from bhadresh-savani +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_bhadresh_savani +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_bhadresh_savani` is a English model originally trained by bhadresh-savani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_bhadresh_savani_en_5.2.0_3.0_1700429966870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_bhadresh_savani_en_5.2.0_3.0_1700429966870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_bhadresh_savani","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_bhadresh_savani","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_bhadresh_savani| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bhadresh-savani/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_carlosaguayo_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_carlosaguayo_en.md new file mode 100644 index 000000000000..7fc0bce73882 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_carlosaguayo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_carlosaguayo DistilBertForSequenceClassification from carlosaguayo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_carlosaguayo +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_carlosaguayo` is a English model originally trained by carlosaguayo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_carlosaguayo_en_5.2.0_3.0_1700409930063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_carlosaguayo_en_5.2.0_3.0_1700409930063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_carlosaguayo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_carlosaguayo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_carlosaguayo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/carlosaguayo/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_carnival13_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_carnival13_en.md new file mode 100644 index 000000000000..90f22a3ff5ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_carnival13_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_carnival13 DistilBertForSequenceClassification from carnival13 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_carnival13 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_carnival13` is a English model originally trained by carnival13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_carnival13_en_5.2.0_3.0_1700367165604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_carnival13_en_5.2.0_3.0_1700367165604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_carnival13","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_carnival13","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_carnival13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/carnival13/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_cementtaco_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_cementtaco_en.md new file mode 100644 index 000000000000..52c573df6068 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_cementtaco_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cementtaco DistilBertForSequenceClassification from cementtaco +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cementtaco +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cementtaco` is a English model originally trained by cementtaco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cementtaco_en_5.2.0_3.0_1700361702601.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cementtaco_en_5.2.0_3.0_1700361702601.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cementtaco","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cementtaco","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cementtaco| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cementtaco/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_chsafouane_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_chsafouane_en.md new file mode 100644 index 000000000000..31827d2802b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_chsafouane_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_chsafouane DistilBertForSequenceClassification from chsafouane +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_chsafouane +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_chsafouane` is a English model originally trained by chsafouane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_chsafouane_en_5.2.0_3.0_1700394895874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_chsafouane_en_5.2.0_3.0_1700394895874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_chsafouane","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_chsafouane","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_chsafouane| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/chsafouane/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_comic_owl_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_comic_owl_en.md new file mode 100644 index 000000000000..78cacb986ae9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_comic_owl_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_comic_owl DistilBertForSequenceClassification from comic-owl +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_comic_owl +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_comic_owl` is a English model originally trained by comic-owl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_comic_owl_en_5.2.0_3.0_1700393091813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_comic_owl_en_5.2.0_3.0_1700393091813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_comic_owl","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_comic_owl","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_comic_owl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/comic-owl/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_cscottp27_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_cscottp27_en.md new file mode 100644 index 000000000000..d70c717bb5b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_cscottp27_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cscottp27 DistilBertForSequenceClassification from cscottp27 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cscottp27 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cscottp27` is a English model originally trained by cscottp27. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cscottp27_en_5.2.0_3.0_1700411371068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cscottp27_en_5.2.0_3.0_1700411371068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cscottp27","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cscottp27","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cscottp27| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cscottp27/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_daigo1126_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_daigo1126_en.md new file mode 100644 index 000000000000..bdccf82302b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_daigo1126_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_daigo1126 DistilBertForSequenceClassification from DAIGO1126 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_daigo1126 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_daigo1126` is a English model originally trained by DAIGO1126. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_daigo1126_en_5.2.0_3.0_1700398216936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_daigo1126_en_5.2.0_3.0_1700398216936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_daigo1126","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_daigo1126","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_daigo1126| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DAIGO1126/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_detector_from_text_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_detector_from_text_en.md new file mode 100644 index 000000000000..dc7c15b436c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_detector_from_text_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_detector_from_text DistilBertForSequenceClassification from ali619 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_detector_from_text +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_detector_from_text` is a English model originally trained by ali619. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_detector_from_text_en_5.2.0_3.0_1700399087478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_detector_from_text_en_5.2.0_3.0_1700399087478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_detector_from_text","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_detector_from_text","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_detector_from_text| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ali619/distilbert-base-uncased-finetuned-emotion-detector-from-text \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_duyne_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_duyne_en.md new file mode 100644 index 000000000000..c4fda827121a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_duyne_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_duyne DistilBertForSequenceClassification from duyne +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_duyne +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_duyne` is a English model originally trained by duyne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_duyne_en_5.2.0_3.0_1700409110917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_duyne_en_5.2.0_3.0_1700409110917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_duyne","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_duyne","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_duyne| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/duyne/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_edmon02_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_edmon02_en.md new file mode 100644 index 000000000000..77e4466613a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_edmon02_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_edmon02 DistilBertForSequenceClassification from Edmon02 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_edmon02 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_edmon02` is a English model originally trained by Edmon02. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_edmon02_en_5.2.0_3.0_1700403833986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_edmon02_en_5.2.0_3.0_1700403833986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_edmon02","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_edmon02","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_edmon02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Edmon02/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_eleven_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_eleven_en.md new file mode 100644 index 000000000000..95476550e77b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_eleven_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_eleven DistilBertForSequenceClassification from Eleven +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_eleven +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_eleven` is a English model originally trained by Eleven. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_eleven_en_5.2.0_3.0_1700415197427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_eleven_en_5.2.0_3.0_1700415197427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_eleven","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_eleven","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_eleven| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Eleven/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_energytrain7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_energytrain7_en.md new file mode 100644 index 000000000000..daa18eb3c71d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_energytrain7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_energytrain7 DistilBertForSequenceClassification from energytrain7 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_energytrain7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_energytrain7` is a English model originally trained by energytrain7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_energytrain7_en_5.2.0_3.0_1700392823203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_energytrain7_en_5.2.0_3.0_1700392823203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_energytrain7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_energytrain7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_energytrain7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/energytrain7/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ensaremirali_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ensaremirali_en.md new file mode 100644 index 000000000000..0867a485840b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ensaremirali_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ensaremirali DistilBertForSequenceClassification from EnsarEmirali +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ensaremirali +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ensaremirali` is a English model originally trained by EnsarEmirali. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ensaremirali_en_5.2.0_3.0_1700353396110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ensaremirali_en_5.2.0_3.0_1700353396110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ensaremirali","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ensaremirali","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ensaremirali| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/EnsarEmirali/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_facehugger135_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_facehugger135_en.md new file mode 100644 index 000000000000..be503a715850 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_facehugger135_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_facehugger135 DistilBertForSequenceClassification from Facehugger135 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_facehugger135 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_facehugger135` is a English model originally trained by Facehugger135. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_facehugger135_en_5.2.0_3.0_1700359603662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_facehugger135_en_5.2.0_3.0_1700359603662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_facehugger135","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_facehugger135","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_facehugger135| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Facehugger135/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_feladorhet_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_feladorhet_en.md new file mode 100644 index 000000000000..a564c5703360 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_feladorhet_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_feladorhet DistilBertForSequenceClassification from feladorhet +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_feladorhet +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_feladorhet` is a English model originally trained by feladorhet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_feladorhet_en_5.2.0_3.0_1700411371053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_feladorhet_en_5.2.0_3.0_1700411371053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_feladorhet","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_feladorhet","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_feladorhet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/feladorhet/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ffalcao_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ffalcao_en.md new file mode 100644 index 000000000000..73847998698f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_ffalcao_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ffalcao DistilBertForSequenceClassification from ffalcao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ffalcao +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ffalcao` is a English model originally trained by ffalcao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ffalcao_en_5.2.0_3.0_1700378989172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ffalcao_en_5.2.0_3.0_1700378989172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ffalcao","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ffalcao","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ffalcao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ffalcao/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_fouad_shammary_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_fouad_shammary_en.md new file mode 100644 index 000000000000..38c1a4ff1e48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_fouad_shammary_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_fouad_shammary DistilBertForSequenceClassification from fouad-shammary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_fouad_shammary +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_fouad_shammary` is a English model originally trained by fouad-shammary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_fouad_shammary_en_5.2.0_3.0_1700417638566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_fouad_shammary_en_5.2.0_3.0_1700417638566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_fouad_shammary","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_fouad_shammary","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_fouad_shammary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/fouad-shammary/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_frahman_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_frahman_en.md new file mode 100644 index 000000000000..97f48456b5f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_frahman_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_frahman DistilBertForSequenceClassification from frahman +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_frahman +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_frahman` is a English model originally trained by frahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_frahman_en_5.2.0_3.0_1700419845613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_frahman_en_5.2.0_3.0_1700419845613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_frahman","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_frahman","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_frahman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/frahman/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_gcmsrc_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_gcmsrc_en.md new file mode 100644 index 000000000000..874612a07084 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_gcmsrc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_gcmsrc DistilBertForSequenceClassification from gcmsrc +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_gcmsrc +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_gcmsrc` is a English model originally trained by gcmsrc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_gcmsrc_en_5.2.0_3.0_1700392587117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_gcmsrc_en_5.2.0_3.0_1700392587117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_gcmsrc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_gcmsrc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_gcmsrc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/gcmsrc/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_haesun_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_haesun_en.md new file mode 100644 index 000000000000..7ca82504a98b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_haesun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_haesun DistilBertForSequenceClassification from haesun +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_haesun +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_haesun` is a English model originally trained by haesun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_haesun_en_5.2.0_3.0_1700352606359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_haesun_en_5.2.0_3.0_1700352606359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_haesun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_haesun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_haesun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/haesun/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hbtemari_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hbtemari_en.md new file mode 100644 index 000000000000..1b3c5ab32c86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hbtemari_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_hbtemari DistilBertForSequenceClassification from HBtemari +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_hbtemari +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_hbtemari` is a English model originally trained by HBtemari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hbtemari_en_5.2.0_3.0_1700383788863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hbtemari_en_5.2.0_3.0_1700383788863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hbtemari","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hbtemari","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_hbtemari| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/HBtemari/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hidetai_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hidetai_en.md new file mode 100644 index 000000000000..024a3943a9ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hidetai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_hidetai DistilBertForSequenceClassification from hidetai +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_hidetai +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_hidetai` is a English model originally trained by hidetai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hidetai_en_5.2.0_3.0_1700361702598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hidetai_en_5.2.0_3.0_1700361702598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hidetai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hidetai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_hidetai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hidetai/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hpl_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hpl_en.md new file mode 100644 index 000000000000..bae672e00e49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_hpl_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_hpl DistilBertForSequenceClassification from HPL +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_hpl +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_hpl` is a English model originally trained by HPL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hpl_en_5.2.0_3.0_1700406653320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_hpl_en_5.2.0_3.0_1700406653320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hpl","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_hpl","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_hpl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/HPL/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_huseyn92_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_huseyn92_en.md new file mode 100644 index 000000000000..66e87a3ace63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_huseyn92_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_huseyn92 DistilBertForSequenceClassification from Huseyn92 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_huseyn92 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_huseyn92` is a English model originally trained by Huseyn92. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_huseyn92_en_5.2.0_3.0_1700365666127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_huseyn92_en_5.2.0_3.0_1700365666127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_huseyn92","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_huseyn92","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_huseyn92| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Huseyn92/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_jerrym_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_jerrym_en.md new file mode 100644 index 000000000000..bac52044ee80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_jerrym_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jerrym DistilBertForSequenceClassification from JerryM +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jerrym +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jerrym` is a English model originally trained by JerryM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jerrym_en_5.2.0_3.0_1700434226351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jerrym_en_5.2.0_3.0_1700434226351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jerrym","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jerrym","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jerrym| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JerryM/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_jliew_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_jliew_en.md new file mode 100644 index 000000000000..78ab75f6ad9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_jliew_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jliew DistilBertForSequenceClassification from jliew +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jliew +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jliew` is a English model originally trained by jliew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jliew_en_5.2.0_3.0_1700434875538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jliew_en_5.2.0_3.0_1700434875538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jliew","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jliew","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jliew| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jliew/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kidzy_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kidzy_en.md new file mode 100644 index 000000000000..a669d7a68e2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kidzy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_kidzy DistilBertForSequenceClassification from kidzy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_kidzy +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_kidzy` is a English model originally trained by kidzy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kidzy_en_5.2.0_3.0_1700397733666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kidzy_en_5.2.0_3.0_1700397733666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kidzy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kidzy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_kidzy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kidzy/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kiran146_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kiran146_en.md new file mode 100644 index 000000000000..6b53792dc18f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kiran146_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_kiran146 DistilBertForSequenceClassification from Kiran146 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_kiran146 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_kiran146` is a English model originally trained by Kiran146. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kiran146_en_5.2.0_3.0_1700353545981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kiran146_en_5.2.0_3.0_1700353545981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kiran146","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kiran146","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_kiran146| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kiran146/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kjunelee_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kjunelee_en.md new file mode 100644 index 000000000000..1352f31ea52c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_kjunelee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_kjunelee DistilBertForSequenceClassification from kjunelee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_kjunelee +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_kjunelee` is a English model originally trained by kjunelee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kjunelee_en_5.2.0_3.0_1700438002021.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kjunelee_en_5.2.0_3.0_1700438002021.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kjunelee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kjunelee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_kjunelee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kjunelee/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lauer_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lauer_en.md new file mode 100644 index 000000000000..d484832c1be2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lauer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_lauer DistilBertForSequenceClassification from lauer +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_lauer +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_lauer` is a English model originally trained by lauer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_lauer_en_5.2.0_3.0_1700399369031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_lauer_en_5.2.0_3.0_1700399369031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_lauer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_lauer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_lauer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/lauer/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lewiswatson_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lewiswatson_en.md new file mode 100644 index 000000000000..d1d64e839180 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lewiswatson_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_lewiswatson DistilBertForSequenceClassification from lewiswatson +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_lewiswatson +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_lewiswatson` is a English model originally trained by lewiswatson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_lewiswatson_en_5.2.0_3.0_1700386843962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_lewiswatson_en_5.2.0_3.0_1700386843962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_lewiswatson","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_lewiswatson","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_lewiswatson| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/lewiswatson/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_leyuxzhang_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_leyuxzhang_en.md new file mode 100644 index 000000000000..f32d298cee1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_leyuxzhang_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_leyuxzhang DistilBertForSequenceClassification from leyuxzhang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_leyuxzhang +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_leyuxzhang` is a English model originally trained by leyuxzhang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_leyuxzhang_en_5.2.0_3.0_1700353673498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_leyuxzhang_en_5.2.0_3.0_1700353673498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_leyuxzhang","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_leyuxzhang","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_leyuxzhang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/leyuxzhang/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_linuxcoder_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_linuxcoder_en.md new file mode 100644 index 000000000000..493f36f61b02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_linuxcoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_linuxcoder DistilBertForSequenceClassification from linuxcoder +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_linuxcoder +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_linuxcoder` is a English model originally trained by linuxcoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_linuxcoder_en_5.2.0_3.0_1700374879058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_linuxcoder_en_5.2.0_3.0_1700374879058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_linuxcoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_linuxcoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_linuxcoder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/linuxcoder/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_loveplay1983_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_loveplay1983_en.md new file mode 100644 index 000000000000..7a7bb5232e78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_loveplay1983_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_loveplay1983 DistilBertForSequenceClassification from loveplay1983 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_loveplay1983 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_loveplay1983` is a English model originally trained by loveplay1983. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_loveplay1983_en_5.2.0_3.0_1700358440207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_loveplay1983_en_5.2.0_3.0_1700358440207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_loveplay1983","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_loveplay1983","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_loveplay1983| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/loveplay1983/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003_en.md new file mode 100644 index 000000000000..7187658393fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003 DistilBertForSequenceClassification from sayakpaul +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003` is a English model originally trained by sayakpaul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003_en_5.2.0_3.0_1700393418847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003_en_5.2.0_3.0_1700393418847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_lr_0_0003_wd_003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sayakpaul/distilbert-base-uncased-finetuned-emotion-lr-0.0003-wd-003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_movies_186k_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_movies_186k_en.md new file mode 100644 index 000000000000..75026ecd4d99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_movies_186k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_movies_186k DistilBertForSequenceClassification from TFMUNIR +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_movies_186k +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_movies_186k` is a English model originally trained by TFMUNIR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_movies_186k_en_5.2.0_3.0_1700402177258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_movies_186k_en_5.2.0_3.0_1700402177258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_movies_186k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_movies_186k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_movies_186k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/TFMUNIR/distilbert-base-uncased-finetuned-emotion-movies-186k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_nebo333_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_nebo333_en.md new file mode 100644 index 000000000000..f44c8799ee22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_nebo333_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_nebo333 DistilBertForSequenceClassification from nebo333 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_nebo333 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_nebo333` is a English model originally trained by nebo333. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nebo333_en_5.2.0_3.0_1700425376468.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nebo333_en_5.2.0_3.0_1700425376468.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nebo333","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nebo333","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_nebo333| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nebo333/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_part_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_part_2_en.md new file mode 100644 index 000000000000..8bd55b41ae89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_part_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_part_2 DistilBertForSequenceClassification from Svngoku +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_part_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_part_2` is a English model originally trained by Svngoku. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_part_2_en_5.2.0_3.0_1700394585014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_part_2_en_5.2.0_3.0_1700394585014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_part_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_part_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_part_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Svngoku/distilbert-base-uncased-finetuned-emotion-part-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_postrational_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_postrational_en.md new file mode 100644 index 000000000000..4f9a91ec2fa2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_postrational_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_postrational DistilBertForSequenceClassification from postrational +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_postrational +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_postrational` is a English model originally trained by postrational. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_postrational_en_5.2.0_3.0_1700369584055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_postrational_en_5.2.0_3.0_1700369584055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_postrational","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_postrational","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_postrational| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/postrational/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_realyinchen_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_realyinchen_en.md new file mode 100644 index 000000000000..9c06ff32ab39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_realyinchen_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_realyinchen DistilBertForSequenceClassification from realyinchen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_realyinchen +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_realyinchen` is a English model originally trained by realyinchen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_realyinchen_en_5.2.0_3.0_1700401044763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_realyinchen_en_5.2.0_3.0_1700401044763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_realyinchen","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_realyinchen","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_realyinchen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/realyinchen/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_reatiny_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_reatiny_en.md new file mode 100644 index 000000000000..55c70d5380c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_reatiny_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_reatiny DistilBertForSequenceClassification from reatiny +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_reatiny +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_reatiny` is a English model originally trained by reatiny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_reatiny_en_5.2.0_3.0_1700420900389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_reatiny_en_5.2.0_3.0_1700420900389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_reatiny","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_reatiny","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_reatiny| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/reatiny/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sabre_code_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sabre_code_en.md new file mode 100644 index 000000000000..b70a1e5a6160 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sabre_code_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_sabre_code DistilBertForSequenceClassification from sabre-code +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_sabre_code +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_sabre_code` is a English model originally trained by sabre-code. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sabre_code_en_5.2.0_3.0_1700422938301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sabre_code_en_5.2.0_3.0_1700422938301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sabre_code","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sabre_code","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_sabre_code| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sabre-code/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sajjad333_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sajjad333_en.md new file mode 100644 index 000000000000..71717ee0af1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sajjad333_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_sajjad333 DistilBertForSequenceClassification from sajjad333 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_sajjad333 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_sajjad333` is a English model originally trained by sajjad333. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sajjad333_en_5.2.0_3.0_1700356883331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sajjad333_en_5.2.0_3.0_1700356883331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sajjad333","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sajjad333","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_sajjad333| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sajjad333/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_samrfreitas_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_samrfreitas_en.md new file mode 100644 index 000000000000..597ebe9a5e0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_samrfreitas_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_samrfreitas DistilBertForSequenceClassification from samrfreitas +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_samrfreitas +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_samrfreitas` is a English model originally trained by samrfreitas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_samrfreitas_en_5.2.0_3.0_1700401132501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_samrfreitas_en_5.2.0_3.0_1700401132501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_samrfreitas","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_samrfreitas","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_samrfreitas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/samrfreitas/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_songys_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_songys_en.md new file mode 100644 index 000000000000..2a287304a20b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_songys_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_songys DistilBertForSequenceClassification from songys +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_songys +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_songys` is a English model originally trained by songys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_songys_en_5.2.0_3.0_1700429444948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_songys_en_5.2.0_3.0_1700429444948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_songys","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_songys","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_songys| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/songys/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sunidhishetty_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sunidhishetty_en.md new file mode 100644 index 000000000000..0a1ddd27cb76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_sunidhishetty_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_sunidhishetty DistilBertForSequenceClassification from sunidhishetty +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_sunidhishetty +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_sunidhishetty` is a English model originally trained by sunidhishetty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sunidhishetty_en_5.2.0_3.0_1700378806729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sunidhishetty_en_5.2.0_3.0_1700378806729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sunidhishetty","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sunidhishetty","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_sunidhishetty| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sunidhishetty/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_tirendaz_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_tirendaz_en.md new file mode 100644 index 000000000000..87c8535893c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_tirendaz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_tirendaz DistilBertForSequenceClassification from Tirendaz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_tirendaz +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_tirendaz` is a English model originally trained by Tirendaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_tirendaz_en_5.2.0_3.0_1700381604141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_tirendaz_en_5.2.0_3.0_1700381604141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_tirendaz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_tirendaz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_tirendaz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tirendaz/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_worldman_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_worldman_en.md new file mode 100644 index 000000000000..65b66ff34a6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_worldman_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_worldman DistilBertForSequenceClassification from Worldman +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_worldman +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_worldman` is a English model originally trained by Worldman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_worldman_en_5.2.0_3.0_1700424538239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_worldman_en_5.2.0_3.0_1700424538239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_worldman","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_worldman","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_worldman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Worldman/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_yejinkim_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_yejinkim_en.md new file mode 100644 index 000000000000..afbb707b7f8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_yejinkim_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_yejinkim DistilBertForSequenceClassification from yejinkim +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_yejinkim +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_yejinkim` is a English model originally trained by yejinkim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yejinkim_en_5.2.0_3.0_1700410234104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yejinkim_en_5.2.0_3.0_1700410234104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yejinkim","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yejinkim","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_yejinkim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yejinkim/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_zorualyh_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_zorualyh_en.md new file mode 100644 index 000000000000..2602a211ea81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_emotion_zorualyh_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_zorualyh DistilBertForSequenceClassification from Zorualyh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_zorualyh +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_zorualyh` is a English model originally trained by Zorualyh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_zorualyh_en_5.2.0_3.0_1700378677431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_zorualyh_en_5.2.0_3.0_1700378677431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_zorualyh","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_zorualyh","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_zorualyh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Zorualyh/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_fashion_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_fashion_en.md new file mode 100644 index 000000000000..cd57a83aa2f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_fashion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_fashion DistilBertForSequenceClassification from rasta +author: John Snow Labs +name: distilbert_base_uncased_finetuned_fashion +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_fashion` is a English model originally trained by rasta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_fashion_en_5.2.0_3.0_1700377196972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_fashion_en_5.2.0_3.0_1700377196972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_fashion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_fashion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_fashion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rasta/distilbert-base-uncased-finetuned-fashion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_go_emotions_20220608_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_go_emotions_20220608_1_en.md new file mode 100644 index 000000000000..83f1917977b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_go_emotions_20220608_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_go_emotions_20220608_1 DistilBertForSequenceClassification from jungealexander +author: John Snow Labs +name: distilbert_base_uncased_finetuned_go_emotions_20220608_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_go_emotions_20220608_1` is a English model originally trained by jungealexander. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_go_emotions_20220608_1_en_5.2.0_3.0_1700353390789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_go_emotions_20220608_1_en_5.2.0_3.0_1700353390789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_go_emotions_20220608_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_go_emotions_20220608_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_go_emotions_20220608_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jungealexander/distilbert-base-uncased-finetuned-go_emotions_20220608_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_greenplastics_small_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_greenplastics_small_en.md new file mode 100644 index 000000000000..256bee6694be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_greenplastics_small_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_greenplastics_small DistilBertForSequenceClassification from cwinkler +author: John Snow Labs +name: distilbert_base_uncased_finetuned_greenplastics_small +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_greenplastics_small` is a English model originally trained by cwinkler. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_greenplastics_small_en_5.2.0_3.0_1700369056309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_greenplastics_small_en_5.2.0_3.0_1700369056309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_greenplastics_small","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_greenplastics_small","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_greenplastics_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/cwinkler/distilbert-base-uncased-finetuned-greenplastics-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_health_facts_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_health_facts_en.md new file mode 100644 index 000000000000..c0c84aee1f4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_health_facts_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_health_facts DistilBertForSequenceClassification from austinmw +author: John Snow Labs +name: distilbert_base_uncased_finetuned_health_facts +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_health_facts` is a English model originally trained by austinmw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_health_facts_en_5.2.0_3.0_1700401882354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_health_facts_en_5.2.0_3.0_1700401882354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_health_facts","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_health_facts","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_health_facts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/austinmw/distilbert-base-uncased-finetuned-health_facts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_blur_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_blur_en.md new file mode 100644 index 000000000000..8e6d021652a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_blur_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_blur DistilBertForSequenceClassification from sahn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_blur +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_blur` is a English model originally trained by sahn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_blur_en_5.2.0_3.0_1700372164137.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_blur_en_5.2.0_3.0_1700372164137.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_blur","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_blur","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_blur| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sahn/distilbert-base-uncased-finetuned-imdb-blur \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_sahn_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_sahn_en.md new file mode 100644 index 000000000000..84ca94b48e35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_sahn_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_sahn DistilBertForSequenceClassification from sahn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_sahn +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_sahn` is a English model originally trained by sahn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_sahn_en_5.2.0_3.0_1700422938422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_sahn_en_5.2.0_3.0_1700422938422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_sahn","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_sahn","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_sahn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sahn/distilbert-base-uncased-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_subtle_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_subtle_en.md new file mode 100644 index 000000000000..a0eb62735ceb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_imdb_subtle_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_imdb_subtle DistilBertForSequenceClassification from sahn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_imdb_subtle +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_imdb_subtle` is a English model originally trained by sahn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_subtle_en_5.2.0_3.0_1700422044785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_imdb_subtle_en_5.2.0_3.0_1700422044785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_subtle","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_imdb_subtle","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_imdb_subtle| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sahn/distilbert-base-uncased-finetuned-imdb-subtle \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_katsiaryna_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_katsiaryna_en.md new file mode 100644 index 000000000000..92269f8e436e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_katsiaryna_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_katsiaryna DistilBertForSequenceClassification from Katsiaryna +author: John Snow Labs +name: distilbert_base_uncased_finetuned_katsiaryna +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_katsiaryna` is a English model originally trained by Katsiaryna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_katsiaryna_en_5.2.0_3.0_1700353848963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_katsiaryna_en_5.2.0_3.0_1700353848963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_katsiaryna","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_katsiaryna","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_katsiaryna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Katsiaryna/distilbert-base-uncased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mnli_blizrys_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mnli_blizrys_en.md new file mode 100644 index 000000000000..66152c9c7bc4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mnli_blizrys_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_mnli_blizrys DistilBertForSequenceClassification from blizrys +author: John Snow Labs +name: distilbert_base_uncased_finetuned_mnli_blizrys +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_mnli_blizrys` is a English model originally trained by blizrys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mnli_blizrys_en_5.2.0_3.0_1700391534419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mnli_blizrys_en_5.2.0_3.0_1700391534419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mnli_blizrys","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mnli_blizrys","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_mnli_blizrys| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/blizrys/distilbert-base-uncased-finetuned-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mnli_seishin_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mnli_seishin_en.md new file mode 100644 index 000000000000..467006a88fac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mnli_seishin_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_mnli_seishin DistilBertForSequenceClassification from SEISHIN +author: John Snow Labs +name: distilbert_base_uncased_finetuned_mnli_seishin +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_mnli_seishin` is a English model originally trained by SEISHIN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mnli_seishin_en_5.2.0_3.0_1700353196055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mnli_seishin_en_5.2.0_3.0_1700353196055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mnli_seishin","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mnli_seishin","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_mnli_seishin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SEISHIN/distilbert-base-uncased-finetuned-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_moral_action_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_moral_action_en.md new file mode 100644 index 000000000000..cb82cb082350 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_moral_action_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_moral_action DistilBertForSequenceClassification from agi-css +author: John Snow Labs +name: distilbert_base_uncased_finetuned_moral_action +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_moral_action` is a English model originally trained by agi-css. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_moral_action_en_5.2.0_3.0_1700421835888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_moral_action_en_5.2.0_3.0_1700421835888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_moral_action","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_moral_action","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_moral_action| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/agi-css/distilbert-base-uncased-finetuned-moral-action \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mrpc_anirudh21_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mrpc_anirudh21_en.md new file mode 100644 index 000000000000..7f70646db8c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_mrpc_anirudh21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_mrpc_anirudh21 DistilBertForSequenceClassification from anirudh21 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_mrpc_anirudh21 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_mrpc_anirudh21` is a English model originally trained by anirudh21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mrpc_anirudh21_en_5.2.0_3.0_1700407453567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_mrpc_anirudh21_en_5.2.0_3.0_1700407453567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mrpc_anirudh21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_mrpc_anirudh21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_mrpc_anirudh21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/anirudh21/distilbert-base-uncased-finetuned-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_news_category_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_news_category_en.md new file mode 100644 index 000000000000..61d9d4191b9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_news_category_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_news_category DistilBertForSequenceClassification from Neha2608 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_news_category +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_news_category` is a English model originally trained by Neha2608. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_news_category_en_5.2.0_3.0_1700388839479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_news_category_en_5.2.0_3.0_1700388839479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_news_category","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_news_category","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_news_category| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Neha2608/distilbert-base-uncased-finetuned-news-category \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_news_mosesju_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_news_mosesju_en.md new file mode 100644 index 000000000000..48013b4936c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_news_mosesju_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_news_mosesju DistilBertForSequenceClassification from mosesju +author: John Snow Labs +name: distilbert_base_uncased_finetuned_news_mosesju +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_news_mosesju` is a English model originally trained by mosesju. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_news_mosesju_en_5.2.0_3.0_1700396760878.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_news_mosesju_en_5.2.0_3.0_1700396760878.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_news_mosesju","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_news_mosesju","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_news_mosesju| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mosesju/distilbert-base-uncased-finetuned-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_1_en.md new file mode 100644 index 000000000000..e61ed9df80b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pre_requisite_finder_1 DistilBertForSequenceClassification from bhattronak14 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pre_requisite_finder_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pre_requisite_finder_1` is a English model originally trained by bhattronak14. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pre_requisite_finder_1_en_5.2.0_3.0_1700355687999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pre_requisite_finder_1_en_5.2.0_3.0_1700355687999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pre_requisite_finder_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pre_requisite_finder_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pre_requisite_finder_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bhattronak14/distilbert-base-uncased-finetuned-Pre_requisite_finder_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14_en.md new file mode 100644 index 000000000000..2fe63f11a73d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14 DistilBertForSequenceClassification from bhattronak14 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14` is a English model originally trained by bhattronak14. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14_en_5.2.0_3.0_1700364695867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14_en_5.2.0_3.0_1700364695867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pre_requisite_finder_bhattronak14| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/bhattronak14/distilbert-base-uncased-finetuned-Pre_requisite_finder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma_en.md new file mode 100644 index 000000000000..545ecbd2680a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma DistilBertForSequenceClassification from satyamverma +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma` is a English model originally trained by satyamverma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma_en_5.2.0_3.0_1700425376509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma_en_5.2.0_3.0_1700425376509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pre_requisite_finder_satyamverma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/satyamverma/distilbert-base-uncased-finetuned-Pre_requisite_finder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_qnli_anirudh21_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_qnli_anirudh21_en.md new file mode 100644 index 000000000000..bac459f48bdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_qnli_anirudh21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_qnli_anirudh21 DistilBertForSequenceClassification from anirudh21 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_qnli_anirudh21 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_qnli_anirudh21` is a English model originally trained by anirudh21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qnli_anirudh21_en_5.2.0_3.0_1700425376517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qnli_anirudh21_en_5.2.0_3.0_1700425376517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_qnli_anirudh21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_qnli_anirudh21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_qnli_anirudh21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anirudh21/distilbert-base-uncased-finetuned-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_rte_danlou_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_rte_danlou_en.md new file mode 100644 index 000000000000..b51992d022ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_rte_danlou_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_rte_danlou DistilBertForSequenceClassification from danlou +author: John Snow Labs +name: distilbert_base_uncased_finetuned_rte_danlou +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_rte_danlou` is a English model originally trained by danlou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_rte_danlou_en_5.2.0_3.0_1700433407692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_rte_danlou_en_5.2.0_3.0_1700433407692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_rte_danlou","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_rte_danlou","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_rte_danlou| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/danlou/distilbert-base-uncased-finetuned-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sentence_intent_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sentence_intent_en.md new file mode 100644 index 000000000000..4613e4551cd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sentence_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sentence_intent DistilBertForSequenceClassification from Bukun +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sentence_intent +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sentence_intent` is a English model originally trained by Bukun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sentence_intent_en_5.2.0_3.0_1700406653317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sentence_intent_en_5.2.0_3.0_1700406653317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sentence_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sentence_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sentence_intent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Bukun/distilbert-base-uncased-finetuned-sentence-intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_short_answer_assessment_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_short_answer_assessment_en.md new file mode 100644 index 000000000000..c717360f1252 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_short_answer_assessment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_short_answer_assessment DistilBertForSequenceClassification from Giyaseddin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_short_answer_assessment +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_short_answer_assessment` is a English model originally trained by Giyaseddin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_short_answer_assessment_en_5.2.0_3.0_1700360721458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_short_answer_assessment_en_5.2.0_3.0_1700360721458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_short_answer_assessment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_short_answer_assessment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_short_answer_assessment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Giyaseddin/distilbert-base-uncased-finetuned-short-answer-assessment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_spam_jg_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_spam_jg_en.md new file mode 100644 index 000000000000..0c5804869474 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_spam_jg_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_spam_jg DistilBertForSequenceClassification from jg +author: John Snow Labs +name: distilbert_base_uncased_finetuned_spam_jg +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_spam_jg` is a English model originally trained by jg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_spam_jg_en_5.2.0_3.0_1700355000292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_spam_jg_en_5.2.0_3.0_1700355000292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_spam_jg","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_spam_jg","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_spam_jg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/jg/distilbert-base-uncased-finetuned-spam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_anirudh21_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_anirudh21_en.md new file mode 100644 index 000000000000..caa2d92aa922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_anirudh21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_anirudh21 DistilBertForSequenceClassification from anirudh21 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_anirudh21 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_anirudh21` is a English model originally trained by anirudh21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_anirudh21_en_5.2.0_3.0_1700429250240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_anirudh21_en_5.2.0_3.0_1700429250240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_anirudh21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_anirudh21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_anirudh21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anirudh21/distilbert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_avneet_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_avneet_en.md new file mode 100644 index 000000000000..0610cfb21e2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_avneet_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_avneet DistilBertForSequenceClassification from avneet +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_avneet +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_avneet` is a English model originally trained by avneet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_avneet_en_5.2.0_3.0_1700426488006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_avneet_en_5.2.0_3.0_1700426488006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_avneet","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_avneet","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_avneet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/avneet/distilbert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_rwang5688_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_rwang5688_en.md new file mode 100644 index 000000000000..920ed50b174e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst2_rwang5688_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_rwang5688 DistilBertForSequenceClassification from rwang5688 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_rwang5688 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_rwang5688` is a English model originally trained by rwang5688. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_rwang5688_en_5.2.0_3.0_1700418156529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_rwang5688_en_5.2.0_3.0_1700418156529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_rwang5688","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_rwang5688","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_rwang5688| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/rwang5688/distilbert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb_en.md new file mode 100644 index 000000000000..3ea42f0c8a0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb DistilBertForSequenceClassification from kurianbenoy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb` is a English model originally trained by kurianbenoy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb_en_5.2.0_3.0_1700376774702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb_en_5.2.0_3.0_1700376774702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst_2_english_finetuned_imdb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/kurianbenoy/distilbert-base-uncased-finetuned-sst-2-english-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_switchboard_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_switchboard_2_en.md new file mode 100644 index 000000000000..a28adbed8f30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_switchboard_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_switchboard_2 DistilBertForSequenceClassification from goldenk +author: John Snow Labs +name: distilbert_base_uncased_finetuned_switchboard_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_switchboard_2` is a English model originally trained by goldenk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_switchboard_2_en_5.2.0_3.0_1700421630004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_switchboard_2_en_5.2.0_3.0_1700421630004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_switchboard_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_switchboard_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_switchboard_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/goldenk/distilbert-base-uncased-finetuned-switchboard-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_text_2_disease_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_text_2_disease_en.md new file mode 100644 index 000000000000..92e0c373e6fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_text_2_disease_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_text_2_disease DistilBertForSequenceClassification from venetis +author: John Snow Labs +name: distilbert_base_uncased_finetuned_text_2_disease +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_text_2_disease` is a English model originally trained by venetis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_en_5.2.0_3.0_1700367583094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_en_5.2.0_3.0_1700367583094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_text_2_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/venetis/distilbert-base-uncased_finetuned_text_2_disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en.md new file mode 100644 index 000000000000..af50421be883 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds DistilBertForSequenceClassification from francisco-perez-sorrosal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds` is a English model originally trained by francisco-perez-sorrosal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en_5.2.0_3.0_1700438198037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds_en_5.2.0_3.0_1700438198037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_with_spanish_tweets_clf_cleaned_ds| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/francisco-perez-sorrosal/distilbert-base-uncased-finetuned-with-spanish-tweets-clf-cleaned-ds \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_wnli_anirudh21_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_wnli_anirudh21_en.md new file mode 100644 index 000000000000..14b8b394f899 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_wnli_anirudh21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_wnli_anirudh21 DistilBertForSequenceClassification from anirudh21 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_wnli_anirudh21 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_wnli_anirudh21` is a English model originally trained by anirudh21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_wnli_anirudh21_en_5.2.0_3.0_1700429444932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_wnli_anirudh21_en_5.2.0_3.0_1700429444932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_wnli_anirudh21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_wnli_anirudh21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_wnli_anirudh21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anirudh21/distilbert-base-uncased-finetuned-wnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_yelp_reviews_tmp_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_yelp_reviews_tmp_en.md new file mode 100644 index 000000000000..f62da55085dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_finetuned_yelp_reviews_tmp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_yelp_reviews_tmp DistilBertForSequenceClassification from Ramamurthi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_yelp_reviews_tmp +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_yelp_reviews_tmp` is a English model originally trained by Ramamurthi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_yelp_reviews_tmp_en_5.2.0_3.0_1700362672026.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_yelp_reviews_tmp_en_5.2.0_3.0_1700362672026.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_yelp_reviews_tmp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_yelp_reviews_tmp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_yelp_reviews_tmp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Ramamurthi/distilbert-base-uncased-finetuned-yelp-reviews-tmp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_ft_m3_lc_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_ft_m3_lc_en.md new file mode 100644 index 000000000000..eea6524caf03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_ft_m3_lc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_ft_m3_lc DistilBertForSequenceClassification from sarahmiller137 +author: John Snow Labs +name: distilbert_base_uncased_ft_m3_lc +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ft_m3_lc` is a English model originally trained by sarahmiller137. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ft_m3_lc_en_5.2.0_3.0_1700385771260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ft_m3_lc_en_5.2.0_3.0_1700385771260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_ft_m3_lc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_ft_m3_lc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ft_m3_lc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/sarahmiller137/distilbert-base-uncased-ft-m3-lc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_hate_offensive_oriya_normal_speech_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_hate_offensive_oriya_normal_speech_en.md new file mode 100644 index 000000000000..ab5864af0076 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_hate_offensive_oriya_normal_speech_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_hate_offensive_oriya_normal_speech DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_hate_offensive_oriya_normal_speech +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_hate_offensive_oriya_normal_speech` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_hate_offensive_oriya_normal_speech_en_5.2.0_3.0_1700373234139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_hate_offensive_oriya_normal_speech_en_5.2.0_3.0_1700373234139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_hate_offensive_oriya_normal_speech","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_hate_offensive_oriya_normal_speech","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_hate_offensive_oriya_normal_speech| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Hate_Offensive_or_Normal_Speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_mvonwyl_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_mvonwyl_en.md new file mode 100644 index 000000000000..b92abdb8ba62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_mvonwyl_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_imdb_mvonwyl DistilBertForSequenceClassification from mvonwyl +author: John Snow Labs +name: distilbert_base_uncased_imdb_mvonwyl +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_imdb_mvonwyl` is a English model originally trained by mvonwyl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_mvonwyl_en_5.2.0_3.0_1700407453574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_mvonwyl_en_5.2.0_3.0_1700407453574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_mvonwyl","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_mvonwyl","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_imdb_mvonwyl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mvonwyl/distilbert-base-uncased-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_reviews_en.md new file mode 100644 index 000000000000..fb07ab7caf2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_reviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_imdb_reviews DistilBertForSequenceClassification from minhhoque +author: John Snow Labs +name: distilbert_base_uncased_imdb_reviews +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_imdb_reviews` is a English model originally trained by minhhoque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_reviews_en_5.2.0_3.0_1700385239377.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_reviews_en_5.2.0_3.0_1700385239377.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_reviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_reviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_imdb_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/minhhoque/distilbert-base-uncased_imdb_reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_saved_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_saved_en.md new file mode 100644 index 000000000000..2624c8df3444 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_imdb_saved_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_imdb_saved DistilBertForSequenceClassification from thaile +author: John Snow Labs +name: distilbert_base_uncased_imdb_saved +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_imdb_saved` is a English model originally trained by thaile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_saved_en_5.2.0_3.0_1700416280078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_saved_en_5.2.0_3.0_1700416280078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_saved","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_saved","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_imdb_saved| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/thaile/distilbert-base-uncased-imdb-saved \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_jigsaw_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_jigsaw_en.md new file mode 100644 index 000000000000..3d31bf12f5e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_jigsaw_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_jigsaw DistilBertForSequenceClassification from hohyun312 +author: John Snow Labs +name: distilbert_base_uncased_jigsaw +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_jigsaw` is a English model originally trained by hohyun312. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_jigsaw_en_5.2.0_3.0_1700380512166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_jigsaw_en_5.2.0_3.0_1700380512166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_jigsaw","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_jigsaw","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_jigsaw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/hohyun312/distilbert-base-uncased-jigsaw \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_log_classfication_v1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_log_classfication_v1_en.md new file mode 100644 index 000000000000..34e577e35b2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_log_classfication_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_log_classfication_v1 DistilBertForSequenceClassification from gemlincong +author: John Snow Labs +name: distilbert_base_uncased_log_classfication_v1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_log_classfication_v1` is a English model originally trained by gemlincong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_log_classfication_v1_en_5.2.0_3.0_1700358628436.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_log_classfication_v1_en_5.2.0_3.0_1700358628436.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_log_classfication_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_log_classfication_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_log_classfication_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/gemlincong/distilbert-base-uncased_log_classfication_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_onionornot_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_onionornot_en.md new file mode 100644 index 000000000000..22366839f025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_onionornot_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_onionornot DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_onionornot +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_onionornot` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_onionornot_en_5.2.0_3.0_1700374473883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_onionornot_en_5.2.0_3.0_1700374473883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_onionornot","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_onionornot","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_onionornot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-OnionOrNot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_pakornor_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_pakornor_en.md new file mode 100644 index 000000000000..73f473f1c6a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_pakornor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_pakornor DistilBertForSequenceClassification from pakornor +author: John Snow Labs +name: distilbert_base_uncased_pakornor +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_pakornor` is a English model originally trained by pakornor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pakornor_en_5.2.0_3.0_1700356742805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pakornor_en_5.2.0_3.0_1700356742805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_pakornor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_pakornor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_pakornor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/pakornor/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands_en.md new file mode 100644 index 000000000000..29334d701e39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands_en_5.2.0_3.0_1700394098954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands_en_5.2.0_3.0_1700394098954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_regression_edmunds_car_reviews_all_car_brands| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Regression-Edmunds_Car_Reviews-all_car_brands \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_american_made_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_american_made_en.md new file mode 100644 index 000000000000..e89447a4553b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_american_made_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_regression_edmunds_car_reviews_american_made DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_regression_edmunds_car_reviews_american_made +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_regression_edmunds_car_reviews_american_made` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_american_made_en_5.2.0_3.0_1700413214566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_american_made_en_5.2.0_3.0_1700413214566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_american_made","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_american_made","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_regression_edmunds_car_reviews_american_made| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Regression-Edmunds_Car_Reviews-American_Made \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_european_made_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_european_made_en.md new file mode 100644 index 000000000000..d036cb4412d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_european_made_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_regression_edmunds_car_reviews_european_made DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_regression_edmunds_car_reviews_european_made +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_regression_edmunds_car_reviews_european_made` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_european_made_en_5.2.0_3.0_1700430374767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_european_made_en_5.2.0_3.0_1700430374767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_european_made","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_european_made","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_regression_edmunds_car_reviews_european_made| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Regression-Edmunds_Car_Reviews-European_Made \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports_en.md new file mode 100644 index 000000000000..5c9de70e7d90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports_en_5.2.0_3.0_1700356720185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports_en_5.2.0_3.0_1700356720185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_regression_edmunds_car_reviews_non_european_imports| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Regression-Edmunds_Car_Reviews-Non_European_Imports \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_simpsons_plus_others_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_simpsons_plus_others_en.md new file mode 100644 index 000000000000..dd14a964cd07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_regression_simpsons_plus_others_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_regression_simpsons_plus_others DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_regression_simpsons_plus_others +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_regression_simpsons_plus_others` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_simpsons_plus_others_en_5.2.0_3.0_1700426678287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_regression_simpsons_plus_others_en_5.2.0_3.0_1700426678287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_simpsons_plus_others","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_regression_simpsons_plus_others","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_regression_simpsons_plus_others| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-Regression-Simpsons_Plus_Others \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_research_articles_multilabel_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_research_articles_multilabel_en.md new file mode 100644 index 000000000000..78a8e382539e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_research_articles_multilabel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_research_articles_multilabel DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_research_articles_multilabel +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_research_articles_multilabel` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_research_articles_multilabel_en_5.2.0_3.0_1700428493381.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_research_articles_multilabel_en_5.2.0_3.0_1700428493381.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_research_articles_multilabel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_research_articles_multilabel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_research_articles_multilabel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased_research_articles_multilabel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_reviews_multilabel_clf_v2_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_reviews_multilabel_clf_v2_en.md new file mode 100644 index 000000000000..f025a4061f1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_reviews_multilabel_clf_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_reviews_multilabel_clf_v2 DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_reviews_multilabel_clf_v2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_reviews_multilabel_clf_v2` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_reviews_multilabel_clf_v2_en_5.2.0_3.0_1700355277740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_reviews_multilabel_clf_v2_en_5.2.0_3.0_1700355277740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_reviews_multilabel_clf_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_reviews_multilabel_clf_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_reviews_multilabel_clf_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-reviews_multilabel_clf_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_rotten_tomatoes_xianzhew_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_rotten_tomatoes_xianzhew_en.md new file mode 100644 index 000000000000..37e40a2dc7e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_rotten_tomatoes_xianzhew_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_rotten_tomatoes_xianzhew DistilBertForSequenceClassification from xianzhew +author: John Snow Labs +name: distilbert_base_uncased_rotten_tomatoes_xianzhew +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_rotten_tomatoes_xianzhew` is a English model originally trained by xianzhew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_rotten_tomatoes_xianzhew_en_5.2.0_3.0_1700356178939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_rotten_tomatoes_xianzhew_en_5.2.0_3.0_1700356178939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_rotten_tomatoes_xianzhew","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_rotten_tomatoes_xianzhew","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_rotten_tomatoes_xianzhew| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/xianzhew/distilbert-base-uncased_rotten_tomatoes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_sexist_epoch2_norwegian_config_json_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_sexist_epoch2_norwegian_config_json_en.md new file mode 100644 index 000000000000..ff11f8c00e28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_sexist_epoch2_norwegian_config_json_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sexist_epoch2_norwegian_config_json DistilBertForSequenceClassification from l-tran +author: John Snow Labs +name: distilbert_base_uncased_sexist_epoch2_norwegian_config_json +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sexist_epoch2_norwegian_config_json` is a English model originally trained by l-tran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sexist_epoch2_norwegian_config_json_en_5.2.0_3.0_1700403973072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sexist_epoch2_norwegian_config_json_en_5.2.0_3.0_1700403973072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sexist_epoch2_norwegian_config_json","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sexist_epoch2_norwegian_config_json","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sexist_epoch2_norwegian_config_json| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/l-tran/distilbert-base-uncased-sexist-epoch2-no-config-json \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_spamfilter_lg_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_spamfilter_lg_en.md new file mode 100644 index 000000000000..79740a4fc2fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_spamfilter_lg_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_spamfilter_lg DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_spamfilter_lg +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_spamfilter_lg` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_spamfilter_lg_en_5.2.0_3.0_1700412343586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_spamfilter_lg_en_5.2.0_3.0_1700412343586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_spamfilter_lg","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_spamfilter_lg","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_spamfilter_lg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-SpamFilter-LG \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1_en.md new file mode 100644 index 000000000000..ff57fdf59d3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1 DistilBertForSequenceClassification from rajesh426 +author: John Snow Labs +name: distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1` is a English model originally trained by rajesh426. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1_en_5.2.0_3.0_1700419137554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1_en_5.2.0_3.0_1700419137554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_up_sampling_sub_category_speech_text_display_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rajesh426/distilbert-base-uncased_Up_Sampling_Sub_Category_SPEECH_TEXT_DISPLAY_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22_en.md new file mode 100644 index 000000000000..b0c392a38c0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22 DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22_en_5.2.0_3.0_1700395909578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22_en_5.2.0_3.0_1700395909578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_us_airline_twitter_sentiment_analysis_dunnbc22| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-US_Airline_Twitter_Sentiment_Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234_en.md new file mode 100644 index 000000000000..078309df2674 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234 DistilBertForSequenceClassification from Vin1234 +author: John Snow Labs +name: distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234` is a English model originally trained by Vin1234. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234_en_5.2.0_3.0_1700353553203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234_en_5.2.0_3.0_1700353553203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_us_airline_twitter_sentiment_analysis_vin1234| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Vin1234/distilbert-base-uncased-US_Airline_Twitter_Sentiment_Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_wandb_week_3_complaints_classifier_1500_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_wandb_week_3_complaints_classifier_1500_en.md new file mode 100644 index 000000000000..2db3823b80e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_base_uncased_wandb_week_3_complaints_classifier_1500_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_wandb_week_3_complaints_classifier_1500 DistilBertForSequenceClassification from Kayvane +author: John Snow Labs +name: distilbert_base_uncased_wandb_week_3_complaints_classifier_1500 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_wandb_week_3_complaints_classifier_1500` is a English model originally trained by Kayvane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_wandb_week_3_complaints_classifier_1500_en_5.2.0_3.0_1700355932529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_wandb_week_3_complaints_classifier_1500_en_5.2.0_3.0_1700355932529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_wandb_week_3_complaints_classifier_1500","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_wandb_week_3_complaints_classifier_1500","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_wandb_week_3_complaints_classifier_1500| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kayvane/distilbert-base-uncased-wandb-week-3-complaints-classifier-1500 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_bookreviews_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_bookreviews_en.md new file mode 100644 index 000000000000..ee9d8ba47d60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_bookreviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_bookreviews DistilBertForSequenceClassification from bierus +author: John Snow Labs +name: distilbert_bookreviews +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_bookreviews` is a English model originally trained by bierus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_bookreviews_en_5.2.0_3.0_1700352444264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_bookreviews_en_5.2.0_3.0_1700352444264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_bookreviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_bookreviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_bookreviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bierus/distilbert_bookreviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_bug_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_bug_classifier_en.md new file mode 100644 index 000000000000..a2ed8cbcef47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_bug_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_bug_classifier DistilBertForSequenceClassification from Peterard +author: John Snow Labs +name: distilbert_bug_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_bug_classifier` is a English model originally trained by Peterard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_bug_classifier_en_5.2.0_3.0_1700353015275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_bug_classifier_en_5.2.0_3.0_1700353015275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_bug_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_bug_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_bug_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.4 MB| + +## References + +https://huggingface.co/Peterard/distilbert_bug_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_cnn_news_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_cnn_news_en.md new file mode 100644 index 000000000000..894b6094a90c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_cnn_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_cnn_news DistilBertForSequenceClassification from AyoubChLin +author: John Snow Labs +name: distilbert_cnn_news +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_cnn_news` is a English model originally trained by AyoubChLin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_cnn_news_en_5.2.0_3.0_1700403099834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_cnn_news_en_5.2.0_3.0_1700403099834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cnn_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cnn_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_cnn_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AyoubChLin/distilbert_cnn_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_cohere_v7_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_cohere_v7_en.md new file mode 100644 index 000000000000..d24044905d6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_cohere_v7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_cohere_v7 DistilBertForSequenceClassification from clam004 +author: John Snow Labs +name: distilbert_cohere_v7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_cohere_v7` is a English model originally trained by clam004. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_cohere_v7_en_5.2.0_3.0_1700399210458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_cohere_v7_en_5.2.0_3.0_1700399210458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cohere_v7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_cohere_v7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_cohere_v7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/clam004/distilbert-cohere-v7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_coherent_v3_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_coherent_v3_en.md new file mode 100644 index 000000000000..ac23e9f467eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_coherent_v3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_coherent_v3 DistilBertForSequenceClassification from clam004 +author: John Snow Labs +name: distilbert_coherent_v3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_coherent_v3` is a English model originally trained by clam004. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_coherent_v3_en_5.2.0_3.0_1700388840832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_coherent_v3_en_5.2.0_3.0_1700388840832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_coherent_v3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_coherent_v3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_coherent_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/clam004/distilbert-coherent-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_combined_large_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_combined_large_en.md new file mode 100644 index 000000000000..a05c2f3e3fec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_combined_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_combined_large DistilBertForSequenceClassification from asparius +author: John Snow Labs +name: distilbert_combined_large +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_combined_large` is a English model originally trained by asparius. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_combined_large_en_5.2.0_3.0_1700431385136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_combined_large_en_5.2.0_3.0_1700431385136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_combined_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_combined_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_combined_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|254.0 MB| + +## References + +https://huggingface.co/asparius/distilbert-combined-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_complaints_wandb_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_complaints_wandb_en.md new file mode 100644 index 000000000000..3899dfb318dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_complaints_wandb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_complaints_wandb DistilBertForSequenceClassification from Kayvane +author: John Snow Labs +name: distilbert_complaints_wandb +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_complaints_wandb` is a English model originally trained by Kayvane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_complaints_wandb_en_5.2.0_3.0_1700411371165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_complaints_wandb_en_5.2.0_3.0_1700411371165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_complaints_wandb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_complaints_wandb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_complaints_wandb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kayvane/distilbert-complaints-wandb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_emotion_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_emotion_analysis_en.md new file mode 100644 index 000000000000..955583e814d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_emotion_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_emotion_analysis DistilBertForSequenceClassification from WillyWilliam +author: John Snow Labs +name: distilbert_emotion_analysis +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_emotion_analysis` is a English model originally trained by WillyWilliam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_emotion_analysis_en_5.2.0_3.0_1700432361519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_emotion_analysis_en_5.2.0_3.0_1700432361519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_emotion_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_emotion_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_emotion_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/WillyWilliam/distilbert-emotion-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_finance_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finance_en.md new file mode 100644 index 000000000000..8ae41081167d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finance_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finance DistilBertForSequenceClassification from IngeniousArtist +author: John Snow Labs +name: distilbert_finance +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finance` is a English model originally trained by IngeniousArtist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finance_en_5.2.0_3.0_1700411201948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finance_en_5.2.0_3.0_1700411201948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finance","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finance","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finance| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/IngeniousArtist/distilbert-finance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_fine_tune_questionvsanswer_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_fine_tune_questionvsanswer_en.md new file mode 100644 index 000000000000..ef8589161036 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_fine_tune_questionvsanswer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_fine_tune_questionvsanswer DistilBertForSequenceClassification from Ahmedgr +author: John Snow Labs +name: distilbert_fine_tune_questionvsanswer +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_fine_tune_questionvsanswer` is a English model originally trained by Ahmedgr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_fine_tune_questionvsanswer_en_5.2.0_3.0_1700383573090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_fine_tune_questionvsanswer_en_5.2.0_3.0_1700383573090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tune_questionvsanswer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tune_questionvsanswer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_fine_tune_questionvsanswer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Ahmedgr/DistilBert_Fine_tune_QuestionVsAnswer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_fine_tuned_terms_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_fine_tuned_terms_en.md new file mode 100644 index 000000000000..05aa3bb5ceba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_fine_tuned_terms_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_fine_tuned_terms DistilBertForSequenceClassification from alexskrn +author: John Snow Labs +name: distilbert_fine_tuned_terms +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_fine_tuned_terms` is a English model originally trained by alexskrn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_fine_tuned_terms_en_5.2.0_3.0_1700424950540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_fine_tuned_terms_en_5.2.0_3.0_1700424950540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tuned_terms","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tuned_terms","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_fine_tuned_terms| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alexskrn/distilbert-fine-tuned-terms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_emotion_en.md new file mode 100644 index 000000000000..4d32f4fc1a4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_emotion DistilBertForSequenceClassification from ADRIANRICO +author: John Snow Labs +name: distilbert_finetuned_emotion +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_emotion` is a English model originally trained by ADRIANRICO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_emotion_en_5.2.0_3.0_1700355401593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_emotion_en_5.2.0_3.0_1700355401593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ADRIANRICO/Distilbert-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_fakenews_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_fakenews_en.md new file mode 100644 index 000000000000..156fb433ec88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_fakenews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_fakenews DistilBertForSequenceClassification from Tahsin-Mayeesha +author: John Snow Labs +name: distilbert_finetuned_fakenews +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_fakenews` is a English model originally trained by Tahsin-Mayeesha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_fakenews_en_5.2.0_3.0_1700390474797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_fakenews_en_5.2.0_3.0_1700390474797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_fakenews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_fakenews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_fakenews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tahsin-Mayeesha/distilbert-finetuned-fakenews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_flipkart_product_reviews_kaggle_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_flipkart_product_reviews_kaggle_en.md new file mode 100644 index 000000000000..69c91b19f273 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_flipkart_product_reviews_kaggle_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_flipkart_product_reviews_kaggle DistilBertForSequenceClassification from prajwalkhairnar +author: John Snow Labs +name: distilbert_finetuned_flipkart_product_reviews_kaggle +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_flipkart_product_reviews_kaggle` is a English model originally trained by prajwalkhairnar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_flipkart_product_reviews_kaggle_en_5.2.0_3.0_1700353698424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_flipkart_product_reviews_kaggle_en_5.2.0_3.0_1700353698424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_flipkart_product_reviews_kaggle","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_flipkart_product_reviews_kaggle","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_flipkart_product_reviews_kaggle| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/prajwalkhairnar/distilbert_finetuned_flipkart_product_reviews_kaggle \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_reuters21578_multilabel_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_reuters21578_multilabel_en.md new file mode 100644 index 000000000000..6b66455fd4fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_reuters21578_multilabel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_reuters21578_multilabel DistilBertForSequenceClassification from lxyuan +author: John Snow Labs +name: distilbert_finetuned_reuters21578_multilabel +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_reuters21578_multilabel` is a English model originally trained by lxyuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_reuters21578_multilabel_en_5.2.0_3.0_1700434875916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_reuters21578_multilabel_en_5.2.0_3.0_1700434875916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_reuters21578_multilabel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_reuters21578_multilabel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_reuters21578_multilabel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.3 MB| + +## References + +https://huggingface.co/lxyuan/distilbert-finetuned-reuters21578-multilabel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_vietnamese_question_type_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_vietnamese_question_type_en.md new file mode 100644 index 000000000000..7b8f525a21ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_vietnamese_question_type_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_vietnamese_question_type DistilBertForSequenceClassification from EddieChen372 +author: John Snow Labs +name: distilbert_finetuned_vietnamese_question_type +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_vietnamese_question_type` is a English model originally trained by EddieChen372. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_vietnamese_question_type_en_5.2.0_3.0_1700414567864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_vietnamese_question_type_en_5.2.0_3.0_1700414567864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_vietnamese_question_type","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_vietnamese_question_type","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_vietnamese_question_type| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/EddieChen372/distilbert-finetuned-vi-question_type \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_yahoo_answers_topics_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_yahoo_answers_topics_en.md new file mode 100644 index 000000000000..ddfbf4167cac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_finetuned_yahoo_answers_topics_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_yahoo_answers_topics DistilBertForSequenceClassification from gavulsim +author: John Snow Labs +name: distilbert_finetuned_yahoo_answers_topics +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_yahoo_answers_topics` is a English model originally trained by gavulsim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_yahoo_answers_topics_en_5.2.0_3.0_1700408449498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_yahoo_answers_topics_en_5.2.0_3.0_1700408449498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_yahoo_answers_topics","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_yahoo_answers_topics","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_yahoo_answers_topics| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/gavulsim/distilbert_finetuned_yahoo_answers_topics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_for_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_for_text_classification_en.md new file mode 100644 index 000000000000..c44218c6706f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_for_text_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_for_text_classification DistilBertForSequenceClassification from RavenK +author: John Snow Labs +name: distilbert_for_text_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_for_text_classification` is a English model originally trained by RavenK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_for_text_classification_en_5.2.0_3.0_1700353983598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_for_text_classification_en_5.2.0_3.0_1700353983598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_for_text_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_for_text_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_for_text_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/RavenK/distilbert_for_text_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_gsa_eula_opp_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_gsa_eula_opp_en.md new file mode 100644 index 000000000000..09fec6f8af0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_gsa_eula_opp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_gsa_eula_opp DistilBertForSequenceClassification from adelevie +author: John Snow Labs +name: distilbert_gsa_eula_opp +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_gsa_eula_opp` is a English model originally trained by adelevie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_gsa_eula_opp_en_5.2.0_3.0_1700384717029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_gsa_eula_opp_en_5.2.0_3.0_1700384717029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_gsa_eula_opp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_gsa_eula_opp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_gsa_eula_opp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/adelevie/distilbert-gsa-eula-opp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_icekingbing_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_icekingbing_en.md new file mode 100644 index 000000000000..33ff2253a3c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_icekingbing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_icekingbing DistilBertForSequenceClassification from IceKingBing +author: John Snow Labs +name: distilbert_imdb_icekingbing +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_icekingbing` is a English model originally trained by IceKingBing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_icekingbing_en_5.2.0_3.0_1700383911098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_icekingbing_en_5.2.0_3.0_1700383911098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_icekingbing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_icekingbing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_icekingbing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/IceKingBing/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_jzonthemtn_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_jzonthemtn_en.md new file mode 100644 index 000000000000..03685dd1ee22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_jzonthemtn_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_jzonthemtn DistilBertForSequenceClassification from jzonthemtn +author: John Snow Labs +name: distilbert_imdb_jzonthemtn +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_jzonthemtn` is a English model originally trained by jzonthemtn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_jzonthemtn_en_5.2.0_3.0_1700375815982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_jzonthemtn_en_5.2.0_3.0_1700375815982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_jzonthemtn","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_jzonthemtn","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_jzonthemtn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jzonthemtn/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_pos_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_pos_en.md new file mode 100644 index 000000000000..a4a1a675ad75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_pos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_pos DistilBertForSequenceClassification from ScandinavianMrT +author: John Snow Labs +name: distilbert_imdb_pos +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_pos` is a English model originally trained by ScandinavianMrT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_pos_en_5.2.0_3.0_1700377197022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_pos_en_5.2.0_3.0_1700377197022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_pos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_pos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_pos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ScandinavianMrT/distilbert-IMDB-POS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_xianzhew_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_xianzhew_en.md new file mode 100644 index 000000000000..4dbc25850fdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_imdb_xianzhew_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_xianzhew DistilBertForSequenceClassification from xianzhew +author: John Snow Labs +name: distilbert_imdb_xianzhew +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_xianzhew` is a English model originally trained by xianzhew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_xianzhew_en_5.2.0_3.0_1700402177182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_xianzhew_en_5.2.0_3.0_1700402177182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_xianzhew","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_xianzhew","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_xianzhew| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/xianzhew/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_joke_detector_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_joke_detector_en.md new file mode 100644 index 000000000000..a296a12279df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_joke_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_joke_detector DistilBertForSequenceClassification from Reggie +author: John Snow Labs +name: distilbert_joke_detector +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_joke_detector` is a English model originally trained by Reggie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_joke_detector_en_5.2.0_3.0_1700380573598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_joke_detector_en_5.2.0_3.0_1700380573598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_joke_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_joke_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_joke_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Reggie/distilbert-joke_detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_media_bias_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_media_bias_en.md new file mode 100644 index 000000000000..2dbd65762e7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_media_bias_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_media_bias DistilBertForSequenceClassification from rinapch +author: John Snow Labs +name: distilbert_media_bias +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_media_bias` is a English model originally trained by rinapch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_media_bias_en_5.2.0_3.0_1700384844109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_media_bias_en_5.2.0_3.0_1700384844109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_media_bias","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_media_bias","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_media_bias| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rinapch/distilbert-media-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_multilingual_finetuned_sentiment_xx.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_multilingual_finetuned_sentiment_xx.md new file mode 100644 index 000000000000..7270afb22399 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_multilingual_finetuned_sentiment_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_multilingual_finetuned_sentiment DistilBertForSequenceClassification from mgb-dx-meetup +author: John Snow Labs +name: distilbert_multilingual_finetuned_sentiment +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_multilingual_finetuned_sentiment` is a Multilingual model originally trained by mgb-dx-meetup. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_finetuned_sentiment_xx_5.2.0_3.0_1700355766126.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_finetuned_sentiment_xx_5.2.0_3.0_1700355766126.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_finetuned_sentiment","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_finetuned_sentiment","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_multilingual_finetuned_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/mgb-dx-meetup/distilbert-multilingual-finetuned-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_multilingual_uncased_gpt_mar_13_epoch_1_xx.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_multilingual_uncased_gpt_mar_13_epoch_1_xx.md new file mode 100644 index 000000000000..37ded79b68d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_multilingual_uncased_gpt_mar_13_epoch_1_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_multilingual_uncased_gpt_mar_13_epoch_1 DistilBertForSequenceClassification from SmilestheSad +author: John Snow Labs +name: distilbert_multilingual_uncased_gpt_mar_13_epoch_1 +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_multilingual_uncased_gpt_mar_13_epoch_1` is a Multilingual model originally trained by SmilestheSad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_gpt_mar_13_epoch_1_xx_5.2.0_3.0_1700412237285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_gpt_mar_13_epoch_1_xx_5.2.0_3.0_1700412237285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_gpt_mar_13_epoch_1","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_gpt_mar_13_epoch_1","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_multilingual_uncased_gpt_mar_13_epoch_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/SmilestheSad/distilbert-multilingual-uncased-gpt-mar-13-epoch-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_optimised_finetuned_financial_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_optimised_finetuned_financial_sentiment_en.md new file mode 100644 index 000000000000..39974dac6d0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_optimised_finetuned_financial_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_optimised_finetuned_financial_sentiment DistilBertForSequenceClassification from hazrulakmal +author: John Snow Labs +name: distilbert_optimised_finetuned_financial_sentiment +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_optimised_finetuned_financial_sentiment` is a English model originally trained by hazrulakmal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_optimised_finetuned_financial_sentiment_en_5.2.0_3.0_1700352267401.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_optimised_finetuned_financial_sentiment_en_5.2.0_3.0_1700352267401.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_optimised_finetuned_financial_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_optimised_finetuned_financial_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_optimised_finetuned_financial_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hazrulakmal/distilbert-optimised-finetuned-financial-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_persian_farsi_description_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_persian_farsi_description_classifier_en.md new file mode 100644 index 000000000000..6861452d12a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_persian_farsi_description_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_persian_farsi_description_classifier DistilBertForSequenceClassification from mohsenfayyaz +author: John Snow Labs +name: distilbert_persian_farsi_description_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_persian_farsi_description_classifier` is a English model originally trained by mohsenfayyaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_description_classifier_en_5.2.0_3.0_1700390580347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_description_classifier_en_5.2.0_3.0_1700390580347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_persian_farsi_description_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_persian_farsi_description_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_persian_farsi_description_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|284.5 MB| + +## References + +https://huggingface.co/mohsenfayyaz/distilbert-fa-description-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_poem_key_words_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_poem_key_words_en.md new file mode 100644 index 000000000000..f2b1a142b4e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_poem_key_words_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_poem_key_words DistilBertForSequenceClassification from guhuawuli +author: John Snow Labs +name: distilbert_poem_key_words +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_poem_key_words` is a English model originally trained by guhuawuli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_poem_key_words_en_5.2.0_3.0_1700354600929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_poem_key_words_en_5.2.0_3.0_1700354600929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_poem_key_words","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_poem_key_words","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_poem_key_words| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/guhuawuli/distilbert-poem_key_words \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_reviews_with_context_drift_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_reviews_with_context_drift_en.md new file mode 100644 index 000000000000..7526afdabf72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_reviews_with_context_drift_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_reviews_with_context_drift DistilBertForSequenceClassification from arize-ai +author: John Snow Labs +name: distilbert_reviews_with_context_drift +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_reviews_with_context_drift` is a English model originally trained by arize-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_reviews_with_context_drift_en_5.2.0_3.0_1700411528009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_reviews_with_context_drift_en_5.2.0_3.0_1700411528009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_reviews_with_context_drift","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_reviews_with_context_drift","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_reviews_with_context_drift| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/arize-ai/distilbert_reviews_with_context_drift \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_rotten_tomatoes_sentiment_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_rotten_tomatoes_sentiment_classifier_en.md new file mode 100644 index 000000000000..e039f4aebafb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_rotten_tomatoes_sentiment_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_rotten_tomatoes_sentiment_classifier DistilBertForSequenceClassification from RJZauner +author: John Snow Labs +name: distilbert_rotten_tomatoes_sentiment_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_rotten_tomatoes_sentiment_classifier` is a English model originally trained by RJZauner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_rotten_tomatoes_sentiment_classifier_en_5.2.0_3.0_1700357473479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_rotten_tomatoes_sentiment_classifier_en_5.2.0_3.0_1700357473479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_rotten_tomatoes_sentiment_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_rotten_tomatoes_sentiment_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_rotten_tomatoes_sentiment_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/RJZauner/distilbert_rotten_tomatoes_sentiment_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_sagemaker_1609802168_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sagemaker_1609802168_en.md new file mode 100644 index 000000000000..231c0eb81faf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sagemaker_1609802168_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sagemaker_1609802168 DistilBertForSequenceClassification from julien-c +author: John Snow Labs +name: distilbert_sagemaker_1609802168 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sagemaker_1609802168` is a English model originally trained by julien-c. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sagemaker_1609802168_en_5.2.0_3.0_1700392543753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sagemaker_1609802168_en_5.2.0_3.0_1700392543753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sagemaker_1609802168","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sagemaker_1609802168","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sagemaker_1609802168| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/julien-c/distilbert-sagemaker-1609802168 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sentiment_analysis_en.md new file mode 100644 index 000000000000..e57bef4ac797 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sentiment_analysis DistilBertForSequenceClassification from DracoHugging +author: John Snow Labs +name: distilbert_sentiment_analysis +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sentiment_analysis` is a English model originally trained by DracoHugging. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_analysis_en_5.2.0_3.0_1700417039502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_analysis_en_5.2.0_3.0_1700417039502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DracoHugging/Distilbert-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_sentimentanalysis_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sentimentanalysis_en.md new file mode 100644 index 000000000000..1189856b2aca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sentimentanalysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sentimentanalysis DistilBertForSequenceClassification from annamalai-s +author: John Snow Labs +name: distilbert_sentimentanalysis +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sentimentanalysis` is a English model originally trained by annamalai-s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sentimentanalysis_en_5.2.0_3.0_1700430168746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sentimentanalysis_en_5.2.0_3.0_1700430168746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentimentanalysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentimentanalysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sentimentanalysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/annamalai-s/distilbert-sentimentanalysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1_en.md new file mode 100644 index 000000000000..464460c0e253 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1 DistilBertForSequenceClassification from echarlaix +author: John Snow Labs +name: distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1` is a English model originally trained by echarlaix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1_en_5.2.0_3.0_1700372035344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1_en_5.2.0_3.0_1700372035344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sst2_indic_languages_dynamic_quantization_magnitude_pruning_0_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|243.4 MB| + +## References + +https://huggingface.co/echarlaix/distilbert-sst2-inc-dynamic-quantization-magnitude-pruning-0.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_sst2_runglue_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sst2_runglue_en.md new file mode 100644 index 000000000000..e9f6dd825736 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_sst2_runglue_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sst2_runglue DistilBertForSequenceClassification from neal49 +author: John Snow Labs +name: distilbert_sst2_runglue +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sst2_runglue` is a English model originally trained by neal49. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sst2_runglue_en_5.2.0_3.0_1700383908953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sst2_runglue_en_5.2.0_3.0_1700383908953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sst2_runglue","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sst2_runglue","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sst2_runglue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/neal49/distilbert-sst2-runglue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_suicidal_content_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_suicidal_content_reviews_en.md new file mode 100644 index 000000000000..0548481276c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_suicidal_content_reviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_suicidal_content_reviews DistilBertForSequenceClassification from Prashant-karwasra +author: John Snow Labs +name: distilbert_suicidal_content_reviews +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_suicidal_content_reviews` is a English model originally trained by Prashant-karwasra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_suicidal_content_reviews_en_5.2.0_3.0_1700355548466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_suicidal_content_reviews_en_5.2.0_3.0_1700355548466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_suicidal_content_reviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_suicidal_content_reviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_suicidal_content_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/Prashant-karwasra/DistilBert-suicidal-content-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_summarization_reward_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_summarization_reward_model_en.md new file mode 100644 index 000000000000..868a5ca18f70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_summarization_reward_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_summarization_reward_model DistilBertForSequenceClassification from Tristan +author: John Snow Labs +name: distilbert_summarization_reward_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_summarization_reward_model` is a English model originally trained by Tristan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_summarization_reward_model_en_5.2.0_3.0_1700387868440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_summarization_reward_model_en_5.2.0_3.0_1700387868440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_summarization_reward_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_summarization_reward_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_summarization_reward_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tristan/distilbert_summarization_reward_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_classifier_demo_01_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_classifier_demo_01_en.md new file mode 100644 index 000000000000..4fc0f502d767 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_classifier_demo_01_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_toxic_classifier_demo_01 DistilBertForSequenceClassification from ravi2k1 +author: John Snow Labs +name: distilbert_toxic_classifier_demo_01 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_toxic_classifier_demo_01` is a English model originally trained by ravi2k1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_toxic_classifier_demo_01_en_5.2.0_3.0_1700396318171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_toxic_classifier_demo_01_en_5.2.0_3.0_1700396318171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxic_classifier_demo_01","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxic_classifier_demo_01","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_toxic_classifier_demo_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ravi2k1/distilbert-toxic-classifier-demo-01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_classifier_profoz_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_classifier_profoz_en.md new file mode 100644 index 000000000000..b40db2955189 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_classifier_profoz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_toxic_classifier_profoz DistilBertForSequenceClassification from profoz +author: John Snow Labs +name: distilbert_toxic_classifier_profoz +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_toxic_classifier_profoz` is a English model originally trained by profoz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_toxic_classifier_profoz_en_5.2.0_3.0_1700413214549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_toxic_classifier_profoz_en_5.2.0_3.0_1700413214549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxic_classifier_profoz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxic_classifier_profoz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_toxic_classifier_profoz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/profoz/distilbert-toxic-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_deepbiz_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_deepbiz_en.md new file mode 100644 index 000000000000..a52cbcf02832 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_toxic_deepbiz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_toxic_deepbiz DistilBertForSequenceClassification from deepBiz +author: John Snow Labs +name: distilbert_toxic_deepbiz +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_toxic_deepbiz` is a English model originally trained by deepBiz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_toxic_deepbiz_en_5.2.0_3.0_1700420900407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_toxic_deepbiz_en_5.2.0_3.0_1700420900407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxic_deepbiz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_toxic_deepbiz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_toxic_deepbiz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/deepBiz/distilbert-toxic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_undersampled_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_undersampled_en.md new file mode 100644 index 000000000000..01d85ad28a9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_undersampled_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_undersampled DistilBertForSequenceClassification from Kayvane +author: John Snow Labs +name: distilbert_undersampled +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_undersampled` is a English model originally trained by Kayvane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_undersampled_en_5.2.0_3.0_1700352263635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_undersampled_en_5.2.0_3.0_1700352263635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_undersampled","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_undersampled","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_undersampled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/Kayvane/distilbert-undersampled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilbert_yes_no_intent_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilbert_yes_no_intent_en.md new file mode 100644 index 000000000000..8bf359a7bd59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilbert_yes_no_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_yes_no_intent DistilBertForSequenceClassification from sachin19566 +author: John Snow Labs +name: distilbert_yes_no_intent +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_yes_no_intent` is a English model originally trained by sachin19566. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_yes_no_intent_en_5.2.0_3.0_1700379704442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_yes_no_intent_en_5.2.0_3.0_1700379704442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_yes_no_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_yes_no_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_yes_no_intent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sachin19566/distilbert_Yes_No_Intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilgender_spanish_2m_emptor_es.md b/docs/_posts/ahmedlone127/2023-11-19-distilgender_spanish_2m_emptor_es.md new file mode 100644 index 000000000000..46b82e1a28eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilgender_spanish_2m_emptor_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish distilgender_spanish_2m_emptor DistilBertForSequenceClassification from emptor +author: John Snow Labs +name: distilgender_spanish_2m_emptor +date: 2023-11-19 +tags: [bert, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilgender_spanish_2m_emptor` is a Castilian, Spanish model originally trained by emptor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilgender_spanish_2m_emptor_es_5.2.0_3.0_1700381416474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilgender_spanish_2m_emptor_es_5.2.0_3.0_1700381416474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilgender_spanish_2m_emptor","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilgender_spanish_2m_emptor","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilgender_spanish_2m_emptor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|249.5 MB| + +## References + +https://huggingface.co/emptor/distilgender-es-2M \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distill_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-distill_model_en.md new file mode 100644 index 000000000000..0d6fe2c239cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distill_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distill_model DistilBertForSequenceClassification from stevendee5 +author: John Snow Labs +name: distill_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distill_model` is a English model originally trained by stevendee5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distill_model_en_5.2.0_3.0_1700368106400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distill_model_en_5.2.0_3.0_1700368106400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distill_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distill_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distill_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/stevendee5/distill-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distillbert_110_uncased_movie_genre_en.md b/docs/_posts/ahmedlone127/2023-11-19-distillbert_110_uncased_movie_genre_en.md new file mode 100644 index 000000000000..9dec730fd2fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distillbert_110_uncased_movie_genre_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_110_uncased_movie_genre DistilBertForSequenceClassification from Tejas3 +author: John Snow Labs +name: distillbert_110_uncased_movie_genre +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_110_uncased_movie_genre` is a English model originally trained by Tejas3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_110_uncased_movie_genre_en_5.2.0_3.0_1700355786361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_110_uncased_movie_genre_en_5.2.0_3.0_1700355786361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_110_uncased_movie_genre","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_110_uncased_movie_genre","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_110_uncased_movie_genre| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tejas3/distillbert_110_uncased_movie_genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distillbert_110_uncased_v1_en.md b/docs/_posts/ahmedlone127/2023-11-19-distillbert_110_uncased_v1_en.md new file mode 100644 index 000000000000..4edf7f53df0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distillbert_110_uncased_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_110_uncased_v1 DistilBertForSequenceClassification from Tejas3 +author: John Snow Labs +name: distillbert_110_uncased_v1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_110_uncased_v1` is a English model originally trained by Tejas3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_110_uncased_v1_en_5.2.0_3.0_1700395390615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_110_uncased_v1_en_5.2.0_3.0_1700395390615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_110_uncased_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_110_uncased_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_110_uncased_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tejas3/distillbert_110_uncased_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_80_all_en.md b/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_80_all_en.md new file mode 100644 index 000000000000..a4b25dc9869b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_80_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_base_uncased_80_all DistilBertForSequenceClassification from Tejas3 +author: John Snow Labs +name: distillbert_base_uncased_80_all +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncased_80_all` is a English model originally trained by Tejas3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_80_all_en_5.2.0_3.0_1700385864161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_80_all_en_5.2.0_3.0_1700385864161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_80_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_80_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncased_80_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tejas3/distillbert_base_uncased_80_all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_80_en.md b/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_80_en.md new file mode 100644 index 000000000000..33ac24c2c6bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_80_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_base_uncased_80 DistilBertForSequenceClassification from Tejas3 +author: John Snow Labs +name: distillbert_base_uncased_80 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncased_80` is a English model originally trained by Tejas3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_80_en_5.2.0_3.0_1700357046237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_80_en_5.2.0_3.0_1700357046237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_80","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_80","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncased_80| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Tejas3/distillbert_base_uncased_80 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_finetuned_clinc_adsjklfsd_en.md b/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_finetuned_clinc_adsjklfsd_en.md new file mode 100644 index 000000000000..abc11d4f1bcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distillbert_base_uncased_finetuned_clinc_adsjklfsd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_base_uncased_finetuned_clinc_adsjklfsd DistilBertForSequenceClassification from adsjklfsd +author: John Snow Labs +name: distillbert_base_uncased_finetuned_clinc_adsjklfsd +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncased_finetuned_clinc_adsjklfsd` is a English model originally trained by adsjklfsd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_finetuned_clinc_adsjklfsd_en_5.2.0_3.0_1700403452972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_finetuned_clinc_adsjklfsd_en_5.2.0_3.0_1700403452972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_finetuned_clinc_adsjklfsd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_finetuned_clinc_adsjklfsd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncased_finetuned_clinc_adsjklfsd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/adsjklfsd/distillbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distillbert_finetuned_indonlusmsa_en.md b/docs/_posts/ahmedlone127/2023-11-19-distillbert_finetuned_indonlusmsa_en.md new file mode 100644 index 000000000000..799d23443ff1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distillbert_finetuned_indonlusmsa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_finetuned_indonlusmsa DistilBertForSequenceClassification from mkhairil +author: John Snow Labs +name: distillbert_finetuned_indonlusmsa +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_finetuned_indonlusmsa` is a English model originally trained by mkhairil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_finetuned_indonlusmsa_en_5.2.0_3.0_1700403100657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_finetuned_indonlusmsa_en_5.2.0_3.0_1700403100657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_finetuned_indonlusmsa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_finetuned_indonlusmsa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_finetuned_indonlusmsa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mkhairil/distillbert-finetuned-indonlusmsa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01_en.md new file mode 100644 index 000000000000..98a1715b3746 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01 DistilBertForSequenceClassification from karolill +author: John Snow Labs +name: distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01` is a English model originally trained by karolill. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01_en_5.2.0_3.0_1700374833759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01_en_5.2.0_3.0_1700374833759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilmbert_lr3e_05_wr0_1_optimadamw_hf_wd0_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/karolill/distilmbert_LR3e-05_WR0.1_OPTIMadamw_hf_WD0.01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier_en.md new file mode 100644 index 000000000000..fc1e83593a4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier DistilBertForSequenceClassification from mmillet +author: John Snow Labs +name: distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier` is a English model originally trained by mmillet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier_en_5.2.0_3.0_1700412237058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier_en_5.2.0_3.0_1700412237058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilrubert_tiny_cased_conversational_v1_finetuned_empathy_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|39.2 MB| + +## References + +https://huggingface.co/mmillet/distilrubert-tiny-cased-conversational-v1_finetuned_empathy_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier_en.md new file mode 100644 index 000000000000..9973ffbd5f2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier DistilBertForSequenceClassification from mmillet +author: John Snow Labs +name: distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier` is a English model originally trained by mmillet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier_en_5.2.0_3.0_1700392221437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier_en_5.2.0_3.0_1700392221437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilrubert_tiny_cased_conversational_v1_single_finetuned_empathy_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|39.2 MB| + +## References + +https://huggingface.co/mmillet/distilrubert-tiny-cased-conversational-v1_single_finetuned_empathy_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-duplicatesunique_en.md b/docs/_posts/ahmedlone127/2023-11-19-duplicatesunique_en.md new file mode 100644 index 000000000000..a2ad353e476c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-duplicatesunique_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English duplicatesunique DistilBertForSequenceClassification from Kamer +author: John Snow Labs +name: duplicatesunique +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`duplicatesunique` is a English model originally trained by Kamer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/duplicatesunique_en_5.2.0_3.0_1700409930111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/duplicatesunique_en_5.2.0_3.0_1700409930111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("duplicatesunique","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("duplicatesunique","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|duplicatesunique| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Kamer/DuplicatesUnique \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-edos_2023_baseline_distilbert_base_uncased_label_sexist_en.md b/docs/_posts/ahmedlone127/2023-11-19-edos_2023_baseline_distilbert_base_uncased_label_sexist_en.md new file mode 100644 index 000000000000..03d284622e17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-edos_2023_baseline_distilbert_base_uncased_label_sexist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English edos_2023_baseline_distilbert_base_uncased_label_sexist DistilBertForSequenceClassification from lct-rug-2022 +author: John Snow Labs +name: edos_2023_baseline_distilbert_base_uncased_label_sexist +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`edos_2023_baseline_distilbert_base_uncased_label_sexist` is a English model originally trained by lct-rug-2022. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/edos_2023_baseline_distilbert_base_uncased_label_sexist_en_5.2.0_3.0_1700352883115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/edos_2023_baseline_distilbert_base_uncased_label_sexist_en_5.2.0_3.0_1700352883115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("edos_2023_baseline_distilbert_base_uncased_label_sexist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("edos_2023_baseline_distilbert_base_uncased_label_sexist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|edos_2023_baseline_distilbert_base_uncased_label_sexist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|157.8 MB| + +## References + +https://huggingface.co/lct-rug-2022/edos-2023-baseline-distilbert-base-uncased-label_sexist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-emotion_analysis_of_twitter_comments_en.md b/docs/_posts/ahmedlone127/2023-11-19-emotion_analysis_of_twitter_comments_en.md new file mode 100644 index 000000000000..c386cb28c676 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-emotion_analysis_of_twitter_comments_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_analysis_of_twitter_comments DistilBertForSequenceClassification from shubhambawiskar +author: John Snow Labs +name: emotion_analysis_of_twitter_comments +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_analysis_of_twitter_comments` is a English model originally trained by shubhambawiskar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_analysis_of_twitter_comments_en_5.2.0_3.0_1700353205382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_analysis_of_twitter_comments_en_5.2.0_3.0_1700353205382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_analysis_of_twitter_comments","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_analysis_of_twitter_comments","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_analysis_of_twitter_comments| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/shubhambawiskar/Emotion_Analysis_of_Twitter_Comments \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-emotion_detection_conversations_en.md b/docs/_posts/ahmedlone127/2023-11-19-emotion_detection_conversations_en.md new file mode 100644 index 000000000000..b02a23778df4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-emotion_detection_conversations_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_detection_conversations DistilBertForSequenceClassification from Deysi +author: John Snow Labs +name: emotion_detection_conversations +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_detection_conversations` is a English model originally trained by Deysi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_detection_conversations_en_5.2.0_3.0_1700418166818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_detection_conversations_en_5.2.0_3.0_1700418166818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_detection_conversations","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_detection_conversations","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_detection_conversations| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Deysi/emotion_detection_conversations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-emotion_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-emotion_distilbert_en.md new file mode 100644 index 000000000000..876da0b72b15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-emotion_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_distilbert DistilBertForSequenceClassification from sbenel +author: John Snow Labs +name: emotion_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_distilbert` is a English model originally trained by sbenel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_distilbert_en_5.2.0_3.0_1700386430041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_distilbert_en_5.2.0_3.0_1700386430041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emotion_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sbenel/emotion-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-emtract_distilbert_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-19-emtract_distilbert_base_uncased_emotion_en.md new file mode 100644 index 000000000000..5ff4e4262adb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-emtract_distilbert_base_uncased_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emtract_distilbert_base_uncased_emotion DistilBertForSequenceClassification from vamossyd +author: John Snow Labs +name: emtract_distilbert_base_uncased_emotion +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emtract_distilbert_base_uncased_emotion` is a English model originally trained by vamossyd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emtract_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700354415893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emtract_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700354415893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("emtract_distilbert_base_uncased_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("emtract_distilbert_base_uncased_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emtract_distilbert_base_uncased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|263.4 MB| + +## References + +https://huggingface.co/vamossyd/emtract-distilbert-base-uncased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-english_grammar_en.md b/docs/_posts/ahmedlone127/2023-11-19-english_grammar_en.md new file mode 100644 index 000000000000..1178fd8eb02f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-english_grammar_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English english_grammar DistilBertForSequenceClassification from adgw +author: John Snow Labs +name: english_grammar +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`english_grammar` is a English model originally trained by adgw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/english_grammar_en_5.2.0_3.0_1700354757228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/english_grammar_en_5.2.0_3.0_1700354757228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("english_grammar","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("english_grammar","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|english_grammar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/adgw/english_grammar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-enron_spam_checker_10000_en.md b/docs/_posts/ahmedlone127/2023-11-19-enron_spam_checker_10000_en.md new file mode 100644 index 000000000000..48465efeb3a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-enron_spam_checker_10000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English enron_spam_checker_10000 DistilBertForSequenceClassification from CalamitousVisibility +author: John Snow Labs +name: enron_spam_checker_10000 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`enron_spam_checker_10000` is a English model originally trained by CalamitousVisibility. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/enron_spam_checker_10000_en_5.2.0_3.0_1700433302352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/enron_spam_checker_10000_en_5.2.0_3.0_1700433302352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("enron_spam_checker_10000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("enron_spam_checker_10000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|enron_spam_checker_10000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/CalamitousVisibility/enron-spam-checker-10000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-ethnicity_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-ethnicity_classification_en.md new file mode 100644 index 000000000000..2f73e7ba808b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-ethnicity_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ethnicity_classification DistilBertForSequenceClassification from padmajabfrl +author: John Snow Labs +name: ethnicity_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ethnicity_classification` is a English model originally trained by padmajabfrl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ethnicity_classification_en_5.2.0_3.0_1700354668601.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ethnicity_classification_en_5.2.0_3.0_1700354668601.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ethnicity_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ethnicity_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ethnicity_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/padmajabfrl/Ethnicity-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-ethnicity_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-ethnicity_model_en.md new file mode 100644 index 000000000000..6bfe58b692cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-ethnicity_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ethnicity_model DistilBertForSequenceClassification from padmajabfrl +author: John Snow Labs +name: ethnicity_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ethnicity_model` is a English model originally trained by padmajabfrl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ethnicity_model_en_5.2.0_3.0_1700377779740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ethnicity_model_en_5.2.0_3.0_1700377779740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ethnicity_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ethnicity_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ethnicity_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/padmajabfrl/Ethnicity-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-fakenewsclassifierdistilbert_cased_en.md b/docs/_posts/ahmedlone127/2023-11-19-fakenewsclassifierdistilbert_cased_en.md new file mode 100644 index 000000000000..52617ad61bf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-fakenewsclassifierdistilbert_cased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fakenewsclassifierdistilbert_cased DistilBertForSequenceClassification from caballeroch +author: John Snow Labs +name: fakenewsclassifierdistilbert_cased +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenewsclassifierdistilbert_cased` is a English model originally trained by caballeroch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenewsclassifierdistilbert_cased_en_5.2.0_3.0_1700364797316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenewsclassifierdistilbert_cased_en_5.2.0_3.0_1700364797316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fakenewsclassifierdistilbert_cased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fakenewsclassifierdistilbert_cased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenewsclassifierdistilbert_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/caballeroch/FakeNewsClassifierDistilBert-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-fanfics_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-fanfics_classification_en.md new file mode 100644 index 000000000000..99755da683a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-fanfics_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fanfics_classification DistilBertForSequenceClassification from Helwyn +author: John Snow Labs +name: fanfics_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fanfics_classification` is a English model originally trained by Helwyn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fanfics_classification_en_5.2.0_3.0_1700414792225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fanfics_classification_en_5.2.0_3.0_1700414792225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fanfics_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fanfics_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fanfics_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Helwyn/Fanfics_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-fatimah_fake_news_bert_en.md b/docs/_posts/ahmedlone127/2023-11-19-fatimah_fake_news_bert_en.md new file mode 100644 index 000000000000..d7e9cd1cb204 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-fatimah_fake_news_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fatimah_fake_news_bert DistilBertForSequenceClassification from yinde +author: John Snow Labs +name: fatimah_fake_news_bert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fatimah_fake_news_bert` is a English model originally trained by yinde. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fatimah_fake_news_bert_en_5.2.0_3.0_1700397230803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fatimah_fake_news_bert_en_5.2.0_3.0_1700397230803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fatimah_fake_news_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fatimah_fake_news_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fatimah_fake_news_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/yinde/fatimah_fake_news_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-fear_mongering_detection_en.md b/docs/_posts/ahmedlone127/2023-11-19-fear_mongering_detection_en.md new file mode 100644 index 000000000000..23d3ab3d5155 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-fear_mongering_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fear_mongering_detection DistilBertForSequenceClassification from Falconsai +author: John Snow Labs +name: fear_mongering_detection +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fear_mongering_detection` is a English model originally trained by Falconsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fear_mongering_detection_en_5.2.0_3.0_1700354918278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fear_mongering_detection_en_5.2.0_3.0_1700354918278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fear_mongering_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fear_mongering_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fear_mongering_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Falconsai/fear_mongering_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-fin_sentiment_rajistics_en.md b/docs/_posts/ahmedlone127/2023-11-19-fin_sentiment_rajistics_en.md new file mode 100644 index 000000000000..96026e407714 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-fin_sentiment_rajistics_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fin_sentiment_rajistics DistilBertForSequenceClassification from rajistics +author: John Snow Labs +name: fin_sentiment_rajistics +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fin_sentiment_rajistics` is a English model originally trained by rajistics. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fin_sentiment_rajistics_en_5.2.0_3.0_1700434477813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fin_sentiment_rajistics_en_5.2.0_3.0_1700434477813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fin_sentiment_rajistics","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fin_sentiment_rajistics","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fin_sentiment_rajistics| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rajistics/fin_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-financial_twitter_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-financial_twitter_sentiment_model_en.md new file mode 100644 index 000000000000..90455bc950c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-financial_twitter_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English financial_twitter_sentiment_model DistilBertForSequenceClassification from HugMaik +author: John Snow Labs +name: financial_twitter_sentiment_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`financial_twitter_sentiment_model` is a English model originally trained by HugMaik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/financial_twitter_sentiment_model_en_5.2.0_3.0_1700428505169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/financial_twitter_sentiment_model_en_5.2.0_3.0_1700428505169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("financial_twitter_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("financial_twitter_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|financial_twitter_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/HugMaik/financial-twitter-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-fine_tuned_fake_news_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-fine_tuned_fake_news_classifier_en.md new file mode 100644 index 000000000000..fb087b054a53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-fine_tuned_fake_news_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tuned_fake_news_classifier DistilBertForSequenceClassification from koliskos +author: John Snow Labs +name: fine_tuned_fake_news_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_fake_news_classifier` is a English model originally trained by koliskos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_fake_news_classifier_en_5.2.0_3.0_1700399210519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_fake_news_classifier_en_5.2.0_3.0_1700399210519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_fake_news_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fine_tuned_fake_news_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_fake_news_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/koliskos/fine_tuned_fake_news_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_adult_content_detection_nicolaskaram_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_adult_content_detection_nicolaskaram_en.md new file mode 100644 index 000000000000..6928904adc22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_adult_content_detection_nicolaskaram_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_adult_content_detection_nicolaskaram DistilBertForSequenceClassification from NicolasKaram +author: John Snow Labs +name: finetuned_distilbert_adult_content_detection_nicolaskaram +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_adult_content_detection_nicolaskaram` is a English model originally trained by NicolasKaram. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_adult_content_detection_nicolaskaram_en_5.2.0_3.0_1700413753995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_adult_content_detection_nicolaskaram_en_5.2.0_3.0_1700413753995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_adult_content_detection_nicolaskaram","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_adult_content_detection_nicolaskaram","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_adult_content_detection_nicolaskaram| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/NicolasKaram/finetuned-distilbert-adult-content-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_model_flokabukie_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_model_flokabukie_en.md new file mode 100644 index 000000000000..1874da3f4cba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_model_flokabukie_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_base_model_flokabukie DistilBertForSequenceClassification from flokabukie +author: John Snow Labs +name: finetuned_distilbert_base_model_flokabukie +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_base_model_flokabukie` is a English model originally trained by flokabukie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_model_flokabukie_en_5.2.0_3.0_1700353039494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_model_flokabukie_en_5.2.0_3.0_1700353039494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_model_flokabukie","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_model_flokabukie","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_base_model_flokabukie| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/flokabukie/Finetuned-Distilbert-base-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_uncased_emotion_en.md new file mode 100644 index 000000000000..aa849dc9c6b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_uncased_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_base_uncased_emotion DistilBertForSequenceClassification from 02shanky +author: John Snow Labs +name: finetuned_distilbert_base_uncased_emotion +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_base_uncased_emotion` is a English model originally trained by 02shanky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700398451710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_uncased_emotion_en_5.2.0_3.0_1700398451710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_uncased_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_uncased_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_base_uncased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/02shanky/finetuned-distilbert-base-uncased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_uncased_logiczmaksimka_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_uncased_logiczmaksimka_en.md new file mode 100644 index 000000000000..3ae4ee3add86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_base_uncased_logiczmaksimka_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_base_uncased_logiczmaksimka DistilBertForSequenceClassification from logiczmaksimka +author: John Snow Labs +name: finetuned_distilbert_base_uncased_logiczmaksimka +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_base_uncased_logiczmaksimka` is a English model originally trained by logiczmaksimka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_uncased_logiczmaksimka_en_5.2.0_3.0_1700355222309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_base_uncased_logiczmaksimka_en_5.2.0_3.0_1700355222309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_uncased_logiczmaksimka","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_base_uncased_logiczmaksimka","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_base_uncased_logiczmaksimka| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/logiczmaksimka/finetuned_distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_multi_label_emotion_6_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_multi_label_emotion_6_en.md new file mode 100644 index 000000000000..996cc81d6f7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuned_distilbert_multi_label_emotion_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_multi_label_emotion_6 DistilBertForSequenceClassification from abdulmatinomotoso +author: John Snow Labs +name: finetuned_distilbert_multi_label_emotion_6 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_multi_label_emotion_6` is a English model originally trained by abdulmatinomotoso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_multi_label_emotion_6_en_5.2.0_3.0_1700385771385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_multi_label_emotion_6_en_5.2.0_3.0_1700385771385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_multi_label_emotion_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_multi_label_emotion_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_multi_label_emotion_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/abdulmatinomotoso/finetuned-distilbert-multi-label-emotion_6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40_en.md new file mode 100644 index 000000000000..986e2b67c146 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40_en_5.2.0_3.0_1700379705251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40_en_5.2.0_3.0_1700379705251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr0_2e_05_essays_01_03_2022_13_20_40| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr0_2e-05_essays_01_03_2022-13_20_40 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuned_sentiment_analysis_model_3000_samples_base_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuned_sentiment_analysis_model_3000_samples_base_distilbert_en.md new file mode 100644 index 000000000000..1062ddcc25cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuned_sentiment_analysis_model_3000_samples_base_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentiment_analysis_model_3000_samples_base_distilbert DistilBertForSequenceClassification from Godspower +author: John Snow Labs +name: finetuned_sentiment_analysis_model_3000_samples_base_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentiment_analysis_model_3000_samples_base_distilbert` is a English model originally trained by Godspower. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_analysis_model_3000_samples_base_distilbert_en_5.2.0_3.0_1700355500761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_analysis_model_3000_samples_base_distilbert_en_5.2.0_3.0_1700355500761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentiment_analysis_model_3000_samples_base_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentiment_analysis_model_3000_samples_base_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentiment_analysis_model_3000_samples_base_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Godspower/finetuned-sentiment-analysis-model-3000-samples-base-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_distilbert_movie_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_distilbert_movie_classification_en.md new file mode 100644 index 000000000000..b5587839e830 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_distilbert_movie_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_distilbert_movie_classification DistilBertForSequenceClassification from geniusguy777 +author: John Snow Labs +name: finetuning_distilbert_movie_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_distilbert_movie_classification` is a English model originally trained by geniusguy777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_movie_classification_en_5.2.0_3.0_1700355396722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_movie_classification_en_5.2.0_3.0_1700355396722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_movie_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_movie_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_distilbert_movie_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/geniusguy777/finetuning-distilbert-movie-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_emotion_model_v_shukla_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_emotion_model_v_shukla_en.md new file mode 100644 index 000000000000..dff74318a135 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_emotion_model_v_shukla_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_emotion_model_v_shukla DistilBertForSequenceClassification from V-Shukla +author: John Snow Labs +name: finetuning_emotion_model_v_shukla +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_emotion_model_v_shukla` is a English model originally trained by V-Shukla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_emotion_model_v_shukla_en_5.2.0_3.0_1700394585002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_emotion_model_v_shukla_en_5.2.0_3.0_1700394585002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_emotion_model_v_shukla","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_emotion_model_v_shukla","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_emotion_model_v_shukla| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/V-Shukla/finetuning-emotion-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_movie_sentiment_model_9000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_movie_sentiment_model_9000_samples_en.md new file mode 100644 index 000000000000..34fb855442f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_movie_sentiment_model_9000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_movie_sentiment_model_9000_samples DistilBertForSequenceClassification from Manishkalra +author: John Snow Labs +name: finetuning_movie_sentiment_model_9000_samples +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_movie_sentiment_model_9000_samples` is a English model originally trained by Manishkalra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_movie_sentiment_model_9000_samples_en_5.2.0_3.0_1700408208763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_movie_sentiment_model_9000_samples_en_5.2.0_3.0_1700408208763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_movie_sentiment_model_9000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_movie_sentiment_model_9000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_movie_sentiment_model_9000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Manishkalra/finetuning-movie-sentiment-model-9000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_reviews_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_reviews_sentiment_model_en.md new file mode 100644 index 000000000000..5e6f937c714b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_reviews_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_reviews_sentiment_model DistilBertForSequenceClassification from rbhowsden +author: John Snow Labs +name: finetuning_reviews_sentiment_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_reviews_sentiment_model` is a English model originally trained by rbhowsden. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_reviews_sentiment_model_en_5.2.0_3.0_1700403973060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_reviews_sentiment_model_en_5.2.0_3.0_1700403973060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_reviews_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_reviews_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_reviews_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rbhowsden/finetuning-reviews-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_analysis_model_3000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_analysis_model_3000_samples_en.md new file mode 100644 index 000000000000..b70dd0691fa7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_analysis_model_3000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_analysis_model_3000_samples DistilBertForSequenceClassification from federicopascual +author: John Snow Labs +name: finetuning_sentiment_analysis_model_3000_samples +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_analysis_model_3000_samples` is a English model originally trained by federicopascual. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_analysis_model_3000_samples_en_5.2.0_3.0_1700404792215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_analysis_model_3000_samples_en_5.2.0_3.0_1700404792215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_analysis_model_3000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_analysis_model_3000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_analysis_model_3000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/federicopascual/finetuning-sentiment-analysis-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_10000_samples_js21_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_10000_samples_js21_en.md new file mode 100644 index 000000000000..68dc7d408cae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_10000_samples_js21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_10000_samples_js21 DistilBertForSequenceClassification from JS21 +author: John Snow Labs +name: finetuning_sentiment_model_10000_samples_js21 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_10000_samples_js21` is a English model originally trained by JS21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_10000_samples_js21_en_5.2.0_3.0_1700355404111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_10000_samples_js21_en_5.2.0_3.0_1700355404111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_10000_samples_js21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_10000_samples_js21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_10000_samples_js21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JS21/finetuning-sentiment-model-10000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_25000_samples_youlun77_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_25000_samples_youlun77_en.md new file mode 100644 index 000000000000..59b2fff6ecf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_25000_samples_youlun77_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_25000_samples_youlun77 DistilBertForSequenceClassification from youlun77 +author: John Snow Labs +name: finetuning_sentiment_model_25000_samples_youlun77 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_25000_samples_youlun77` is a English model originally trained by youlun77. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_25000_samples_youlun77_en_5.2.0_3.0_1700428504376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_25000_samples_youlun77_en_5.2.0_3.0_1700428504376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_25000_samples_youlun77","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_25000_samples_youlun77","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_25000_samples_youlun77| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/youlun77/finetuning-sentiment-model-25000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_baxterai_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_baxterai_en.md new file mode 100644 index 000000000000..5ec80f7bd7dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_baxterai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_baxterai DistilBertForSequenceClassification from BaxterAI +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_baxterai +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_baxterai` is a English model originally trained by BaxterAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_baxterai_en_5.2.0_3.0_1700402177170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_baxterai_en_5.2.0_3.0_1700402177170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_baxterai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_baxterai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_baxterai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/BaxterAI/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_csam_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_csam_en.md new file mode 100644 index 000000000000..a567283b2e98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_csam_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_csam DistilBertForSequenceClassification from csam +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_csam +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_csam` is a English model originally trained by csam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_csam_en_5.2.0_3.0_1700438198635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_csam_en_5.2.0_3.0_1700438198635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_csam","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_csam","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_csam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/csam/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_daniel780_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_daniel780_en.md new file mode 100644 index 000000000000..949fb32c74d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_daniel780_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_daniel780 DistilBertForSequenceClassification from daniel780 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_daniel780 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_daniel780` is a English model originally trained by daniel780. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_daniel780_en_5.2.0_3.0_1700415303708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_daniel780_en_5.2.0_3.0_1700415303708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_daniel780","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_daniel780","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_daniel780| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/daniel780/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_duboij_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_duboij_en.md new file mode 100644 index 000000000000..8d6a974eec4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_duboij_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_duboij DistilBertForSequenceClassification from DuboiJ +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_duboij +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_duboij` is a English model originally trained by DuboiJ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_duboij_en_5.2.0_3.0_1700359475275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_duboij_en_5.2.0_3.0_1700359475275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_duboij","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_duboij","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_duboij| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DuboiJ/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_manishkalra_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_manishkalra_en.md new file mode 100644 index 000000000000..2fac4340b87f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_manishkalra_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_manishkalra DistilBertForSequenceClassification from Manishkalra +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_manishkalra +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_manishkalra` is a English model originally trained by Manishkalra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_manishkalra_en_5.2.0_3.0_1700386843389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_manishkalra_en_5.2.0_3.0_1700386843389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_manishkalra","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_manishkalra","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_manishkalra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Manishkalra/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_monusingh_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_monusingh_en.md new file mode 100644 index 000000000000..e99306214990 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_monusingh_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_monusingh DistilBertForSequenceClassification from monusingh +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_monusingh +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_monusingh` is a English model originally trained by monusingh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_monusingh_en_5.2.0_3.0_1700420590514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_monusingh_en_5.2.0_3.0_1700420590514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_monusingh","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_monusingh","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_monusingh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/monusingh/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_norwegiangoat_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_norwegiangoat_en.md new file mode 100644 index 000000000000..d6e1cd1937a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_norwegiangoat_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_norwegiangoat DistilBertForSequenceClassification from NorwegianGoat +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_norwegiangoat +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_norwegiangoat` is a English model originally trained by NorwegianGoat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_norwegiangoat_en_5.2.0_3.0_1700356082492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_norwegiangoat_en_5.2.0_3.0_1700356082492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_norwegiangoat","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_norwegiangoat","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_norwegiangoat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/NorwegianGoat/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_palmer0_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_palmer0_en.md new file mode 100644 index 000000000000..3627d9c59ae2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_palmer0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_palmer0 DistilBertForSequenceClassification from palmer0 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_palmer0 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_palmer0` is a English model originally trained by palmer0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_palmer0_en_5.2.0_3.0_1700363705358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_palmer0_en_5.2.0_3.0_1700363705358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_palmer0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_palmer0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_palmer0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/palmer0/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_sahara_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_sahara_en.md new file mode 100644 index 000000000000..2458485c202b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_sahara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_sahara DistilBertForSequenceClassification from Sahara +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_sahara +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_sahara` is a English model originally trained by Sahara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sahara_en_5.2.0_3.0_1700384844075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sahara_en_5.2.0_3.0_1700384844075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_sahara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_sahara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_sahara| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Sahara/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_sgraf202_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_sgraf202_en.md new file mode 100644 index 000000000000..96e940b2266e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_sgraf202_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_sgraf202 DistilBertForSequenceClassification from sgraf202 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_sgraf202 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_sgraf202` is a English model originally trained by sgraf202. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sgraf202_en_5.2.0_3.0_1700437728440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sgraf202_en_5.2.0_3.0_1700437728440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_sgraf202","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_sgraf202","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_sgraf202| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sgraf202/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_snake12b_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_snake12b_en.md new file mode 100644 index 000000000000..e831f0795635 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_snake12b_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_snake12b DistilBertForSequenceClassification from Snake12b +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_snake12b +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_snake12b` is a English model originally trained by Snake12b. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_snake12b_en_5.2.0_3.0_1700401094861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_snake12b_en_5.2.0_3.0_1700401094861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_snake12b","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_snake12b","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_snake12b| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Snake12b/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_testcopy_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_testcopy_en.md new file mode 100644 index 000000000000..c89164a22ef7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_testcopy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_testcopy DistilBertForSequenceClassification from federicopascual +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_testcopy +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_testcopy` is a English model originally trained by federicopascual. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_testcopy_en_5.2.0_3.0_1700369058603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_testcopy_en_5.2.0_3.0_1700369058603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_testcopy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_testcopy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_testcopy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/federicopascual/finetuning-sentiment-model-3000-samples-testcopy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_thenacl_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_thenacl_en.md new file mode 100644 index 000000000000..f99ed323a977 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_3000_samples_thenacl_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_thenacl DistilBertForSequenceClassification from TheNaCL +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_thenacl +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_thenacl` is a English model originally trained by TheNaCL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_thenacl_en_5.2.0_3.0_1700420718141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_thenacl_en_5.2.0_3.0_1700420718141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_thenacl","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_thenacl","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_thenacl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/TheNaCL/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_distilbert_base_25000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_distilbert_base_25000_samples_en.md new file mode 100644 index 000000000000..fad6f0636369 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_distilbert_base_25000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_distilbert_base_25000_samples DistilBertForSequenceClassification from choidf +author: John Snow Labs +name: finetuning_sentiment_model_distilbert_base_25000_samples +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_distilbert_base_25000_samples` is a English model originally trained by choidf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_distilbert_base_25000_samples_en_5.2.0_3.0_1700380431443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_distilbert_base_25000_samples_en_5.2.0_3.0_1700380431443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_distilbert_base_25000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_distilbert_base_25000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_distilbert_base_25000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/choidf/finetuning-sentiment-model-distilbert-base-25000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_for_c2er_soumyaranjan_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_for_c2er_soumyaranjan_en.md new file mode 100644 index 000000000000..fa48633d61ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_for_c2er_soumyaranjan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_for_c2er_soumyaranjan DistilBertForSequenceClassification from soumyaranjan +author: John Snow Labs +name: finetuning_sentiment_model_for_c2er_soumyaranjan +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_for_c2er_soumyaranjan` is a English model originally trained by soumyaranjan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_for_c2er_soumyaranjan_en_5.2.0_3.0_1700356291847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_for_c2er_soumyaranjan_en_5.2.0_3.0_1700356291847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_for_c2er_soumyaranjan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_for_c2er_soumyaranjan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_for_c2er_soumyaranjan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/soumyaranjan/finetuning-sentiment-model-for-c2er \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_test_seema09_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_test_seema09_en.md new file mode 100644 index 000000000000..a49a4a03a72a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_model_test_seema09_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_test_seema09 DistilBertForSequenceClassification from Seema09 +author: John Snow Labs +name: finetuning_sentiment_model_test_seema09 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_test_seema09` is a English model originally trained by Seema09. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_test_seema09_en_5.2.0_3.0_1700369289087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_test_seema09_en_5.2.0_3.0_1700369289087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_test_seema09","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_test_seema09","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_test_seema09| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Seema09/finetuning-sentiment-model-Test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_yelp_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_yelp_reviews_en.md new file mode 100644 index 000000000000..d97cc7716b2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-finetuning_sentiment_yelp_reviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_yelp_reviews DistilBertForSequenceClassification from rachtxxy +author: John Snow Labs +name: finetuning_sentiment_yelp_reviews +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_yelp_reviews` is a English model originally trained by rachtxxy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_yelp_reviews_en_5.2.0_3.0_1700400309459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_yelp_reviews_en_5.2.0_3.0_1700400309459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_yelp_reviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_yelp_reviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_yelp_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rachtxxy/finetuning-sentiment-yelp-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-french_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-french_model_en.md new file mode 100644 index 000000000000..c721de5ff040 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-french_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English french_model DistilBertForSequenceClassification from TheoLepere +author: John Snow Labs +name: french_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`french_model` is a English model originally trained by TheoLepere. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/french_model_en_5.2.0_3.0_1700358440219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/french_model_en_5.2.0_3.0_1700358440219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("french_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("french_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|french_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/TheoLepere/french_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-gendec_with_distilmbert_ja.md b/docs/_posts/ahmedlone127/2023-11-19-gendec_with_distilmbert_ja.md new file mode 100644 index 000000000000..e218fc801c07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-gendec_with_distilmbert_ja.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Japanese gendec_with_distilmbert DistilBertForSequenceClassification from tarudesu +author: John Snow Labs +name: gendec_with_distilmbert +date: 2023-11-19 +tags: [bert, ja, open_source, sequence_classification, onnx] +task: Text Classification +language: ja +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gendec_with_distilmbert` is a Japanese model originally trained by tarudesu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gendec_with_distilmbert_ja_5.2.0_3.0_1700371053259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gendec_with_distilmbert_ja_5.2.0_3.0_1700371053259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("gendec_with_distilmbert","ja")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("gendec_with_distilmbert","ja") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gendec_with_distilmbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ja| +|Size:|507.6 MB| + +## References + +https://huggingface.co/tarudesu/gendec-with-distilmbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-gender_en.md b/docs/_posts/ahmedlone127/2023-11-19-gender_en.md new file mode 100644 index 000000000000..fdcd8177f266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-gender_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gender DistilBertForSequenceClassification from priyabrat +author: John Snow Labs +name: gender +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gender` is a English model originally trained by priyabrat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gender_en_5.2.0_3.0_1700354759708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gender_en_5.2.0_3.0_1700354759708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("gender","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("gender","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gender| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/priyabrat/gender \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-genre_pred_model_reduced_en.md b/docs/_posts/ahmedlone127/2023-11-19-genre_pred_model_reduced_en.md new file mode 100644 index 000000000000..bf56129572f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-genre_pred_model_reduced_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English genre_pred_model_reduced DistilBertForSequenceClassification from matthiasr +author: John Snow Labs +name: genre_pred_model_reduced +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`genre_pred_model_reduced` is a English model originally trained by matthiasr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/genre_pred_model_reduced_en_5.2.0_3.0_1700405695076.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/genre_pred_model_reduced_en_5.2.0_3.0_1700405695076.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("genre_pred_model_reduced","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("genre_pred_model_reduced","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|genre_pred_model_reduced| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/matthiasr/genre_pred_model_reduced \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-good_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-good_sentiment_model_en.md new file mode 100644 index 000000000000..b8e559c28b29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-good_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English good_sentiment_model DistilBertForSequenceClassification from TheJournal +author: John Snow Labs +name: good_sentiment_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`good_sentiment_model` is a English model originally trained by TheJournal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/good_sentiment_model_en_5.2.0_3.0_1700433210743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/good_sentiment_model_en_5.2.0_3.0_1700433210743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("good_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("good_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|good_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/TheJournal/good_sentiment_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-green_sentiment_latest14_en.md b/docs/_posts/ahmedlone127/2023-11-19-green_sentiment_latest14_en.md new file mode 100644 index 000000000000..b6eb07636f0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-green_sentiment_latest14_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English green_sentiment_latest14 DistilBertForSequenceClassification from manjinder +author: John Snow Labs +name: green_sentiment_latest14 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`green_sentiment_latest14` is a English model originally trained by manjinder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/green_sentiment_latest14_en_5.2.0_3.0_1700410722363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/green_sentiment_latest14_en_5.2.0_3.0_1700410722363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("green_sentiment_latest14","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("green_sentiment_latest14","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|green_sentiment_latest14| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/manjinder/green_sentiment_latest14 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-guardian_news_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-19-guardian_news_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..dcdd6d6c3762 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-guardian_news_distilbert_base_uncased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English guardian_news_distilbert_base_uncased DistilBertForSequenceClassification from cambridgeltl +author: John Snow Labs +name: guardian_news_distilbert_base_uncased +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`guardian_news_distilbert_base_uncased` is a English model originally trained by cambridgeltl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/guardian_news_distilbert_base_uncased_en_5.2.0_3.0_1700408485124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/guardian_news_distilbert_base_uncased_en_5.2.0_3.0_1700408485124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("guardian_news_distilbert_base_uncased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("guardian_news_distilbert_base_uncased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|guardian_news_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cambridgeltl/guardian_news_distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-hate_classification_distilbert_base_multilingual_cased_sentiments_student2_xx.md b/docs/_posts/ahmedlone127/2023-11-19-hate_classification_distilbert_base_multilingual_cased_sentiments_student2_xx.md new file mode 100644 index 000000000000..e0a6bfe86234 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-hate_classification_distilbert_base_multilingual_cased_sentiments_student2_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual hate_classification_distilbert_base_multilingual_cased_sentiments_student2 DistilBertForSequenceClassification from Jairnetojp +author: John Snow Labs +name: hate_classification_distilbert_base_multilingual_cased_sentiments_student2 +date: 2023-11-19 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_classification_distilbert_base_multilingual_cased_sentiments_student2` is a Multilingual model originally trained by Jairnetojp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_classification_distilbert_base_multilingual_cased_sentiments_student2_xx_5.2.0_3.0_1700354271807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_classification_distilbert_base_multilingual_cased_sentiments_student2_xx_5.2.0_3.0_1700354271807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_classification_distilbert_base_multilingual_cased_sentiments_student2","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_classification_distilbert_base_multilingual_cased_sentiments_student2","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_classification_distilbert_base_multilingual_cased_sentiments_student2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/Jairnetojp/hate-classification-distilbert-base-multilingual-cased-sentiments-student2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-hate_speech_targets_dutch_nl.md b/docs/_posts/ahmedlone127/2023-11-19-hate_speech_targets_dutch_nl.md new file mode 100644 index 000000000000..2068f4d5ae31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-hate_speech_targets_dutch_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish hate_speech_targets_dutch DistilBertForSequenceClassification from IMSyPP +author: John Snow Labs +name: hate_speech_targets_dutch +date: 2023-11-19 +tags: [bert, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_targets_dutch` is a Dutch, Flemish model originally trained by IMSyPP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_targets_dutch_nl_5.2.0_3.0_1700404814290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_targets_dutch_nl_5.2.0_3.0_1700404814290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_speech_targets_dutch","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_speech_targets_dutch","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_targets_dutch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|507.6 MB| + +## References + +https://huggingface.co/IMSyPP/hate_speech_targets_nl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-hate_trained_final_en.md b/docs/_posts/ahmedlone127/2023-11-19-hate_trained_final_en.md new file mode 100644 index 000000000000..6486a76abe95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-hate_trained_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_trained_final DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: hate_trained_final +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_trained_final` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_trained_final_en_5.2.0_3.0_1700389734147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_trained_final_en_5.2.0_3.0_1700389734147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_trained_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hate_trained_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_trained_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/aXhyra/hate_trained_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-hello_custom_en.md b/docs/_posts/ahmedlone127/2023-11-19-hello_custom_en.md new file mode 100644 index 000000000000..305657255b63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-hello_custom_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hello_custom DistilBertForSequenceClassification from ljh1 +author: John Snow Labs +name: hello_custom +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hello_custom` is a English model originally trained by ljh1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hello_custom_en_5.2.0_3.0_1700406290626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hello_custom_en_5.2.0_3.0_1700406290626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hello_custom","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hello_custom","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hello_custom| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ljh1/hello-custom \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-hface_mlops_demo_dbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-hface_mlops_demo_dbert_en.md new file mode 100644 index 000000000000..dcc63443d2a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-hface_mlops_demo_dbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hface_mlops_demo_dbert DistilBertForSequenceClassification from naga-jay +author: John Snow Labs +name: hface_mlops_demo_dbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hface_mlops_demo_dbert` is a English model originally trained by naga-jay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hface_mlops_demo_dbert_en_5.2.0_3.0_1700356767181.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hface_mlops_demo_dbert_en_5.2.0_3.0_1700356767181.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hface_mlops_demo_dbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hface_mlops_demo_dbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hface_mlops_demo_dbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/naga-jay/hface_mlops_demo_dbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-hugging_face_en.md b/docs/_posts/ahmedlone127/2023-11-19-hugging_face_en.md new file mode 100644 index 000000000000..1e691ed61b90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-hugging_face_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hugging_face DistilBertForSequenceClassification from chrishistewandb +author: John Snow Labs +name: hugging_face +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hugging_face` is a English model originally trained by chrishistewandb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hugging_face_en_5.2.0_3.0_1700398176384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hugging_face_en_5.2.0_3.0_1700398176384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hugging_face","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hugging_face","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hugging_face| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/chrishistewandb/hugging-face \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-huggingface_sequence_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-huggingface_sequence_classification_en.md new file mode 100644 index 000000000000..d51cb4ff33bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-huggingface_sequence_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English huggingface_sequence_classification DistilBertForSequenceClassification from epicmobile181 +author: John Snow Labs +name: huggingface_sequence_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`huggingface_sequence_classification` is a English model originally trained by epicmobile181. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/huggingface_sequence_classification_en_5.2.0_3.0_1700400247420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/huggingface_sequence_classification_en_5.2.0_3.0_1700400247420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("huggingface_sequence_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("huggingface_sequence_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|huggingface_sequence_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/epicmobile181/huggingface_sequence_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-humor_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-humor_classifier_en.md new file mode 100644 index 000000000000..c2d13f899b55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-humor_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English humor_classifier DistilBertForSequenceClassification from r3b3lj3l +author: John Snow Labs +name: humor_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`humor_classifier` is a English model originally trained by r3b3lj3l. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/humor_classifier_en_5.2.0_3.0_1700396056809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/humor_classifier_en_5.2.0_3.0_1700396056809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("humor_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("humor_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|humor_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/r3b3lj3l/humor_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-hupd_distilbert_claims_en.md b/docs/_posts/ahmedlone127/2023-11-19-hupd_distilbert_claims_en.md new file mode 100644 index 000000000000..7a356f87a032 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-hupd_distilbert_claims_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hupd_distilbert_claims DistilBertForSequenceClassification from theresatvan +author: John Snow Labs +name: hupd_distilbert_claims +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hupd_distilbert_claims` is a English model originally trained by theresatvan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hupd_distilbert_claims_en_5.2.0_3.0_1700421836243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hupd_distilbert_claims_en_5.2.0_3.0_1700421836243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hupd_distilbert_claims","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hupd_distilbert_claims","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hupd_distilbert_claims| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.9 MB| + +## References + +https://huggingface.co/theresatvan/hupd-distilbert-claims \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-idpintents_key_value_en.md b/docs/_posts/ahmedlone127/2023-11-19-idpintents_key_value_en.md new file mode 100644 index 000000000000..7c79dfc9c16f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-idpintents_key_value_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English idpintents_key_value DistilBertForSequenceClassification from Anurag0961 +author: John Snow Labs +name: idpintents_key_value +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idpintents_key_value` is a English model originally trained by Anurag0961. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idpintents_key_value_en_5.2.0_3.0_1700356559647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idpintents_key_value_en_5.2.0_3.0_1700356559647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("idpintents_key_value","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("idpintents_key_value","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idpintents_key_value| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Anurag0961/idpintents-key-value \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-imdb_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-imdb_classification_en.md new file mode 100644 index 000000000000..9e2a410b6846 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-imdb_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English imdb_classification DistilBertForSequenceClassification from abhiShek1061 +author: John Snow Labs +name: imdb_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb_classification` is a English model originally trained by abhiShek1061. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb_classification_en_5.2.0_3.0_1700424422935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb_classification_en_5.2.0_3.0_1700424422935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("imdb_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("imdb_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/abhiShek1061/imdb-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-indobert_distilled_optimized_for_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-indobert_distilled_optimized_for_classification_en.md new file mode 100644 index 000000000000..425c97359339 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-indobert_distilled_optimized_for_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English indobert_distilled_optimized_for_classification DistilBertForSequenceClassification from afbudiman +author: John Snow Labs +name: indobert_distilled_optimized_for_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indobert_distilled_optimized_for_classification` is a English model originally trained by afbudiman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indobert_distilled_optimized_for_classification_en_5.2.0_3.0_1700353846176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indobert_distilled_optimized_for_classification_en_5.2.0_3.0_1700353846176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("indobert_distilled_optimized_for_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("indobert_distilled_optimized_for_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indobert_distilled_optimized_for_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/afbudiman/indobert-distilled-optimized-for-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-intent_classification_large_en.md b/docs/_posts/ahmedlone127/2023-11-19-intent_classification_large_en.md new file mode 100644 index 000000000000..bb0e94463d35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-intent_classification_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English intent_classification_large DistilBertForSequenceClassification from dipesh +author: John Snow Labs +name: intent_classification_large +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_classification_large` is a English model originally trained by dipesh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_classification_large_en_5.2.0_3.0_1700354138295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_classification_large_en_5.2.0_3.0_1700354138295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_classification_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_classification_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.3 MB| + +## References + +https://huggingface.co/dipesh/Intent-Classification-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-intent_recognition_en.md b/docs/_posts/ahmedlone127/2023-11-19-intent_recognition_en.md new file mode 100644 index 000000000000..6bd9108774e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-intent_recognition_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English intent_recognition DistilBertForSequenceClassification from alibidaran +author: John Snow Labs +name: intent_recognition +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`intent_recognition` is a English model originally trained by alibidaran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/intent_recognition_en_5.2.0_3.0_1700415194715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/intent_recognition_en_5.2.0_3.0_1700415194715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_recognition","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("intent_recognition","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|intent_recognition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/alibidaran/intent_recognition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-internet2_en.md b/docs/_posts/ahmedlone127/2023-11-19-internet2_en.md new file mode 100644 index 000000000000..cec6308baec4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-internet2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English internet2 DistilBertForSequenceClassification from Majed +author: John Snow Labs +name: internet2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`internet2` is a English model originally trained by Majed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/internet2_en_5.2.0_3.0_1700356050654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/internet2_en_5.2.0_3.0_1700356050654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("internet2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("internet2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|internet2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Majed/internet2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-jq_emo_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-jq_emo_distilbert_en.md new file mode 100644 index 000000000000..d1234d07b351 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-jq_emo_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English jq_emo_distilbert DistilBertForSequenceClassification from tingtone +author: John Snow Labs +name: jq_emo_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jq_emo_distilbert` is a English model originally trained by tingtone. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jq_emo_distilbert_en_5.2.0_3.0_1700398216970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jq_emo_distilbert_en_5.2.0_3.0_1700398216970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("jq_emo_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("jq_emo_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jq_emo_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tingtone/jq_emo_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-jqed_qa_question_classifer_final_en.md b/docs/_posts/ahmedlone127/2023-11-19-jqed_qa_question_classifer_final_en.md new file mode 100644 index 000000000000..5a244d8971a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-jqed_qa_question_classifer_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English jqed_qa_question_classifer_final DistilBertForSequenceClassification from dflcmu +author: John Snow Labs +name: jqed_qa_question_classifer_final +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jqed_qa_question_classifer_final` is a English model originally trained by dflcmu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jqed_qa_question_classifer_final_en_5.2.0_3.0_1700376650256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jqed_qa_question_classifer_final_en_5.2.0_3.0_1700376650256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("jqed_qa_question_classifer_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("jqed_qa_question_classifer_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jqed_qa_question_classifer_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dflcmu/JQED_QA_question_classifer_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_en.md new file mode 100644 index 000000000000..2db0da220c45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English keyword_category_classifier DistilBertForSequenceClassification from Nalenczewski +author: John Snow Labs +name: keyword_category_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_category_classifier` is a English model originally trained by Nalenczewski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_en_5.2.0_3.0_1700352085738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_en_5.2.0_3.0_1700352085738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_category_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Nalenczewski/keyword_category_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v2_en.md b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v2_en.md new file mode 100644 index 000000000000..e99513fb942e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English keyword_category_classifier_v2 DistilBertForSequenceClassification from Nalenczewski +author: John Snow Labs +name: keyword_category_classifier_v2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_category_classifier_v2` is a English model originally trained by Nalenczewski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_v2_en_5.2.0_3.0_1700378806652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_v2_en_5.2.0_3.0_1700378806652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_category_classifier_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Nalenczewski/keyword_category_classifier_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v3_en.md b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v3_en.md new file mode 100644 index 000000000000..a3cf74090361 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English keyword_category_classifier_v3 DistilBertForSequenceClassification from Nalenczewski +author: John Snow Labs +name: keyword_category_classifier_v3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_category_classifier_v3` is a English model originally trained by Nalenczewski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_v3_en_5.2.0_3.0_1700404237975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_v3_en_5.2.0_3.0_1700404237975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier_v3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier_v3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_category_classifier_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Nalenczewski/keyword_category_classifier_v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v7_en.md b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v7_en.md new file mode 100644 index 000000000000..36bb685916a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-keyword_category_classifier_v7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English keyword_category_classifier_v7 DistilBertForSequenceClassification from Nalenczewski +author: John Snow Labs +name: keyword_category_classifier_v7 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`keyword_category_classifier_v7` is a English model originally trained by Nalenczewski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_v7_en_5.2.0_3.0_1700376168594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/keyword_category_classifier_v7_en_5.2.0_3.0_1700376168594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier_v7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("keyword_category_classifier_v7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|keyword_category_classifier_v7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Nalenczewski/keyword_category_classifier_v7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-korean_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-korean_classification_en.md new file mode 100644 index 000000000000..7db031b7cd3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-korean_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English korean_classification DistilBertForSequenceClassification from devhee +author: John Snow Labs +name: korean_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`korean_classification` is a English model originally trained by devhee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/korean_classification_en_5.2.0_3.0_1700433407711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/korean_classification_en_5.2.0_3.0_1700433407711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("korean_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("korean_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|korean_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/devhee/ko_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-learning_sentiment_analysis_through_imdb_ds_en.md b/docs/_posts/ahmedlone127/2023-11-19-learning_sentiment_analysis_through_imdb_ds_en.md new file mode 100644 index 000000000000..b3e74a7ce969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-learning_sentiment_analysis_through_imdb_ds_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English learning_sentiment_analysis_through_imdb_ds DistilBertForSequenceClassification from SeNSiTivE +author: John Snow Labs +name: learning_sentiment_analysis_through_imdb_ds +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`learning_sentiment_analysis_through_imdb_ds` is a English model originally trained by SeNSiTivE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/learning_sentiment_analysis_through_imdb_ds_en_5.2.0_3.0_1700385552399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/learning_sentiment_analysis_through_imdb_ds_en_5.2.0_3.0_1700385552399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("learning_sentiment_analysis_through_imdb_ds","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("learning_sentiment_analysis_through_imdb_ds","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|learning_sentiment_analysis_through_imdb_ds| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SeNSiTivE/Learning-sentiment-analysis-through-imdb-ds \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-left_padding50_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-left_padding50_model_en.md new file mode 100644 index 000000000000..567e7ff9862b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-left_padding50_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English left_padding50_model DistilBertForSequenceClassification from Realgon +author: John Snow Labs +name: left_padding50_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`left_padding50_model` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/left_padding50_model_en_5.2.0_3.0_1700356072339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/left_padding50_model_en_5.2.0_3.0_1700356072339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("left_padding50_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("left_padding50_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|left_padding50_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Realgon/left_padding50_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-light_recipes_italian_en.md b/docs/_posts/ahmedlone127/2023-11-19-light_recipes_italian_en.md new file mode 100644 index 000000000000..147bc8076f95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-light_recipes_italian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English light_recipes_italian DistilBertForSequenceClassification from paola-md +author: John Snow Labs +name: light_recipes_italian +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`light_recipes_italian` is a English model originally trained by paola-md. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/light_recipes_italian_en_5.2.0_3.0_1700402679899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/light_recipes_italian_en_5.2.0_3.0_1700402679899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("light_recipes_italian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("light_recipes_italian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|light_recipes_italian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/paola-md/light-recipes-italian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-mbti_classifier_parka735_en.md b/docs/_posts/ahmedlone127/2023-11-19-mbti_classifier_parka735_en.md new file mode 100644 index 000000000000..5b6c5727e30c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-mbti_classifier_parka735_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbti_classifier_parka735 DistilBertForSequenceClassification from parka735 +author: John Snow Labs +name: mbti_classifier_parka735 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbti_classifier_parka735` is a English model originally trained by parka735. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbti_classifier_parka735_en_5.2.0_3.0_1700353387744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbti_classifier_parka735_en_5.2.0_3.0_1700353387744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("mbti_classifier_parka735","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("mbti_classifier_parka735","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbti_classifier_parka735| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/parka735/mbti-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-me_sensus_en.md b/docs/_posts/ahmedlone127/2023-11-19-me_sensus_en.md new file mode 100644 index 000000000000..0a537b8f1cf6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-me_sensus_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English me_sensus DistilBertForSequenceClassification from afiqlol +author: John Snow Labs +name: me_sensus +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`me_sensus` is a English model originally trained by afiqlol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/me_sensus_en_5.2.0_3.0_1700352824309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/me_sensus_en_5.2.0_3.0_1700352824309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("me_sensus","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("me_sensus","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|me_sensus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/afiqlol/me_sensus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-med_qa_intent_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-med_qa_intent_classification_en.md new file mode 100644 index 000000000000..54d39e29005e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-med_qa_intent_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English med_qa_intent_classification DistilBertForSequenceClassification from GEDISA +author: John Snow Labs +name: med_qa_intent_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`med_qa_intent_classification` is a English model originally trained by GEDISA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/med_qa_intent_classification_en_5.2.0_3.0_1700430374927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/med_qa_intent_classification_en_5.2.0_3.0_1700430374927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("med_qa_intent_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("med_qa_intent_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|med_qa_intent_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.5 MB| + +## References + +https://huggingface.co/GEDISA/med-qa-intent-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-media_bias_ukraine_dataset_all_en.md b/docs/_posts/ahmedlone127/2023-11-19-media_bias_ukraine_dataset_all_en.md new file mode 100644 index 000000000000..2d87cea4f67f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-media_bias_ukraine_dataset_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English media_bias_ukraine_dataset_all DistilBertForSequenceClassification from franfj +author: John Snow Labs +name: media_bias_ukraine_dataset_all +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`media_bias_ukraine_dataset_all` is a English model originally trained by franfj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/media_bias_ukraine_dataset_all_en_5.2.0_3.0_1700396760798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/media_bias_ukraine_dataset_all_en_5.2.0_3.0_1700396760798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("media_bias_ukraine_dataset_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("media_bias_ukraine_dataset_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|media_bias_ukraine_dataset_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/franfj/media-bias-ukraine-dataset-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-media_bias_ukraine_dataset_all_minus_ukraine_masked_en.md b/docs/_posts/ahmedlone127/2023-11-19-media_bias_ukraine_dataset_all_minus_ukraine_masked_en.md new file mode 100644 index 000000000000..ae524951892e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-media_bias_ukraine_dataset_all_minus_ukraine_masked_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English media_bias_ukraine_dataset_all_minus_ukraine_masked DistilBertForSequenceClassification from franfj +author: John Snow Labs +name: media_bias_ukraine_dataset_all_minus_ukraine_masked +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`media_bias_ukraine_dataset_all_minus_ukraine_masked` is a English model originally trained by franfj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/media_bias_ukraine_dataset_all_minus_ukraine_masked_en_5.2.0_3.0_1700375730171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/media_bias_ukraine_dataset_all_minus_ukraine_masked_en_5.2.0_3.0_1700375730171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("media_bias_ukraine_dataset_all_minus_ukraine_masked","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("media_bias_ukraine_dataset_all_minus_ukraine_masked","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|media_bias_ukraine_dataset_all_minus_ukraine_masked| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/franfj/media-bias-ukraine-dataset-all-minus-ukraine-masked \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-medicalappreview_en.md b/docs/_posts/ahmedlone127/2023-11-19-medicalappreview_en.md new file mode 100644 index 000000000000..47e698fcdbb9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-medicalappreview_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English medicalappreview DistilBertForSequenceClassification from KarolPaczocha +author: John Snow Labs +name: medicalappreview +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medicalappreview` is a English model originally trained by KarolPaczocha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medicalappreview_en_5.2.0_3.0_1700380573577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medicalappreview_en_5.2.0_3.0_1700380573577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("medicalappreview","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("medicalappreview","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medicalappreview| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/KarolPaczocha/medicalappreview \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-medium_article_titles_engagement_en.md b/docs/_posts/ahmedlone127/2023-11-19-medium_article_titles_engagement_en.md new file mode 100644 index 000000000000..ce15085059b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-medium_article_titles_engagement_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English medium_article_titles_engagement DistilBertForSequenceClassification from dima806 +author: John Snow Labs +name: medium_article_titles_engagement +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medium_article_titles_engagement` is a English model originally trained by dima806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medium_article_titles_engagement_en_5.2.0_3.0_1700354410812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medium_article_titles_engagement_en_5.2.0_3.0_1700354410812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("medium_article_titles_engagement","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("medium_article_titles_engagement","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medium_article_titles_engagement| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/dima806/medium-article-titles-engagement \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-movie_genre_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-movie_genre_classification_en.md new file mode 100644 index 000000000000..87c15816c7bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-movie_genre_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English movie_genre_classification DistilBertForSequenceClassification from santis2 +author: John Snow Labs +name: movie_genre_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_genre_classification` is a English model originally trained by santis2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_genre_classification_en_5.2.0_3.0_1700386725423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_genre_classification_en_5.2.0_3.0_1700386725423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("movie_genre_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("movie_genre_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_genre_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/santis2/movie-genre-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-movie_review_sentiment_classifier_with_bert_en.md b/docs/_posts/ahmedlone127/2023-11-19-movie_review_sentiment_classifier_with_bert_en.md new file mode 100644 index 000000000000..66ed966ea7f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-movie_review_sentiment_classifier_with_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English movie_review_sentiment_classifier_with_bert DistilBertForSequenceClassification from wesleyacheng +author: John Snow Labs +name: movie_review_sentiment_classifier_with_bert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_review_sentiment_classifier_with_bert` is a English model originally trained by wesleyacheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_review_sentiment_classifier_with_bert_en_5.2.0_3.0_1700381604157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_review_sentiment_classifier_with_bert_en_5.2.0_3.0_1700381604157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("movie_review_sentiment_classifier_with_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("movie_review_sentiment_classifier_with_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_review_sentiment_classifier_with_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wesleyacheng/movie-review-sentiment-classifier-with-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-moviesreview5classroberta_en.md b/docs/_posts/ahmedlone127/2023-11-19-moviesreview5classroberta_en.md new file mode 100644 index 000000000000..ebf8906e603b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-moviesreview5classroberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English moviesreview5classroberta DistilBertForSequenceClassification from AhmedTaha012 +author: John Snow Labs +name: moviesreview5classroberta +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moviesreview5classroberta` is a English model originally trained by AhmedTaha012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moviesreview5classroberta_en_5.2.0_3.0_1700399475866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moviesreview5classroberta_en_5.2.0_3.0_1700399475866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("moviesreview5classroberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("moviesreview5classroberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moviesreview5classroberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AhmedTaha012/moviesReview5classRoberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-mulan_methyl_distilbert_5hmc_en.md b/docs/_posts/ahmedlone127/2023-11-19-mulan_methyl_distilbert_5hmc_en.md new file mode 100644 index 000000000000..db5265de4e04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-mulan_methyl_distilbert_5hmc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mulan_methyl_distilbert_5hmc DistilBertForSequenceClassification from wenhuan +author: John Snow Labs +name: mulan_methyl_distilbert_5hmc +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mulan_methyl_distilbert_5hmc` is a English model originally trained by wenhuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mulan_methyl_distilbert_5hmc_en_5.2.0_3.0_1700391446991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mulan_methyl_distilbert_5hmc_en_5.2.0_3.0_1700391446991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("mulan_methyl_distilbert_5hmc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("mulan_methyl_distilbert_5hmc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mulan_methyl_distilbert_5hmc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|235.3 MB| + +## References + +https://huggingface.co/wenhuan/MuLan-Methyl-DistilBERT_5hmC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-naadedei_finetuned_distilbert_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-naadedei_finetuned_distilbert_model_en.md new file mode 100644 index 000000000000..6c71ba472286 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-naadedei_finetuned_distilbert_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English naadedei_finetuned_distilbert_model DistilBertForSequenceClassification from reginandcrabbe +author: John Snow Labs +name: naadedei_finetuned_distilbert_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`naadedei_finetuned_distilbert_model` is a English model originally trained by reginandcrabbe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/naadedei_finetuned_distilbert_model_en_5.2.0_3.0_1700352444189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/naadedei_finetuned_distilbert_model_en_5.2.0_3.0_1700352444189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("naadedei_finetuned_distilbert_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("naadedei_finetuned_distilbert_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|naadedei_finetuned_distilbert_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/reginandcrabbe/naadedei-Finetuned-distilbert-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-natural_language_inference_not_evaluated_en.md b/docs/_posts/ahmedlone127/2023-11-19-natural_language_inference_not_evaluated_en.md new file mode 100644 index 000000000000..ba3eacf23be1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-natural_language_inference_not_evaluated_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English natural_language_inference_not_evaluated DistilBertForSequenceClassification from autoevaluate +author: John Snow Labs +name: natural_language_inference_not_evaluated +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`natural_language_inference_not_evaluated` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/natural_language_inference_not_evaluated_en_5.2.0_3.0_1700360305049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/natural_language_inference_not_evaluated_en_5.2.0_3.0_1700360305049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("natural_language_inference_not_evaluated","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("natural_language_inference_not_evaluated","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|natural_language_inference_not_evaluated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/autoevaluate/natural-language-inference-not-evaluated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-nepal_bhasa_doc_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-nepal_bhasa_doc_classifier_en.md new file mode 100644 index 000000000000..cbf77e0b8da5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-nepal_bhasa_doc_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nepal_bhasa_doc_classifier DistilBertForSequenceClassification from debjyoti007 +author: John Snow Labs +name: nepal_bhasa_doc_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepal_bhasa_doc_classifier` is a English model originally trained by debjyoti007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepal_bhasa_doc_classifier_en_5.2.0_3.0_1700368641980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepal_bhasa_doc_classifier_en_5.2.0_3.0_1700368641980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nepal_bhasa_doc_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nepal_bhasa_doc_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepal_bhasa_doc_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/debjyoti007/new_doc_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-nepali_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-nepali_distilbert_en.md new file mode 100644 index 000000000000..32c673565f5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-nepali_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nepali_distilbert DistilBertForSequenceClassification from dexhrestha +author: John Snow Labs +name: nepali_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepali_distilbert` is a English model originally trained by dexhrestha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepali_distilbert_en_5.2.0_3.0_1700400153724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepali_distilbert_en_5.2.0_3.0_1700400153724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nepali_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nepali_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepali_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.2 MB| + +## References + +https://huggingface.co/dexhrestha/Nepali-DistilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-netflix_rating_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-netflix_rating_classifier_en.md new file mode 100644 index 000000000000..78fb4477d582 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-netflix_rating_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English netflix_rating_classifier DistilBertForSequenceClassification from austinphamm +author: John Snow Labs +name: netflix_rating_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`netflix_rating_classifier` is a English model originally trained by austinphamm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/netflix_rating_classifier_en_5.2.0_3.0_1700424950040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/netflix_rating_classifier_en_5.2.0_3.0_1700424950040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("netflix_rating_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("netflix_rating_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|netflix_rating_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/austinphamm/netflix_rating_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-news_classification_johnhwang_en.md b/docs/_posts/ahmedlone127/2023-11-19-news_classification_johnhwang_en.md new file mode 100644 index 000000000000..f5052f467575 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-news_classification_johnhwang_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English news_classification_johnhwang DistilBertForSequenceClassification from JohnHwang +author: John Snow Labs +name: news_classification_johnhwang +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_classification_johnhwang` is a English model originally trained by JohnHwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_classification_johnhwang_en_5.2.0_3.0_1700356233927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_classification_johnhwang_en_5.2.0_3.0_1700356233927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_classification_johnhwang","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_classification_johnhwang","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_classification_johnhwang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JohnHwang/news_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-news_sentiment_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-news_sentiment_distilbert_en.md new file mode 100644 index 000000000000..164862840d1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-news_sentiment_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English news_sentiment_distilbert DistilBertForSequenceClassification from harvinder676 +author: John Snow Labs +name: news_sentiment_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_sentiment_distilbert` is a English model originally trained by harvinder676. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_sentiment_distilbert_en_5.2.0_3.0_1700352094482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_sentiment_distilbert_en_5.2.0_3.0_1700352094482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_sentiment_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_sentiment_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_sentiment_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/harvinder676/news_sentiment_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-nice_distilbert_v2_en.md b/docs/_posts/ahmedlone127/2023-11-19-nice_distilbert_v2_en.md new file mode 100644 index 000000000000..903468f58402 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-nice_distilbert_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nice_distilbert_v2 DistilBertForSequenceClassification from chisadi +author: John Snow Labs +name: nice_distilbert_v2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nice_distilbert_v2` is a English model originally trained by chisadi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nice_distilbert_v2_en_5.2.0_3.0_1700432357301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nice_distilbert_v2_en_5.2.0_3.0_1700432357301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nice_distilbert_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nice_distilbert_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nice_distilbert_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.6 MB| + +## References + +https://huggingface.co/chisadi/nice-distilbert-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-nima_test_bert_glue_en.md b/docs/_posts/ahmedlone127/2023-11-19-nima_test_bert_glue_en.md new file mode 100644 index 000000000000..14c4ac002910 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-nima_test_bert_glue_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nima_test_bert_glue DistilBertForSequenceClassification from Sphere-Fall2022 +author: John Snow Labs +name: nima_test_bert_glue +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nima_test_bert_glue` is a English model originally trained by Sphere-Fall2022. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nima_test_bert_glue_en_5.2.0_3.0_1700396382725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nima_test_bert_glue_en_5.2.0_3.0_1700396382725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nima_test_bert_glue","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nima_test_bert_glue","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nima_test_bert_glue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Sphere-Fall2022/nima-test-bert-glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-nlp_deep_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-nlp_deep_2_en.md new file mode 100644 index 000000000000..6344903d9f57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-nlp_deep_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_deep_2 DistilBertForSequenceClassification from Bictole +author: John Snow Labs +name: nlp_deep_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_deep_2` is a English model originally trained by Bictole. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_deep_2_en_5.2.0_3.0_1700407348577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_deep_2_en_5.2.0_3.0_1700407348577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_deep_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_deep_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_deep_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Bictole/NLP_DEEP_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-19-nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion_en.md new file mode 100644 index 000000000000..aa72ff8cbdce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion DistilBertForSequenceClassification from ChaoLi +author: John Snow Labs +name: nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion` is a English model originally trained by ChaoLi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion_en_5.2.0_3.0_1700377731095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion_en_5.2.0_3.0_1700377731095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_for_transformer_book_distilbert_base_uncased_finetuned_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ChaoLi/nlp_for_transformer_book_distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-nlp_sentimental_analysis_using_distilbert_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-nlp_sentimental_analysis_using_distilbert_model_en.md new file mode 100644 index 000000000000..b40e8dd6ccf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-nlp_sentimental_analysis_using_distilbert_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_sentimental_analysis_using_distilbert_model DistilBertForSequenceClassification from Achar +author: John Snow Labs +name: nlp_sentimental_analysis_using_distilbert_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_sentimental_analysis_using_distilbert_model` is a English model originally trained by Achar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_sentimental_analysis_using_distilbert_model_en_5.2.0_3.0_1700389670112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_sentimental_analysis_using_distilbert_model_en_5.2.0_3.0_1700389670112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_sentimental_analysis_using_distilbert_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_sentimental_analysis_using_distilbert_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_sentimental_analysis_using_distilbert_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Achar/NLP-Sentimental-Analysis-using-DistilBERT-ModeL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-office_character_en.md b/docs/_posts/ahmedlone127/2023-11-19-office_character_en.md new file mode 100644 index 000000000000..86c5f6365654 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-office_character_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English office_character DistilBertForSequenceClassification from kearney +author: John Snow Labs +name: office_character +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`office_character` is a English model originally trained by kearney. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/office_character_en_5.2.0_3.0_1700423845795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/office_character_en_5.2.0_3.0_1700423845795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("office_character","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("office_character","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|office_character| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kearney/office-character \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-playground_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-playground_sentiment_model_en.md new file mode 100644 index 000000000000..90209e6d4492 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-playground_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English playground_sentiment_model DistilBertForSequenceClassification from GIanlucaRub +author: John Snow Labs +name: playground_sentiment_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`playground_sentiment_model` is a English model originally trained by GIanlucaRub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/playground_sentiment_model_en_5.2.0_3.0_1700427603803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/playground_sentiment_model_en_5.2.0_3.0_1700427603803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("playground_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("playground_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|playground_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/GIanlucaRub/playground-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-postcard_multilabel_classifier_ru.md b/docs/_posts/ahmedlone127/2023-11-19-postcard_multilabel_classifier_ru.md new file mode 100644 index 000000000000..4d73988f99ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-postcard_multilabel_classifier_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian postcard_multilabel_classifier DistilBertForSequenceClassification from pa-shk +author: John Snow Labs +name: postcard_multilabel_classifier +date: 2023-11-19 +tags: [bert, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`postcard_multilabel_classifier` is a Russian model originally trained by pa-shk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/postcard_multilabel_classifier_ru_5.2.0_3.0_1700369942333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/postcard_multilabel_classifier_ru_5.2.0_3.0_1700369942333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("postcard_multilabel_classifier","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("postcard_multilabel_classifier","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|postcard_multilabel_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|39.2 MB| + +## References + +https://huggingface.co/pa-shk/postcard_multilabel_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-pre_requisite_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-pre_requisite_model_en.md new file mode 100644 index 000000000000..90c4e285002c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-pre_requisite_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pre_requisite_model DistilBertForSequenceClassification from satyamverma +author: John Snow Labs +name: pre_requisite_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pre_requisite_model` is a English model originally trained by satyamverma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pre_requisite_model_en_5.2.0_3.0_1700420089307.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pre_requisite_model_en_5.2.0_3.0_1700420089307.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("pre_requisite_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("pre_requisite_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pre_requisite_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/satyamverma/Pre-requisite_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-predict_perception_bertino_focus_object_en.md b/docs/_posts/ahmedlone127/2023-11-19-predict_perception_bertino_focus_object_en.md new file mode 100644 index 000000000000..509a1b956ea4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-predict_perception_bertino_focus_object_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English predict_perception_bertino_focus_object DistilBertForSequenceClassification from gossminn +author: John Snow Labs +name: predict_perception_bertino_focus_object +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`predict_perception_bertino_focus_object` is a English model originally trained by gossminn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/predict_perception_bertino_focus_object_en_5.2.0_3.0_1700420590546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/predict_perception_bertino_focus_object_en_5.2.0_3.0_1700420590546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("predict_perception_bertino_focus_object","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("predict_perception_bertino_focus_object","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|predict_perception_bertino_focus_object| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|255.2 MB| + +## References + +https://huggingface.co/gossminn/predict-perception-bertino-focus-object \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-predict_perception_bertino_focus_victim_en.md b/docs/_posts/ahmedlone127/2023-11-19-predict_perception_bertino_focus_victim_en.md new file mode 100644 index 000000000000..7ac399bded85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-predict_perception_bertino_focus_victim_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English predict_perception_bertino_focus_victim DistilBertForSequenceClassification from gossminn +author: John Snow Labs +name: predict_perception_bertino_focus_victim +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`predict_perception_bertino_focus_victim` is a English model originally trained by gossminn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/predict_perception_bertino_focus_victim_en_5.2.0_3.0_1700373320417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/predict_perception_bertino_focus_victim_en_5.2.0_3.0_1700373320417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("predict_perception_bertino_focus_victim","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("predict_perception_bertino_focus_victim","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|predict_perception_bertino_focus_victim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|255.2 MB| + +## References + +https://huggingface.co/gossminn/predict-perception-bertino-focus-victim \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-presentation_emotion_1234567_en.md b/docs/_posts/ahmedlone127/2023-11-19-presentation_emotion_1234567_en.md new file mode 100644 index 000000000000..38a5cc6a06a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-presentation_emotion_1234567_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English presentation_emotion_1234567 DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: presentation_emotion_1234567 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`presentation_emotion_1234567` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/presentation_emotion_1234567_en_5.2.0_3.0_1700432237116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/presentation_emotion_1234567_en_5.2.0_3.0_1700432237116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("presentation_emotion_1234567","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("presentation_emotion_1234567","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|presentation_emotion_1234567| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aXhyra/presentation_emotion_1234567 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-presentation_irony_1234567_en.md b/docs/_posts/ahmedlone127/2023-11-19-presentation_irony_1234567_en.md new file mode 100644 index 000000000000..75a90e7f9af4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-presentation_irony_1234567_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English presentation_irony_1234567 DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: presentation_irony_1234567 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`presentation_irony_1234567` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/presentation_irony_1234567_en_5.2.0_3.0_1700388761739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/presentation_irony_1234567_en_5.2.0_3.0_1700388761739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("presentation_irony_1234567","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("presentation_irony_1234567","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|presentation_irony_1234567| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/aXhyra/presentation_irony_1234567 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-psychiq_en.md b/docs/_posts/ahmedlone127/2023-11-19-psychiq_en.md new file mode 100644 index 000000000000..a33947841d0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-psychiq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English psychiq DistilBertForSequenceClassification from derenrich +author: John Snow Labs +name: psychiq +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`psychiq` is a English model originally trained by derenrich. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/psychiq_en_5.2.0_3.0_1700406290876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/psychiq_en_5.2.0_3.0_1700406290876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("psychiq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("psychiq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|psychiq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.3 MB| + +## References + +https://huggingface.co/derenrich/psychiq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-qd_dialog_distilbert_base_turkish_en.md b/docs/_posts/ahmedlone127/2023-11-19-qd_dialog_distilbert_base_turkish_en.md new file mode 100644 index 000000000000..05280da25ee6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-qd_dialog_distilbert_base_turkish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English qd_dialog_distilbert_base_turkish DistilBertForSequenceClassification from Izzet +author: John Snow Labs +name: qd_dialog_distilbert_base_turkish +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qd_dialog_distilbert_base_turkish` is a English model originally trained by Izzet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qd_dialog_distilbert_base_turkish_en_5.2.0_3.0_1700388755959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qd_dialog_distilbert_base_turkish_en_5.2.0_3.0_1700388755959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("qd_dialog_distilbert_base_turkish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("qd_dialog_distilbert_base_turkish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qd_dialog_distilbert_base_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|254.0 MB| + +## References + +https://huggingface.co/Izzet/qd_dialog_distilbert-base-turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-question_classifier_v2_en.md b/docs/_posts/ahmedlone127/2023-11-19-question_classifier_v2_en.md new file mode 100644 index 000000000000..f8d47c579e15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-question_classifier_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English question_classifier_v2 DistilBertForSequenceClassification from alangpp255 +author: John Snow Labs +name: question_classifier_v2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_classifier_v2` is a English model originally trained by alangpp255. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_classifier_v2_en_5.2.0_3.0_1700373291992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_classifier_v2_en_5.2.0_3.0_1700373291992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("question_classifier_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("question_classifier_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_classifier_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/alangpp255/Question_classifier_V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-racial_bias_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-racial_bias_classification_en.md new file mode 100644 index 000000000000..7744ad94c4fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-racial_bias_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English racial_bias_classification DistilBertForSequenceClassification from BogdanTurbal +author: John Snow Labs +name: racial_bias_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`racial_bias_classification` is a English model originally trained by BogdanTurbal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/racial_bias_classification_en_5.2.0_3.0_1700414222997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/racial_bias_classification_en_5.2.0_3.0_1700414222997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("racial_bias_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("racial_bias_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|racial_bias_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/BogdanTurbal/racial_bias_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-redbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-redbert_en.md new file mode 100644 index 000000000000..16bc92d0ed38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-redbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English redbert DistilBertForSequenceClassification from traberph +author: John Snow Labs +name: redbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`redbert` is a English model originally trained by traberph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/redbert_en_5.2.0_3.0_1700364696135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/redbert_en_5.2.0_3.0_1700364696135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("redbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("redbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|redbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|330.0 MB| + +## References + +https://huggingface.co/traberph/RedBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-reddit_comment_sentiment_final_en.md b/docs/_posts/ahmedlone127/2023-11-19-reddit_comment_sentiment_final_en.md new file mode 100644 index 000000000000..ffd0c1c64367 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-reddit_comment_sentiment_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English reddit_comment_sentiment_final DistilBertForSequenceClassification from AG6019 +author: John Snow Labs +name: reddit_comment_sentiment_final +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_comment_sentiment_final` is a English model originally trained by AG6019. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_comment_sentiment_final_en_5.2.0_3.0_1700385771192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_comment_sentiment_final_en_5.2.0_3.0_1700385771192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("reddit_comment_sentiment_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("reddit_comment_sentiment_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_comment_sentiment_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/AG6019/reddit-comment-sentiment-final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-religion_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-religion_classification_en.md new file mode 100644 index 000000000000..98305442cfb9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-religion_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English religion_classification DistilBertForSequenceClassification from padmajabfrl +author: John Snow Labs +name: religion_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`religion_classification` is a English model originally trained by padmajabfrl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/religion_classification_en_5.2.0_3.0_1700386844039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/religion_classification_en_5.2.0_3.0_1700386844039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("religion_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("religion_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|religion_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/padmajabfrl/Religion-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-resumeclassification_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-resumeclassification_distilbert_en.md new file mode 100644 index 000000000000..de584649ab0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-resumeclassification_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English resumeclassification_distilbert DistilBertForSequenceClassification from runaksh +author: John Snow Labs +name: resumeclassification_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`resumeclassification_distilbert` is a English model originally trained by runaksh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/resumeclassification_distilbert_en_5.2.0_3.0_1700373069332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/resumeclassification_distilbert_en_5.2.0_3.0_1700373069332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("resumeclassification_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("resumeclassification_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|resumeclassification_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/runaksh/ResumeClassification_distilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-rightpartisan_en.md b/docs/_posts/ahmedlone127/2023-11-19-rightpartisan_en.md new file mode 100644 index 000000000000..687cd9d7fc67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-rightpartisan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rightpartisan DistilBertForSequenceClassification from spencerh +author: John Snow Labs +name: rightpartisan +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rightpartisan` is a English model originally trained by spencerh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rightpartisan_en_5.2.0_3.0_1700357017446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rightpartisan_en_5.2.0_3.0_1700357017446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("rightpartisan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("rightpartisan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rightpartisan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/spencerh/rightpartisan \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sentance_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-19-sentance_analysis_en.md new file mode 100644 index 000000000000..235c899103f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sentance_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentance_analysis DistilBertForSequenceClassification from sentientconch +author: John Snow Labs +name: sentance_analysis +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentance_analysis` is a English model originally trained by sentientconch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentance_analysis_en_5.2.0_3.0_1700357906631.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentance_analysis_en_5.2.0_3.0_1700357906631.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentance_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentance_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentance_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sentientconch/sentance_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-19-sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment_en.md new file mode 100644 index 000000000000..7cccd07df863 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment DistilBertForSequenceClassification from Theivaprakasham +author: John Snow Labs +name: sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment` is a English model originally trained by Theivaprakasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment_en_5.2.0_3.0_1700352768480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment_en_5.2.0_3.0_1700352768480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_transformers_msmarco_distilbert_base_tas_b_twitter_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Theivaprakasham/sentence-transformers-msmarco-distilbert-base-tas-b-twitter_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sentiment_analysis_on_covid_tweets_edusei_en.md b/docs/_posts/ahmedlone127/2023-11-19-sentiment_analysis_on_covid_tweets_edusei_en.md new file mode 100644 index 000000000000..bb0b6ab4db9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sentiment_analysis_on_covid_tweets_edusei_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_on_covid_tweets_edusei DistilBertForSequenceClassification from edusei +author: John Snow Labs +name: sentiment_analysis_on_covid_tweets_edusei +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_on_covid_tweets_edusei` is a English model originally trained by edusei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_on_covid_tweets_edusei_en_5.2.0_3.0_1700391533914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_on_covid_tweets_edusei_en_5.2.0_3.0_1700391533914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_on_covid_tweets_edusei","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_on_covid_tweets_edusei","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_on_covid_tweets_edusei| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/edusei/sentiment_analysis_on_covid_tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sentiment_model_imdb_small_demo_en.md b/docs/_posts/ahmedlone127/2023-11-19-sentiment_model_imdb_small_demo_en.md new file mode 100644 index 000000000000..70a4fd5fdea3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sentiment_model_imdb_small_demo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_model_imdb_small_demo DistilBertForSequenceClassification from sachinshinde +author: John Snow Labs +name: sentiment_model_imdb_small_demo +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_model_imdb_small_demo` is a English model originally trained by sachinshinde. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_model_imdb_small_demo_en_5.2.0_3.0_1700402899210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_model_imdb_small_demo_en_5.2.0_3.0_1700402899210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_imdb_small_demo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_imdb_small_demo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_model_imdb_small_demo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/sachinshinde/sentiment-model-imdb-small-demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sentiment_model_on_imdb_dataset_tirendaz_en.md b/docs/_posts/ahmedlone127/2023-11-19-sentiment_model_on_imdb_dataset_tirendaz_en.md new file mode 100644 index 000000000000..2ab52c3dac9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sentiment_model_on_imdb_dataset_tirendaz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_model_on_imdb_dataset_tirendaz DistilBertForSequenceClassification from Tirendaz +author: John Snow Labs +name: sentiment_model_on_imdb_dataset_tirendaz +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_model_on_imdb_dataset_tirendaz` is a English model originally trained by Tirendaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_model_on_imdb_dataset_tirendaz_en_5.2.0_3.0_1700405695050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_model_on_imdb_dataset_tirendaz_en_5.2.0_3.0_1700405695050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_on_imdb_dataset_tirendaz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_on_imdb_dataset_tirendaz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_model_on_imdb_dataset_tirendaz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Tirendaz/sentiment-model-on-imdb-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sentiments_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-sentiments_classifier_en.md new file mode 100644 index 000000000000..48a1b83c333d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sentiments_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiments_classifier DistilBertForSequenceClassification from neuroapps +author: John Snow Labs +name: sentiments_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiments_classifier` is a English model originally trained by neuroapps. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiments_classifier_en_5.2.0_3.0_1700419615259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiments_classifier_en_5.2.0_3.0_1700419615259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiments_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiments_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiments_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/neuroapps/sentiments_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-simpsons_character_discriminator_en.md b/docs/_posts/ahmedlone127/2023-11-19-simpsons_character_discriminator_en.md new file mode 100644 index 000000000000..f434ef5b9a32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-simpsons_character_discriminator_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English simpsons_character_discriminator DistilBertForSequenceClassification from Rbanerjee +author: John Snow Labs +name: simpsons_character_discriminator +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`simpsons_character_discriminator` is a English model originally trained by Rbanerjee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/simpsons_character_discriminator_en_5.2.0_3.0_1700362786251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/simpsons_character_discriminator_en_5.2.0_3.0_1700362786251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("simpsons_character_discriminator","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("simpsons_character_discriminator","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|simpsons_character_discriminator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/Rbanerjee/simpsons-character-discriminator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sincere_question_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-sincere_question_classification_en.md new file mode 100644 index 000000000000..c5199c2b2239 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sincere_question_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sincere_question_classification DistilBertForSequenceClassification from amyma21 +author: John Snow Labs +name: sincere_question_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sincere_question_classification` is a English model originally trained by amyma21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sincere_question_classification_en_5.2.0_3.0_1700373069355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sincere_question_classification_en_5.2.0_3.0_1700373069355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sincere_question_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sincere_question_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sincere_question_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/amyma21/sincere_question_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-small_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-small_sentiment_model_en.md new file mode 100644 index 000000000000..92faf5e5d91a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-small_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English small_sentiment_model DistilBertForSequenceClassification from AlexAnge +author: John Snow Labs +name: small_sentiment_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`small_sentiment_model` is a English model originally trained by AlexAnge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/small_sentiment_model_en_5.2.0_3.0_1700353701204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/small_sentiment_model_en_5.2.0_3.0_1700353701204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("small_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("small_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|small_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/AlexAnge/small-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes_en.md b/docs/_posts/ahmedlone127/2023-11-19-smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes_en.md new file mode 100644 index 000000000000..e48c489c3f2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes DistilBertForSequenceClassification from jayantapaul888 +author: John Snow Labs +name: smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes` is a English model originally trained by jayantapaul888. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes_en_5.2.0_3.0_1700356409891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes_en_5.2.0_3.0_1700356409891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|smalldata_distilbert_base_uncasede_eng_only_sentiment_single_finetuned_memes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jayantapaul888/smalldata-distilbert-base-uncasede-eng-only-sentiment-single-finetuned-memes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-smart_home_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-smart_home_model_en.md new file mode 100644 index 000000000000..90957a5c1427 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-smart_home_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English smart_home_model DistilBertForSequenceClassification from FDuCHeS +author: John Snow Labs +name: smart_home_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`smart_home_model` is a English model originally trained by FDuCHeS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/smart_home_model_en_5.2.0_3.0_1700416279961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/smart_home_model_en_5.2.0_3.0_1700416279961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("smart_home_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("smart_home_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|smart_home_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/FDuCHeS/smart_home_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-sms_spam_detection_manning_en.md b/docs/_posts/ahmedlone127/2023-11-19-sms_spam_detection_manning_en.md new file mode 100644 index 000000000000..7697b93311e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-sms_spam_detection_manning_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sms_spam_detection_manning DistilBertForSequenceClassification from satish860 +author: John Snow Labs +name: sms_spam_detection_manning +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sms_spam_detection_manning` is a English model originally trained by satish860. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sms_spam_detection_manning_en_5.2.0_3.0_1700434449176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sms_spam_detection_manning_en_5.2.0_3.0_1700434449176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sms_spam_detection_manning","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sms_spam_detection_manning","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sms_spam_detection_manning| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/satish860/sms_spam_detection-manning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-smsa_distilbert_indo_id.md b/docs/_posts/ahmedlone127/2023-11-19-smsa_distilbert_indo_id.md new file mode 100644 index 000000000000..e3004e078391 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-smsa_distilbert_indo_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian smsa_distilbert_indo DistilBertForSequenceClassification from karuniaperjuangan +author: John Snow Labs +name: smsa_distilbert_indo +date: 2023-11-19 +tags: [bert, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`smsa_distilbert_indo` is a Indonesian model originally trained by karuniaperjuangan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/smsa_distilbert_indo_id_5.2.0_3.0_1700394415851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/smsa_distilbert_indo_id_5.2.0_3.0_1700394415851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("smsa_distilbert_indo","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("smsa_distilbert_indo","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|smsa_distilbert_indo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|507.6 MB| + +## References + +https://huggingface.co/karuniaperjuangan/smsa-distilbert-indo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-spam_ham_classifier_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-spam_ham_classifier_distilbert_en.md new file mode 100644 index 000000000000..3084edb966c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-spam_ham_classifier_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English spam_ham_classifier_distilbert DistilBertForSequenceClassification from martin-bendik +author: John Snow Labs +name: spam_ham_classifier_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spam_ham_classifier_distilbert` is a English model originally trained by martin-bendik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spam_ham_classifier_distilbert_en_5.2.0_3.0_1700433407966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spam_ham_classifier_distilbert_en_5.2.0_3.0_1700433407966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("spam_ham_classifier_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("spam_ham_classifier_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spam_ham_classifier_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/martin-bendik/spam_ham_classifier_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-startupclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-startupclassifier_en.md new file mode 100644 index 000000000000..af7131c16bd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-startupclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English startupclassifier DistilBertForSequenceClassification from erikacardenas300 +author: John Snow Labs +name: startupclassifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`startupclassifier` is a English model originally trained by erikacardenas300. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/startupclassifier_en_5.2.0_3.0_1700407140000.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/startupclassifier_en_5.2.0_3.0_1700407140000.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("startupclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("startupclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|startupclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/erikacardenas300/StartupClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-stockpredictor_en.md b/docs/_posts/ahmedlone127/2023-11-19-stockpredictor_en.md new file mode 100644 index 000000000000..2b39f8ea014d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-stockpredictor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stockpredictor DistilBertForSequenceClassification from icyGS +author: John Snow Labs +name: stockpredictor +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stockpredictor` is a English model originally trained by icyGS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stockpredictor_en_5.2.0_3.0_1700382773743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stockpredictor_en_5.2.0_3.0_1700382773743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("stockpredictor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("stockpredictor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stockpredictor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/icyGS/StockPredictor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-symptom_2_disease_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-symptom_2_disease_distilbert_en.md new file mode 100644 index 000000000000..28e6181f844e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-symptom_2_disease_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English symptom_2_disease_distilbert DistilBertForSequenceClassification from runaksh +author: John Snow Labs +name: symptom_2_disease_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`symptom_2_disease_distilbert` is a English model originally trained by runaksh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/symptom_2_disease_distilbert_en_5.2.0_3.0_1700352698501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/symptom_2_disease_distilbert_en_5.2.0_3.0_1700352698501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("symptom_2_disease_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("symptom_2_disease_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|symptom_2_disease_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/runaksh/Symptom-2-disease_distilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-tahniat_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-19-tahniat_classifier_en.md new file mode 100644 index 000000000000..07282689fea6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-tahniat_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tahniat_classifier DistilBertForSequenceClassification from Social-Media-Fairness +author: John Snow Labs +name: tahniat_classifier +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tahniat_classifier` is a English model originally trained by Social-Media-Fairness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tahniat_classifier_en_5.2.0_3.0_1700394118650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tahniat_classifier_en_5.2.0_3.0_1700394118650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tahniat_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tahniat_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tahniat_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Social-Media-Fairness/Tahniat-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-telugu_movie_review_sentiment_distilbert_te.md b/docs/_posts/ahmedlone127/2023-11-19-telugu_movie_review_sentiment_distilbert_te.md new file mode 100644 index 000000000000..a369c43c054d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-telugu_movie_review_sentiment_distilbert_te.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Telugu telugu_movie_review_sentiment_distilbert DistilBertForSequenceClassification from Sanath369 +author: John Snow Labs +name: telugu_movie_review_sentiment_distilbert +date: 2023-11-19 +tags: [bert, te, open_source, sequence_classification, onnx] +task: Text Classification +language: te +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`telugu_movie_review_sentiment_distilbert` is a Telugu model originally trained by Sanath369. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/telugu_movie_review_sentiment_distilbert_te_5.2.0_3.0_1700352276519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/telugu_movie_review_sentiment_distilbert_te_5.2.0_3.0_1700352276519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("telugu_movie_review_sentiment_distilbert","te")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("telugu_movie_review_sentiment_distilbert","te") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|telugu_movie_review_sentiment_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|te| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Sanath369/Telugu_movie_review_sentiment_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-test_model_lliriknat_en.md b/docs/_posts/ahmedlone127/2023-11-19-test_model_lliriknat_en.md new file mode 100644 index 000000000000..47864fb2b7c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-test_model_lliriknat_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English test_model_lliriknat DistilBertForSequenceClassification from LliriKnat +author: John Snow Labs +name: test_model_lliriknat +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_model_lliriknat` is a English model originally trained by LliriKnat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_model_lliriknat_en_5.2.0_3.0_1700355935599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_model_lliriknat_en_5.2.0_3.0_1700355935599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("test_model_lliriknat","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("test_model_lliriknat","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_model_lliriknat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/LliriKnat/test_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-text_classification_model_1_pytorch_en.md b/docs/_posts/ahmedlone127/2023-11-19-text_classification_model_1_pytorch_en.md new file mode 100644 index 000000000000..be1e23cc5b03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-text_classification_model_1_pytorch_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_classification_model_1_pytorch DistilBertForSequenceClassification from Hansaht +author: John Snow Labs +name: text_classification_model_1_pytorch +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classification_model_1_pytorch` is a English model originally trained by Hansaht. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classification_model_1_pytorch_en_5.2.0_3.0_1700366668907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classification_model_1_pytorch_en_5.2.0_3.0_1700366668907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_classification_model_1_pytorch","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_classification_model_1_pytorch","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classification_model_1_pytorch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Hansaht/Text_classification_model_1_pytorch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-text_classification_rayschwartz_en.md b/docs/_posts/ahmedlone127/2023-11-19-text_classification_rayschwartz_en.md new file mode 100644 index 000000000000..40f71e91422b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-text_classification_rayschwartz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_classification_rayschwartz DistilBertForSequenceClassification from rayschwartz +author: John Snow Labs +name: text_classification_rayschwartz +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classification_rayschwartz` is a English model originally trained by rayschwartz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classification_rayschwartz_en_5.2.0_3.0_1700356564287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classification_rayschwartz_en_5.2.0_3.0_1700356564287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_classification_rayschwartz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_classification_rayschwartz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classification_rayschwartz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/rayschwartz/text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-text_emotion_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-text_emotion_classification_en.md new file mode 100644 index 000000000000..036c2861ea27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-text_emotion_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_emotion_classification DistilBertForSequenceClassification from PrachiPatel +author: John Snow Labs +name: text_emotion_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_emotion_classification` is a English model originally trained by PrachiPatel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_emotion_classification_en_5.2.0_3.0_1700415343101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_emotion_classification_en_5.2.0_3.0_1700415343101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_emotion_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_emotion_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_emotion_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/PrachiPatel/text_emotion_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-tfs_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-tfs_distilbert_en.md new file mode 100644 index 000000000000..5a5cb78ca358 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-tfs_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tfs_distilbert DistilBertForSequenceClassification from chanret +author: John Snow Labs +name: tfs_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tfs_distilbert` is a English model originally trained by chanret. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tfs_distilbert_en_5.2.0_3.0_1700352279762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tfs_distilbert_en_5.2.0_3.0_1700352279762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tfs_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tfs_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tfs_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/chanret/tfs_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-theoffice_speaker_classification_en.md b/docs/_posts/ahmedlone127/2023-11-19-theoffice_speaker_classification_en.md new file mode 100644 index 000000000000..37340a876251 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-theoffice_speaker_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English theoffice_speaker_classification DistilBertForSequenceClassification from mo374z +author: John Snow Labs +name: theoffice_speaker_classification +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`theoffice_speaker_classification` is a English model originally trained by mo374z. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/theoffice_speaker_classification_en_5.2.0_3.0_1700364805390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/theoffice_speaker_classification_en_5.2.0_3.0_1700364805390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("theoffice_speaker_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("theoffice_speaker_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|theoffice_speaker_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mo374z/theoffice_speaker_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-todos_task_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-todos_task_model_en.md new file mode 100644 index 000000000000..d7f2f18750fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-todos_task_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English todos_task_model DistilBertForSequenceClassification from vagrawal787 +author: John Snow Labs +name: todos_task_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`todos_task_model` is a English model originally trained by vagrawal787. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/todos_task_model_en_5.2.0_3.0_1700425460303.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/todos_task_model_en_5.2.0_3.0_1700425460303.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("todos_task_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("todos_task_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|todos_task_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/vagrawal787/todos_task_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-trainer_chapter3_en.md b/docs/_posts/ahmedlone127/2023-11-19-trainer_chapter3_en.md new file mode 100644 index 000000000000..f92cac505c32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-trainer_chapter3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English trainer_chapter3 DistilBertForSequenceClassification from osanseviero +author: John Snow Labs +name: trainer_chapter3 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trainer_chapter3` is a English model originally trained by osanseviero. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trainer_chapter3_en_5.2.0_3.0_1700354000136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trainer_chapter3_en_5.2.0_3.0_1700354000136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("trainer_chapter3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("trainer_chapter3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trainer_chapter3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/osanseviero/trainer-chapter3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-trainer_chapter4_en.md b/docs/_posts/ahmedlone127/2023-11-19-trainer_chapter4_en.md new file mode 100644 index 000000000000..6739eb79b486 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-trainer_chapter4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English trainer_chapter4 DistilBertForSequenceClassification from osanseviero +author: John Snow Labs +name: trainer_chapter4 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trainer_chapter4` is a English model originally trained by osanseviero. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trainer_chapter4_en_5.2.0_3.0_1700360576966.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trainer_chapter4_en_5.2.0_3.0_1700360576966.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("trainer_chapter4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("trainer_chapter4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trainer_chapter4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/osanseviero/trainer-chapter4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-trajectory_classifier2_en.md b/docs/_posts/ahmedlone127/2023-11-19-trajectory_classifier2_en.md new file mode 100644 index 000000000000..109c5fecc4ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-trajectory_classifier2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English trajectory_classifier2 DistilBertForSequenceClassification from alexamiredjibi +author: John Snow Labs +name: trajectory_classifier2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`trajectory_classifier2` is a English model originally trained by alexamiredjibi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/trajectory_classifier2_en_5.2.0_3.0_1700405306986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/trajectory_classifier2_en_5.2.0_3.0_1700405306986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("trajectory_classifier2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("trajectory_classifier2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|trajectory_classifier2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/alexamiredjibi/trajectory-classifier2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-treatment_recommendation_en.md b/docs/_posts/ahmedlone127/2023-11-19-treatment_recommendation_en.md new file mode 100644 index 000000000000..5bb567fc4af0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-treatment_recommendation_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English treatment_recommendation DistilBertForSequenceClassification from Straiberry +author: John Snow Labs +name: treatment_recommendation +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`treatment_recommendation` is a English model originally trained by Straiberry. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/treatment_recommendation_en_5.2.0_3.0_1700427605394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/treatment_recommendation_en_5.2.0_3.0_1700427605394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("treatment_recommendation","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("treatment_recommendation","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|treatment_recommendation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Straiberry/Treatment_Recommendation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-truera_huggingface_monitoring_en.md b/docs/_posts/ahmedlone127/2023-11-19-truera_huggingface_monitoring_en.md new file mode 100644 index 000000000000..990131e0fe2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-truera_huggingface_monitoring_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English truera_huggingface_monitoring DistilBertForSequenceClassification from ebotwick +author: John Snow Labs +name: truera_huggingface_monitoring +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`truera_huggingface_monitoring` is a English model originally trained by ebotwick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/truera_huggingface_monitoring_en_5.2.0_3.0_1700437277025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/truera_huggingface_monitoring_en_5.2.0_3.0_1700437277025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("truera_huggingface_monitoring","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("truera_huggingface_monitoring","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|truera_huggingface_monitoring| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ebotwick/truera_huggingface_monitoring \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-tsc_finetuning_sentiment_movie_model2_en.md b/docs/_posts/ahmedlone127/2023-11-19-tsc_finetuning_sentiment_movie_model2_en.md new file mode 100644 index 000000000000..4c807b11226e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-tsc_finetuning_sentiment_movie_model2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tsc_finetuning_sentiment_movie_model2 DistilBertForSequenceClassification from malcolm +author: John Snow Labs +name: tsc_finetuning_sentiment_movie_model2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tsc_finetuning_sentiment_movie_model2` is a English model originally trained by malcolm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tsc_finetuning_sentiment_movie_model2_en_5.2.0_3.0_1700408079102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tsc_finetuning_sentiment_movie_model2_en_5.2.0_3.0_1700408079102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tsc_finetuning_sentiment_movie_model2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tsc_finetuning_sentiment_movie_model2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tsc_finetuning_sentiment_movie_model2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/malcolm/TSC_finetuning-sentiment-movie-model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-tst_resnet50_2_en.md b/docs/_posts/ahmedlone127/2023-11-19-tst_resnet50_2_en.md new file mode 100644 index 000000000000..d9013e374a30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-tst_resnet50_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tst_resnet50_2 DistilBertForSequenceClassification from laol777 +author: John Snow Labs +name: tst_resnet50_2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tst_resnet50_2` is a English model originally trained by laol777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tst_resnet50_2_en_5.2.0_3.0_1700361702648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tst_resnet50_2_en_5.2.0_3.0_1700361702648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tst_resnet50_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tst_resnet50_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tst_resnet50_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/laol777/tst_resnet50_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-tweet_toxicity_en.md b/docs/_posts/ahmedlone127/2023-11-19-tweet_toxicity_en.md new file mode 100644 index 000000000000..4590b9fb9259 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-tweet_toxicity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tweet_toxicity DistilBertForSequenceClassification from sachiniyer +author: John Snow Labs +name: tweet_toxicity +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tweet_toxicity` is a English model originally trained by sachiniyer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tweet_toxicity_en_5.2.0_3.0_1700353829412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tweet_toxicity_en_5.2.0_3.0_1700353829412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("tweet_toxicity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("tweet_toxicity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tweet_toxicity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sachiniyer/tweet_toxicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_en.md b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_en.md new file mode 100644 index 000000000000..3c34c781e4b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_data_distilbert_base_uncased_sentiment_finetuned_memes DistilBertForSequenceClassification from jayantapaul888 +author: John Snow Labs +name: twitter_data_distilbert_base_uncased_sentiment_finetuned_memes +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_data_distilbert_base_uncased_sentiment_finetuned_memes` is a English model originally trained by jayantapaul888. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_en_5.2.0_3.0_1700424047355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_en_5.2.0_3.0_1700424047355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_data_distilbert_base_uncased_sentiment_finetuned_memes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jayantapaul888/twitter-data-distilbert-base-uncased-sentiment-finetuned-memes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test_en.md b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test_en.md new file mode 100644 index 000000000000..e3c1ae226780 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test DistilBertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test` is a English model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test_en_5.2.0_3.0_1700435830888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test_en_5.2.0_3.0_1700435830888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SiddharthaM/twitter-data-distilbert-base-uncased-sentiment-finetuned-memes-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1_en.md b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1_en.md new file mode 100644 index 000000000000..39edcc13018f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1 DistilBertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1` is a English model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1_en_5.2.0_3.0_1700419845510.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1_en_5.2.0_3.0_1700419845510.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SiddharthaM/twitter-data-distilbert-base-uncased-sentiment-finetuned-memes-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2_en.md b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2_en.md new file mode 100644 index 000000000000..c413f8bf80c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2 DistilBertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2 +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2` is a English model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2_en_5.2.0_3.0_1700421500322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2_en_5.2.0_3.0_1700421500322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_data_distilbert_base_uncased_sentiment_finetuned_memes_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SiddharthaM/twitter-data-distilbert-base-uncased-sentiment-finetuned-memes-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-universityexerciseanothertry_en.md b/docs/_posts/ahmedlone127/2023-11-19-universityexerciseanothertry_en.md new file mode 100644 index 000000000000..cc951d5af80b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-universityexerciseanothertry_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English universityexerciseanothertry DistilBertForSequenceClassification from le1andonly +author: John Snow Labs +name: universityexerciseanothertry +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`universityexerciseanothertry` is a English model originally trained by le1andonly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/universityexerciseanothertry_en_5.2.0_3.0_1700382207678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/universityexerciseanothertry_en_5.2.0_3.0_1700382207678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("universityexerciseanothertry","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("universityexerciseanothertry","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|universityexerciseanothertry| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/le1andonly/universityexerciseanothertry \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-unwanted_content_detector_en.md b/docs/_posts/ahmedlone127/2023-11-19-unwanted_content_detector_en.md new file mode 100644 index 000000000000..abe05d1e465f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-unwanted_content_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English unwanted_content_detector DistilBertForSequenceClassification from JeanMachado +author: John Snow Labs +name: unwanted_content_detector +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unwanted_content_detector` is a English model originally trained by JeanMachado. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unwanted_content_detector_en_5.2.0_3.0_1700356414579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unwanted_content_detector_en_5.2.0_3.0_1700356414579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("unwanted_content_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("unwanted_content_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unwanted_content_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JeanMachado/unwanted_content_detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-vba_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-19-vba_distilbert_en.md new file mode 100644 index 000000000000..8fa88d7a9eed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-vba_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English vba_distilbert DistilBertForSequenceClassification from sadickam +author: John Snow Labs +name: vba_distilbert +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vba_distilbert` is a English model originally trained by sadickam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vba_distilbert_en_5.2.0_3.0_1700371053081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vba_distilbert_en_5.2.0_3.0_1700371053081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("vba_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("vba_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vba_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sadickam/vba-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-website_classification_model_en.md b/docs/_posts/ahmedlone127/2023-11-19-website_classification_model_en.md new file mode 100644 index 000000000000..17a3803110a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-website_classification_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English website_classification_model DistilBertForSequenceClassification from Eitanli +author: John Snow Labs +name: website_classification_model +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`website_classification_model` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/website_classification_model_en_5.2.0_3.0_1700371053096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/website_classification_model_en_5.2.0_3.0_1700371053096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("website_classification_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("website_classification_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|website_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Eitanli/website_classification_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-19-yelp_distilbert_5e_en.md b/docs/_posts/ahmedlone127/2023-11-19-yelp_distilbert_5e_en.md new file mode 100644 index 000000000000..be1302144c8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-19-yelp_distilbert_5e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English yelp_distilbert_5e DistilBertForSequenceClassification from pig4431 +author: John Snow Labs +name: yelp_distilbert_5e +date: 2023-11-19 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yelp_distilbert_5e` is a English model originally trained by pig4431. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yelp_distilbert_5e_en_5.2.0_3.0_1700408959565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yelp_distilbert_5e_en_5.2.0_3.0_1700408959565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("yelp_distilbert_5e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("yelp_distilbert_5e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yelp_distilbert_5e| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/pig4431/YELP_DistilBERT_5E \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-4_way_detection_prop_16_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-20-4_way_detection_prop_16_distilbert_en.md new file mode 100644 index 000000000000..e4e9833e8811 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-4_way_detection_prop_16_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 4_way_detection_prop_16_distilbert DistilBertForSequenceClassification from ultra-coder54732 +author: John Snow Labs +name: 4_way_detection_prop_16_distilbert +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`4_way_detection_prop_16_distilbert` is a English model originally trained by ultra-coder54732. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/4_way_detection_prop_16_distilbert_en_5.2.0_3.0_1700496105654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/4_way_detection_prop_16_distilbert_en_5.2.0_3.0_1700496105654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("4_way_detection_prop_16_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("4_way_detection_prop_16_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|4_way_detection_prop_16_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ultra-coder54732/4-way-detection-prop-16-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-agnews_distilbert_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-20-agnews_distilbert_finetuned_en.md new file mode 100644 index 000000000000..bebf3775613e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-agnews_distilbert_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English agnews_distilbert_finetuned DistilBertForSequenceClassification from billster45 +author: John Snow Labs +name: agnews_distilbert_finetuned +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`agnews_distilbert_finetuned` is a English model originally trained by billster45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/agnews_distilbert_finetuned_en_5.2.0_3.0_1700443333796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/agnews_distilbert_finetuned_en_5.2.0_3.0_1700443333796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("agnews_distilbert_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("agnews_distilbert_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|agnews_distilbert_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/billster45/agnews_distilbert_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-app_en.md b/docs/_posts/ahmedlone127/2023-11-20-app_en.md new file mode 100644 index 000000000000..396d06efda57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-app_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English app DistilBertForTokenClassification from pierrerappolt-okta +author: John Snow Labs +name: app +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`app` is a English model originally trained by pierrerappolt-okta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/app_en_5.2.0_3.0_1700519714968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/app_en_5.2.0_3.0_1700519714968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("app","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("app", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|app| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pierrerappolt-okta/app \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-autotrain_js_classification_6_cat_dist_bert_uncased_54424128043_en.md b/docs/_posts/ahmedlone127/2023-11-20-autotrain_js_classification_6_cat_dist_bert_uncased_54424128043_en.md new file mode 100644 index 000000000000..7fa5894e5922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-autotrain_js_classification_6_cat_dist_bert_uncased_54424128043_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_js_classification_6_cat_dist_bert_uncased_54424128043 DistilBertForSequenceClassification from bodik +author: John Snow Labs +name: autotrain_js_classification_6_cat_dist_bert_uncased_54424128043 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_js_classification_6_cat_dist_bert_uncased_54424128043` is a English model originally trained by bodik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_js_classification_6_cat_dist_bert_uncased_54424128043_en_5.2.0_3.0_1700438751706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_js_classification_6_cat_dist_bert_uncased_54424128043_en_5.2.0_3.0_1700438751706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("autotrain_js_classification_6_cat_dist_bert_uncased_54424128043","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("autotrain_js_classification_6_cat_dist_bert_uncased_54424128043","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_js_classification_6_cat_dist_bert_uncased_54424128043| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bodik/autotrain-js-classification-6-cat-dist-bert-uncased-54424128043 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-background_distilebert_2023_02_21_19_08_en.md b/docs/_posts/ahmedlone127/2023-11-20-background_distilebert_2023_02_21_19_08_en.md new file mode 100644 index 000000000000..118819f24729 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-background_distilebert_2023_02_21_19_08_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English background_distilebert_2023_02_21_19_08 DistilBertForSequenceClassification from leeju +author: John Snow Labs +name: background_distilebert_2023_02_21_19_08 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`background_distilebert_2023_02_21_19_08` is a English model originally trained by leeju. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/background_distilebert_2023_02_21_19_08_en_5.2.0_3.0_1700483305055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/background_distilebert_2023_02_21_19_08_en_5.2.0_3.0_1700483305055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("background_distilebert_2023_02_21_19_08","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("background_distilebert_2023_02_21_19_08","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|background_distilebert_2023_02_21_19_08| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.9 MB| + +## References + +https://huggingface.co/leeju/background-distilebert_2023-02-21_19-08 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bacteria_lamp_network_en.md b/docs/_posts/ahmedlone127/2023-11-20-bacteria_lamp_network_en.md new file mode 100644 index 000000000000..c04c4328e565 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bacteria_lamp_network_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bacteria_lamp_network DistilBertForSequenceClassification from The-Data-Hound +author: John Snow Labs +name: bacteria_lamp_network +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bacteria_lamp_network` is a English model originally trained by The-Data-Hound. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bacteria_lamp_network_en_5.2.0_3.0_1700447198006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bacteria_lamp_network_en_5.2.0_3.0_1700447198006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bacteria_lamp_network","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bacteria_lamp_network","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bacteria_lamp_network| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/The-Data-Hound/bacteria_lamp_network \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_b07_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_b07_en.md new file mode 100644 index 000000000000..cf285666a098 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_b07_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_b07 DistilBertForTokenClassification from LazzeKappa +author: John Snow Labs +name: bert_b07 +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_b07` is a English model originally trained by LazzeKappa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_b07_en_5.2.0_3.0_1700521668306.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_b07_en_5.2.0_3.0_1700521668306.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_b07","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_b07", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_b07| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/LazzeKappa/BERT_B07 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_concept_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_concept_extraction_en.md new file mode 100644 index 000000000000..4ce1c74e6ba1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_concept_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_concept_extraction DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: bert_concept_extraction +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_concept_extraction` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_concept_extraction_en_5.2.0_3.0_1700521534777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_concept_extraction_en_5.2.0_3.0_1700521534777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_concept_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_concept_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_concept_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/bert_concept_extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_gabella_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_gabella_en.md new file mode 100644 index 000000000000..3b13edd15598 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_gabella_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_gabella DistilBertForSequenceClassification from gabella +author: John Snow Labs +name: bert_emotion_gabella +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_gabella` is a English model originally trained by gabella. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_gabella_en_5.2.0_3.0_1700456518620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_gabella_en_5.2.0_3.0_1700456518620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_gabella","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_gabella","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_gabella| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/gabella/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_garrett_vangilder_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_garrett_vangilder_en.md new file mode 100644 index 000000000000..17a88a1ac5ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_garrett_vangilder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_garrett_vangilder DistilBertForSequenceClassification from garrett-vangilder +author: John Snow Labs +name: bert_emotion_garrett_vangilder +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_garrett_vangilder` is a English model originally trained by garrett-vangilder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_garrett_vangilder_en_5.2.0_3.0_1700458882615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_garrett_vangilder_en_5.2.0_3.0_1700458882615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_garrett_vangilder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_garrett_vangilder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_garrett_vangilder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/garrett-vangilder/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_jnieus01_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_jnieus01_en.md new file mode 100644 index 000000000000..665bf4d6ac38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_jnieus01_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_jnieus01 DistilBertForSequenceClassification from jnieus01 +author: John Snow Labs +name: bert_emotion_jnieus01 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_jnieus01` is a English model originally trained by jnieus01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_jnieus01_en_5.2.0_3.0_1700460307386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_jnieus01_en_5.2.0_3.0_1700460307386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_jnieus01","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_jnieus01","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_jnieus01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/jnieus01/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_lss8ak_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_lss8ak_en.md new file mode 100644 index 000000000000..a73dc4c5152e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_emotion_lss8ak_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_emotion_lss8ak DistilBertForSequenceClassification from lss8ak +author: John Snow Labs +name: bert_emotion_lss8ak +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_emotion_lss8ak` is a English model originally trained by lss8ak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_emotion_lss8ak_en_5.2.0_3.0_1700473302609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_emotion_lss8ak_en_5.2.0_3.0_1700473302609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_lss8ak","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_emotion_lss8ak","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_emotion_lss8ak| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/lss8ak/bert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_engonly_sentiment_test_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_engonly_sentiment_test_en.md new file mode 100644 index 000000000000..7b49a74f22b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_engonly_sentiment_test_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_engonly_sentiment_test DistilBertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: bert_engonly_sentiment_test +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_engonly_sentiment_test` is a English model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_engonly_sentiment_test_en_5.2.0_3.0_1700441164995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_engonly_sentiment_test_en_5.2.0_3.0_1700441164995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_engonly_sentiment_test","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bert_engonly_sentiment_test","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_engonly_sentiment_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SiddharthaM/bert-engonly-sentiment-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bert_medical_ner_en.md b/docs/_posts/ahmedlone127/2023-11-20-bert_medical_ner_en.md new file mode 100644 index 000000000000..454f14b7756a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bert_medical_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_medical_ner DistilBertForTokenClassification from silpakanneganti +author: John Snow Labs +name: bert_medical_ner +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_medical_ner` is a English model originally trained by silpakanneganti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_medical_ner_en_5.2.0_3.0_1700521534836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_medical_ner_en_5.2.0_3.0_1700521534836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_medical_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_medical_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_medical_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.0 MB| + +## References + +https://huggingface.co/silpakanneganti/bert-medical-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-biomedical_ner_all_d4data_en.md b/docs/_posts/ahmedlone127/2023-11-20-biomedical_ner_all_d4data_en.md new file mode 100644 index 000000000000..d813d72c2aed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-biomedical_ner_all_d4data_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_d4data DistilBertForTokenClassification from d4data +author: John Snow Labs +name: biomedical_ner_all_d4data +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_d4data` is a English model originally trained by d4data. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_d4data_en_5.2.0_3.0_1700519723754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_d4data_en_5.2.0_3.0_1700519723754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_d4data","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_d4data", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_d4data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/d4data/biomedical-ner-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-biomedical_ner_all_sschet_en.md b/docs/_posts/ahmedlone127/2023-11-20-biomedical_ner_all_sschet_en.md new file mode 100644 index 000000000000..78867e0b35ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-biomedical_ner_all_sschet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_sschet DistilBertForTokenClassification from sschet +author: John Snow Labs +name: biomedical_ner_all_sschet +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_sschet` is a English model originally trained by sschet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_sschet_en_5.2.0_3.0_1700519889561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_sschet_en_5.2.0_3.0_1700519889561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_sschet","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_sschet", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_sschet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/sschet/biomedical-ner-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-blaze_italian_ner_it.md b/docs/_posts/ahmedlone127/2023-11-20-blaze_italian_ner_it.md new file mode 100644 index 000000000000..7bbf20a26c29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-blaze_italian_ner_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian blaze_italian_ner DistilBertForTokenClassification from osiria +author: John Snow Labs +name: blaze_italian_ner +date: 2023-11-20 +tags: [bert, it, open_source, token_classification, onnx] +task: Named Entity Recognition +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`blaze_italian_ner` is a Italian model originally trained by osiria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/blaze_italian_ner_it_5.2.0_3.0_1700520485268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/blaze_italian_ner_it_5.2.0_3.0_1700520485268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("blaze_italian_ner","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("blaze_italian_ner", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|blaze_italian_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|it| +|Size:|200.7 MB| + +## References + +https://huggingface.co/osiria/blaze-it-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-bookgenrepredictiondbert_en.md b/docs/_posts/ahmedlone127/2023-11-20-bookgenrepredictiondbert_en.md new file mode 100644 index 000000000000..35f6c2395467 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-bookgenrepredictiondbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bookgenrepredictiondbert DistilBertForSequenceClassification from leireher +author: John Snow Labs +name: bookgenrepredictiondbert +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bookgenrepredictiondbert` is a English model originally trained by leireher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bookgenrepredictiondbert_en_5.2.0_3.0_1700442939418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bookgenrepredictiondbert_en_5.2.0_3.0_1700442939418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("bookgenrepredictiondbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("bookgenrepredictiondbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bookgenrepredictiondbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/leireher/BookGenrePredictionDBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_model_ali_issa_en.md b/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_model_ali_issa_en.md new file mode 100644 index 000000000000..be814ade960c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_model_ali_issa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_ali_issa DistilBertForSequenceClassification from ali-issa +author: John Snow Labs +name: burmese_awesome_model_ali_issa +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_ali_issa` is a English model originally trained by ali-issa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_ali_issa_en_5.2.0_3.0_1700468201454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_ali_issa_en_5.2.0_3.0_1700468201454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_ali_issa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_ali_issa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_ali_issa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ali-issa/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_model_koreadaeil_en.md b/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_model_koreadaeil_en.md new file mode 100644 index 000000000000..cd216750278b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_model_koreadaeil_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_koreadaeil DistilBertForSequenceClassification from koreadaeil +author: John Snow Labs +name: burmese_awesome_model_koreadaeil +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_koreadaeil` is a English model originally trained by koreadaeil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_koreadaeil_en_5.2.0_3.0_1700472264611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_koreadaeil_en_5.2.0_3.0_1700472264611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_koreadaeil","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("burmese_awesome_model_koreadaeil","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_koreadaeil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/koreadaeil/my_awesome_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_wnut_model_stevhliu_en.md b/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_wnut_model_stevhliu_en.md new file mode 100644 index 000000000000..0ae8531de9cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-burmese_awesome_wnut_model_stevhliu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_stevhliu DistilBertForTokenClassification from stevhliu +author: John Snow Labs +name: burmese_awesome_wnut_model_stevhliu +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_stevhliu` is a English model originally trained by stevhliu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_stevhliu_en_5.2.0_3.0_1700519530687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_stevhliu_en_5.2.0_3.0_1700519530687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_stevhliu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_stevhliu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_stevhliu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/stevhliu/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-clinical_distilbert_i2b2_2010_en.md b/docs/_posts/ahmedlone127/2023-11-20-clinical_distilbert_i2b2_2010_en.md new file mode 100644 index 000000000000..3e914786a18c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-clinical_distilbert_i2b2_2010_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clinical_distilbert_i2b2_2010 DistilBertForTokenClassification from nlpie +author: John Snow Labs +name: clinical_distilbert_i2b2_2010 +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinical_distilbert_i2b2_2010` is a English model originally trained by nlpie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinical_distilbert_i2b2_2010_en_5.2.0_3.0_1700519913256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinical_distilbert_i2b2_2010_en_5.2.0_3.0_1700519913256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("clinical_distilbert_i2b2_2010","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("clinical_distilbert_i2b2_2010", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinical_distilbert_i2b2_2010| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.6 MB| + +## References + +https://huggingface.co/nlpie/clinical-distilbert-i2b2-2010 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-cybersecurity_ner_v2_en.md b/docs/_posts/ahmedlone127/2023-11-20-cybersecurity_ner_v2_en.md new file mode 100644 index 000000000000..ba3d89c29125 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-cybersecurity_ner_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cybersecurity_ner_v2 DistilBertForTokenClassification from sudipadhikari +author: John Snow Labs +name: cybersecurity_ner_v2 +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cybersecurity_ner_v2` is a English model originally trained by sudipadhikari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cybersecurity_ner_v2_en_5.2.0_3.0_1700520525724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cybersecurity_ner_v2_en_5.2.0_3.0_1700520525724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("cybersecurity_ner_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("cybersecurity_ner_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cybersecurity_ner_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sudipadhikari/cybersecurity_ner-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-dappradar_categories_prediction_en.md b/docs/_posts/ahmedlone127/2023-11-20-dappradar_categories_prediction_en.md new file mode 100644 index 000000000000..8419f12a3c32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-dappradar_categories_prediction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dappradar_categories_prediction DistilBertForSequenceClassification from Mantas +author: John Snow Labs +name: dappradar_categories_prediction +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dappradar_categories_prediction` is a English model originally trained by Mantas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dappradar_categories_prediction_en_5.2.0_3.0_1700476658979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dappradar_categories_prediction_en_5.2.0_3.0_1700476658979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("dappradar_categories_prediction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("dappradar_categories_prediction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dappradar_categories_prediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Mantas/dappradar-categories-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-demo_emotion_31415_en.md b/docs/_posts/ahmedlone127/2023-11-20-demo_emotion_31415_en.md new file mode 100644 index 000000000000..51614b2e954e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-demo_emotion_31415_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English demo_emotion_31415 DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: demo_emotion_31415 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`demo_emotion_31415` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/demo_emotion_31415_en_5.2.0_3.0_1700439131880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/demo_emotion_31415_en_5.2.0_3.0_1700439131880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_emotion_31415","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_emotion_31415","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|demo_emotion_31415| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aXhyra/demo_emotion_31415 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-demo_irony_42_en.md b/docs/_posts/ahmedlone127/2023-11-20-demo_irony_42_en.md new file mode 100644 index 000000000000..ea293e2137b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-demo_irony_42_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English demo_irony_42 DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: demo_irony_42 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`demo_irony_42` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/demo_irony_42_en_5.2.0_3.0_1700495100986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/demo_irony_42_en_5.2.0_3.0_1700495100986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_irony_42","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_irony_42","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|demo_irony_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aXhyra/demo_irony_42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-demo_sentiment_1234567_en.md b/docs/_posts/ahmedlone127/2023-11-20-demo_sentiment_1234567_en.md new file mode 100644 index 000000000000..119f64d84426 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-demo_sentiment_1234567_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English demo_sentiment_1234567 DistilBertForSequenceClassification from aXhyra +author: John Snow Labs +name: demo_sentiment_1234567 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`demo_sentiment_1234567` is a English model originally trained by aXhyra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/demo_sentiment_1234567_en_5.2.0_3.0_1700453594049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/demo_sentiment_1234567_en_5.2.0_3.0_1700453594049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_sentiment_1234567","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("demo_sentiment_1234567","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|demo_sentiment_1234567| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aXhyra/demo_sentiment_1234567 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-destilbert_fever_nli_en.md b/docs/_posts/ahmedlone127/2023-11-20-destilbert_fever_nli_en.md new file mode 100644 index 000000000000..63001e788b98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-destilbert_fever_nli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English destilbert_fever_nli DistilBertForSequenceClassification from ernlavr +author: John Snow Labs +name: destilbert_fever_nli +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`destilbert_fever_nli` is a English model originally trained by ernlavr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/destilbert_fever_nli_en_5.2.0_3.0_1700492262379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/destilbert_fever_nli_en_5.2.0_3.0_1700492262379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("destilbert_fever_nli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("destilbert_fever_nli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|destilbert_fever_nli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/ernlavr/destilbert_fever_nli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distil_bert_1412_oriya_en.md b/docs/_posts/ahmedlone127/2023-11-20-distil_bert_1412_oriya_en.md new file mode 100644 index 000000000000..4798a3b35a65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distil_bert_1412_oriya_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distil_bert_1412_oriya DistilBertForSequenceClassification from gg-ai +author: John Snow Labs +name: distil_bert_1412_oriya +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_1412_oriya` is a English model originally trained by gg-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_1412_oriya_en_5.2.0_3.0_1700474098446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_1412_oriya_en_5.2.0_3.0_1700474098446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distil_bert_1412_oriya","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distil_bert_1412_oriya","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_1412_oriya| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.4 MB| + +## References + +https://huggingface.co/gg-ai/distil-bert-1412-or \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_cola_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_cola_en.md new file mode 100644 index 000000000000..a84765bd77f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_cola DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_cola +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_cola` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_cola_en_5.2.0_3.0_1700444929920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_cola_en_5.2.0_3.0_1700444929920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_mnli_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_mnli_96_en.md new file mode 100644 index 000000000000..efb97128fbc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_mnli_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_mnli_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_mnli_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_mnli_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_mnli_96_en_5.2.0_3.0_1700470253575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_mnli_96_en_5.2.0_3.0_1700470253575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_mnli_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_mnli_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_mnli_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_mnli_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qnli_384_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qnli_384_en.md new file mode 100644 index 000000000000..c7be15498a06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qnli_384_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_qnli_384 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_qnli_384 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_qnli_384` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qnli_384_en_5.2.0_3.0_1700502859167.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qnli_384_en_5.2.0_3.0_1700502859167.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qnli_384","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qnli_384","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_qnli_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|111.8 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_qnli_384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qnli_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qnli_96_en.md new file mode 100644 index 000000000000..fffa3d555496 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qnli_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_qnli_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_qnli_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_qnli_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qnli_96_en_5.2.0_3.0_1700452577428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qnli_96_en_5.2.0_3.0_1700452577428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qnli_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qnli_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_qnli_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_qnli_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qqp_192_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qqp_192_en.md new file mode 100644 index 000000000000..3e3320ce4532 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qqp_192_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_qqp_192 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_qqp_192 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_qqp_192` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qqp_192_en_5.2.0_3.0_1700477806996.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qqp_192_en_5.2.0_3.0_1700477806996.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qqp_192","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qqp_192","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_qqp_192| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|52.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_qqp_192 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qqp_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qqp_96_en.md new file mode 100644 index 000000000000..0738b08eef3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_qqp_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_qqp_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_qqp_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_qqp_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qqp_96_en_5.2.0_3.0_1700448051909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_qqp_96_en_5.2.0_3.0_1700448051909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qqp_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_qqp_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_qqp_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_qqp_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_rte_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_rte_96_en.md new file mode 100644 index 000000000000..12dc2ad5a1cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_rte_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_rte_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_rte_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_rte_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_rte_96_en_5.2.0_3.0_1700455373951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_rte_96_en_5.2.0_3.0_1700455373951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_rte_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_rte_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_rte_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_rte_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_sst2_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_sst2_96_en.md new file mode 100644 index 000000000000..30e150a5e0d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_sst2_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_sst2_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_sst2_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_sst2_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_sst2_96_en_5.2.0_3.0_1700455373939.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_sst2_96_en_5.2.0_3.0_1700455373939.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_sst2_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_sst2_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_sst2_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_sst2_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_192_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_192_en.md new file mode 100644 index 000000000000..e086c8ee673a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_192_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_stsb_192 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_stsb_192 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_stsb_192` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_stsb_192_en_5.2.0_3.0_1700446139412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_stsb_192_en_5.2.0_3.0_1700446139412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_stsb_192","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_stsb_192","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_stsb_192| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|52.6 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_stsb_192 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_256_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_256_en.md new file mode 100644 index 000000000000..4af9e37581ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_256_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_stsb_256 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_stsb_256 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_stsb_256` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_stsb_256_en_5.2.0_3.0_1700474853831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_stsb_256_en_5.2.0_3.0_1700474853831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_stsb_256","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_stsb_256","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_stsb_256| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|71.6 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_stsb_256 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_96_en.md new file mode 100644 index 000000000000..6eda1bfeaa95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_stsb_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_stsb_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_stsb_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_stsb_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_stsb_96_en_5.2.0_3.0_1700481959924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_stsb_96_en_5.2.0_3.0_1700481959924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_stsb_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_stsb_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_stsb_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_stsb_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_wnli_256_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_wnli_256_en.md new file mode 100644 index 000000000000..8aa3e65c5f59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_logit_kd_wnli_256_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_logit_kd_wnli_256 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_logit_kd_wnli_256 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_logit_kd_wnli_256` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_wnli_256_en_5.2.0_3.0_1700488625906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_logit_kd_wnli_256_en_5.2.0_3.0_1700488625906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_wnli_256","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_logit_kd_wnli_256","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_logit_kd_wnli_256| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|71.6 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_logit_kd_wnli_256 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mnli_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mnli_96_en.md new file mode 100644 index 000000000000..8555618f2e93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mnli_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_mnli_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_mnli_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_mnli_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_mnli_96_en_5.2.0_3.0_1700456486832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_mnli_96_en_5.2.0_3.0_1700456486832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_mnli_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_mnli_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_mnli_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_mnli_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mrpc_192_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mrpc_192_en.md new file mode 100644 index 000000000000..8961e4f1192b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mrpc_192_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_mrpc_192 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_mrpc_192 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_mrpc_192` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_mrpc_192_en_5.2.0_3.0_1700481342912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_mrpc_192_en_5.2.0_3.0_1700481342912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_mrpc_192","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_mrpc_192","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_mrpc_192| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|52.6 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_mrpc_192 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mrpc_96_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mrpc_96_en.md new file mode 100644 index 000000000000..b21b31c613bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_mrpc_96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_mrpc_96 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_mrpc_96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_mrpc_96` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_mrpc_96_en_5.2.0_3.0_1700478575849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_mrpc_96_en_5.2.0_3.0_1700478575849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_mrpc_96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_mrpc_96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_mrpc_96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|25.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_mrpc_96 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_qnli_192_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_qnli_192_en.md new file mode 100644 index 000000000000..4ca405b90d31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_qnli_192_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_qnli_192 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_qnli_192 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_qnli_192` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_qnli_192_en_5.2.0_3.0_1700459251453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_qnli_192_en_5.2.0_3.0_1700459251453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_qnli_192","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_qnli_192","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_qnli_192| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|52.7 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_qnli_192 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_qnli_384_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_qnli_384_en.md new file mode 100644 index 000000000000..87c3cfe32fe7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_add_glue_experiment_qnli_384_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_add_glue_experiment_qnli_384 DistilBertForSequenceClassification from gokuls +author: John Snow Labs +name: distilbert_add_glue_experiment_qnli_384 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_add_glue_experiment_qnli_384` is a English model originally trained by gokuls. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_qnli_384_en_5.2.0_3.0_1700496919731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_add_glue_experiment_qnli_384_en_5.2.0_3.0_1700496919731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_qnli_384","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_add_glue_experiment_qnli_384","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_add_glue_experiment_qnli_384| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|111.8 MB| + +## References + +https://huggingface.co/gokuls/distilbert_add_GLUE_Experiment_qnli_384 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_amazon_shoe_reviews_tensorboard_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_amazon_shoe_reviews_tensorboard_en.md new file mode 100644 index 000000000000..5665fbfbcb94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_amazon_shoe_reviews_tensorboard_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_amazon_shoe_reviews_tensorboard DistilBertForSequenceClassification from juliensimon +author: John Snow Labs +name: distilbert_amazon_shoe_reviews_tensorboard +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_amazon_shoe_reviews_tensorboard` is a English model originally trained by juliensimon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_amazon_shoe_reviews_tensorboard_en_5.2.0_3.0_1700463072817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_amazon_shoe_reviews_tensorboard_en_5.2.0_3.0_1700463072817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_shoe_reviews_tensorboard","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_amazon_shoe_reviews_tensorboard","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_amazon_shoe_reviews_tensorboard| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/juliensimon/distilbert-amazon-shoe-reviews-tensorboard \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_banking77_pt2_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_banking77_pt2_en.md new file mode 100644 index 000000000000..81b0ab017b7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_banking77_pt2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_banking77_pt2 DistilBertForSequenceClassification from happytree09 +author: John Snow Labs +name: distilbert_base_banking77_pt2 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_banking77_pt2` is a English model originally trained by happytree09. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_banking77_pt2_en_5.2.0_3.0_1700471299982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_banking77_pt2_en_5.2.0_3.0_1700471299982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_banking77_pt2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_banking77_pt2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_banking77_pt2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/happytree09/distilbert-base-banking77-pt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en.md new file mode 100644 index 000000000000..87b0408b2284 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0 +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700522540972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700522540972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-kp20k-v1.0-concept-extraction-wikipedia-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_fine_tuned_food_ner_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_fine_tuned_food_ner_en.md new file mode 100644 index 000000000000..6ba1c7c81498 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_fine_tuned_food_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_fine_tuned_food_ner DistilBertForTokenClassification from davanstrien +author: John Snow Labs +name: distilbert_base_cased_fine_tuned_food_ner +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_fine_tuned_food_ner` is a English model originally trained by davanstrien. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_fine_tuned_food_ner_en_5.2.0_3.0_1700524228519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_fine_tuned_food_ner_en_5.2.0_3.0_1700524228519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_fine_tuned_food_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_fine_tuned_food_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_fine_tuned_food_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/davanstrien/distilbert-base-cased_fine_tuned_food_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_financial_csv_gevis1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_financial_csv_gevis1_en.md new file mode 100644 index 000000000000..7dd41bfbf29d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_financial_csv_gevis1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_financial_csv_gevis1 DistilBertForSequenceClassification from gevis1 +author: John Snow Labs +name: distilbert_base_cased_finetuned_financial_csv_gevis1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_financial_csv_gevis1` is a English model originally trained by gevis1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_financial_csv_gevis1_en_5.2.0_3.0_1700468647248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_financial_csv_gevis1_en_5.2.0_3.0_1700468647248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_finetuned_financial_csv_gevis1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_finetuned_financial_csv_gevis1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_financial_csv_gevis1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/gevis1/distilbert-base-cased-finetuned-financial-csv-gevis1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_ner_t1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_ner_t1_en.md new file mode 100644 index 000000000000..0c7a176440d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_ner_t1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_t1 DistilBertForTokenClassification from RS7 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_t1 +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_t1` is a English model originally trained by RS7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t1_en_5.2.0_3.0_1700523567053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t1_en_5.2.0_3.0_1700523567053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_t1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_t1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_t1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RS7/distilbert-base-cased-finetuned-ner-t1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_ner_t1_g1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_ner_t1_g1_en.md new file mode 100644 index 000000000000..79271fea1a64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_finetuned_ner_t1_g1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_t1_g1 DistilBertForTokenClassification from RS7 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_t1_g1 +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_t1_g1` is a English model originally trained by RS7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t1_g1_en_5.2.0_3.0_1700519886193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t1_g1_en_5.2.0_3.0_1700519886193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_t1_g1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_t1_g1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_t1_g1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RS7/distilbert-base-cased-finetuned-ner-t1-g1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_hate_speech_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_hate_speech_en.md new file mode 100644 index 000000000000..f0a34726e141 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_hate_speech_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_cased_hate_speech DistilBertForSequenceClassification from morenolq +author: John Snow Labs +name: distilbert_base_cased_hate_speech +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_hate_speech` is a English model originally trained by morenolq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_hate_speech_en_5.2.0_3.0_1700464065999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_hate_speech_en_5.2.0_3.0_1700464065999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_hate_speech","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_cased_hate_speech","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_hate_speech| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/morenolq/distilbert-base-cased-hate-speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_wikiann_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_wikiann_en.md new file mode 100644 index 000000000000..87e57522631b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_cased_wikiann_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_wikiann DistilBertForTokenClassification from Domino-ai +author: John Snow Labs +name: distilbert_base_cased_wikiann +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_wikiann` is a English model originally trained by Domino-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_wikiann_en_5.2.0_3.0_1700523880196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_wikiann_en_5.2.0_3.0_1700523880196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_wikiann","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_wikiann", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_wikiann| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Domino-ai/distilbert-base-cased-wikiann \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multiling_finetuned_emotion_bulgarian_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multiling_finetuned_emotion_bulgarian_en.md new file mode 100644 index 000000000000..e5eb96f148c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multiling_finetuned_emotion_bulgarian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_multiling_finetuned_emotion_bulgarian DistilBertForSequenceClassification from vladkolev +author: John Snow Labs +name: distilbert_base_multiling_finetuned_emotion_bulgarian +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multiling_finetuned_emotion_bulgarian` is a English model originally trained by vladkolev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multiling_finetuned_emotion_bulgarian_en_5.2.0_3.0_1700495210268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multiling_finetuned_emotion_bulgarian_en_5.2.0_3.0_1700495210268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multiling_finetuned_emotion_bulgarian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multiling_finetuned_emotion_bulgarian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multiling_finetuned_emotion_bulgarian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/vladkolev/distilbert-base-multiling-finetuned-emotion-bg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds_xx.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds_xx.md new file mode 100644 index 000000000000..b78c8d42632c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds DistilBertForSequenceClassification from francisco-perez-sorrosal +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds +date: 2023-11-20 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds` is a Multilingual model originally trained by francisco-perez-sorrosal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds_xx_5.2.0_3.0_1700457572036.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds_xx_5.2.0_3.0_1700457572036.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_with_spanish_tweets_clf_cleaned_ds| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/francisco-perez-sorrosal/distilbert-base-multilingual-cased-finetuned-with-spanish-tweets-clf-cleaned-ds \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_mapa_coarse_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_mapa_coarse_ner_xx.md new file mode 100644 index 000000000000..34a793c7d320 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_mapa_coarse_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_mapa_coarse_ner DistilBertForTokenClassification from dmargutierrez +author: John Snow Labs +name: distilbert_base_multilingual_cased_mapa_coarse_ner +date: 2023-11-20 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_mapa_coarse_ner` is a Multilingual model originally trained by dmargutierrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_mapa_coarse_ner_xx_5.2.0_3.0_1700519552240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_mapa_coarse_ner_xx_5.2.0_3.0_1700519552240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_mapa_coarse_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_mapa_coarse_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_mapa_coarse_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/dmargutierrez/distilbert-base-multilingual-cased-mapa_coarse-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain_xx.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain_xx.md new file mode 100644 index 000000000000..d3ee89875e77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain DistilBertForSequenceClassification from annahaz +author: John Snow Labs +name: distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain +date: 2023-11-20 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain` is a Multilingual model originally trained by annahaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain_xx_5.2.0_3.0_1700446231544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain_xx_5.2.0_3.0_1700446231544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_misogyny_sexism_decay0_01_french_outofdomain| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/annahaz/distilbert-base-multilingual-cased-misogyny-sexism-decay0.01-fr-outofdomain \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain_xx.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain_xx.md new file mode 100644 index 000000000000..d3ba1678fc38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain DistilBertForSequenceClassification from annahaz +author: John Snow Labs +name: distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain +date: 2023-11-20 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain` is a Multilingual model originally trained by annahaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain_xx_5.2.0_3.0_1700485348786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain_xx_5.2.0_3.0_1700485348786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_misogyny_sexism_decay0_05_french_outofdomain| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/annahaz/distilbert-base-multilingual-cased-misogyny-sexism-decay0.05-fr-outofdomain \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_re_punctuate_unikei_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_re_punctuate_unikei_en.md new file mode 100644 index 000000000000..49e0d11d3680 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_re_punctuate_unikei_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_re_punctuate_unikei DistilBertForTokenClassification from unikei +author: John Snow Labs +name: distilbert_base_re_punctuate_unikei +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_re_punctuate_unikei` is a English model originally trained by unikei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_re_punctuate_unikei_en_5.2.0_3.0_1700518952834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_re_punctuate_unikei_en_5.2.0_3.0_1700518952834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_re_punctuate_unikei","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_re_punctuate_unikei", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_re_punctuate_unikei| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/unikei/distilbert-base-re-punctuate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_turkish_cased_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_turkish_cased_finetuned_emotion_en.md new file mode 100644 index 000000000000..2e0544d9c0ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_turkish_cased_finetuned_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_turkish_cased_finetuned_emotion DistilBertForSequenceClassification from BenTata-86 +author: John Snow Labs +name: distilbert_base_turkish_cased_finetuned_emotion +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_turkish_cased_finetuned_emotion` is a English model originally trained by BenTata-86. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_finetuned_emotion_en_5.2.0_3.0_1700450638630.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_turkish_cased_finetuned_emotion_en_5.2.0_3.0_1700450638630.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_turkish_cased_finetuned_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_turkish_cased_finetuned_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_turkish_cased_finetuned_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|254.0 MB| + +## References + +https://huggingface.co/BenTata-86/distilbert-base-turkish-cased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_banking77_classification_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_banking77_classification_en.md new file mode 100644 index 000000000000..05ad09685119 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_banking77_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_banking77_classification DistilBertForSequenceClassification from nickprock +author: John Snow Labs +name: distilbert_base_uncased_banking77_classification +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_banking77_classification` is a English model originally trained by nickprock. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_banking77_classification_en_5.2.0_3.0_1700443506614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_banking77_classification_en_5.2.0_3.0_1700443506614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_banking77_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_banking77_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_banking77_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/nickprock/distilbert-base-uncased-banking77-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_abdelkader_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_abdelkader_en.md new file mode 100644 index 000000000000..40f86e3e13d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_abdelkader_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_abdelkader DistilBertForSequenceClassification from abdelkader +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_abdelkader +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_abdelkader` is a English model originally trained by abdelkader. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_abdelkader_en_5.2.0_3.0_1700445026621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_abdelkader_en_5.2.0_3.0_1700445026621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_abdelkader","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_abdelkader","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_abdelkader| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/abdelkader/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_ashrielbrian_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_ashrielbrian_en.md new file mode 100644 index 000000000000..b5b53e524ad0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_ashrielbrian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_ashrielbrian DistilBertForSequenceClassification from ashrielbrian +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_ashrielbrian +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_ashrielbrian` is a English model originally trained by ashrielbrian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_ashrielbrian_en_5.2.0_3.0_1700482595650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_ashrielbrian_en_5.2.0_3.0_1700482595650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_ashrielbrian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_ashrielbrian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_ashrielbrian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/ashrielbrian/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_bahushruth_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_bahushruth_en.md new file mode 100644 index 000000000000..ac844e6858be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_bahushruth_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_bahushruth DistilBertForSequenceClassification from Bahushruth +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_bahushruth +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_bahushruth` is a English model originally trained by Bahushruth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_bahushruth_en_5.2.0_3.0_1700463176225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_bahushruth_en_5.2.0_3.0_1700463176225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_bahushruth","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_bahushruth","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_bahushruth| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Bahushruth/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_clisi2000_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_clisi2000_en.md new file mode 100644 index 000000000000..bd1d6428724a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_clisi2000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_clisi2000 DistilBertForSequenceClassification from clisi2000 +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_clisi2000 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_clisi2000` is a English model originally trained by clisi2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_clisi2000_en_5.2.0_3.0_1700450728235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_clisi2000_en_5.2.0_3.0_1700450728235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_clisi2000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_clisi2000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_clisi2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/clisi2000/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_ctojang_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_ctojang_en.md new file mode 100644 index 000000000000..58e0169910a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_ctojang_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_ctojang DistilBertForSequenceClassification from ctojang +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_ctojang +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_ctojang` is a English model originally trained by ctojang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_ctojang_en_5.2.0_3.0_1700497024654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_ctojang_en_5.2.0_3.0_1700497024654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_ctojang","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_ctojang","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_ctojang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/ctojang/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_dfsj_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_dfsj_en.md new file mode 100644 index 000000000000..ae53c5b855d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_dfsj_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_dfsj DistilBertForSequenceClassification from dfsj +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_dfsj +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_dfsj` is a English model originally trained by dfsj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_dfsj_en_5.2.0_3.0_1700497969438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_dfsj_en_5.2.0_3.0_1700497969438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_dfsj","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_dfsj","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_dfsj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/dfsj/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_dkoh12_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_dkoh12_en.md new file mode 100644 index 000000000000..d589f68697a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_dkoh12_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_dkoh12 DistilBertForSequenceClassification from dkoh12 +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_dkoh12 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_dkoh12` is a English model originally trained by dkoh12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_dkoh12_en_5.2.0_3.0_1700502673303.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_dkoh12_en_5.2.0_3.0_1700502673303.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_dkoh12","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_dkoh12","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_dkoh12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/dkoh12/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_fieldms_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_fieldms_en.md new file mode 100644 index 000000000000..c2fb0b08fad6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_fieldms_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_fieldms DistilBertForSequenceClassification from fieldms +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_fieldms +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_fieldms` is a English model originally trained by fieldms. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_fieldms_en_5.2.0_3.0_1700494088314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_fieldms_en_5.2.0_3.0_1700494088314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_fieldms","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_fieldms","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_fieldms| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/fieldms/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_hhffxx_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_hhffxx_en.md new file mode 100644 index 000000000000..533f9acd62df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_hhffxx_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_hhffxx DistilBertForSequenceClassification from hhffxx +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_hhffxx +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_hhffxx` is a English model originally trained by hhffxx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_hhffxx_en_5.2.0_3.0_1700497762612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_hhffxx_en_5.2.0_3.0_1700497762612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_hhffxx","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_hhffxx","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_hhffxx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/hhffxx/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_miyagawaorj_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_miyagawaorj_en.md new file mode 100644 index 000000000000..a1c9d85721cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_miyagawaorj_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_miyagawaorj DistilBertForSequenceClassification from miyagawaorj +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_miyagawaorj +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_miyagawaorj` is a English model originally trained by miyagawaorj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_miyagawaorj_en_5.2.0_3.0_1700450096507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_miyagawaorj_en_5.2.0_3.0_1700450096507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_miyagawaorj","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_miyagawaorj","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_miyagawaorj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/miyagawaorj/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_mj03_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_mj03_en.md new file mode 100644 index 000000000000..0bdae28f3691 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_clinc_mj03_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_clinc_mj03 DistilBertForSequenceClassification from MJ03 +author: John Snow Labs +name: distilbert_base_uncased_distilled_clinc_mj03 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_clinc_mj03` is a English model originally trained by MJ03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_mj03_en_5.2.0_3.0_1700470594277.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_clinc_mj03_en_5.2.0_3.0_1700470594277.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_mj03","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_clinc_mj03","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_clinc_mj03| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/MJ03/distilbert-base-uncased-distilled-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_dtkd_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_dtkd_clinc_en.md new file mode 100644 index 000000000000..95dda530046b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_dtkd_clinc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_dtkd_clinc DistilBertForSequenceClassification from Mor1998 +author: John Snow Labs +name: distilbert_base_uncased_distilled_dtkd_clinc +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_dtkd_clinc` is a English model originally trained by Mor1998. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_dtkd_clinc_en_5.2.0_3.0_1700451944431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_dtkd_clinc_en_5.2.0_3.0_1700451944431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_dtkd_clinc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled_dtkd_clinc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_dtkd_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Mor1998/distilbert-base-uncased-distilled-dtkd-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_en.md new file mode 100644 index 000000000000..2af1844c06e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_distilled_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled DistilBertForSequenceClassification from roscoyoon +author: John Snow Labs +name: distilbert_base_uncased_distilled +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled` is a English model originally trained by roscoyoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_en_5.2.0_3.0_1700497969467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_en_5.2.0_3.0_1700497969467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_distilled","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/roscoyoon/distilbert-base-uncased-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_finetuned_en.md new file mode 100644 index 000000000000..338452cd3f31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotion_finetuned DistilBertForSequenceClassification from sssingh +author: John Snow Labs +name: distilbert_base_uncased_emotion_finetuned +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotion_finetuned` is a English model originally trained by sssingh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_finetuned_en_5.2.0_3.0_1700462242299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_finetuned_en_5.2.0_3.0_1700462242299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotion_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sssingh/distilbert-base-uncased-emotion-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_ft_0416_land25_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_ft_0416_land25_en.md new file mode 100644 index 000000000000..9b5b3fdc11c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_ft_0416_land25_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotion_ft_0416_land25 DistilBertForSequenceClassification from land25 +author: John Snow Labs +name: distilbert_base_uncased_emotion_ft_0416_land25 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotion_ft_0416_land25` is a English model originally trained by land25. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_0416_land25_en_5.2.0_3.0_1700491268353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_ft_0416_land25_en_5.2.0_3.0_1700491268353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_0416_land25","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_ft_0416_land25","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotion_ft_0416_land25| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/land25/distilbert-base-uncased_emotion_ft_0416 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_nlp_with_transformers_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_nlp_with_transformers_en.md new file mode 100644 index 000000000000..bf63bcda51a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_emotion_nlp_with_transformers_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_emotion_nlp_with_transformers DistilBertForSequenceClassification from pridaj +author: John Snow Labs +name: distilbert_base_uncased_emotion_nlp_with_transformers +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_emotion_nlp_with_transformers` is a English model originally trained by pridaj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_nlp_with_transformers_en_5.2.0_3.0_1700471403897.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_emotion_nlp_with_transformers_en_5.2.0_3.0_1700471403897.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_nlp_with_transformers","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_emotion_nlp_with_transformers","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_emotion_nlp_with_transformers| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/pridaj/distilbert-base-uncased-emotion-nlp-with-transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1_en.md new file mode 100644 index 000000000000..affede4faaab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1 DistilBertForSequenceClassification from hafidikhsan +author: John Snow Labs +name: distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1` is a English model originally trained by hafidikhsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1_en_5.2.0_3.0_1700476229238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1_en_5.2.0_3.0_1700476229238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_english_cefr_lexical_evaluation_dt_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/hafidikhsan/distilbert-base-uncased-english-cefr-lexical-evaluation-dt-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_final2_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_final2_mnli_en.md new file mode 100644 index 000000000000..d681fd26fb5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_final2_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_final2_mnli DistilBertForSequenceClassification from charlemagne +author: John Snow Labs +name: distilbert_base_uncased_final2_mnli +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_final2_mnli` is a English model originally trained by charlemagne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_final2_mnli_en_5.2.0_3.0_1700465899341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_final2_mnli_en_5.2.0_3.0_1700465899341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_final2_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_final2_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_final2_mnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/charlemagne/distilbert-base-uncased-final2-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fine_tuned_emotions_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fine_tuned_emotions_en.md new file mode 100644 index 000000000000..2edf2e9c5d47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fine_tuned_emotions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fine_tuned_emotions DistilBertForSequenceClassification from adrianhenkel +author: John Snow Labs +name: distilbert_base_uncased_fine_tuned_emotions +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fine_tuned_emotions` is a English model originally trained by adrianhenkel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fine_tuned_emotions_en_5.2.0_3.0_1700464917148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fine_tuned_emotions_en_5.2.0_3.0_1700464917148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fine_tuned_emotions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fine_tuned_emotions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fine_tuned_emotions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/adrianhenkel/distilbert-base-uncased-fine-tuned-emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetunded_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetunded_emotion_en.md new file mode 100644 index 000000000000..1f3becec98a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetunded_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetunded_emotion DistilBertForSequenceClassification from mgoudarz +author: John Snow Labs +name: distilbert_base_uncased_finetunded_emotion +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetunded_emotion` is a English model originally trained by mgoudarz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetunded_emotion_en_5.2.0_3.0_1700498928141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetunded_emotion_en_5.2.0_3.0_1700498928141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetunded_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetunded_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetunded_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mgoudarz/distilbert-base-uncased-finetunded-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_amazon_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_amazon_reviews_en.md new file mode 100644 index 000000000000..c62d095dbe4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_amazon_reviews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_amazon_reviews DistilBertForSequenceClassification from amir7d0 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_amazon_reviews +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_amazon_reviews` is a English model originally trained by amir7d0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_reviews_en_5.2.0_3.0_1700440209511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amazon_reviews_en_5.2.0_3.0_1700440209511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_amazon_reviews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_amazon_reviews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_amazon_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/amir7d0/distilbert-base-uncased-finetuned-amazon-reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_as_sentences_sarahflan_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_as_sentences_sarahflan_en.md new file mode 100644 index 000000000000..2a976948bead --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_as_sentences_sarahflan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_as_sentences_sarahflan DistilBertForSequenceClassification from sarahflan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_as_sentences_sarahflan +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_as_sentences_sarahflan` is a English model originally trained by sarahflan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_as_sentences_sarahflan_en_5.2.0_3.0_1700496171480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_as_sentences_sarahflan_en_5.2.0_3.0_1700496171480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_as_sentences_sarahflan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_as_sentences_sarahflan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_as_sentences_sarahflan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sarahflan/distilbert-base-uncased-finetuned-AS_sentences \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_bbc_news_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_bbc_news_en.md new file mode 100644 index 000000000000..4c6ac65cd206 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_bbc_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_bbc_news DistilBertForSequenceClassification from nypnop +author: John Snow Labs +name: distilbert_base_uncased_finetuned_bbc_news +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_bbc_news` is a English model originally trained by nypnop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_bbc_news_en_5.2.0_3.0_1700459576889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_bbc_news_en_5.2.0_3.0_1700459576889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_bbc_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_bbc_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_bbc_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nypnop/distilbert-base-uncased-finetuned-bbc-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_akira10_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_akira10_en.md new file mode 100644 index 000000000000..3191945af1f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_akira10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_akira10 DistilBertForSequenceClassification from Akira10 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_akira10 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_akira10` is a English model originally trained by Akira10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_akira10_en_5.2.0_3.0_1700492995856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_akira10_en_5.2.0_3.0_1700492995856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_akira10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_akira10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_akira10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Akira10/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_almondpeanuts_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_almondpeanuts_en.md new file mode 100644 index 000000000000..c6c32d8b689d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_almondpeanuts_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_almondpeanuts DistilBertForSequenceClassification from Almondpeanuts +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_almondpeanuts +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_almondpeanuts` is a English model originally trained by Almondpeanuts. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_almondpeanuts_en_5.2.0_3.0_1700461339069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_almondpeanuts_en_5.2.0_3.0_1700461339069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_almondpeanuts","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_almondpeanuts","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_almondpeanuts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Almondpeanuts/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_amartyobanerjee_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_amartyobanerjee_en.md new file mode 100644 index 000000000000..28bf3d153519 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_amartyobanerjee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_amartyobanerjee DistilBertForSequenceClassification from amartyobanerjee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_amartyobanerjee +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_amartyobanerjee` is a English model originally trained by amartyobanerjee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_amartyobanerjee_en_5.2.0_3.0_1700484054476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_amartyobanerjee_en_5.2.0_3.0_1700484054476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_amartyobanerjee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_amartyobanerjee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_amartyobanerjee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/amartyobanerjee/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_ashishbalhara_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_ashishbalhara_en.md new file mode 100644 index 000000000000..81693c0bbf56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_ashishbalhara_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_ashishbalhara DistilBertForSequenceClassification from AshishBalhara +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_ashishbalhara +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_ashishbalhara` is a English model originally trained by AshishBalhara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_ashishbalhara_en_5.2.0_3.0_1700486592198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_ashishbalhara_en_5.2.0_3.0_1700486592198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_ashishbalhara","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_ashishbalhara","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_ashishbalhara| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/AshishBalhara/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cafbr_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cafbr_en.md new file mode 100644 index 000000000000..9304b56e36c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cafbr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_cafbr DistilBertForSequenceClassification from cafbr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_cafbr +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_cafbr` is a English model originally trained by cafbr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_cafbr_en_5.2.0_3.0_1700474220442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_cafbr_en_5.2.0_3.0_1700474220442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_cafbr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_cafbr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_cafbr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/cafbr/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cataluna84_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cataluna84_en.md new file mode 100644 index 000000000000..10bc69684356 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cataluna84_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_cataluna84 DistilBertForSequenceClassification from cataluna84 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_cataluna84 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_cataluna84` is a English model originally trained by cataluna84. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_cataluna84_en_5.2.0_3.0_1700447956985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_cataluna84_en_5.2.0_3.0_1700447956985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_cataluna84","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_cataluna84","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_cataluna84| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/cataluna84/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_chris_santiago_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_chris_santiago_en.md new file mode 100644 index 000000000000..eed97206d321 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_chris_santiago_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_chris_santiago DistilBertForSequenceClassification from chris-santiago +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_chris_santiago +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_chris_santiago` is a English model originally trained by chris-santiago. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_chris_santiago_en_5.2.0_3.0_1700492160230.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_chris_santiago_en_5.2.0_3.0_1700492160230.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_chris_santiago","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_chris_santiago","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_chris_santiago| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/chris-santiago/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cj_mills_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cj_mills_en.md new file mode 100644 index 000000000000..ea124598ecc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_cj_mills_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_cj_mills DistilBertForSequenceClassification from cj-mills +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_cj_mills +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_cj_mills` is a English model originally trained by cj-mills. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_cj_mills_en_5.2.0_3.0_1700469719432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_cj_mills_en_5.2.0_3.0_1700469719432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_cj_mills","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_cj_mills","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_cj_mills| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/cj-mills/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_dongyeop_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_dongyeop_en.md new file mode 100644 index 000000000000..3703662ee3da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_dongyeop_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_dongyeop DistilBertForSequenceClassification from Dongyeop +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_dongyeop +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_dongyeop` is a English model originally trained by Dongyeop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_dongyeop_en_5.2.0_3.0_1700462225465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_dongyeop_en_5.2.0_3.0_1700462225465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_dongyeop","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_dongyeop","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_dongyeop| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Dongyeop/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_frahman_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_frahman_en.md new file mode 100644 index 000000000000..77b204754fe3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_frahman_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_frahman DistilBertForSequenceClassification from frahman +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_frahman +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_frahman` is a English model originally trained by frahman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_frahman_en_5.2.0_3.0_1700441707530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_frahman_en_5.2.0_3.0_1700441707530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_frahman","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_frahman","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_frahman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/frahman/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_hli_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_hli_en.md new file mode 100644 index 000000000000..00a983862eef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_hli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_hli DistilBertForSequenceClassification from hli +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_hli +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_hli` is a English model originally trained by hli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_hli_en_5.2.0_3.0_1700490833270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_hli_en_5.2.0_3.0_1700490833270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_hli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_hli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_hli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/hli/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_isaacp_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_isaacp_en.md new file mode 100644 index 000000000000..a4b9b74c80c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_isaacp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_isaacp DistilBertForSequenceClassification from Isaacp +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_isaacp +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_isaacp` is a English model originally trained by Isaacp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_isaacp_en_5.2.0_3.0_1700456303230.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_isaacp_en_5.2.0_3.0_1700456303230.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_isaacp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_isaacp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_isaacp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Isaacp/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_jamie613_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_jamie613_en.md new file mode 100644 index 000000000000..c666d3f6411b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_jamie613_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_jamie613 DistilBertForSequenceClassification from jamie613 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_jamie613 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_jamie613` is a English model originally trained by jamie613. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_jamie613_en_5.2.0_3.0_1700461339071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_jamie613_en_5.2.0_3.0_1700461339071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_jamie613","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_jamie613","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_jamie613| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/jamie613/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_jupitercoder_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_jupitercoder_en.md new file mode 100644 index 000000000000..3c2753ff9e80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_jupitercoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_jupitercoder DistilBertForSequenceClassification from jupitercoder +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_jupitercoder +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_jupitercoder` is a English model originally trained by jupitercoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_jupitercoder_en_5.2.0_3.0_1700466020225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_jupitercoder_en_5.2.0_3.0_1700466020225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_jupitercoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_jupitercoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_jupitercoder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/jupitercoder/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_miyagawaorj_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_miyagawaorj_en.md new file mode 100644 index 000000000000..092ecee873d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_miyagawaorj_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_miyagawaorj DistilBertForSequenceClassification from miyagawaorj +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_miyagawaorj +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_miyagawaorj` is a English model originally trained by miyagawaorj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_miyagawaorj_en_5.2.0_3.0_1700448965977.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_miyagawaorj_en_5.2.0_3.0_1700448965977.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_miyagawaorj","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_miyagawaorj","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_miyagawaorj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/miyagawaorj/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_mj03_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_mj03_en.md new file mode 100644 index 000000000000..e7767fa014bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_mj03_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_mj03 DistilBertForSequenceClassification from MJ03 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_mj03 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_mj03` is a English model originally trained by MJ03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_mj03_en_5.2.0_3.0_1700483004340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_mj03_en_5.2.0_3.0_1700483004340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_mj03","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_mj03","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_mj03| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/MJ03/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_reaverlee_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_reaverlee_en.md new file mode 100644 index 000000000000..d270a8587e8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_reaverlee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_reaverlee DistilBertForSequenceClassification from reaverlee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_reaverlee +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_reaverlee` is a English model originally trained by reaverlee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_reaverlee_en_5.2.0_3.0_1700444072454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_reaverlee_en_5.2.0_3.0_1700444072454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_reaverlee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_reaverlee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_reaverlee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/reaverlee/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_robkayinto_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_robkayinto_en.md new file mode 100644 index 000000000000..57e1da468d0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_robkayinto_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_robkayinto DistilBertForSequenceClassification from robkayinto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_robkayinto +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_robkayinto` is a English model originally trained by robkayinto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_robkayinto_en_5.2.0_3.0_1700481640574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_robkayinto_en_5.2.0_3.0_1700481640574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_robkayinto","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_robkayinto","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_robkayinto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/robkayinto/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_rootacess_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_rootacess_en.md new file mode 100644 index 000000000000..86a5dd4079a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_rootacess_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_rootacess DistilBertForSequenceClassification from rootacess +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_rootacess +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_rootacess` is a English model originally trained by rootacess. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_rootacess_en_5.2.0_3.0_1700489103429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_rootacess_en_5.2.0_3.0_1700489103429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_rootacess","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_rootacess","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_rootacess| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/rootacess/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_songys_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_songys_en.md new file mode 100644 index 000000000000..35d92a20455a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_songys_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_songys DistilBertForSequenceClassification from songys +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_songys +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_songys` is a English model originally trained by songys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_songys_en_5.2.0_3.0_1700471389702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_songys_en_5.2.0_3.0_1700471389702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_songys","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_songys","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_songys| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/songys/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_sunoh_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_sunoh_en.md new file mode 100644 index 000000000000..f108dadce554 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_sunoh_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_sunoh DistilBertForSequenceClassification from Sunoh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_sunoh +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_sunoh` is a English model originally trained by Sunoh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_sunoh_en_5.2.0_3.0_1700447084176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_sunoh_en_5.2.0_3.0_1700447084176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_sunoh","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_sunoh","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_sunoh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/Sunoh/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_yemoncad_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_yemoncad_en.md new file mode 100644 index 000000000000..f15025777ee0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_clinc_yemoncad_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_clinc_yemoncad DistilBertForSequenceClassification from yemoncad +author: John Snow Labs +name: distilbert_base_uncased_finetuned_clinc_yemoncad +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_clinc_yemoncad` is a English model originally trained by yemoncad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_yemoncad_en_5.2.0_3.0_1700475291246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_clinc_yemoncad_en_5.2.0_3.0_1700475291246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_yemoncad","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_clinc_yemoncad","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_clinc_yemoncad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/yemoncad/distilbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_anirudh21_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_anirudh21_en.md new file mode 100644 index 000000000000..e08e41bef3fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_anirudh21_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_anirudh21 DistilBertForSequenceClassification from anirudh21 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_anirudh21 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_anirudh21` is a English model originally trained by anirudh21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_anirudh21_en_5.2.0_3.0_1700442804540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_anirudh21_en_5.2.0_3.0_1700442804540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_anirudh21","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_anirudh21","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_anirudh21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/anirudh21/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_arahmi6_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_arahmi6_en.md new file mode 100644 index 000000000000..3e4af9024ceb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_arahmi6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_arahmi6 DistilBertForSequenceClassification from arahmi6 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_arahmi6 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_arahmi6` is a English model originally trained by arahmi6. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_arahmi6_en_5.2.0_3.0_1700466030289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_arahmi6_en_5.2.0_3.0_1700466030289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_arahmi6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_arahmi6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_arahmi6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/arahmi6/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_arbazk_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_arbazk_en.md new file mode 100644 index 000000000000..31e4530ef078 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_arbazk_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_arbazk DistilBertForSequenceClassification from arbazk +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_arbazk +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_arbazk` is a English model originally trained by arbazk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_arbazk_en_5.2.0_3.0_1700475219225.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_arbazk_en_5.2.0_3.0_1700475219225.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_arbazk","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_arbazk","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_arbazk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/arbazk/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_blacktree_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_blacktree_en.md new file mode 100644 index 000000000000..f1c66f4f1527 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_blacktree_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_blacktree DistilBertForSequenceClassification from blacktree +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_blacktree +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_blacktree` is a English model originally trained by blacktree. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_blacktree_en_5.2.0_3.0_1700498725927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_blacktree_en_5.2.0_3.0_1700498725927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_blacktree","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_blacktree","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_blacktree| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/blacktree/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_blizrys_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_blizrys_en.md new file mode 100644 index 000000000000..41f32b85c706 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_blizrys_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_blizrys DistilBertForSequenceClassification from blizrys +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_blizrys +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_blizrys` is a English model originally trained by blizrys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_blizrys_en_5.2.0_3.0_1700443333791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_blizrys_en_5.2.0_3.0_1700443333791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_blizrys","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_blizrys","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_blizrys| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/blizrys/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_bmp_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_bmp_en.md new file mode 100644 index 000000000000..3907a34bf4cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_bmp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_bmp DistilBertForSequenceClassification from BMP +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_bmp +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_bmp` is a English model originally trained by BMP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_bmp_en_5.2.0_3.0_1700482798424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_bmp_en_5.2.0_3.0_1700482798424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_bmp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_bmp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_bmp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/BMP/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_charliemarx_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_charliemarx_en.md new file mode 100644 index 000000000000..fd7f3d818fc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_charliemarx_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_charliemarx DistilBertForSequenceClassification from charliemarx +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_charliemarx +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_charliemarx` is a English model originally trained by charliemarx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_charliemarx_en_5.2.0_3.0_1700446117097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_charliemarx_en_5.2.0_3.0_1700446117097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_charliemarx","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_charliemarx","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_charliemarx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/charliemarx/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_cnu_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_cnu_en.md new file mode 100644 index 000000000000..97d1147feffb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_cnu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_cnu DistilBertForSequenceClassification from cnu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_cnu +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_cnu` is a English model originally trained by cnu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_cnu_en_5.2.0_3.0_1700440000182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_cnu_en_5.2.0_3.0_1700440000182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_cnu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_cnu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_cnu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cnu/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_fznmhmmd_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_fznmhmmd_en.md new file mode 100644 index 000000000000..b3c22ec58c8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_fznmhmmd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_fznmhmmd DistilBertForSequenceClassification from fznmhmmd +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_fznmhmmd +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_fznmhmmd` is a English model originally trained by fznmhmmd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fznmhmmd_en_5.2.0_3.0_1700454393138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_fznmhmmd_en_5.2.0_3.0_1700454393138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fznmhmmd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_fznmhmmd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_fznmhmmd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/fznmhmmd/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_gracevonoiste_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_gracevonoiste_en.md new file mode 100644 index 000000000000..4ca57e42aeb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_gracevonoiste_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_gracevonoiste DistilBertForSequenceClassification from Gracevonoiste +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_gracevonoiste +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_gracevonoiste` is a English model originally trained by Gracevonoiste. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_gracevonoiste_en_5.2.0_3.0_1700477712861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_gracevonoiste_en_5.2.0_3.0_1700477712861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_gracevonoiste","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_gracevonoiste","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_gracevonoiste| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Gracevonoiste/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_jimmyliao_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_jimmyliao_en.md new file mode 100644 index 000000000000..22e6253fd217 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_jimmyliao_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_jimmyliao DistilBertForSequenceClassification from jimmyliao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_jimmyliao +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_jimmyliao` is a English model originally trained by jimmyliao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_jimmyliao_en_5.2.0_3.0_1700474146882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_jimmyliao_en_5.2.0_3.0_1700474146882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_jimmyliao","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_jimmyliao","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_jimmyliao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/jimmyliao/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_keruizhao_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_keruizhao_en.md new file mode 100644 index 000000000000..b9098a299b77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_keruizhao_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_keruizhao DistilBertForSequenceClassification from KeruiZhao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_keruizhao +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_keruizhao` is a English model originally trained by KeruiZhao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_keruizhao_en_5.2.0_3.0_1700467054826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_keruizhao_en_5.2.0_3.0_1700467054826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_keruizhao","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_keruizhao","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_keruizhao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/KeruiZhao/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_kgsteven_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_kgsteven_en.md new file mode 100644 index 000000000000..0c35fc80f600 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_kgsteven_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_kgsteven DistilBertForSequenceClassification from KGsteven +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_kgsteven +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_kgsteven` is a English model originally trained by KGsteven. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kgsteven_en_5.2.0_3.0_1700483418247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kgsteven_en_5.2.0_3.0_1700483418247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kgsteven","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kgsteven","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_kgsteven| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.7 MB| + +## References + +https://huggingface.co/KGsteven/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_kris666_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_kris666_en.md new file mode 100644 index 000000000000..f3a19268ace3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_kris666_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_kris666 DistilBertForSequenceClassification from kris666 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_kris666 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_kris666` is a English model originally trained by kris666. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kris666_en_5.2.0_3.0_1700454722188.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_kris666_en_5.2.0_3.0_1700454722188.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kris666","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_kris666","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_kris666| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kris666/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_mohammedbriman_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_mohammedbriman_en.md new file mode 100644 index 000000000000..6b498e83f4ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_mohammedbriman_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_mohammedbriman DistilBertForSequenceClassification from mohammedbriman +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_mohammedbriman +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_mohammedbriman` is a English model originally trained by mohammedbriman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_mohammedbriman_en_5.2.0_3.0_1700473786223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_mohammedbriman_en_5.2.0_3.0_1700473786223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_mohammedbriman","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_mohammedbriman","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_mohammedbriman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mohammedbriman/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_omaralsaabi_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_omaralsaabi_en.md new file mode 100644 index 000000000000..9f25cdda4f81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_omaralsaabi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_omaralsaabi DistilBertForSequenceClassification from OmarAlsaabi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_omaralsaabi +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_omaralsaabi` is a English model originally trained by OmarAlsaabi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_omaralsaabi_en_5.2.0_3.0_1700464839630.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_omaralsaabi_en_5.2.0_3.0_1700464839630.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_omaralsaabi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_omaralsaabi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_omaralsaabi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/OmarAlsaabi/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_pooyaphoenix_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_pooyaphoenix_en.md new file mode 100644 index 000000000000..4f441d4b27d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_pooyaphoenix_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_pooyaphoenix DistilBertForSequenceClassification from pooyaphoenix +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_pooyaphoenix +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_pooyaphoenix` is a English model originally trained by pooyaphoenix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_pooyaphoenix_en_5.2.0_3.0_1700457422507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_pooyaphoenix_en_5.2.0_3.0_1700457422507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_pooyaphoenix","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_pooyaphoenix","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_pooyaphoenix| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/pooyaphoenix/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_pranav1015_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_pranav1015_en.md new file mode 100644 index 000000000000..82ba953f7cd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_pranav1015_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_pranav1015 DistilBertForSequenceClassification from pranav1015 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_pranav1015 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_pranav1015` is a English model originally trained by pranav1015. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_pranav1015_en_5.2.0_3.0_1700464142985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_pranav1015_en_5.2.0_3.0_1700464142985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_pranav1015","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_pranav1015","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_pranav1015| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/pranav1015/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_r10521708_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_r10521708_en.md new file mode 100644 index 000000000000..277b1298dfca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_r10521708_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_r10521708 DistilBertForSequenceClassification from r10521708 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_r10521708 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_r10521708` is a English model originally trained by r10521708. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_r10521708_en_5.2.0_3.0_1700472264579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_r10521708_en_5.2.0_3.0_1700472264579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_r10521708","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_r10521708","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_r10521708| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/r10521708/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_robby1421_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_robby1421_en.md new file mode 100644 index 000000000000..805491d2befa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_robby1421_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_robby1421 DistilBertForSequenceClassification from robby1421 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_robby1421 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_robby1421` is a English model originally trained by robby1421. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_robby1421_en_5.2.0_3.0_1700445112675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_robby1421_en_5.2.0_3.0_1700445112675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_robby1421","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_robby1421","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_robby1421| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/robby1421/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_rwang5688_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_rwang5688_en.md new file mode 100644 index 000000000000..b7075d62825b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_rwang5688_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_rwang5688 DistilBertForSequenceClassification from rwang5688 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_rwang5688 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_rwang5688` is a English model originally trained by rwang5688. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_rwang5688_en_5.2.0_3.0_1700447971646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_rwang5688_en_5.2.0_3.0_1700447971646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_rwang5688","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_rwang5688","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_rwang5688| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/rwang5688/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_shebna_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_shebna_en.md new file mode 100644 index 000000000000..9dd834142c08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_shebna_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_shebna DistilBertForSequenceClassification from Shebna +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_shebna +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_shebna` is a English model originally trained by Shebna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_shebna_en_5.2.0_3.0_1700496092839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_shebna_en_5.2.0_3.0_1700496092839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_shebna","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_shebna","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_shebna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Shebna/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_swang2000_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_swang2000_en.md new file mode 100644 index 000000000000..95273faf090f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_swang2000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_swang2000 DistilBertForSequenceClassification from swang2000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_swang2000 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_swang2000` is a English model originally trained by swang2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_swang2000_en_5.2.0_3.0_1700449192705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_swang2000_en_5.2.0_3.0_1700449192705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_swang2000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_swang2000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_swang2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/swang2000/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_v3_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_v3_en.md new file mode 100644 index 000000000000..a4948bfdd077 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_v3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_v3 DistilBertForSequenceClassification from MGanesh29 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_v3 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_v3` is a English model originally trained by MGanesh29. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_v3_en_5.2.0_3.0_1700469185311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_v3_en_5.2.0_3.0_1700469185311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_v3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_v3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/MGanesh29/distilbert-base-uncased-finetuned-cola-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_zhanglu_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_zhanglu_en.md new file mode 100644 index 000000000000..7d97bf36da91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_cola_zhanglu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cola_zhanglu DistilBertForSequenceClassification from zhanglu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cola_zhanglu +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cola_zhanglu` is a English model originally trained by zhanglu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_zhanglu_en_5.2.0_3.0_1700492994585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cola_zhanglu_en_5.2.0_3.0_1700492994585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_zhanglu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_cola_zhanglu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cola_zhanglu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/zhanglu/distilbert-base-uncased-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_deepakrish_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_deepakrish_en.md new file mode 100644 index 000000000000..1844ef479da4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_deepakrish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_deepakrish DistilBertForSequenceClassification from DeepaKrish +author: John Snow Labs +name: distilbert_base_uncased_finetuned_deepakrish +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_deepakrish` is a English model originally trained by DeepaKrish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_deepakrish_en_5.2.0_3.0_1700452577420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_deepakrish_en_5.2.0_3.0_1700452577420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_deepakrish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_deepakrish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_deepakrish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DeepaKrish/distilbert-base-uncased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion2_nickapch_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion2_nickapch_en.md new file mode 100644 index 000000000000..46a2474c105c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion2_nickapch_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion2_nickapch DistilBertForSequenceClassification from nickapch +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion2_nickapch +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion2_nickapch` is a English model originally trained by nickapch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion2_nickapch_en_5.2.0_3.0_1700450300408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion2_nickapch_en_5.2.0_3.0_1700450300408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion2_nickapch","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion2_nickapch","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion2_nickapch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nickapch/distilbert-base-uncased-finetuned-emotion2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_2_ronnybehrens_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_2_ronnybehrens_en.md new file mode 100644 index 000000000000..702324be5152 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_2_ronnybehrens_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_2_ronnybehrens DistilBertForSequenceClassification from ronnybehrens +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_2_ronnybehrens +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_2_ronnybehrens` is a English model originally trained by ronnybehrens. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_2_ronnybehrens_en_5.2.0_3.0_1700478683665.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_2_ronnybehrens_en_5.2.0_3.0_1700478683665.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_2_ronnybehrens","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_2_ronnybehrens","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_2_ronnybehrens| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ronnybehrens/distilbert-base-uncased-finetuned-emotion-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_abbas5253_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_abbas5253_en.md new file mode 100644 index 000000000000..d8529b13def9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_abbas5253_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_abbas5253 DistilBertForSequenceClassification from abbas5253 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_abbas5253 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_abbas5253` is a English model originally trained by abbas5253. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_abbas5253_en_5.2.0_3.0_1700486314965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_abbas5253_en_5.2.0_3.0_1700486314965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_abbas5253","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_abbas5253","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_abbas5253| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/abbas5253/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_adsjklfsd_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_adsjklfsd_en.md new file mode 100644 index 000000000000..2b7dbb39536e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_adsjklfsd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_adsjklfsd DistilBertForSequenceClassification from adsjklfsd +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_adsjklfsd +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_adsjklfsd` is a English model originally trained by adsjklfsd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_adsjklfsd_en_5.2.0_3.0_1700490337256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_adsjklfsd_en_5.2.0_3.0_1700490337256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_adsjklfsd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_adsjklfsd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_adsjklfsd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/adsjklfsd/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_akhild1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_akhild1_en.md new file mode 100644 index 000000000000..db5a841cd57f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_akhild1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_akhild1 DistilBertForSequenceClassification from AkhilD1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_akhild1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_akhild1` is a English model originally trained by AkhilD1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_akhild1_en_5.2.0_3.0_1700452883686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_akhild1_en_5.2.0_3.0_1700452883686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_akhild1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_akhild1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_akhild1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AkhilD1/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_al3ksandra_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_al3ksandra_en.md new file mode 100644 index 000000000000..f466a118d441 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_al3ksandra_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_al3ksandra DistilBertForSequenceClassification from Al3ksandra +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_al3ksandra +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_al3ksandra` is a English model originally trained by Al3ksandra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_al3ksandra_en_5.2.0_3.0_1700491711410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_al3ksandra_en_5.2.0_3.0_1700491711410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_al3ksandra","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_al3ksandra","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_al3ksandra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Al3ksandra/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_almondpeanuts_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_almondpeanuts_en.md new file mode 100644 index 000000000000..546f14b0d8c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_almondpeanuts_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_almondpeanuts DistilBertForSequenceClassification from Almondpeanuts +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_almondpeanuts +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_almondpeanuts` is a English model originally trained by Almondpeanuts. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_almondpeanuts_en_5.2.0_3.0_1700500993583.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_almondpeanuts_en_5.2.0_3.0_1700500993583.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_almondpeanuts","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_almondpeanuts","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_almondpeanuts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Almondpeanuts/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_amir36_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_amir36_en.md new file mode 100644 index 000000000000..67c01c9252dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_amir36_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_amir36 DistilBertForSequenceClassification from amir36 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_amir36 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_amir36` is a English model originally trained by amir36. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_amir36_en_5.2.0_3.0_1700484186246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_amir36_en_5.2.0_3.0_1700484186246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_amir36","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_amir36","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_amir36| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/amir36/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_andyrasika_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_andyrasika_en.md new file mode 100644 index 000000000000..ac162a3c4ba5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_andyrasika_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_andyrasika DistilBertForSequenceClassification from Andyrasika +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_andyrasika +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_andyrasika` is a English model originally trained by Andyrasika. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_andyrasika_en_5.2.0_3.0_1700448354467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_andyrasika_en_5.2.0_3.0_1700448354467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_andyrasika","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_andyrasika","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_andyrasika| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Andyrasika/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ankit93_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ankit93_en.md new file mode 100644 index 000000000000..74658de1f57f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ankit93_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ankit93 DistilBertForSequenceClassification from Ankit93 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ankit93 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ankit93` is a English model originally trained by Ankit93. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ankit93_en_5.2.0_3.0_1700467525422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ankit93_en_5.2.0_3.0_1700467525422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ankit93","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ankit93","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ankit93| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Ankit93/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_arnaudlauer_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_arnaudlauer_en.md new file mode 100644 index 000000000000..ba5f227f6884 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_arnaudlauer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_arnaudlauer DistilBertForSequenceClassification from arnaudlauer +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_arnaudlauer +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_arnaudlauer` is a English model originally trained by arnaudlauer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_arnaudlauer_en_5.2.0_3.0_1700481960265.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_arnaudlauer_en_5.2.0_3.0_1700481960265.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_arnaudlauer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_arnaudlauer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_arnaudlauer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/arnaudlauer/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_arned_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_arned_en.md new file mode 100644 index 000000000000..72ba09d72229 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_arned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_arned DistilBertForSequenceClassification from ArneD +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_arned +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_arned` is a English model originally trained by ArneD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_arned_en_5.2.0_3.0_1700472264528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_arned_en_5.2.0_3.0_1700472264528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_arned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_arned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_arned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ArneD/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ashrielbrian_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ashrielbrian_en.md new file mode 100644 index 000000000000..86b365008b78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ashrielbrian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ashrielbrian DistilBertForSequenceClassification from ashrielbrian +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ashrielbrian +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ashrielbrian` is a English model originally trained by ashrielbrian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ashrielbrian_en_5.2.0_3.0_1700497969490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ashrielbrian_en_5.2.0_3.0_1700497969490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ashrielbrian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ashrielbrian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ashrielbrian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ashrielbrian/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_bobospark_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_bobospark_en.md new file mode 100644 index 000000000000..ddeb1ee7e818 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_bobospark_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_bobospark DistilBertForSequenceClassification from Bobospark +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_bobospark +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_bobospark` is a English model originally trained by Bobospark. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_bobospark_en_5.2.0_3.0_1700493964980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_bobospark_en_5.2.0_3.0_1700493964980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_bobospark","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_bobospark","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_bobospark| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Bobospark/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_butchland_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_butchland_en.md new file mode 100644 index 000000000000..0aecb7f6fc4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_butchland_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_butchland DistilBertForSequenceClassification from butchland +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_butchland +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_butchland` is a English model originally trained by butchland. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_butchland_en_5.2.0_3.0_1700499854811.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_butchland_en_5.2.0_3.0_1700499854811.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_butchland","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_butchland","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_butchland| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/butchland/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_carmeco_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_carmeco_en.md new file mode 100644 index 000000000000..2c90816a9e17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_carmeco_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_carmeco DistilBertForSequenceClassification from carmeco +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_carmeco +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_carmeco` is a English model originally trained by carmeco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_carmeco_en_5.2.0_3.0_1700502670818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_carmeco_en_5.2.0_3.0_1700502670818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_carmeco","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_carmeco","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_carmeco| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/carmeco/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cdinh2022_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cdinh2022_en.md new file mode 100644 index 000000000000..c4c62da5972f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cdinh2022_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cdinh2022 DistilBertForSequenceClassification from cdinh2022 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cdinh2022 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cdinh2022` is a English model originally trained by cdinh2022. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cdinh2022_en_5.2.0_3.0_1700486213446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cdinh2022_en_5.2.0_3.0_1700486213446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cdinh2022","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cdinh2022","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cdinh2022| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cdinh2022/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_chaewonlee_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_chaewonlee_en.md new file mode 100644 index 000000000000..c53a5b4c597f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_chaewonlee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_chaewonlee DistilBertForSequenceClassification from chaewonlee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_chaewonlee +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_chaewonlee` is a English model originally trained by chaewonlee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_chaewonlee_en_5.2.0_3.0_1700480526393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_chaewonlee_en_5.2.0_3.0_1700480526393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_chaewonlee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_chaewonlee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_chaewonlee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/chaewonlee/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_charlieoneill_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_charlieoneill_en.md new file mode 100644 index 000000000000..c0904d8b71e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_charlieoneill_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_charlieoneill DistilBertForSequenceClassification from charlieoneill +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_charlieoneill +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_charlieoneill` is a English model originally trained by charlieoneill. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_charlieoneill_en_5.2.0_3.0_1700498928031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_charlieoneill_en_5.2.0_3.0_1700498928031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_charlieoneill","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_charlieoneill","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_charlieoneill| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/charlieoneill/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cj_mills_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cj_mills_en.md new file mode 100644 index 000000000000..12b644d51012 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cj_mills_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cj_mills DistilBertForSequenceClassification from cj-mills +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cj_mills +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cj_mills` is a English model originally trained by cj-mills. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cj_mills_en_5.2.0_3.0_1700487586667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cj_mills_en_5.2.0_3.0_1700487586667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cj_mills","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cj_mills","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cj_mills| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cj-mills/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cjbarrie_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cjbarrie_en.md new file mode 100644 index 000000000000..dd0c19ab5c73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cjbarrie_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cjbarrie DistilBertForSequenceClassification from cjbarrie +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cjbarrie +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cjbarrie` is a English model originally trained by cjbarrie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cjbarrie_en_5.2.0_3.0_1700500295248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cjbarrie_en_5.2.0_3.0_1700500295248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cjbarrie","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cjbarrie","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cjbarrie| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cjbarrie/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cjdentra_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cjdentra_en.md new file mode 100644 index 000000000000..c3e9f28ad3d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cjdentra_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cjdentra DistilBertForSequenceClassification from cjdentra +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cjdentra +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cjdentra` is a English model originally trained by cjdentra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cjdentra_en_5.2.0_3.0_1700490165789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cjdentra_en_5.2.0_3.0_1700490165789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cjdentra","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cjdentra","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cjdentra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cjdentra/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cmdshiftenter_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cmdshiftenter_en.md new file mode 100644 index 000000000000..cf2a194087ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cmdshiftenter_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cmdshiftenter DistilBertForSequenceClassification from cmdshiftenter +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cmdshiftenter +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cmdshiftenter` is a English model originally trained by cmdshiftenter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cmdshiftenter_en_5.2.0_3.0_1700486658016.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cmdshiftenter_en_5.2.0_3.0_1700486658016.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cmdshiftenter","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cmdshiftenter","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cmdshiftenter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cmdshiftenter/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_codefactory4791_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_codefactory4791_en.md new file mode 100644 index 000000000000..ce623512c538 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_codefactory4791_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_codefactory4791 DistilBertForSequenceClassification from codefactory4791 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_codefactory4791 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_codefactory4791` is a English model originally trained by codefactory4791. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_codefactory4791_en_5.2.0_3.0_1700488848231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_codefactory4791_en_5.2.0_3.0_1700488848231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_codefactory4791","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_codefactory4791","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_codefactory4791| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/codefactory4791/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cole_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cole_en.md new file mode 100644 index 000000000000..7da6f48b4618 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_cole_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_cole DistilBertForSequenceClassification from Cole +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_cole +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_cole` is a English model originally trained by Cole. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cole_en_5.2.0_3.0_1700495089686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_cole_en_5.2.0_3.0_1700495089686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cole","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_cole","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_cole| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Cole/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_corgi777_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_corgi777_en.md new file mode 100644 index 000000000000..090e454012e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_corgi777_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_corgi777 DistilBertForSequenceClassification from corgi777 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_corgi777 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_corgi777` is a English model originally trained by corgi777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_corgi777_en_5.2.0_3.0_1700460471006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_corgi777_en_5.2.0_3.0_1700460471006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_corgi777","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_corgi777","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_corgi777| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/corgi777/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_dhehun_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_dhehun_en.md new file mode 100644 index 000000000000..e0c2238bc27e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_dhehun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_dhehun DistilBertForSequenceClassification from DheHun +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_dhehun +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_dhehun` is a English model originally trained by DheHun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_dhehun_en_5.2.0_3.0_1700480312018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_dhehun_en_5.2.0_3.0_1700480312018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_dhehun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_dhehun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_dhehun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DheHun/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_dongyeop_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_dongyeop_en.md new file mode 100644 index 000000000000..b4191ba51ba6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_dongyeop_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_dongyeop DistilBertForSequenceClassification from Dongyeop +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_dongyeop +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_dongyeop` is a English model originally trained by Dongyeop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_dongyeop_en_5.2.0_3.0_1700459390726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_dongyeop_en_5.2.0_3.0_1700459390726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_dongyeop","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_dongyeop","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_dongyeop| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Dongyeop/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_duytuan_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_duytuan_en.md new file mode 100644 index 000000000000..e921966e47a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_duytuan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_duytuan DistilBertForSequenceClassification from DuyTuan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_duytuan +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_duytuan` is a English model originally trained by DuyTuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_duytuan_en_5.2.0_3.0_1700479330683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_duytuan_en_5.2.0_3.0_1700479330683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_duytuan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_duytuan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_duytuan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DuyTuan/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ehanj_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ehanj_en.md new file mode 100644 index 000000000000..863986b33fa3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ehanj_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ehanj DistilBertForSequenceClassification from ehanJ +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ehanj +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ehanj` is a English model originally trained by ehanJ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ehanj_en_5.2.0_3.0_1700478193454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ehanj_en_5.2.0_3.0_1700478193454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ehanj","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ehanj","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ehanj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ehanJ/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_goldenk_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_goldenk_en.md new file mode 100644 index 000000000000..3ebdaf8fe6f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_goldenk_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_goldenk DistilBertForSequenceClassification from goldenk +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_goldenk +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_goldenk` is a English model originally trained by goldenk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_goldenk_en_5.2.0_3.0_1700467054833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_goldenk_en_5.2.0_3.0_1700467054833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_goldenk","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_goldenk","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_goldenk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/goldenk/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_impesalobo431_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_impesalobo431_en.md new file mode 100644 index 000000000000..f4ba88172d43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_impesalobo431_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_impesalobo431 DistilBertForSequenceClassification from impesalobo431 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_impesalobo431 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_impesalobo431` is a English model originally trained by impesalobo431. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_impesalobo431_en_5.2.0_3.0_1700499991204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_impesalobo431_en_5.2.0_3.0_1700499991204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_impesalobo431","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_impesalobo431","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_impesalobo431| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/impesalobo431/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jamesg_337_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jamesg_337_en.md new file mode 100644 index 000000000000..567719367ef4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jamesg_337_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jamesg_337 DistilBertForSequenceClassification from JamesG-337 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jamesg_337 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jamesg_337` is a English model originally trained by JamesG-337. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jamesg_337_en_5.2.0_3.0_1700454606940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jamesg_337_en_5.2.0_3.0_1700454606940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jamesg_337","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jamesg_337","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jamesg_337| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JamesG-337/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jb723_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jb723_en.md new file mode 100644 index 000000000000..01670dfb7cee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jb723_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jb723 DistilBertForSequenceClassification from jb723 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jb723 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jb723` is a English model originally trained by jb723. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jb723_en_5.2.0_3.0_1700502675858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jb723_en_5.2.0_3.0_1700502675858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jb723","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jb723","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jb723| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jb723/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jhn9803_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jhn9803_en.md new file mode 100644 index 000000000000..d51d5aa4f361 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jhn9803_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jhn9803 DistilBertForSequenceClassification from jhn9803 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jhn9803 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jhn9803` is a English model originally trained by jhn9803. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jhn9803_en_5.2.0_3.0_1700487096432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jhn9803_en_5.2.0_3.0_1700487096432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jhn9803","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jhn9803","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jhn9803| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jhn9803/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jo_kwsm_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jo_kwsm_en.md new file mode 100644 index 000000000000..d043983fb0e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jo_kwsm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jo_kwsm DistilBertForSequenceClassification from jo-kwsm +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jo_kwsm +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jo_kwsm` is a English model originally trained by jo-kwsm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jo_kwsm_en_5.2.0_3.0_1700453594060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jo_kwsm_en_5.2.0_3.0_1700453594060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jo_kwsm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jo_kwsm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jo_kwsm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jo-kwsm/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_joys000_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_joys000_en.md new file mode 100644 index 000000000000..548887f23370 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_joys000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_joys000 DistilBertForSequenceClassification from joys000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_joys000 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_joys000` is a English model originally trained by joys000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_joys000_en_5.2.0_3.0_1700458005390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_joys000_en_5.2.0_3.0_1700458005390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_joys000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_joys000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_joys000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/joys000/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jslowik_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jslowik_en.md new file mode 100644 index 000000000000..4562718cf2df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_jslowik_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_jslowik DistilBertForSequenceClassification from jslowik +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_jslowik +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_jslowik` is a English model originally trained by jslowik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jslowik_en_5.2.0_3.0_1700488435950.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_jslowik_en_5.2.0_3.0_1700488435950.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jslowik","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_jslowik","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_jslowik| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/jslowik/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_kawauso_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_kawauso_en.md new file mode 100644 index 000000000000..a72717fdb509 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_kawauso_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_kawauso DistilBertForSequenceClassification from kawauso +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_kawauso +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_kawauso` is a English model originally trained by kawauso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kawauso_en_5.2.0_3.0_1700484186179.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kawauso_en_5.2.0_3.0_1700484186179.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kawauso","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kawauso","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_kawauso| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kawauso/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_kzk_kbys_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_kzk_kbys_en.md new file mode 100644 index 000000000000..a142123e5db2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_kzk_kbys_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_kzk_kbys DistilBertForSequenceClassification from kzk-kbys +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_kzk_kbys +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_kzk_kbys` is a English model originally trained by kzk-kbys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kzk_kbys_en_5.2.0_3.0_1700489328493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_kzk_kbys_en_5.2.0_3.0_1700489328493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kzk_kbys","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_kzk_kbys","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_kzk_kbys| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/kzk-kbys/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_marcolin_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_marcolin_en.md new file mode 100644 index 000000000000..68be67570749 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_marcolin_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_marcolin DistilBertForSequenceClassification from marcolin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_marcolin +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_marcolin` is a English model originally trained by marcolin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_marcolin_en_5.2.0_3.0_1700474098442.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_marcolin_en_5.2.0_3.0_1700474098442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_marcolin","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_marcolin","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_marcolin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/marcolin/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_matorus_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_matorus_en.md new file mode 100644 index 000000000000..0aca1b44c28c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_matorus_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_matorus DistilBertForSequenceClassification from matorus +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_matorus +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_matorus` is a English model originally trained by matorus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_matorus_en_5.2.0_3.0_1700464065980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_matorus_en_5.2.0_3.0_1700464065980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_matorus","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_matorus","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_matorus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/matorus/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_mattcalhoun1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_mattcalhoun1_en.md new file mode 100644 index 000000000000..593db9b9e81f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_mattcalhoun1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_mattcalhoun1 DistilBertForSequenceClassification from mattcalhoun1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_mattcalhoun1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_mattcalhoun1` is a English model originally trained by mattcalhoun1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_mattcalhoun1_en_5.2.0_3.0_1700500851427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_mattcalhoun1_en_5.2.0_3.0_1700500851427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_mattcalhoun1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_mattcalhoun1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_mattcalhoun1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/mattcalhoun1/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_maxbarshay_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_maxbarshay_en.md new file mode 100644 index 000000000000..9e5cda441802 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_maxbarshay_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_maxbarshay DistilBertForSequenceClassification from maxbarshay +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_maxbarshay +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_maxbarshay` is a English model originally trained by maxbarshay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_maxbarshay_en_5.2.0_3.0_1700460229236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_maxbarshay_en_5.2.0_3.0_1700460229236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_maxbarshay","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_maxbarshay","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_maxbarshay| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/maxbarshay/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_maxhilsdorf_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_maxhilsdorf_en.md new file mode 100644 index 000000000000..f72c99f2b021 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_maxhilsdorf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_maxhilsdorf DistilBertForSequenceClassification from maxhilsdorf +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_maxhilsdorf +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_maxhilsdorf` is a English model originally trained by maxhilsdorf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_maxhilsdorf_en_5.2.0_3.0_1700499380711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_maxhilsdorf_en_5.2.0_3.0_1700499380711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_maxhilsdorf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_maxhilsdorf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_maxhilsdorf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/maxhilsdorf/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_medium_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_medium_en.md new file mode 100644 index 000000000000..553668a1925f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_medium_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_medium DistilBertForSequenceClassification from grantsl +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_medium +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_medium` is a English model originally trained by grantsl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_medium_en_5.2.0_3.0_1700477712871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_medium_en_5.2.0_3.0_1700477712871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_medium","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_medium","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_medium| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/grantsl/distilbert-base-uncased-finetuned-emotion-medium \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_moghis_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_moghis_en.md new file mode 100644 index 000000000000..1d1a3a8ac530 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_moghis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_moghis DistilBertForSequenceClassification from moghis +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_moghis +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_moghis` is a English model originally trained by moghis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_moghis_en_5.2.0_3.0_1700455374047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_moghis_en_5.2.0_3.0_1700455374047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_moghis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_moghis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_moghis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/moghis/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_momtaro_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_momtaro_en.md new file mode 100644 index 000000000000..89cd76a20469 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_momtaro_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_momtaro DistilBertForSequenceClassification from momtaro +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_momtaro +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_momtaro` is a English model originally trained by momtaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_momtaro_en_5.2.0_3.0_1700476395835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_momtaro_en_5.2.0_3.0_1700476395835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_momtaro","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_momtaro","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_momtaro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/momtaro/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_monaf3_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_monaf3_en.md new file mode 100644 index 000000000000..085bb431fe00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_monaf3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_monaf3 DistilBertForSequenceClassification from monaf3 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_monaf3 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_monaf3` is a English model originally trained by monaf3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_monaf3_en_5.2.0_3.0_1700439068410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_monaf3_en_5.2.0_3.0_1700439068410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_monaf3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_monaf3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_monaf3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/monaf3/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_moutainjump_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_moutainjump_en.md new file mode 100644 index 000000000000..bb61ba5d7370 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_moutainjump_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_moutainjump DistilBertForSequenceClassification from MoutainJump +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_moutainjump +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_moutainjump` is a English model originally trained by MoutainJump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_moutainjump_en_5.2.0_3.0_1700473233535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_moutainjump_en_5.2.0_3.0_1700473233535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_moutainjump","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_moutainjump","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_moutainjump| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/MoutainJump/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nanunsaram_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nanunsaram_en.md new file mode 100644 index 000000000000..275bef3fd480 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nanunsaram_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_nanunsaram DistilBertForSequenceClassification from nanunsaram +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_nanunsaram +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_nanunsaram` is a English model originally trained by nanunsaram. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nanunsaram_en_5.2.0_3.0_1700440078297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nanunsaram_en_5.2.0_3.0_1700440078297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nanunsaram","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nanunsaram","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_nanunsaram| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nanunsaram/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_naomiyjchen_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_naomiyjchen_en.md new file mode 100644 index 000000000000..429f4b1407d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_naomiyjchen_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_naomiyjchen DistilBertForSequenceClassification from naomiyjchen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_naomiyjchen +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_naomiyjchen` is a English model originally trained by naomiyjchen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_naomiyjchen_en_5.2.0_3.0_1700485222712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_naomiyjchen_en_5.2.0_3.0_1700485222712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_naomiyjchen","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_naomiyjchen","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_naomiyjchen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/naomiyjchen/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nickovchinnikov_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nickovchinnikov_en.md new file mode 100644 index 000000000000..901a8bac31a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nickovchinnikov_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_nickovchinnikov DistilBertForSequenceClassification from nickovchinnikov +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_nickovchinnikov +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_nickovchinnikov` is a English model originally trained by nickovchinnikov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nickovchinnikov_en_5.2.0_3.0_1700496977753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nickovchinnikov_en_5.2.0_3.0_1700496977753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nickovchinnikov","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nickovchinnikov","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_nickovchinnikov| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nickovchinnikov/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nisimura_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nisimura_en.md new file mode 100644 index 000000000000..47e2ab66714e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_nisimura_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_nisimura DistilBertForSequenceClassification from nisimura +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_nisimura +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_nisimura` is a English model originally trained by nisimura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nisimura_en_5.2.0_3.0_1700464996076.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_nisimura_en_5.2.0_3.0_1700464996076.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nisimura","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_nisimura","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_nisimura| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nisimura/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_novarac23_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_novarac23_en.md new file mode 100644 index 000000000000..df64a6f74006 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_novarac23_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_novarac23 DistilBertForSequenceClassification from novarac23 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_novarac23 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_novarac23` is a English model originally trained by novarac23. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_novarac23_en_5.2.0_3.0_1700480146750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_novarac23_en_5.2.0_3.0_1700480146750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_novarac23","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_novarac23","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_novarac23| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/novarac23/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_occupy1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_occupy1_en.md new file mode 100644 index 000000000000..df180924a326 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_occupy1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_occupy1 DistilBertForSequenceClassification from occupy1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_occupy1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_occupy1` is a English model originally trained by occupy1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_occupy1_en_5.2.0_3.0_1700441857042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_occupy1_en_5.2.0_3.0_1700441857042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_occupy1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_occupy1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_occupy1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/occupy1/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_okep_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_okep_en.md new file mode 100644 index 000000000000..2bda8a5d1fdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_okep_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_okep DistilBertForSequenceClassification from okep +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_okep +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_okep` is a English model originally trained by okep. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_okep_en_5.2.0_3.0_1700495089466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_okep_en_5.2.0_3.0_1700495089466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_okep","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_okep","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_okep| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/okep/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_oknashar_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_oknashar_en.md new file mode 100644 index 000000000000..fce4a9e92b78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_oknashar_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_oknashar DistilBertForSequenceClassification from oknashar +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_oknashar +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_oknashar` is a English model originally trained by oknashar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_oknashar_en_5.2.0_3.0_1700439106483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_oknashar_en_5.2.0_3.0_1700439106483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_oknashar","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_oknashar","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_oknashar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/oknashar/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_omogo_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_omogo_en.md new file mode 100644 index 000000000000..e4770c05c51e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_omogo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_omogo DistilBertForSequenceClassification from Omogo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_omogo +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_omogo` is a English model originally trained by Omogo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_omogo_en_5.2.0_3.0_1700451078739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_omogo_en_5.2.0_3.0_1700451078739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_omogo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_omogo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_omogo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Omogo/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_parnyanp_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_parnyanp_en.md new file mode 100644 index 000000000000..ae470e839708 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_parnyanp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_parnyanp DistilBertForSequenceClassification from parnyanp +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_parnyanp +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_parnyanp` is a English model originally trained by parnyanp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_parnyanp_en_5.2.0_3.0_1700496010721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_parnyanp_en_5.2.0_3.0_1700496010721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_parnyanp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_parnyanp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_parnyanp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/parnyanp/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_patnelt60_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_patnelt60_en.md new file mode 100644 index 000000000000..b5dd6e27fc81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_patnelt60_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_patnelt60 DistilBertForSequenceClassification from patnelt60 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_patnelt60 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_patnelt60` is a English model originally trained by patnelt60. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_patnelt60_en_5.2.0_3.0_1700501731189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_patnelt60_en_5.2.0_3.0_1700501731189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_patnelt60","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_patnelt60","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_patnelt60| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/patnelt60/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_pattkopp_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_pattkopp_en.md new file mode 100644 index 000000000000..427194c12227 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_pattkopp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_pattkopp DistilBertForSequenceClassification from Pattkopp +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_pattkopp +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_pattkopp` is a English model originally trained by Pattkopp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_pattkopp_en_5.2.0_3.0_1700499279179.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_pattkopp_en_5.2.0_3.0_1700499279179.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_pattkopp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_pattkopp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_pattkopp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Pattkopp/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_pjheslin_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_pjheslin_en.md new file mode 100644 index 000000000000..f750dd17d3bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_pjheslin_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_pjheslin DistilBertForSequenceClassification from pjheslin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_pjheslin +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_pjheslin` is a English model originally trained by pjheslin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_pjheslin_en_5.2.0_3.0_1700456337917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_pjheslin_en_5.2.0_3.0_1700456337917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_pjheslin","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_pjheslin","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_pjheslin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/pjheslin/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_playdev_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_playdev_en.md new file mode 100644 index 000000000000..b6aa7b9ee80d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_playdev_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_playdev DistilBertForSequenceClassification from PlayDev +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_playdev +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_playdev` is a English model originally trained by PlayDev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_playdev_en_5.2.0_3.0_1700448117009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_playdev_en_5.2.0_3.0_1700448117009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_playdev","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_playdev","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_playdev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/PlayDev/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_poojitha_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_poojitha_en.md new file mode 100644 index 000000000000..a950eb52f475 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_poojitha_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_poojitha DistilBertForSequenceClassification from Poojitha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_poojitha +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_poojitha` is a English model originally trained by Poojitha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_poojitha_en_5.2.0_3.0_1700494313676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_poojitha_en_5.2.0_3.0_1700494313676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_poojitha","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_poojitha","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_poojitha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Poojitha/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_prinernian_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_prinernian_en.md new file mode 100644 index 000000000000..179fd8c5eba9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_prinernian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_prinernian DistilBertForSequenceClassification from Prinernian +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_prinernian +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_prinernian` is a English model originally trained by Prinernian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_prinernian_en_5.2.0_3.0_1700448822975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_prinernian_en_5.2.0_3.0_1700448822975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_prinernian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_prinernian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_prinernian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Prinernian/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_psrohith98_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_psrohith98_en.md new file mode 100644 index 000000000000..a854c7026ef1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_psrohith98_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_psrohith98 DistilBertForSequenceClassification from psrohith98 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_psrohith98 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_psrohith98` is a English model originally trained by psrohith98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_psrohith98_en_5.2.0_3.0_1700476229579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_psrohith98_en_5.2.0_3.0_1700476229579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_psrohith98","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_psrohith98","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_psrohith98| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/psrohith98/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_r45289_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_r45289_en.md new file mode 100644 index 000000000000..dac84d2d3ff9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_r45289_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_r45289 DistilBertForSequenceClassification from r45289 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_r45289 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_r45289` is a English model originally trained by r45289. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_r45289_en_5.2.0_3.0_1700478716439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_r45289_en_5.2.0_3.0_1700478716439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_r45289","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_r45289","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_r45289| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/r45289/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_raghuramkol_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_raghuramkol_en.md new file mode 100644 index 000000000000..5fecb7391820 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_raghuramkol_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_raghuramkol DistilBertForSequenceClassification from RaghuramKol +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_raghuramkol +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_raghuramkol` is a English model originally trained by RaghuramKol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_raghuramkol_en_5.2.0_3.0_1700500925741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_raghuramkol_en_5.2.0_3.0_1700500925741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_raghuramkol","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_raghuramkol","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_raghuramkol| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/RaghuramKol/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ramu_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ramu_en.md new file mode 100644 index 000000000000..d248546a012a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_ramu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_ramu DistilBertForSequenceClassification from Ramu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_ramu +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_ramu` is a English model originally trained by Ramu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ramu_en_5.2.0_3.0_1700498928050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_ramu_en_5.2.0_3.0_1700498928050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ramu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_ramu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_ramu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Ramu/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_riho1710_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_riho1710_en.md new file mode 100644 index 000000000000..9bf3ce9b703a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_riho1710_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_riho1710 DistilBertForSequenceClassification from riho1710 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_riho1710 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_riho1710` is a English model originally trained by riho1710. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_riho1710_en_5.2.0_3.0_1700444218752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_riho1710_en_5.2.0_3.0_1700444218752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_riho1710","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_riho1710","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_riho1710| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/riho1710/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_roxanmlr_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_roxanmlr_en.md new file mode 100644 index 000000000000..24a430a94b06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_roxanmlr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_roxanmlr DistilBertForSequenceClassification from roxanmlr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_roxanmlr +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_roxanmlr` is a English model originally trained by roxanmlr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_roxanmlr_en_5.2.0_3.0_1700501730221.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_roxanmlr_en_5.2.0_3.0_1700501730221.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_roxanmlr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_roxanmlr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_roxanmlr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/roxanmlr/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_sarunyusst_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_sarunyusst_en.md new file mode 100644 index 000000000000..cee9ac468f6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_sarunyusst_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_sarunyusst DistilBertForSequenceClassification from sarunyusst +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_sarunyusst +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_sarunyusst` is a English model originally trained by sarunyusst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sarunyusst_en_5.2.0_3.0_1700468202452.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sarunyusst_en_5.2.0_3.0_1700468202452.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sarunyusst","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sarunyusst","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_sarunyusst| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sarunyusst/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_selimsametoglu_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_selimsametoglu_en.md new file mode 100644 index 000000000000..6d45c9a1b904 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_selimsametoglu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_selimsametoglu DistilBertForSequenceClassification from selimsametoglu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_selimsametoglu +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_selimsametoglu` is a English model originally trained by selimsametoglu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_selimsametoglu_en_5.2.0_3.0_1700471301515.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_selimsametoglu_en_5.2.0_3.0_1700471301515.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_selimsametoglu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_selimsametoglu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_selimsametoglu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/selimsametoglu/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_skr1125_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_skr1125_en.md new file mode 100644 index 000000000000..5cfc267a9259 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_skr1125_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_skr1125 DistilBertForSequenceClassification from skr1125 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_skr1125 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_skr1125` is a English model originally trained by skr1125. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_skr1125_en_5.2.0_3.0_1700479598297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_skr1125_en_5.2.0_3.0_1700479598297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_skr1125","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_skr1125","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_skr1125| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/skr1125/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_smallsuper_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_smallsuper_en.md new file mode 100644 index 000000000000..ef136ca68f77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_smallsuper_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_smallsuper DistilBertForSequenceClassification from smallsuper +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_smallsuper +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_smallsuper` is a English model originally trained by smallsuper. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_smallsuper_en_5.2.0_3.0_1700485222689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_smallsuper_en_5.2.0_3.0_1700485222689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_smallsuper","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_smallsuper","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_smallsuper| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/smallsuper/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_srosy_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_srosy_en.md new file mode 100644 index 000000000000..47eee25a88f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_srosy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_srosy DistilBertForSequenceClassification from srosy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_srosy +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_srosy` is a English model originally trained by srosy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_srosy_en_5.2.0_3.0_1700441114688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_srosy_en_5.2.0_3.0_1700441114688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_srosy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_srosy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_srosy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/srosy/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_sudheer997_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_sudheer997_en.md new file mode 100644 index 000000000000..d4bd36d68bd4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_sudheer997_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_sudheer997 DistilBertForSequenceClassification from sudheer997 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_sudheer997 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_sudheer997` is a English model originally trained by sudheer997. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sudheer997_en_5.2.0_3.0_1700470594261.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_sudheer997_en_5.2.0_3.0_1700470594261.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sudheer997","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_sudheer997","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_sudheer997| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/sudheer997/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_swetava_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_swetava_en.md new file mode 100644 index 000000000000..5f4be3cf4e8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_swetava_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_swetava DistilBertForSequenceClassification from swetava +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_swetava +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_swetava` is a English model originally trained by swetava. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_swetava_en_5.2.0_3.0_1700470204358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_swetava_en_5.2.0_3.0_1700470204358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_swetava","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_swetava","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_swetava| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/swetava/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_the_fanta_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_the_fanta_en.md new file mode 100644 index 000000000000..9057fbfec371 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_the_fanta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_the_fanta DistilBertForSequenceClassification from The-Fanta +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_the_fanta +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_the_fanta` is a English model originally trained by The-Fanta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_the_fanta_en_5.2.0_3.0_1700480525974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_the_fanta_en_5.2.0_3.0_1700480525974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_the_fanta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_the_fanta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_the_fanta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/The-Fanta/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_vedantam_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_vedantam_en.md new file mode 100644 index 000000000000..b9633e3c2b96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_vedantam_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_vedantam DistilBertForSequenceClassification from vedantam +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_vedantam +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_vedantam` is a English model originally trained by vedantam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_vedantam_en_5.2.0_3.0_1700501666782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_vedantam_en_5.2.0_3.0_1700501666782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_vedantam","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_vedantam","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_vedantam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/vedantam/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_wypoon_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_wypoon_en.md new file mode 100644 index 000000000000..c2154b0bf31e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_wypoon_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_wypoon DistilBertForSequenceClassification from wypoon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_wypoon +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_wypoon` is a English model originally trained by wypoon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_wypoon_en_5.2.0_3.0_1700468647011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_wypoon_en_5.2.0_3.0_1700468647011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_wypoon","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_wypoon","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_wypoon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wypoon/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_xavixva_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_xavixva_en.md new file mode 100644 index 000000000000..13df3bc4f3c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_xavixva_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_xavixva DistilBertForSequenceClassification from XaviXva +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_xavixva +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_xavixva` is a English model originally trained by XaviXva. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_xavixva_en_5.2.0_3.0_1700488001875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_xavixva_en_5.2.0_3.0_1700488001875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_xavixva","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_xavixva","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_xavixva| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/XaviXva/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yasnunsal_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yasnunsal_en.md new file mode 100644 index 000000000000..6a4da9f6bac9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yasnunsal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_yasnunsal DistilBertForSequenceClassification from yasnunsal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_yasnunsal +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_yasnunsal` is a English model originally trained by yasnunsal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yasnunsal_en_5.2.0_3.0_1700491268271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yasnunsal_en_5.2.0_3.0_1700491268271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yasnunsal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yasnunsal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_yasnunsal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yasnunsal/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yokoe_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yokoe_en.md new file mode 100644 index 000000000000..bfefea830a96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yokoe_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_yokoe DistilBertForSequenceClassification from yokoe +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_yokoe +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_yokoe` is a English model originally trained by yokoe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yokoe_en_5.2.0_3.0_1700488436007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yokoe_en_5.2.0_3.0_1700488436007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yokoe","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yokoe","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_yokoe| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/yokoe/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yumasaito_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yumasaito_en.md new file mode 100644 index 000000000000..8f5e4d4adabb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_yumasaito_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_yumasaito DistilBertForSequenceClassification from YumaSaito +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_yumasaito +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_yumasaito` is a English model originally trained by YumaSaito. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yumasaito_en_5.2.0_3.0_1700497844133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_yumasaito_en_5.2.0_3.0_1700497844133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yumasaito","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_yumasaito","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_yumasaito| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/YumaSaito/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_zia_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_zia_en.md new file mode 100644 index 000000000000..041e435712f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_emotion_zia_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_emotion_zia DistilBertForSequenceClassification from Zia +author: John Snow Labs +name: distilbert_base_uncased_finetuned_emotion_zia +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_emotion_zia` is a English model originally trained by Zia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_zia_en_5.2.0_3.0_1700489498908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_emotion_zia_en_5.2.0_3.0_1700489498908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_zia","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_emotion_zia","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_emotion_zia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Zia/distilbert-base-uncased-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ft650_10class_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ft650_10class_en.md new file mode 100644 index 000000000000..a48291ac09ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ft650_10class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ft650_10class DistilBertForSequenceClassification from dminiotas05 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ft650_10class +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ft650_10class` is a English model originally trained by dminiotas05. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ft650_10class_en_5.2.0_3.0_1700473323534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ft650_10class_en_5.2.0_3.0_1700473323534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_ft650_10class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_ft650_10class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ft650_10class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/dminiotas05/distilbert-base-uncased-finetuned-ft650_10class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ft750_reg1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ft750_reg1_en.md new file mode 100644 index 000000000000..fa58220fb454 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ft750_reg1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ft750_reg1 DistilBertForSequenceClassification from dminiotas05 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ft750_reg1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ft750_reg1` is a English model originally trained by dminiotas05. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ft750_reg1_en_5.2.0_3.0_1700489327724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ft750_reg1_en_5.2.0_3.0_1700489327724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_ft750_reg1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_ft750_reg1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ft750_reg1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/dminiotas05/distilbert-base-uncased-finetuned-ft750_reg1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_greenpatent_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_greenpatent_en.md new file mode 100644 index 000000000000..b012f1b75f23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_greenpatent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_greenpatent DistilBertForSequenceClassification from cwinkler +author: John Snow Labs +name: distilbert_base_uncased_finetuned_greenpatent +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_greenpatent` is a English model originally trained by cwinkler. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_greenpatent_en_5.2.0_3.0_1700489813968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_greenpatent_en_5.2.0_3.0_1700489813968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_greenpatent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_greenpatent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_greenpatent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/cwinkler/distilbert-base-uncased-finetuned-greenpatent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_greenplastics_tiny_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_greenplastics_tiny_en.md new file mode 100644 index 000000000000..1b285b33ba4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_greenplastics_tiny_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_greenplastics_tiny DistilBertForSequenceClassification from cwinkler +author: John Snow Labs +name: distilbert_base_uncased_finetuned_greenplastics_tiny +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_greenplastics_tiny` is a English model originally trained by cwinkler. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_greenplastics_tiny_en_5.2.0_3.0_1700464916529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_greenplastics_tiny_en_5.2.0_3.0_1700464916529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_greenplastics_tiny","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_greenplastics_tiny","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_greenplastics_tiny| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/cwinkler/distilbert-base-uncased-finetuned-greenplastics-tiny \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_moral_ctx_action_conseq_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_moral_ctx_action_conseq_en.md new file mode 100644 index 000000000000..3ea54dc85e4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_moral_ctx_action_conseq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_moral_ctx_action_conseq DistilBertForSequenceClassification from agi-css +author: John Snow Labs +name: distilbert_base_uncased_finetuned_moral_ctx_action_conseq +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_moral_ctx_action_conseq` is a English model originally trained by agi-css. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_moral_ctx_action_conseq_en_5.2.0_3.0_1700447197990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_moral_ctx_action_conseq_en_5.2.0_3.0_1700447197990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_moral_ctx_action_conseq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_moral_ctx_action_conseq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_moral_ctx_action_conseq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/agi-css/distilbert-base-uncased-finetuned-moral-ctx-action-conseq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_1_en.md new file mode 100644 index 000000000000..c467c9d4e6ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_1 DistilBertForTokenClassification from clarissa-koh-chope +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_1 +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_1` is a English model originally trained by clarissa-koh-chope. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_1_en_5.2.0_3.0_1700524235530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_1_en_5.2.0_3.0_1700524235530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/clarissa-koh-chope/distilbert-base-uncased-finetuned-ner_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_brettlin_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_brettlin_en.md new file mode 100644 index 000000000000..c25613468543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_brettlin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_brettlin DistilBertForTokenClassification from brettlin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_brettlin +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_brettlin` is a English model originally trained by brettlin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_brettlin_en_5.2.0_3.0_1700520584132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_brettlin_en_5.2.0_3.0_1700520584132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_brettlin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_brettlin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_brettlin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.6 MB| + +## References + +https://huggingface.co/brettlin/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_issifuamajeed_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_issifuamajeed_en.md new file mode 100644 index 000000000000..16a07f113d28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_issifuamajeed_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_issifuamajeed DistilBertForTokenClassification from issifuamajeed +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_issifuamajeed +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_issifuamajeed` is a English model originally trained by issifuamajeed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_issifuamajeed_en_5.2.0_3.0_1700519243660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_issifuamajeed_en_5.2.0_3.0_1700519243660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_issifuamajeed","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_issifuamajeed", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_issifuamajeed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/issifuamajeed/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_malduwais_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_malduwais_en.md new file mode 100644 index 000000000000..e83325a8873c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_malduwais_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_malduwais DistilBertForTokenClassification from malduwais +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_malduwais +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_malduwais` is a English model originally trained by malduwais. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_malduwais_en_5.2.0_3.0_1700519388994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_malduwais_en_5.2.0_3.0_1700519388994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_malduwais","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_malduwais", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_malduwais| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/malduwais/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_ui_chope_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_ui_chope_en.md new file mode 100644 index 000000000000..6085881547f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_ner_ui_chope_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ui_chope DistilBertForTokenClassification from ui-chope +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ui_chope +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ui_chope` is a English model originally trained by ui-chope. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ui_chope_en_5.2.0_3.0_1700519717661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ui_chope_en_5.2.0_3.0_1700519717661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ui_chope","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ui_chope", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ui_chope| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ui-chope/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_olid_a_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_olid_a_en.md new file mode 100644 index 000000000000..2afdc530d5c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_olid_a_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_olid_a DistilBertForSequenceClassification from pigeon-phobia +author: John Snow Labs +name: distilbert_base_uncased_finetuned_olid_a +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_olid_a` is a English model originally trained by pigeon-phobia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_olid_a_en_5.2.0_3.0_1700455128998.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_olid_a_en_5.2.0_3.0_1700455128998.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_olid_a","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_olid_a","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_olid_a| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/pigeon-phobia/distilbert-base-uncased_finetuned_olid_a \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sprint_meds_sarahflan_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sprint_meds_sarahflan_en.md new file mode 100644 index 000000000000..d99cdfd34c1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sprint_meds_sarahflan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sprint_meds_sarahflan DistilBertForSequenceClassification from sarahflan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sprint_meds_sarahflan +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sprint_meds_sarahflan` is a English model originally trained by sarahflan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sprint_meds_sarahflan_en_5.2.0_3.0_1700481041025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sprint_meds_sarahflan_en_5.2.0_3.0_1700481041025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sprint_meds_sarahflan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sprint_meds_sarahflan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sprint_meds_sarahflan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/sarahflan/distilbert-base-uncased-finetuned-sprint-meds \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_ag_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_ag_en.md new file mode 100644 index 000000000000..b68b9ad40901 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_ag_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_ag DistilBertForSequenceClassification from AG6019 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_ag +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_ag` is a English model originally trained by AG6019. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_ag_en_5.2.0_3.0_1700476658976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_ag_en_5.2.0_3.0_1700476658976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_ag","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_ag","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_ag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/AG6019/distilbert-base-uncased-finetuned-sst2-ag \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_tillschwoerer_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_tillschwoerer_en.md new file mode 100644 index 000000000000..8c4e30fee92c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_tillschwoerer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_tillschwoerer DistilBertForSequenceClassification from tillschwoerer +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_tillschwoerer +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_tillschwoerer` is a English model originally trained by tillschwoerer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_tillschwoerer_en_5.2.0_3.0_1700490337406.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_tillschwoerer_en_5.2.0_3.0_1700490337406.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_tillschwoerer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_tillschwoerer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_tillschwoerer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/tillschwoerer/distilbert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_winegarj_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_winegarj_en.md new file mode 100644 index 000000000000..da594e8856bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_sst2_winegarj_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst2_winegarj DistilBertForSequenceClassification from winegarj +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst2_winegarj +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst2_winegarj` is a English model originally trained by winegarj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_winegarj_en_5.2.0_3.0_1700447163037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst2_winegarj_en_5.2.0_3.0_1700447163037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_winegarj","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_sst2_winegarj","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst2_winegarj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/winegarj/distilbert-base-uncased-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_stsb_trinadutta_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_stsb_trinadutta_en.md new file mode 100644 index 000000000000..fee6e36fd77d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_stsb_trinadutta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_stsb_trinadutta DistilBertForSequenceClassification from trinadutta +author: John Snow Labs +name: distilbert_base_uncased_finetuned_stsb_trinadutta +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_stsb_trinadutta` is a English model originally trained by trinadutta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_stsb_trinadutta_en_5.2.0_3.0_1700483163482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_stsb_trinadutta_en_5.2.0_3.0_1700483163482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_stsb_trinadutta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_stsb_trinadutta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_stsb_trinadutta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/trinadutta/distilbert-base-uncased-finetuned-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_subreddit_classification_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_subreddit_classification_en.md new file mode 100644 index 000000000000..52a7f48b785e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_subreddit_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_subreddit_classification DistilBertForSequenceClassification from nillo36 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_subreddit_classification +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_subreddit_classification` is a English model originally trained by nillo36. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_subreddit_classification_en_5.2.0_3.0_1700452441528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_subreddit_classification_en_5.2.0_3.0_1700452441528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_subreddit_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_subreddit_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_subreddit_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nillo36/distilbert-base-uncased-finetuned-subreddit_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_switchboard_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_switchboard_en.md new file mode 100644 index 000000000000..c5ef2bc20aba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_switchboard_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_switchboard DistilBertForSequenceClassification from goldenk +author: John Snow Labs +name: distilbert_base_uncased_finetuned_switchboard +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_switchboard` is a English model originally trained by goldenk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_switchboard_en_5.2.0_3.0_1700487067280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_switchboard_en_5.2.0_3.0_1700487067280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_switchboard","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_switchboard","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_switchboard| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/goldenk/distilbert-base-uncased-finetuned-switchboard \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_en.md new file mode 100644 index 000000000000..584ca9603a8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_text_2_disease_celtic_languages DistilBertForSequenceClassification from celikmus +author: John Snow Labs +name: distilbert_base_uncased_finetuned_text_2_disease_celtic_languages +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_text_2_disease_celtic_languages` is a English model originally trained by celikmus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_en_5.2.0_3.0_1700446117090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_en_5.2.0_3.0_1700446117090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease_celtic_languages","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease_celtic_languages","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_text_2_disease_celtic_languages| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/celikmus/distilbert-base-uncased_finetuned_text_2_disease_cel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1_en.md new file mode 100644 index 000000000000..047a7a2c0b0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1 DistilBertForSequenceClassification from celikmus +author: John Snow Labs +name: distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1` is a English model originally trained by celikmus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1_en_5.2.0_3.0_1700457422501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1_en_5.2.0_3.0_1700457422501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/celikmus/distilbert-base-uncased_finetuned_text_2_disease_cel_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2_en.md new file mode 100644 index 000000000000..fba0b67514f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2 DistilBertForSequenceClassification from celikmus +author: John Snow Labs +name: distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2` is a English model originally trained by celikmus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2_en_5.2.0_3.0_1700456003367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2_en_5.2.0_3.0_1700456003367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_text_2_disease_celtic_languages_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/celikmus/distilbert-base-uncased_finetuned_text_2_disease_cel_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_tweets_emoji_dataset_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_tweets_emoji_dataset_en.md new file mode 100644 index 000000000000..a78e001d7760 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_finetuned_tweets_emoji_dataset_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_tweets_emoji_dataset DistilBertForSequenceClassification from JNK789 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_tweets_emoji_dataset +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_tweets_emoji_dataset` is a English model originally trained by JNK789. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tweets_emoji_dataset_en_5.2.0_3.0_1700450096605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tweets_emoji_dataset_en_5.2.0_3.0_1700450096605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_tweets_emoji_dataset","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_finetuned_tweets_emoji_dataset","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_tweets_emoji_dataset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/JNK789/distilbert-base-uncased-finetuned-tweets-emoji-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_1_binary_v1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_1_binary_v1_en.md new file mode 100644 index 000000000000..9b377d18bb90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_1_binary_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fold_1_binary_v1 DistilBertForSequenceClassification from elopezlopez +author: John Snow Labs +name: distilbert_base_uncased_fold_1_binary_v1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fold_1_binary_v1` is a English model originally trained by elopezlopez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_1_binary_v1_en_5.2.0_3.0_1700484054435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_1_binary_v1_en_5.2.0_3.0_1700484054435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_1_binary_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_1_binary_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fold_1_binary_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/elopezlopez/distilbert-base-uncased_fold_1_binary_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_3_binary_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_3_binary_en.md new file mode 100644 index 000000000000..7a245cd2ed5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_3_binary_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fold_3_binary DistilBertForSequenceClassification from elopezlopez +author: John Snow Labs +name: distilbert_base_uncased_fold_3_binary +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fold_3_binary` is a English model originally trained by elopezlopez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_3_binary_en_5.2.0_3.0_1700487586671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_3_binary_en_5.2.0_3.0_1700487586671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_3_binary","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_3_binary","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fold_3_binary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/elopezlopez/distilbert-base-uncased_fold_3_binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_4_binary_v1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_4_binary_v1_en.md new file mode 100644 index 000000000000..c2e29e70d207 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_4_binary_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fold_4_binary_v1 DistilBertForSequenceClassification from elopezlopez +author: John Snow Labs +name: distilbert_base_uncased_fold_4_binary_v1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fold_4_binary_v1` is a English model originally trained by elopezlopez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_4_binary_v1_en_5.2.0_3.0_1700495172288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_4_binary_v1_en_5.2.0_3.0_1700495172288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_4_binary_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_4_binary_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fold_4_binary_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/elopezlopez/distilbert-base-uncased_fold_4_binary_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_4_ternary_v1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_4_ternary_v1_en.md new file mode 100644 index 000000000000..f89cf32d004b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_4_ternary_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fold_4_ternary_v1 DistilBertForSequenceClassification from elopezlopez +author: John Snow Labs +name: distilbert_base_uncased_fold_4_ternary_v1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fold_4_ternary_v1` is a English model originally trained by elopezlopez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_4_ternary_v1_en_5.2.0_3.0_1700491113706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_4_ternary_v1_en_5.2.0_3.0_1700491113706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_4_ternary_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_4_ternary_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fold_4_ternary_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/elopezlopez/distilbert-base-uncased_fold_4_ternary_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_5_binary_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_5_binary_en.md new file mode 100644 index 000000000000..a1b00f1c9a69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_5_binary_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fold_5_binary DistilBertForSequenceClassification from elopezlopez +author: John Snow Labs +name: distilbert_base_uncased_fold_5_binary +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fold_5_binary` is a English model originally trained by elopezlopez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_5_binary_en_5.2.0_3.0_1700493457893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_5_binary_en_5.2.0_3.0_1700493457893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_5_binary","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_5_binary","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fold_5_binary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/elopezlopez/distilbert-base-uncased_fold_5_binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_7_binary_v1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_7_binary_v1_en.md new file mode 100644 index 000000000000..5241730bf572 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_fold_7_binary_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_fold_7_binary_v1 DistilBertForSequenceClassification from elopezlopez +author: John Snow Labs +name: distilbert_base_uncased_fold_7_binary_v1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fold_7_binary_v1` is a English model originally trained by elopezlopez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_7_binary_v1_en_5.2.0_3.0_1700502399141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fold_7_binary_v1_en_5.2.0_3.0_1700502399141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_7_binary_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_fold_7_binary_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fold_7_binary_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/elopezlopez/distilbert-base-uncased_fold_7_binary_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_ft_ncbi_disease_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_ft_ncbi_disease_en.md new file mode 100644 index 000000000000..31fd3de1f826 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_ft_ncbi_disease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ft_ncbi_disease DistilBertForTokenClassification from sarahmiller137 +author: John Snow Labs +name: distilbert_base_uncased_ft_ncbi_disease +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ft_ncbi_disease` is a English model originally trained by sarahmiller137. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ft_ncbi_disease_en_5.2.0_3.0_1700521534769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ft_ncbi_disease_en_5.2.0_3.0_1700521534769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ft_ncbi_disease","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ft_ncbi_disease", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ft_ncbi_disease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sarahmiller137/distilbert-base-uncased-ft-ncbi-disease \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_gc_indep_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_gc_indep_en.md new file mode 100644 index 000000000000..cfb9af96aa97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_gc_indep_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_gc_indep DistilBertForSequenceClassification from waynedsouza +author: John Snow Labs +name: distilbert_base_uncased_gc_indep +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_gc_indep` is a English model originally trained by waynedsouza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_gc_indep_en_5.2.0_3.0_1700492262428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_gc_indep_en_5.2.0_3.0_1700492262428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_gc_indep","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_gc_indep","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_gc_indep| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/waynedsouza/distilbert-base-uncased-gc-indep \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_imdb_1000_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_imdb_1000_en.md new file mode 100644 index 000000000000..78ae9b9c3bfe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_imdb_1000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_imdb_1000 DistilBertForSequenceClassification from romainf +author: John Snow Labs +name: distilbert_base_uncased_imdb_1000 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_imdb_1000` is a English model originally trained by romainf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_1000_en_5.2.0_3.0_1700500851713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_1000_en_5.2.0_3.0_1700500851713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_1000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_1000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_imdb_1000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/romainf/distilbert-base-uncased-imdb-1000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_imdb_3000_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_imdb_3000_en.md new file mode 100644 index 000000000000..99407e3f24c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_imdb_3000_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_imdb_3000 DistilBertForSequenceClassification from romainf +author: John Snow Labs +name: distilbert_base_uncased_imdb_3000 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_imdb_3000` is a English model originally trained by romainf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_3000_en_5.2.0_3.0_1700475758149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_imdb_3000_en_5.2.0_3.0_1700475758149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_3000","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_imdb_3000","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_imdb_3000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/romainf/distilbert-base-uncased-imdb-3000 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_ner_speaker_diarization_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_ner_speaker_diarization_en.md new file mode 100644 index 000000000000..db03cbf9a4cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_ner_speaker_diarization_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_speaker_diarization DistilBertForTokenClassification from asanoop24 +author: John Snow Labs +name: distilbert_base_uncased_ner_speaker_diarization +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_speaker_diarization` is a English model originally trained by asanoop24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_speaker_diarization_en_5.2.0_3.0_1700519892260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_speaker_diarization_en_5.2.0_3.0_1700519892260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_speaker_diarization","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_speaker_diarization", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_speaker_diarization| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/asanoop24/distilbert-base-uncased-ner-speaker-diarization \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_new2_cola_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_new2_cola_en.md new file mode 100644 index 000000000000..e1ed0e564aa1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_new2_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_new2_cola DistilBertForSequenceClassification from charlemagne +author: John Snow Labs +name: distilbert_base_uncased_new2_cola +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_new2_cola` is a English model originally trained by charlemagne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_new2_cola_en_5.2.0_3.0_1700492546567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_new2_cola_en_5.2.0_3.0_1700492546567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_new2_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_new2_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_new2_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/charlemagne/distilbert-base-uncased-new2-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_newsmodelclassification_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_newsmodelclassification_en.md new file mode 100644 index 000000000000..b2fbbe159dae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_newsmodelclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_newsmodelclassification DistilBertForSequenceClassification from aatmasidha +author: John Snow Labs +name: distilbert_base_uncased_newsmodelclassification +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_newsmodelclassification` is a English model originally trained by aatmasidha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_newsmodelclassification_en_5.2.0_3.0_1700475219205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_newsmodelclassification_en_5.2.0_3.0_1700475219205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_newsmodelclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_newsmodelclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_newsmodelclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/aatmasidha/distilbert-base-uncased-newsmodelclassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_output_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_output_en.md new file mode 100644 index 000000000000..8d5fe5f30c7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_output_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_output DistilBertForSequenceClassification from yasser-kaddoura +author: John Snow Labs +name: distilbert_base_uncased_output +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_output` is a English model originally trained by yasser-kaddoura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_output_en_5.2.0_3.0_1700477890725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_output_en_5.2.0_3.0_1700477890725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_output","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_output","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_output| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/yasser-kaddoura/distilbert-base-uncased_output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_pina_dfnew_tuning_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_pina_dfnew_tuning_en.md new file mode 100644 index 000000000000..2eb2f9524391 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_pina_dfnew_tuning_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_pina_dfnew_tuning DistilBertForSequenceClassification from GhifSmile +author: John Snow Labs +name: distilbert_base_uncased_pina_dfnew_tuning +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_pina_dfnew_tuning` is a English model originally trained by GhifSmile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pina_dfnew_tuning_en_5.2.0_3.0_1700494088295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pina_dfnew_tuning_en_5.2.0_3.0_1700494088295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_pina_dfnew_tuning","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_pina_dfnew_tuning","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_pina_dfnew_tuning| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/GhifSmile/distilbert-base-uncased-PINA-dfnew-tuning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_qnli_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_qnli_en.md new file mode 100644 index 000000000000..b07d283def48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_qnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_qnli DistilBertForSequenceClassification from textattack +author: John Snow Labs +name: distilbert_base_uncased_qnli +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_qnli` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_qnli_en_5.2.0_3.0_1700453433323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_qnli_en_5.2.0_3.0_1700453433323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_qnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_qnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_qnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/textattack/distilbert-base-uncased-QNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_sexist_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_sexist_en.md new file mode 100644 index 000000000000..b5920a7b443d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_sexist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_sexist DistilBertForSequenceClassification from mwrob +author: John Snow Labs +name: distilbert_base_uncased_sexist +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_sexist` is a English model originally trained by mwrob. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sexist_en_5.2.0_3.0_1700471301920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_sexist_en_5.2.0_3.0_1700471301920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sexist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_sexist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_sexist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/mwrob/distilbert-base-uncased-sexist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_spamfilter_samoan_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_spamfilter_samoan_en.md new file mode 100644 index 000000000000..17671ff545e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_spamfilter_samoan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_spamfilter_samoan DistilBertForSequenceClassification from DunnBC22 +author: John Snow Labs +name: distilbert_base_uncased_spamfilter_samoan +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_spamfilter_samoan` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_spamfilter_samoan_en_5.2.0_3.0_1700443943758.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_spamfilter_samoan_en_5.2.0_3.0_1700443943758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_spamfilter_samoan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_spamfilter_samoan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_spamfilter_samoan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/DunnBC22/distilbert-base-uncased-SpamFilter-sm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_suggestion_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_suggestion_finetuned_en.md new file mode 100644 index 000000000000..9b526e085323 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_base_uncased_suggestion_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_base_uncased_suggestion_finetuned DistilBertForSequenceClassification from wuyue1987 +author: John Snow Labs +name: distilbert_base_uncased_suggestion_finetuned +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_suggestion_finetuned` is a English model originally trained by wuyue1987. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_suggestion_finetuned_en_5.2.0_3.0_1700441098769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_suggestion_finetuned_en_5.2.0_3.0_1700441098769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_suggestion_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_base_uncased_suggestion_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_suggestion_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/wuyue1987/distilbert-base-uncased-suggestion-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_fine_tuned_terms_v2_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_fine_tuned_terms_v2_en.md new file mode 100644 index 000000000000..88df8a6de9ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_fine_tuned_terms_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_fine_tuned_terms_v2 DistilBertForSequenceClassification from alexskrn +author: John Snow Labs +name: distilbert_fine_tuned_terms_v2 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_fine_tuned_terms_v2` is a English model originally trained by alexskrn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_fine_tuned_terms_v2_en_5.2.0_3.0_1700460229264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_fine_tuned_terms_v2_en_5.2.0_3.0_1700460229264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tuned_terms_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tuned_terms_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_fine_tuned_terms_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/alexskrn/distilbert-fine-tuned-terms_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_fine_tuned_text_classification_sl_data_augmentation_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_fine_tuned_text_classification_sl_data_augmentation_en.md new file mode 100644 index 000000000000..b49a1407f101 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_fine_tuned_text_classification_sl_data_augmentation_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_fine_tuned_text_classification_sl_data_augmentation DistilBertForSequenceClassification from Sleoruiz +author: John Snow Labs +name: distilbert_fine_tuned_text_classification_sl_data_augmentation +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_fine_tuned_text_classification_sl_data_augmentation` is a English model originally trained by Sleoruiz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_fine_tuned_text_classification_sl_data_augmentation_en_5.2.0_3.0_1700461500259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_fine_tuned_text_classification_sl_data_augmentation_en_5.2.0_3.0_1700461500259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tuned_text_classification_sl_data_augmentation","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_fine_tuned_text_classification_sl_data_augmentation","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_fine_tuned_text_classification_sl_data_augmentation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.8 MB| + +## References + +https://huggingface.co/Sleoruiz/distilbert-fine-tuned-text-classification-SL-data-augmentation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_ai4privacy_isotonic_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_ai4privacy_isotonic_en.md new file mode 100644 index 000000000000..1b7b45f1fdf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_ai4privacy_isotonic_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ai4privacy_isotonic DistilBertForTokenClassification from Isotonic +author: John Snow Labs +name: distilbert_finetuned_ai4privacy_isotonic +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ai4privacy_isotonic` is a English model originally trained by Isotonic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ai4privacy_isotonic_en_5.2.0_3.0_1700519894965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ai4privacy_isotonic_en_5.2.0_3.0_1700519894965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ai4privacy_isotonic","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ai4privacy_isotonic", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ai4privacy_isotonic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.6 MB| + +## References + +https://huggingface.co/Isotonic/distilbert_finetuned_ai4privacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_claimdecomp_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_claimdecomp_en.md new file mode 100644 index 000000000000..520d2a47a9a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_claimdecomp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_finetuned_claimdecomp DistilBertForSequenceClassification from gavulsim +author: John Snow Labs +name: distilbert_finetuned_claimdecomp +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_claimdecomp` is a English model originally trained by gavulsim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_claimdecomp_en_5.2.0_3.0_1700468093563.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_claimdecomp_en_5.2.0_3.0_1700468093563.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_claimdecomp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_finetuned_claimdecomp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_claimdecomp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.5 GB| + +## References + +https://huggingface.co/gavulsim/distilbert_finetuned_claimdecomp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_ner_neelgokhale_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_ner_neelgokhale_en.md new file mode 100644 index 000000000000..0a04e29089e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_finetuned_ner_neelgokhale_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ner_neelgokhale DistilBertForTokenClassification from neelgokhale +author: John Snow Labs +name: distilbert_finetuned_ner_neelgokhale +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ner_neelgokhale` is a English model originally trained by neelgokhale. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_neelgokhale_en_5.2.0_3.0_1700522541524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_neelgokhale_en_5.2.0_3.0_1700522541524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ner_neelgokhale","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ner_neelgokhale", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ner_neelgokhale| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/neelgokhale/distilbert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_for_food_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_for_food_extraction_en.md new file mode 100644 index 000000000000..a71035b062cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_for_food_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_for_food_extraction DistilBertForTokenClassification from chambliss +author: John Snow Labs +name: distilbert_for_food_extraction +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_for_food_extraction` is a English model originally trained by chambliss. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_for_food_extraction_en_5.2.0_3.0_1700519078894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_for_food_extraction_en_5.2.0_3.0_1700519078894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_for_food_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_for_food_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_for_food_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/chambliss/distilbert-for-food-extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_genre_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_genre_classifier_en.md new file mode 100644 index 000000000000..2f59f94da5aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_genre_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_genre_classifier DistilBertForSequenceClassification from 50stars +author: John Snow Labs +name: distilbert_imdb_genre_classifier +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_genre_classifier` is a English model originally trained by 50stars. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_genre_classifier_en_5.2.0_3.0_1700499855776.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_genre_classifier_en_5.2.0_3.0_1700499855776.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_genre_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_genre_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_genre_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|250.5 MB| + +## References + +https://huggingface.co/50stars/distilbert_imdb_genre_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_gr00t16_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_gr00t16_en.md new file mode 100644 index 000000000000..fc76acc72189 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_gr00t16_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_gr00t16 DistilBertForSequenceClassification from Gr00t16 +author: John Snow Labs +name: distilbert_imdb_gr00t16 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_gr00t16` is a English model originally trained by Gr00t16. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_gr00t16_en_5.2.0_3.0_1700439001858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_gr00t16_en_5.2.0_3.0_1700439001858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_gr00t16","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_gr00t16","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_gr00t16| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Gr00t16/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_medivvv_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_medivvv_en.md new file mode 100644 index 000000000000..033e64e9fb39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_medivvv_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_medivvv DistilBertForSequenceClassification from Medivvv +author: John Snow Labs +name: distilbert_imdb_medivvv +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_medivvv` is a English model originally trained by Medivvv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_medivvv_en_5.2.0_3.0_1700482136898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_medivvv_en_5.2.0_3.0_1700482136898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_medivvv","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_medivvv","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_medivvv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Medivvv/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_yuzhi_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_yuzhi_en.md new file mode 100644 index 000000000000..168285d57a6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_imdb_yuzhi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_imdb_yuzhi DistilBertForSequenceClassification from yuzhi +author: John Snow Labs +name: distilbert_imdb_yuzhi +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_imdb_yuzhi` is a English model originally trained by yuzhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_imdb_yuzhi_en_5.2.0_3.0_1700462266155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_imdb_yuzhi_en_5.2.0_3.0_1700462266155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_yuzhi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_imdb_yuzhi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_imdb_yuzhi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/yuzhi/distilbert-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_infoextract_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_infoextract_en.md new file mode 100644 index 000000000000..c44864fc1bf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_infoextract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_infoextract DistilBertForTokenClassification from tony4194 +author: John Snow Labs +name: distilbert_infoextract +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_infoextract` is a English model originally trained by tony4194. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_infoextract_en_5.2.0_3.0_1700519223312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_infoextract_en_5.2.0_3.0_1700519223312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_infoextract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_infoextract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_infoextract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/tony4194/distilBERT-infoExtract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_italian_cased_ner_it.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_italian_cased_ner_it.md new file mode 100644 index 000000000000..17725bff2af6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_italian_cased_ner_it.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Italian distilbert_italian_cased_ner DistilBertForTokenClassification from osiria +author: John Snow Labs +name: distilbert_italian_cased_ner +date: 2023-11-20 +tags: [bert, it, open_source, token_classification, onnx] +task: Named Entity Recognition +language: it +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_italian_cased_ner` is a Italian model originally trained by osiria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_italian_cased_ner_it_5.2.0_3.0_1700519393775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_italian_cased_ner_it_5.2.0_3.0_1700519393775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_italian_cased_ner","it") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_italian_cased_ner", "it") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_italian_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|it| +|Size:|249.4 MB| + +## References + +https://huggingface.co/osiria/distilbert-italian-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_multilingual_uncased_english_german_oct_15_xx.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_multilingual_uncased_english_german_oct_15_xx.md new file mode 100644 index 000000000000..6e1ca62b0c97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_multilingual_uncased_english_german_oct_15_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual distilbert_multilingual_uncased_english_german_oct_15 DistilBertForSequenceClassification from SmilestheSad +author: John Snow Labs +name: distilbert_multilingual_uncased_english_german_oct_15 +date: 2023-11-20 +tags: [bert, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_multilingual_uncased_english_german_oct_15` is a Multilingual model originally trained by SmilestheSad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_english_german_oct_15_xx_5.2.0_3.0_1700459684047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_uncased_english_german_oct_15_xx_5.2.0_3.0_1700459684047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_english_german_oct_15","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_multilingual_uncased_english_german_oct_15","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_multilingual_uncased_english_german_oct_15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|507.6 MB| + +## References + +https://huggingface.co/SmilestheSad/distilbert-multilingual-uncased-en-de-oct-15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_base_multi_cased_finetuned_typo_detection_xx.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_base_multi_cased_finetuned_typo_detection_xx.md new file mode 100644 index 000000000000..7a42e2ee9a05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_base_multi_cased_finetuned_typo_detection_xx.md @@ -0,0 +1,112 @@ +--- +layout: model +title: Multilingual DistilBertForTokenClassification Base Cased model (from mrm8488) +author: John Snow Labs +name: distilbert_ner_base_multi_cased_finetuned_typo_detection +date: 2023-11-20 +tags: [open_source, distilbert, ner, typo, multilingual, xx, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBERT NER model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-multi-cased-finetuned-typo-detection` is a Multilingual model originally trained by `mrm8488`. + +## Predicted Entities + +`ok`, `typo` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_base_multi_cased_finetuned_typo_detection_xx_5.2.0_3.0_1700517582006.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_base_multi_cased_finetuned_typo_detection_xx_5.2.0_3.0_1700517582006.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetector()\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +ner = DistilBertForTokenClassification.pretrained("distilbert_ner_base_multi_cased_finetuned_typo_detection","xx") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, ner]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE."]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = new SentenceDetector() + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val ner = DistilBertForTokenClassification.pretrained("distilbert_ner_base_multi_cased_finetuned_typo_detection","xx") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, ner)) + +val data = Seq("PUT YOUR STRING HERE.").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.ner.distil_bert.cased_base_finetuned").predict("""PUT YOUR STRING HERE.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_base_multi_cased_finetuned_typo_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +References + +https://huggingface.co/mrm8488/distilbert-base-multi-cased-finetuned-typo-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner_nl.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner_nl.md new file mode 100644 index 000000000000..e6e70e256d6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner_nl.md @@ -0,0 +1,115 @@ +--- +layout: model +title: Dutch Named Entity Recognition (from gunghio) +author: John Snow Labs +name: distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner +date: 2023-11-20 +tags: [distilbert, ner, token_classification, nl, open_source, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-multilingual-cased-finetuned-conll2003-ner` is a Dutch model orginally trained by `gunghio`. + +## Predicted Entities + +`ORG`, `MISC`, `PER`, `LOC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner_nl_5.2.0_3.0_1700518198687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner_nl_5.2.0_3.0_1700518198687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner","nl") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner","nl") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.ner.distil_bert.conll.cased_multilingual_base_finetuned").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_distilbert_base_multilingual_cased_finetuned_conll2003_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|505.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/gunghio/distilbert-base-multilingual-cased-finetuned-conll2003-ner +- https://paperswithcode.com/sota?task=Named+Entity+Recognition&dataset=ConLL+2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_masakhaner_ig.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_masakhaner_ig.md new file mode 100644 index 000000000000..be07e881bbd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_masakhaner_ig.md @@ -0,0 +1,111 @@ +--- +layout: model +title: Igbo Named Entity Recognition (from Davlan) +author: John Snow Labs +name: distilbert_ner_distilbert_base_multilingual_cased_masakhaner +date: 2023-11-20 +tags: [distilbert, ner, token_classification, ig, open_source, onnx] +task: Named Entity Recognition +language: ig +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-multilingual-cased-masakhaner` is a Igbo model orginally trained by `Davlan`. + +## Predicted Entities + +`DATE`, `PER`, `LOC`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_distilbert_base_multilingual_cased_masakhaner_ig_5.2.0_3.0_1700518105787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_distilbert_base_multilingual_cased_masakhaner_ig_5.2.0_3.0_1700518105787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_distilbert_base_multilingual_cased_masakhaner","ig") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ahụrụ m n'anya na-atọ m ụtọ"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_distilbert_base_multilingual_cased_masakhaner","ig") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ahụrụ m n'anya na-atọ m ụtọ").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_distilbert_base_multilingual_cased_masakhaner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|ig| +|Size:|505.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Davlan/distilbert-base-multilingual-cased-masakhaner +- https://github.com/masakhane-io/masakhane-ner +- https://github.com/masakhane-io/masakhane-ner +- https://arxiv.org/abs/2103.11811 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_ner_hrl_nl.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_ner_hrl_nl.md new file mode 100644 index 000000000000..7497f1e2f650 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_distilbert_base_multilingual_cased_ner_hrl_nl.md @@ -0,0 +1,118 @@ +--- +layout: model +title: Dutch Named Entity Recognition (from Davlan) +author: John Snow Labs +name: distilbert_ner_distilbert_base_multilingual_cased_ner_hrl +date: 2023-11-20 +tags: [distilbert, ner, token_classification, nl, open_source, onnx] +task: Named Entity Recognition +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-multilingual-cased-ner-hrl` is a Dutch model orginally trained by `Davlan`. + +## Predicted Entities + +`DATE`, `PER`, `LOC`, `ORG` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_distilbert_base_multilingual_cased_ner_hrl_nl_5.2.0_3.0_1700518187018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_distilbert_base_multilingual_cased_ner_hrl_nl_5.2.0_3.0_1700518187018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_distilbert_base_multilingual_cased_ner_hrl","nl") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["Ik hou van Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_distilbert_base_multilingual_cased_ner_hrl","nl") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("Ik hou van Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.ner.distil_bert.cased_multilingual_base").predict("""Ik hou van Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_distilbert_base_multilingual_cased_ner_hrl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|nl| +|Size:|505.4 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Davlan/distilbert-base-multilingual-cased-ner-hrl +- https://camel.abudhabi.nyu.edu/anercorp/ +- https://www.clips.uantwerpen.be/conll2003/ner/ +- https://www.clips.uantwerpen.be/conll2003/ner/ +- https://www.clips.uantwerpen.be/conll \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_keyphrase_extraction_distilbert_kptimes_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_keyphrase_extraction_distilbert_kptimes_en.md new file mode 100644 index 000000000000..b4e8c70d2e07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_keyphrase_extraction_distilbert_kptimes_en.md @@ -0,0 +1,115 @@ +--- +layout: model +title: English Named Entity Recognition (from DeDeckerThomas) +author: John Snow Labs +name: distilbert_ner_keyphrase_extraction_distilbert_kptimes +date: 2023-11-20 +tags: [distilbert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyphrase-extraction-distilbert-kptimes` is a English model orginally trained by `DeDeckerThomas`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_keyphrase_extraction_distilbert_kptimes_en_5.2.0_3.0_1700518427217.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_keyphrase_extraction_distilbert_kptimes_en_5.2.0_3.0_1700518427217.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_keyphrase_extraction_distilbert_kptimes","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_keyphrase_extraction_distilbert_kptimes","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.distil_bert.keyphrase.kptimes.by_dedeckerthomas").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_keyphrase_extraction_distilbert_kptimes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/DeDeckerThomas/keyphrase-extraction-distilbert-kptimes +- https://paperswithcode.com/sota?task=Keyphrase+Extraction&dataset=kptimes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_keyphrase_extraction_distilbert_openkp_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_keyphrase_extraction_distilbert_openkp_en.md new file mode 100644 index 000000000000..8407bedb41aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_keyphrase_extraction_distilbert_openkp_en.md @@ -0,0 +1,116 @@ +--- +layout: model +title: English Named Entity Recognition (from DeDeckerThomas) +author: John Snow Labs +name: distilbert_ner_keyphrase_extraction_distilbert_openkp +date: 2023-11-20 +tags: [distilbert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyphrase-extraction-distilbert-openkp` is a English model orginally trained by `DeDeckerThomas`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_keyphrase_extraction_distilbert_openkp_en_5.2.0_3.0_1700518274457.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_keyphrase_extraction_distilbert_openkp_en_5.2.0_3.0_1700518274457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_keyphrase_extraction_distilbert_openkp","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_keyphrase_extraction_distilbert_openkp","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.distil_bert.keyphrase.openkp.by_dedeckerthomas").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_keyphrase_extraction_distilbert_openkp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/DeDeckerThomas/keyphrase-extraction-distilbert-openkp +- https://github.com/microsoft/OpenKP +- https://paperswithcode.com/sota?task=Keyphrase+Extraction&dataset=openkp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_ma_ner_v7_distil_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_ma_ner_v7_distil_en.md new file mode 100644 index 000000000000..f36d8c001b14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_ner_ma_ner_v7_distil_en.md @@ -0,0 +1,114 @@ +--- +layout: model +title: English Named Entity Recognition (from CouchCat) +author: John Snow Labs +name: distilbert_ner_ma_ner_v7_distil +date: 2023-11-20 +tags: [distilbert, ner, token_classification, en, open_source, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Named Entity Recognition model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ma_ner_v7_distil` is a English model orginally trained by `CouchCat`. + +## Predicted Entities + +`MATR`, `PERS`, `TIME`, `MISC`, `PAD`, `PROD`, `BRND` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_ma_ner_v7_distil_en_5.2.0_3.0_1700518633863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_ma_ner_v7_distil_en_5.2.0_3.0_1700518633863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")\ + .setInputCols(["document"])\ + .setOutputCol("sentence") + +tokenizer = Tokenizer() \ + .setInputCols("sentence") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_ma_ner_v7_distil","en") \ + .setInputCols(["sentence", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["I love Spark NLP"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") + .setInputCols(Array("document")) + .setOutputCol("sentence") + +val tokenizer = new Tokenizer() + .setInputCols(Array("sentence")) + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_ma_ner_v7_distil","en") + .setInputCols(Array("sentence", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler,sentenceDetector, tokenizer, tokenClassifier)) + +val data = Seq("I love Spark NLP").toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.ner.distil_bert.by_couchcat").predict("""I love Spark NLP""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_ma_ner_v7_distil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/CouchCat/ma_ner_v7_distil \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_profane_final_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_profane_final_en.md new file mode 100644 index 000000000000..a7dd2555b56c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_profane_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_profane_final DistilBertForSequenceClassification from SiddharthaM +author: John Snow Labs +name: distilbert_profane_final +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_profane_final` is a English model originally trained by SiddharthaM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_profane_final_en_5.2.0_3.0_1700472379707.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_profane_final_en_5.2.0_3.0_1700472379707.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_profane_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_profane_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_profane_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|507.6 MB| + +## References + +https://huggingface.co/SiddharthaM/distilbert-profane-final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_punctuator_chinese_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_punctuator_chinese_en.md new file mode 100644 index 000000000000..728ad2767f29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_punctuator_chinese_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_punctuator_chinese DistilBertForTokenClassification from Qishuai +author: John Snow Labs +name: distilbert_punctuator_chinese +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_punctuator_chinese` is a English model originally trained by Qishuai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_punctuator_chinese_en_5.2.0_3.0_1700523075003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_punctuator_chinese_en_5.2.0_3.0_1700523075003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_punctuator_chinese","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_punctuator_chinese", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_punctuator_chinese| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|221.8 MB| + +## References + +https://huggingface.co/Qishuai/distilbert_punctuator_zh \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_punctuator_english_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_punctuator_english_en.md new file mode 100644 index 000000000000..4127c7ba1705 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_punctuator_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_punctuator_english DistilBertForTokenClassification from Qishuai +author: John Snow Labs +name: distilbert_punctuator_english +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_punctuator_english` is a English model originally trained by Qishuai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_punctuator_english_en_5.2.0_3.0_1700519342970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_punctuator_english_en_5.2.0_3.0_1700519342970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_punctuator_english","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_punctuator_english", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_punctuator_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Qishuai/distilbert_punctuator_en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_sentiment_analysis_model_40k_samples_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_sentiment_analysis_model_40k_samples_en.md new file mode 100644 index 000000000000..b28d401ffb8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_sentiment_analysis_model_40k_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_sentiment_analysis_model_40k_samples DistilBertForSequenceClassification from Camelia7v +author: John Snow Labs +name: distilbert_sentiment_analysis_model_40k_samples +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sentiment_analysis_model_40k_samples` is a English model originally trained by Camelia7v. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_analysis_model_40k_samples_en_5.2.0_3.0_1700451432185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sentiment_analysis_model_40k_samples_en_5.2.0_3.0_1700451432185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentiment_analysis_model_40k_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_sentiment_analysis_model_40k_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sentiment_analysis_model_40k_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Camelia7v/distilbert-sentiment-analysis-model-40k-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_text_classification_en.md new file mode 100644 index 000000000000..449534da08be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_text_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_text_classification DistilBertForSequenceClassification from ptamm +author: John Snow Labs +name: distilbert_text_classification +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_text_classification` is a English model originally trained by ptamm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_text_classification_en_5.2.0_3.0_1700449825564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_text_classification_en_5.2.0_3.0_1700449825564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_text_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_text_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_text_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|252.5 MB| + +## References + +https://huggingface.co/ptamm/distilbert-text_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_tiln_proj_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_tiln_proj_en.md new file mode 100644 index 000000000000..449aee03d918 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_tiln_proj_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbert_tiln_proj DistilBertForSequenceClassification from cataremix15 +author: John Snow Labs +name: distilbert_tiln_proj +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_tiln_proj` is a English model originally trained by cataremix15. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_tiln_proj_en_5.2.0_3.0_1700462107800.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_tiln_proj_en_5.2.0_3.0_1700462107800.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_tiln_proj","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilbert_tiln_proj","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_tiln_proj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.2 MB| + +## References + +https://huggingface.co/cataremix15/distilbert-tiln-proj \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_all_903429540_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_all_903429540_en.md new file mode 100644 index 000000000000..352941316ce6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_all_903429540_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ismail-lucifer011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_company_all_903429540 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-company_all-903429540` is a English model originally trained by `ismail-lucifer011`. + +## Predicted Entities + +`OOV`, `Company` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_company_all_903429540_en_5.2.0_3.0_1700518424699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_company_all_903429540_en_5.2.0_3.0_1700518424699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_company_all_903429540","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_company_all_903429540","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_company_all_903429540| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ismail-lucifer011/autotrain-company_all-903429540 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_all_903429548_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_all_903429548_en.md new file mode 100644 index 000000000000..d8de865dbfde --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_all_903429548_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ismail-lucifer011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_company_all_903429548 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-company_all-903429548` is a English model originally trained by `ismail-lucifer011`. + +## Predicted Entities + +`OOV`, `Company` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_company_all_903429548_en_5.2.0_3.0_1700518019739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_company_all_903429548_en_5.2.0_3.0_1700518019739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_company_all_903429548","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_company_all_903429548","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_company_all_903429548| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ismail-lucifer011/autotrain-company_all-903429548 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_vs_all_902129475_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_vs_all_902129475_en.md new file mode 100644 index 000000000000..7ebd4fc2bcfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_company_vs_all_902129475_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ismail-lucifer011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_company_vs_all_902129475 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-company_vs_all-902129475` is a English model originally trained by `ismail-lucifer011`. + +## Predicted Entities + +`OOV`, `Company` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_company_vs_all_902129475_en_5.2.0_3.0_1700518841640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_company_vs_all_902129475_en_5.2.0_3.0_1700518841640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_company_vs_all_902129475","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_company_vs_all_902129475","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_company_vs_all_902129475| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ismail-lucifer011/autotrain-company_vs_all-902129475 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_final_784824209_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_final_784824209_en.md new file mode 100644 index 000000000000..225e4b92ad75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_final_784824209_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from Lucifermorningstar011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_final_784824209 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-final-784824209` is a English model originally trained by `Lucifermorningstar011`. + +## Predicted Entities + +`0`, `9` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_final_784824209_en_5.2.0_3.0_1700518201128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_final_784824209_en_5.2.0_3.0_1700518201128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_final_784824209","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_final_784824209","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_final_784824209| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Lucifermorningstar011/autotrain-final-784824209 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_final_784824211_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_final_784824211_en.md new file mode 100644 index 000000000000..1695e4fc7dc1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_final_784824211_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from Lucifermorningstar011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_final_784824211 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-final-784824211` is a English model originally trained by `Lucifermorningstar011`. + +## Predicted Entities + +`0`, `9` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_final_784824211_en_5.2.0_3.0_1700518628443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_final_784824211_en_5.2.0_3.0_1700518628443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_final_784824211","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_final_784824211","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_final_784824211| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Lucifermorningstar011/autotrain-final-784824211 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_job_all_903929564_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_job_all_903929564_en.md new file mode 100644 index 000000000000..15c8a86af1df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_job_all_903929564_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ismail-lucifer011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_job_all_903929564 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-job_all-903929564` is a English model originally trained by `ismail-lucifer011`. + +## Predicted Entities + +`OOV`, `Job` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_job_all_903929564_en_5.2.0_3.0_1700519035746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_job_all_903929564_en_5.2.0_3.0_1700519035746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_job_all_903929564","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_job_all_903929564","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_job_all_903929564| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ismail-lucifer011/autotrain-job_all-903929564 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_lucy_light_control_3122788375_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_lucy_light_control_3122788375_en.md new file mode 100644 index 000000000000..22304b0522d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_lucy_light_control_3122788375_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ankleBowl) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_lucy_light_control_3122788375 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-lucy-light-control-3122788375` is a English model originally trained by `ankleBowl`. + +## Predicted Entities + +`PER`, `OFF`, `BRI`, `EMP`, `ONN`, `DIM`, `COL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_lucy_light_control_3122788375_en_5.2.0_3.0_1700518826278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_lucy_light_control_3122788375_en_5.2.0_3.0_1700518826278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_lucy_light_control_3122788375","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_lucy_light_control_3122788375","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_lucy_light_control_3122788375| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ankleBowl/autotrain-lucy-light-control-3122788375 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_all_904029569_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_all_904029569_en.md new file mode 100644 index 000000000000..2a219c154a18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_all_904029569_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ismail-lucifer011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_name_all_904029569 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-name_all-904029569` is a English model originally trained by `ismail-lucifer011`. + +## Predicted Entities + +`OOV`, `Name` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_name_all_904029569_en_5.2.0_3.0_1700518463150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_name_all_904029569_en_5.2.0_3.0_1700518463150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_name_all_904029569","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_name_all_904029569","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_name_all_904029569| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ismail-lucifer011/autotrain-name_all-904029569 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_all_904029577_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_all_904029577_en.md new file mode 100644 index 000000000000..7aa8109e649a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_all_904029577_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ismail-lucifer011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_name_all_904029577 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-name_all-904029577` is a English model originally trained by `ismail-lucifer011`. + +## Predicted Entities + +`Name`, `OOV` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_name_all_904029577_en_5.2.0_3.0_1700519016979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_name_all_904029577_en_5.2.0_3.0_1700519016979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_name_all_904029577","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_name_all_904029577","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_name_all_904029577| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ismail-lucifer011/autotrain-name_all-904029577 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_vsv_all_901529445_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_vsv_all_901529445_en.md new file mode 100644 index 000000000000..b7d60c4cd6fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_name_vsv_all_901529445_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ismail-lucifer011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_name_vsv_all_901529445 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-name_vsv_all-901529445` is a English model originally trained by `ismail-lucifer011`. + +## Predicted Entities + +`OOV`, `Name` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_name_vsv_all_901529445_en_5.2.0_3.0_1700518216851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_name_vsv_all_901529445_en_5.2.0_3.0_1700518216851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_name_vsv_all_901529445","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_name_vsv_all_901529445","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_name_vsv_all_901529445| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ismail-lucifer011/autotrain-name_vsv_all-901529445 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_ner_778023879_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_ner_778023879_en.md new file mode 100644 index 000000000000..f13572c2861f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_autotrain_ner_778023879_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from Lucifermorningstar011) +author: John Snow Labs +name: distilbert_token_classifier_autotrain_ner_778023879 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-ner-778023879` is a English model originally trained by `Lucifermorningstar011`. + +## Predicted Entities + +`0`, `9` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_ner_778023879_en_5.2.0_3.0_1700518430574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_autotrain_ner_778023879_en_5.2.0_3.0_1700518430574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_ner_778023879","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_autotrain_ner_778023879","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_autotrain_ner_778023879| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Lucifermorningstar011/autotrain-ner-778023879 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_cased_finetuned_conll03_english_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_cased_finetuned_conll03_english_en.md new file mode 100644 index 000000000000..1c1f49184c08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_cased_finetuned_conll03_english_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English DistilBertForTokenClassification Base Cased model (from elastic) +author: John Snow Labs +name: distilbert_token_classifier_base_cased_finetuned_conll03_english +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-cased-finetuned-conll03-english` is a English model originally trained by `elastic`. + +## Predicted Entities + +`PER`, `ORG`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_cased_finetuned_conll03_english_en_5.2.0_3.0_1700519197184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_cased_finetuned_conll03_english_en_5.2.0_3.0_1700519197184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_cased_finetuned_conll03_english","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_cased_finetuned_conll03_english","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_base_cased_finetuned_conll03_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/elastic/distilbert-base-cased-finetuned-conll03-english +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_ner_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_ner_en.md new file mode 100644 index 000000000000..b4c62b8529da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_ner_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English DistilBertForTokenClassification Base Cased model (from 51la5) +author: John Snow Labs +name: distilbert_token_classifier_base_ner +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-NER` is a English model originally trained by `51la5`. + +## Predicted Entities + +`PER`, `ORG`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_ner_en_5.2.0_3.0_1700518661626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_ner_en_5.2.0_3.0_1700518661626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_ner","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_ner","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_base_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/51la5/distilbert-base-NER +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_finetuned_conll03_english_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_finetuned_conll03_english_en.md new file mode 100644 index 000000000000..f20c2f52cea6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_finetuned_conll03_english_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English DistilBertForTokenClassification Base Uncased model (from elastic) +author: John Snow Labs +name: distilbert_token_classifier_base_uncased_finetuned_conll03_english +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-conll03-english` is a English model originally trained by `elastic`. + +## Predicted Entities + +`PER`, `ORG`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_uncased_finetuned_conll03_english_en_5.2.0_3.0_1700518841770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_uncased_finetuned_conll03_english_en_5.2.0_3.0_1700518841770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_uncased_finetuned_conll03_english","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_uncased_finetuned_conll03_english","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_base_uncased_finetuned_conll03_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/elastic/distilbert-base-uncased-finetuned-conll03-english +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_finetuned_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_finetuned_conll2003_en.md new file mode 100644 index 000000000000..1e37e91dba79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_finetuned_conll2003_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Base Uncased model (from Datasaur) +author: John Snow Labs +name: distilbert_token_classifier_base_uncased_finetuned_conll2003 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-conll2003` is a English model originally trained by `Datasaur`. + +## Predicted Entities + +`PER`, `ORG`, `LOC`, `MISC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_uncased_finetuned_conll2003_en_5.2.0_3.0_1700519401089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_uncased_finetuned_conll2003_en_5.2.0_3.0_1700519401089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_uncased_finetuned_conll2003","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_uncased_finetuned_conll2003","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_base_uncased_finetuned_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Datasaur/distilbert-base-uncased-finetuned-conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_ft_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_ft_conll2003_en.md new file mode 100644 index 000000000000..5bd2c01610a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_base_uncased_ft_conll2003_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English DistilBertForTokenClassification Base Uncased model (from sarahmiller137) +author: John Snow Labs +name: distilbert_token_classifier_base_uncased_ft_conll2003 +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-ft-conll2003` is a English model originally trained by `sarahmiller137`. + +## Predicted Entities + +`PER`, `ORG`, `MISC`, `LOC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_uncased_ft_conll2003_en_5.2.0_3.0_1700518633839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_base_uncased_ft_conll2003_en_5.2.0_3.0_1700518633839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_uncased_ft_conll2003","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_base_uncased_ft_conll2003","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_base_uncased_ft_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/sarahmiller137/distilbert-base-uncased-ft-conll2003 +- https://aclanthology.org/W03-0419 +- https://paperswithcode.com/sota?task=Token+Classification&dataset=conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_cpener_test_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_cpener_test_en.md new file mode 100644 index 000000000000..bac28e0b130e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_cpener_test_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from Neurona) +author: John Snow Labs +name: distilbert_token_classifier_cpener_test +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `cpener-test` is a English model originally trained by `Neurona`. + +## Predicted Entities + +`cpe_vendor`, `cpe_version`, `cpe_product` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_cpener_test_en_5.2.0_3.0_1700519589689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_cpener_test_en_5.2.0_3.0_1700519589689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_cpener_test","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_cpener_test","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_cpener_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/Neurona/cpener-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_directquote_sentlevel_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_directquote_sentlevel_distilbert_en.md new file mode 100644 index 000000000000..38f7cd74a21b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_directquote_sentlevel_distilbert_en.md @@ -0,0 +1,111 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from whispAI) +author: John Snow Labs +name: distilbert_token_classifier_directquote_sentlevel_distilbert +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `DirectQuote-SentLevel-DistilBERT` is a English model originally trained by `whispAI`. + +## Predicted Entities + +`LeftSpeaker`, `Out`, `Speaker`, `RightSpeaker`, `Unknown` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_directquote_sentlevel_distilbert_en_5.2.0_3.0_1700518836292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_directquote_sentlevel_distilbert_en_5.2.0_3.0_1700518836292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_directquote_sentlevel_distilbert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_directquote_sentlevel_distilbert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_directquote_sentlevel_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/whispAI/DirectQuote-SentLevel-DistilBERT +- #quote-extraction--attribution-on-directquotehttpsarxivorgabs211007827-dataset-with-bert-based-token-classification-💬 +- https://arxiv.org/abs/2110.07827 +- https://arxiv.org/abs/2110.07827 +- https://arxiv.org/abs/2110.07827 +- https://www.theguardian.com/info/2021/nov/25/talking-sense-using-machine-learning-to-understand-quotes +- https://arxiv.org/abs/2110.07827 +- https://stanfordnlp.github.io/CoreNLP/quote.html +- https://textacy.readthedocs.io/en/latest/api_reference/extract.html#textacy.extract.triples.direct_quotations +- https://stanfordnlp.github.io/CoreNLP/quote.html +- https://arxiv.org/abs/2110.07827 +- https://textacy.readthedocs.io/en/latest/api_reference/extract.html#textacy.extract.triples.direct_quotations \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_fastpdn_distiluse_pl.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_fastpdn_distiluse_pl.md new file mode 100644 index 000000000000..430bc73023e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_fastpdn_distiluse_pl.md @@ -0,0 +1,103 @@ +--- +layout: model +title: Polish DistilBertForTokenClassification Cased model (from clarin-pl) +author: John Snow Labs +name: distilbert_token_classifier_fastpdn_distiluse +date: 2023-11-20 +tags: [pl, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: pl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `FastPDN-distiluse` is a Polish model originally trained by `clarin-pl`. + +## Predicted Entities + +`nam_fac_road`, `nam_pro_title_article`, `nam_fac_goe`, `nam_eve`, `nam_adj_country`, `nam_eve_human_holiday`, `nam_num_house`, `nam_org_company`, `nam_oth_currency`, `nam_fac_bridge`, `nam_liv_god`, `nam_fac_goe_stop`, `nam_pro_media_tv`, `nam_loc_gpe_admin3`, `nam_org_political_party`, `nam_oth`, `nam_pro_brand`, `nam_fac_park`, `nam_loc_gpe_city`, `nam_loc_hydronym_sea`, `nam_pro_media_web`, `nam_loc_gpe_conurbation`, `nam_loc_land_peak`, `nam_fac_system`, `nam_loc_gpe_district`, `nam_loc_land_island`, `nam_org_organization_sub`, `nam_loc_gpe_admin2`, `nam_adj_city`, `nam_liv_character`, `nam_pro_title_book`, `nam_loc_hydronym_lake`, `nam_loc_astronomical`, `nam_pro_award`, `nam_pro_title_tv`, `nam_loc`, `nam_loc_hydronym_river`, `nam_oth_position`, `nam_pro_vehicle`, `nam_org_institution`, `nam_pro_media`, `nam_pro_model_car`, `nam_org_group_team`, `nam_pro_software_game`, `nam_loc_land`, `nam_oth_tech`, `nam_loc_gpe_admin1`, `nam_adj_person`, `nam_loc_land_mountain`, `nam_liv_person`, `nam_eve_human_sport`, `nam_liv_animal`, `nam_oth_license`, `nam_oth_www`, `nam_loc_hydronym_ocean`, `nam_liv_habitant`, `nam_eve_human`, `nam_loc_land_continent`, `nam_org_nation`, `nam_pro_title_document`, `nam_pro_media_radio`, `nam_loc_country_region`, `nam_eve_human_cultural`, `nam_loc_hydronym`, `nam_loc_gpe_country`, `nam_oth_data_format`, `nam_num_phone`, `nam_loc_historical_region`, `nam_adj`, `nam_org_group_band`, `nam_pro_software`, `nam_pro_title_song`, `nam_loc_land_region`, `nam_pro`, `nam_org_organization`, `nam_pro_title_album`, `nam_org_group`, `nam_loc_gpe_subdivision`, `nam_pro_title_treaty`, `nam_fac_square`, `nam_pro_media_periodic`, `nam_pro_title` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_fastpdn_distiluse_pl_5.2.0_3.0_1700519105296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_fastpdn_distiluse_pl_5.2.0_3.0_1700519105296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_fastpdn_distiluse","pl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_fastpdn_distiluse","pl") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_fastpdn_distiluse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|pl| +|Size:|508.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/clarin-pl/FastPDN-distiluse +- https://gitlab.clarin-pl.eu/information-extraction/poldeepner2 +- https://gitlab.clarin-pl.eu/grupa-wieszcz/ner/fast-pdn +- https://clarin-pl.eu/dspace/bitstream/handle/11321/294/WytyczneKPWr-jednostkiidentyfikacyjne.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_icelandic_ner_is.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_icelandic_ner_is.md new file mode 100644 index 000000000000..886ceae4344c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_icelandic_ner_is.md @@ -0,0 +1,101 @@ +--- +layout: model +title: Icelandic DistilBertForTokenClassification Cased model (from m3hrdadfi) +author: John Snow Labs +name: distilbert_token_classifier_icelandic_ner +date: 2023-11-20 +tags: [is, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: is +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `icelandic-ner-distilbert` is a Icelandic model originally trained by `m3hrdadfi`. + +## Predicted Entities + +`Money`, `Date`, `Time`, `Percent`, `Miscellaneous`, `Location`, `Person`, `Organization` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_icelandic_ner_is_5.2.0_3.0_1700518398506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_icelandic_ner_is_5.2.0_3.0_1700518398506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_icelandic_ner_distilbert","is") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_icelandic_ner_distilbert","is") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_icelandic_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|is| +|Size:|505.4 MB| + +## References + +References + +- https://huggingface.co/m3hrdadfi/icelandic-ner-distilbert +- http://hdl.handle.net/20.500.12537/42 +- https://en.ru.is/ +- https://github.com/m3hrdadfi/icelandic-ner/issues \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_inspec_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_inspec_en.md new file mode 100644 index 000000000000..7855823cd8ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_inspec_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ml6team) +author: John Snow Labs +name: distilbert_token_classifier_keyphrase_extraction_inspec +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyphrase-extraction-distilbert-inspec` is a English model originally trained by `ml6team`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_keyphrase_extraction_inspec_en_5.2.0_3.0_1700519032290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_keyphrase_extraction_inspec_en_5.2.0_3.0_1700519032290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_keyphrase_extraction_inspec","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_keyphrase_extraction_inspec","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_keyphrase_extraction_inspec| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ml6team/keyphrase-extraction-distilbert-inspec +- https://dl.acm.org/doi/10.3115/1119355.1119383 +- https://paperswithcode.com/sota?task=Keyphrase+Extraction&dataset=inspec \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_kptimes_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_kptimes_en.md new file mode 100644 index 000000000000..1ae4122105db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_kptimes_en.md @@ -0,0 +1,102 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ml6team) +author: John Snow Labs +name: distilbert_token_classifier_keyphrase_extraction_kptimes +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyphrase-extraction-distilbert-kptimes` is a English model originally trained by `ml6team`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_keyphrase_extraction_kptimes_en_5.2.0_3.0_1700518595417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_keyphrase_extraction_kptimes_en_5.2.0_3.0_1700518595417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_keyphrase_extraction_kptimes","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_keyphrase_extraction_kptimes","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_keyphrase_extraction_kptimes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ml6team/keyphrase-extraction-distilbert-kptimes +- https://arxiv.org/abs/1911.12559 +- https://paperswithcode.com/sota?task=Keyphrase+Extraction&dataset=kptimes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_openkp_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_openkp_en.md new file mode 100644 index 000000000000..5f3bd83982ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_keyphrase_extraction_openkp_en.md @@ -0,0 +1,103 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from ml6team) +author: John Snow Labs +name: distilbert_token_classifier_keyphrase_extraction_openkp +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `keyphrase-extraction-distilbert-openkp` is a English model originally trained by `ml6team`. + +## Predicted Entities + +`KEY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_keyphrase_extraction_openkp_en_5.2.0_3.0_1700519246259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_keyphrase_extraction_openkp_en_5.2.0_3.0_1700519246259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_keyphrase_extraction_openkp","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_keyphrase_extraction_openkp","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_keyphrase_extraction_openkp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/ml6team/keyphrase-extraction-distilbert-openkp +- https://github.com/microsoft/OpenKP +- https://arxiv.org/abs/1911.02671 +- https://paperswithcode.com/sota?task=Keyphrase+Extraction&dataset=openkp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_ner_roles_openapi_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_ner_roles_openapi_en.md new file mode 100644 index 000000000000..8137de73a736 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_ner_roles_openapi_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from f2io) +author: John Snow Labs +name: distilbert_token_classifier_ner_roles_openapi +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ner-roles-openapi` is a English model originally trained by `f2io`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_ner_roles_openapi_en_5.2.0_3.0_1700518800507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_ner_roles_openapi_en_5.2.0_3.0_1700518800507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_ner_roles_openapi","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_ner_roles_openapi","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_ner_roles_openapi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/f2io/ner-roles-openapi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_sec_example_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_sec_example_en.md new file mode 100644 index 000000000000..0c4f2c17505b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_sec_example_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForTokenClassification Cased model (from TomUdale) +author: John Snow Labs +name: distilbert_token_classifier_sec_example +date: 2023-11-20 +tags: [en, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sec_example` is a English model originally trained by `TomUdale`. + +## Predicted Entities + +`PER`, `ORG`, `MISC`, `LOC` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_sec_example_en_5.2.0_3.0_1700519241417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_sec_example_en_5.2.0_3.0_1700519241417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_sec_example","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_classifier_sec_example","en") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_sec_example| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/TomUdale/sec_example \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_typo_detector_is.md b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_typo_detector_is.md new file mode 100644 index 000000000000..5e54103e3f12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilbert_token_classifier_typo_detector_is.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Icelandic DistilBertForTokenClassification Cased model (from m3hrdadfi) +author: John Snow Labs +name: distilbert_token_classifier_typo_detector +date: 2023-11-20 +tags: [is, open_source, distilbert, token_classification, ner, onnx] +task: Named Entity Recognition +language: is +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `typo-detector-distilbert-is` is a Icelandic model originally trained by `m3hrdadfi`. + +## Predicted Entities + +`TYPO` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_typo_detector_is_5.2.0_3.0_1700519754448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_classifier_typo_detector_is_5.2.0_3.0_1700519754448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +tokenClassifier = DistilBertForTokenClassification.pretrained("dtilbert_token_classifier_typo_detector","is") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, tokenClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val tokenClassifier = DistilBertForTokenClassification.pretrained("dtilbert_token_classifier_typo_detector","is") + .setInputCols(Array("document", "token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, tokenClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("is.ner.distil_bert").predict("""text|||"document|||"document|||"token|||"dtilbert_token_classifier_typo_detector|||"is|||"document|||"token|||"ner|||"PUT YOUR STRING HERE|||"text""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_classifier_typo_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[sentence, token]| +|Output Labels:|[ner]| +|Language:|is| +|Size:|505.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/m3hrdadfi/typo-detector-distilbert-is +- https://github.com/m3hrdadfi/typo-detector/issues \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distiled_flip_model_emotion_alpha_0_8_epoch7_v1_en.md b/docs/_posts/ahmedlone127/2023-11-20-distiled_flip_model_emotion_alpha_0_8_epoch7_v1_en.md new file mode 100644 index 000000000000..523e522f46c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distiled_flip_model_emotion_alpha_0_8_epoch7_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distiled_flip_model_emotion_alpha_0_8_epoch7_v1 DistilBertForSequenceClassification from ArafatBHossain +author: John Snow Labs +name: distiled_flip_model_emotion_alpha_0_8_epoch7_v1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distiled_flip_model_emotion_alpha_0_8_epoch7_v1` is a English model originally trained by ArafatBHossain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distiled_flip_model_emotion_alpha_0_8_epoch7_v1_en_5.2.0_3.0_1700478693854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distiled_flip_model_emotion_alpha_0_8_epoch7_v1_en_5.2.0_3.0_1700478693854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distiled_flip_model_emotion_alpha_0_8_epoch7_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distiled_flip_model_emotion_alpha_0_8_epoch7_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distiled_flip_model_emotion_alpha_0_8_epoch7_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ArafatBHossain/distiled_flip_model_emotion_alpha_0.8_epoch7_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distillbert_base_uncased_finetuned_clinc_mrpark97_en.md b/docs/_posts/ahmedlone127/2023-11-20-distillbert_base_uncased_finetuned_clinc_mrpark97_en.md new file mode 100644 index 000000000000..0bbeac406bbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distillbert_base_uncased_finetuned_clinc_mrpark97_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distillbert_base_uncased_finetuned_clinc_mrpark97 DistilBertForSequenceClassification from MrPark97 +author: John Snow Labs +name: distillbert_base_uncased_finetuned_clinc_mrpark97 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncased_finetuned_clinc_mrpark97` is a English model originally trained by MrPark97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_finetuned_clinc_mrpark97_en_5.2.0_3.0_1700477227337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_finetuned_clinc_mrpark97_en_5.2.0_3.0_1700477227337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_finetuned_clinc_mrpark97","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distillbert_base_uncased_finetuned_clinc_mrpark97","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncased_finetuned_clinc_mrpark97| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/MrPark97/distillbert-base-uncased-finetuned-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilled_indobert_classification_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilled_indobert_classification_en.md new file mode 100644 index 000000000000..f1165dafd68f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilled_indobert_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilled_indobert_classification DistilBertForSequenceClassification from afbudiman +author: John Snow Labs +name: distilled_indobert_classification +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilled_indobert_classification` is a English model originally trained by afbudiman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilled_indobert_classification_en_5.2.0_3.0_1700465899249.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilled_indobert_classification_en_5.2.0_3.0_1700465899249.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilled_indobert_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilled_indobert_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilled_indobert_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/afbudiman/distilled-indobert-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-distilrubert_tiny_2nd_finetune_epru_mmillet_en.md b/docs/_posts/ahmedlone127/2023-11-20-distilrubert_tiny_2nd_finetune_epru_mmillet_en.md new file mode 100644 index 000000000000..2a200cf4cb95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-distilrubert_tiny_2nd_finetune_epru_mmillet_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilrubert_tiny_2nd_finetune_epru_mmillet DistilBertForSequenceClassification from mmillet +author: John Snow Labs +name: distilrubert_tiny_2nd_finetune_epru_mmillet +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilrubert_tiny_2nd_finetune_epru_mmillet` is a English model originally trained by mmillet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_2nd_finetune_epru_mmillet_en_5.2.0_3.0_1700480146864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_2nd_finetune_epru_mmillet_en_5.2.0_3.0_1700480146864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_2nd_finetune_epru_mmillet","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("distilrubert_tiny_2nd_finetune_epru_mmillet","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilrubert_tiny_2nd_finetune_epru_mmillet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|39.2 MB| + +## References + +https://huggingface.co/mmillet/distilrubert_tiny-2nd-finetune-epru \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-ecobert_powo_lifecycle_scratch_en.md b/docs/_posts/ahmedlone127/2023-11-20-ecobert_powo_lifecycle_scratch_en.md new file mode 100644 index 000000000000..3305488a87ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-ecobert_powo_lifecycle_scratch_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ecobert_powo_lifecycle_scratch DistilBertForSequenceClassification from ViktorDo +author: John Snow Labs +name: ecobert_powo_lifecycle_scratch +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ecobert_powo_lifecycle_scratch` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ecobert_powo_lifecycle_scratch_en_5.2.0_3.0_1700441029440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ecobert_powo_lifecycle_scratch_en_5.2.0_3.0_1700441029440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("ecobert_powo_lifecycle_scratch","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("ecobert_powo_lifecycle_scratch","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ecobert_powo_lifecycle_scratch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/ViktorDo/EcoBERT-POWO_Lifecycle_Scratch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-entity_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-20-entity_extraction_en.md new file mode 100644 index 000000000000..79ed64315763 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-entity_extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English entity_extraction DistilBertForTokenClassification from autoevaluate +author: John Snow Labs +name: entity_extraction +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`entity_extraction` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_extraction_en_5.2.0_3.0_1700519734116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/entity_extraction_en_5.2.0_3.0_1700519734116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("entity_extraction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("entity_extraction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|entity_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/autoevaluate/entity-extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-eyy_categorisation_en.md b/docs/_posts/ahmedlone127/2023-11-20-eyy_categorisation_en.md new file mode 100644 index 000000000000..b55db64f2f44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-eyy_categorisation_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English eyy_categorisation DistilBertForSequenceClassification from ICFNext +author: John Snow Labs +name: eyy_categorisation +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`eyy_categorisation` is a English model originally trained by ICFNext. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/eyy_categorisation_en_5.2.0_3.0_1700456303469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/eyy_categorisation_en_5.2.0_3.0_1700456303469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("eyy_categorisation","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("eyy_categorisation","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|eyy_categorisation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ICFNext/EYY-Categorisation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-fake_news_bert_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-20-fake_news_bert_classifier_en.md new file mode 100644 index 000000000000..3792716e4f12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-fake_news_bert_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_news_bert_classifier DistilBertForSequenceClassification from ungjus +author: John Snow Labs +name: fake_news_bert_classifier +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_news_bert_classifier` is a English model originally trained by ungjus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_news_bert_classifier_en_5.2.0_3.0_1700441820847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_news_bert_classifier_en_5.2.0_3.0_1700441820847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_bert_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_bert_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_news_bert_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ungjus/Fake_News_BERT_Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-fake_news_detector_josumsc_en.md b/docs/_posts/ahmedlone127/2023-11-20-fake_news_detector_josumsc_en.md new file mode 100644 index 000000000000..586eb8af0c15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-fake_news_detector_josumsc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_news_detector_josumsc DistilBertForSequenceClassification from JosuMSC +author: John Snow Labs +name: fake_news_detector_josumsc +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_news_detector_josumsc` is a English model originally trained by JosuMSC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_news_detector_josumsc_en_5.2.0_3.0_1700458535896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_news_detector_josumsc_en_5.2.0_3.0_1700458535896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_detector_josumsc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("fake_news_detector_josumsc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_news_detector_josumsc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/JosuMSC/fake-news-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_distil_bert_depression_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_distil_bert_depression_en.md new file mode 100644 index 000000000000..b76141cb79d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_distil_bert_depression_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distil_bert_depression DistilBertForSequenceClassification from ShreyaR +author: John Snow Labs +name: finetuned_distil_bert_depression +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distil_bert_depression` is a English model originally trained by ShreyaR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distil_bert_depression_en_5.2.0_3.0_1700464066066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distil_bert_depression_en_5.2.0_3.0_1700464066066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distil_bert_depression","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distil_bert_depression","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distil_bert_depression| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ShreyaR/finetuned-distil-bert-depression \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_distilbert_multi_label_emotion_headline_3_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_distilbert_multi_label_emotion_headline_3_en.md new file mode 100644 index 000000000000..08bc040c593f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_distilbert_multi_label_emotion_headline_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_distilbert_multi_label_emotion_headline_3 DistilBertForSequenceClassification from abdulmatinomotoso +author: John Snow Labs +name: finetuned_distilbert_multi_label_emotion_headline_3 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_multi_label_emotion_headline_3` is a English model originally trained by abdulmatinomotoso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_multi_label_emotion_headline_3_en_5.2.0_3.0_1700441857034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_multi_label_emotion_headline_3_en_5.2.0_3.0_1700441857034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_multi_label_emotion_headline_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_distilbert_multi_label_emotion_headline_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_multi_label_emotion_headline_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/abdulmatinomotoso/finetuned-distilbert-multi-label-emotion_headline_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_iitp_pdt_review_distilbert_hinglish_big_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_iitp_pdt_review_distilbert_hinglish_big_en.md new file mode 100644 index 000000000000..e24c570d20ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_iitp_pdt_review_distilbert_hinglish_big_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_iitp_pdt_review_distilbert_hinglish_big DistilBertForSequenceClassification from aditeyabaral +author: John Snow Labs +name: finetuned_iitp_pdt_review_distilbert_hinglish_big +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_iitp_pdt_review_distilbert_hinglish_big` is a English model originally trained by aditeyabaral. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_iitp_pdt_review_distilbert_hinglish_big_en_5.2.0_3.0_1700484238592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_iitp_pdt_review_distilbert_hinglish_big_en_5.2.0_3.0_1700484238592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_iitp_pdt_review_distilbert_hinglish_big","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_iitp_pdt_review_distilbert_hinglish_big","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_iitp_pdt_review_distilbert_hinglish_big| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|251.2 MB| + +## References + +https://huggingface.co/aditeyabaral/finetuned-iitp_pdt_review-distilbert-hinglish-big \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36_en.md new file mode 100644 index 000000000000..de7ff9b7d8c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36_en_5.2.0_3.0_1700469293103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36_en_5.2.0_3.0_1700469293103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr0_0_0002_editorials_27_02_2022_19_42_36| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr0_0.0002_editorials_27_02_2022-19_42_36 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03_en.md new file mode 100644 index 000000000000..35a4fa0b0136 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03_en_5.2.0_3.0_1700458445863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03_en_5.2.0_3.0_1700458445863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr0_2e_05_all_01_03_2022_05_32_03| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr0_2e-05_all_01_03_2022-05_32_03 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41_en.md new file mode 100644 index 000000000000..db84a5423edc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41_en_5.2.0_3.0_1700479529875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41_en_5.2.0_3.0_1700479529875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr0_3e_05_webdiscourse_27_02_2022_19_27_41| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr0_3e-05_webDiscourse_27_02_2022-19_27_41 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22_en.md new file mode 100644 index 000000000000..2271bb936593 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22_en_5.2.0_3.0_1700463141984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22_en_5.2.0_3.0_1700463141984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr1_0_0002_all_27_02_2022_18_01_22| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr1_0.0002_all_27_02_2022-18_01_22 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24_en.md new file mode 100644 index 000000000000..c3d61ffa4de5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24_en_5.2.0_3.0_1700470242241.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24_en_5.2.0_3.0_1700470242241.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr1_3e_05_all_27_02_2022_18_29_24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr1_3e-05_all_27_02_2022-18_29_24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01_en.md new file mode 100644 index 000000000000..4355f1813b30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01_en_5.2.0_3.0_1700493135650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01_en_5.2.0_3.0_1700493135650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr2_2e_05_all_26_02_2022_04_09_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr2_2e-05_all_26_02_2022-04_09_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05_en.md new file mode 100644 index 000000000000..2bd34dfb98e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05_en_5.2.0_3.0_1700501735973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05_en_5.2.0_3.0_1700501735973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr3_2e_05_webdiscourse_27_02_2022_18_59_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr3_2e-05_webDiscourse_27_02_2022-18_59_05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05_en.md new file mode 100644 index 000000000000..edad735a0d08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05_en_5.2.0_3.0_1700480940588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05_en_5.2.0_3.0_1700480940588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr4_2e_05_all_27_02_2022_17_50_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr4_2e-05_all_27_02_2022-17_50_05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45_en.md new file mode 100644 index 000000000000..dc5ab6bad2b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45 DistilBertForSequenceClassification from ali2066 +author: John Snow Labs +name: finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45_en_5.2.0_3.0_1700446117157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45_en_5.2.0_3.0_1700446117157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentence_itr7_2e_05_all_26_02_2022_04_36_45| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/ali2066/finetuned_sentence_itr7_2e-05_all_26_02_2022-04_36_45 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_distilbert_base_uncased_on_imdb_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_distilbert_base_uncased_on_imdb_en.md new file mode 100644 index 000000000000..a1f9061f1e8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_distilbert_base_uncased_on_imdb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_distilbert_base_uncased_on_imdb DistilBertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_distilbert_base_uncased_on_imdb +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_distilbert_base_uncased_on_imdb` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_base_uncased_on_imdb_en_5.2.0_3.0_1700458445845.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_base_uncased_on_imdb_en_5.2.0_3.0_1700458445845.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_base_uncased_on_imdb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_base_uncased_on_imdb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_distilbert_base_uncased_on_imdb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-distilbert-base-uncased-on-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_distilbert_base_uncased_on_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_distilbert_base_uncased_on_sst2_en.md new file mode 100644 index 000000000000..36da166ce448 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_distilbert_base_uncased_on_sst2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_distilbert_base_uncased_on_sst2 DistilBertForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_distilbert_base_uncased_on_sst2 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_distilbert_base_uncased_on_sst2` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_base_uncased_on_sst2_en_5.2.0_3.0_1700477128779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_distilbert_base_uncased_on_sst2_en_5.2.0_3.0_1700477128779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_base_uncased_on_sst2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_distilbert_base_uncased_on_sst2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_distilbert_base_uncased_on_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-distilbert-base-uncased-on-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_esg_sentiment_model_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_esg_sentiment_model_distilbert_en.md new file mode 100644 index 000000000000..86cb54c14a22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_esg_sentiment_model_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_esg_sentiment_model_distilbert DistilBertForSequenceClassification from Bennet1996 +author: John Snow Labs +name: finetuning_esg_sentiment_model_distilbert +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_esg_sentiment_model_distilbert` is a English model originally trained by Bennet1996. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_esg_sentiment_model_distilbert_en_5.2.0_3.0_1700488001943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_esg_sentiment_model_distilbert_en_5.2.0_3.0_1700488001943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_esg_sentiment_model_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_esg_sentiment_model_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_esg_sentiment_model_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/Bennet1996/finetuning-ESG-sentiment-model-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_ad7_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_ad7_en.md new file mode 100644 index 000000000000..f5133d20509a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_ad7_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_ad7 DistilBertForSequenceClassification from ad7 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_ad7 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_ad7` is a English model originally trained by ad7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_ad7_en_5.2.0_3.0_1700463176091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_ad7_en_5.2.0_3.0_1700463176091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_ad7","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_ad7","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_ad7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ad7/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_becharabouabdo_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_becharabouabdo_en.md new file mode 100644 index 000000000000..0e2e3567dd5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_becharabouabdo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_becharabouabdo DistilBertForSequenceClassification from BecharaBouAbdo +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_becharabouabdo +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_becharabouabdo` is a English model originally trained by BecharaBouAbdo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_becharabouabdo_en_5.2.0_3.0_1700493135628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_becharabouabdo_en_5.2.0_3.0_1700493135628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_becharabouabdo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_becharabouabdo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_becharabouabdo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/BecharaBouAbdo/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_edvinkxs_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_edvinkxs_en.md new file mode 100644 index 000000000000..d3a667bd4b14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_edvinkxs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_edvinkxs DistilBertForSequenceClassification from edvinkxs +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_edvinkxs +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_edvinkxs` is a English model originally trained by edvinkxs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_edvinkxs_en_5.2.0_3.0_1700464839668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_edvinkxs_en_5.2.0_3.0_1700464839668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_edvinkxs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_edvinkxs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_edvinkxs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/edvinkxs/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_geniusguy777_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_geniusguy777_en.md new file mode 100644 index 000000000000..1e89da5df80b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_geniusguy777_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_geniusguy777 DistilBertForSequenceClassification from geniusguy777 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_geniusguy777 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_geniusguy777` is a English model originally trained by geniusguy777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_geniusguy777_en_5.2.0_3.0_1700455462055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_geniusguy777_en_5.2.0_3.0_1700455462055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_geniusguy777","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_geniusguy777","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_geniusguy777| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/geniusguy777/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_hejjo_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_hejjo_en.md new file mode 100644 index 000000000000..e3af0c81a71a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_hejjo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_hejjo DistilBertForSequenceClassification from hejjo +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_hejjo +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_hejjo` is a English model originally trained by hejjo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_hejjo_en_5.2.0_3.0_1700447084007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_hejjo_en_5.2.0_3.0_1700447084007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_hejjo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_hejjo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_hejjo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/hejjo/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_karimkhalil_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_karimkhalil_en.md new file mode 100644 index 000000000000..295d186e50c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_karimkhalil_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_karimkhalil DistilBertForSequenceClassification from KarimKhalil +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_karimkhalil +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_karimkhalil` is a English model originally trained by KarimKhalil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_karimkhalil_en_5.2.0_3.0_1700461422630.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_karimkhalil_en_5.2.0_3.0_1700461422630.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_karimkhalil","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_karimkhalil","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_karimkhalil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/KarimKhalil/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_luttufuttu_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_luttufuttu_en.md new file mode 100644 index 000000000000..2a8511380b9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_luttufuttu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_luttufuttu DistilBertForSequenceClassification from Luttufuttu +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_luttufuttu +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_luttufuttu` is a English model originally trained by Luttufuttu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_luttufuttu_en_5.2.0_3.0_1700491325853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_luttufuttu_en_5.2.0_3.0_1700491325853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_luttufuttu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_luttufuttu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_luttufuttu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Luttufuttu/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_nastorian_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_nastorian_en.md new file mode 100644 index 000000000000..11edd6026443 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_nastorian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_nastorian DistilBertForSequenceClassification from nastorian +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_nastorian +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_nastorian` is a English model originally trained by nastorian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_nastorian_en_5.2.0_3.0_1700481775691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_nastorian_en_5.2.0_3.0_1700481775691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_nastorian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_nastorian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_nastorian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/nastorian/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_saptarshidatta96_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_saptarshidatta96_en.md new file mode 100644 index 000000000000..1ffbdc1ec6f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_saptarshidatta96_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_saptarshidatta96 DistilBertForSequenceClassification from saptarshidatta96 +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_saptarshidatta96 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_saptarshidatta96` is a English model originally trained by saptarshidatta96. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_saptarshidatta96_en_5.2.0_3.0_1700451211680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_saptarshidatta96_en_5.2.0_3.0_1700451211680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_saptarshidatta96","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_saptarshidatta96","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_saptarshidatta96| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/saptarshidatta96/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_soldierofgod_rick_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_soldierofgod_rick_en.md new file mode 100644 index 000000000000..5681dca357d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_soldierofgod_rick_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_soldierofgod_rick DistilBertForSequenceClassification from SoldierOfGod-Rick +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_soldierofgod_rick +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_soldierofgod_rick` is a English model originally trained by SoldierOfGod-Rick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_soldierofgod_rick_en_5.2.0_3.0_1700440099192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_soldierofgod_rick_en_5.2.0_3.0_1700440099192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_soldierofgod_rick","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_soldierofgod_rick","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_soldierofgod_rick| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/SoldierOfGod-Rick/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_sukhendrasingh_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_sukhendrasingh_en.md new file mode 100644 index 000000000000..2e2edf1ea91a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_3000_samples_sukhendrasingh_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_3000_samples_sukhendrasingh DistilBertForSequenceClassification from sukhendrasingh +author: John Snow Labs +name: finetuning_sentiment_model_3000_samples_sukhendrasingh +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_3000_samples_sukhendrasingh` is a English model originally trained by sukhendrasingh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sukhendrasingh_en_5.2.0_3.0_1700479198846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_3000_samples_sukhendrasingh_en_5.2.0_3.0_1700479198846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_sukhendrasingh","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_3000_samples_sukhendrasingh","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_3000_samples_sukhendrasingh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/sukhendrasingh/finetuning-sentiment-model-3000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_4500_lyrics_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_4500_lyrics_en.md new file mode 100644 index 000000000000..71b4c9b901ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_4500_lyrics_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_4500_lyrics DistilBertForSequenceClassification from amanda-cristina +author: John Snow Labs +name: finetuning_sentiment_model_4500_lyrics +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_4500_lyrics` is a English model originally trained by amanda-cristina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_4500_lyrics_en_5.2.0_3.0_1700448946349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_4500_lyrics_en_5.2.0_3.0_1700448946349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_4500_lyrics","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_4500_lyrics","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_4500_lyrics| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/amanda-cristina/finetuning-sentiment-model-4500-lyrics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_9000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_9000_samples_en.md new file mode 100644 index 000000000000..e93ca0cf57a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_9000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_9000_samples DistilBertForSequenceClassification from SentiAnal +author: John Snow Labs +name: finetuning_sentiment_model_9000_samples +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_9000_samples` is a English model originally trained by SentiAnal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_9000_samples_en_5.2.0_3.0_1700445106516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_9000_samples_en_5.2.0_3.0_1700445106516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_9000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_9000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_9000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/SentiAnal/finetuning-sentiment-model-9000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_tweet_bert_en.md b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_tweet_bert_en.md new file mode 100644 index 000000000000..57275e53c9b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-finetuning_sentiment_model_tweet_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_tweet_bert DistilBertForSequenceClassification from LYTinn +author: John Snow Labs +name: finetuning_sentiment_model_tweet_bert +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_tweet_bert` is a English model originally trained by LYTinn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_tweet_bert_en_5.2.0_3.0_1700496105637.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_tweet_bert_en_5.2.0_3.0_1700496105637.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_tweet_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("finetuning_sentiment_model_tweet_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_tweet_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/LYTinn/finetuning-sentiment-model-tweet-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-flood_detection_yasser_kaddoura_en.md b/docs/_posts/ahmedlone127/2023-11-20-flood_detection_yasser_kaddoura_en.md new file mode 100644 index 000000000000..0a3eea43e0d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-flood_detection_yasser_kaddoura_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English flood_detection_yasser_kaddoura DistilBertForSequenceClassification from yasser-kaddoura +author: John Snow Labs +name: flood_detection_yasser_kaddoura +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`flood_detection_yasser_kaddoura` is a English model originally trained by yasser-kaddoura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/flood_detection_yasser_kaddoura_en_5.2.0_3.0_1700494006832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/flood_detection_yasser_kaddoura_en_5.2.0_3.0_1700494006832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("flood_detection_yasser_kaddoura","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("flood_detection_yasser_kaddoura","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|flood_detection_yasser_kaddoura| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/yasser-kaddoura/flood_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-food_dbert_multiling_en.md b/docs/_posts/ahmedlone127/2023-11-20-food_dbert_multiling_en.md new file mode 100644 index 000000000000..632d405b53f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-food_dbert_multiling_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English food_dbert_multiling DistilBertForTokenClassification from Dev-DGT +author: John Snow Labs +name: food_dbert_multiling +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`food_dbert_multiling` is a English model originally trained by Dev-DGT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/food_dbert_multiling_en_5.2.0_3.0_1700523152969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/food_dbert_multiling_en_5.2.0_3.0_1700523152969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("food_dbert_multiling","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("food_dbert_multiling", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|food_dbert_multiling| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Dev-DGT/food-dbert-multiling \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-genres_distill_en.md b/docs/_posts/ahmedlone127/2023-11-20-genres_distill_en.md new file mode 100644 index 000000000000..4feee22c662d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-genres_distill_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English genres_distill DistilBertForSequenceClassification from ssharoff +author: John Snow Labs +name: genres_distill +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`genres_distill` is a English model originally trained by ssharoff. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/genres_distill_en_5.2.0_3.0_1700441097418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/genres_distill_en_5.2.0_3.0_1700441097418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("genres_distill","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("genres_distill","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|genres_distill| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/ssharoff/genres-distill \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-hraf_event_demo_en.md b/docs/_posts/ahmedlone127/2023-11-20-hraf_event_demo_en.md new file mode 100644 index 000000000000..ba8de4c31336 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-hraf_event_demo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hraf_event_demo DistilBertForSequenceClassification from Chantland +author: John Snow Labs +name: hraf_event_demo +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hraf_event_demo` is a English model originally trained by Chantland. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hraf_event_demo_en_5.2.0_3.0_1700477128708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hraf_event_demo_en_5.2.0_3.0_1700477128708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("hraf_event_demo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("hraf_event_demo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hraf_event_demo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/Chantland/HRAF_EVENT_Demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-kd_distilbert_clinc_davidaponte_en.md b/docs/_posts/ahmedlone127/2023-11-20-kd_distilbert_clinc_davidaponte_en.md new file mode 100644 index 000000000000..e38df1612e0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-kd_distilbert_clinc_davidaponte_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English kd_distilbert_clinc_davidaponte DistilBertForSequenceClassification from davidaponte +author: John Snow Labs +name: kd_distilbert_clinc_davidaponte +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kd_distilbert_clinc_davidaponte` is a English model originally trained by davidaponte. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kd_distilbert_clinc_davidaponte_en_5.2.0_3.0_1700451432324.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kd_distilbert_clinc_davidaponte_en_5.2.0_3.0_1700451432324.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("kd_distilbert_clinc_davidaponte","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("kd_distilbert_clinc_davidaponte","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kd_distilbert_clinc_davidaponte| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.9 MB| + +## References + +https://huggingface.co/davidaponte/kd-distilBERT-clinc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-knowledge_graph_nlp_en.md b/docs/_posts/ahmedlone127/2023-11-20-knowledge_graph_nlp_en.md new file mode 100644 index 000000000000..3ca00c361d87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-knowledge_graph_nlp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English knowledge_graph_nlp DistilBertForTokenClassification from vishnun +author: John Snow Labs +name: knowledge_graph_nlp +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`knowledge_graph_nlp` is a English model originally trained by vishnun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/knowledge_graph_nlp_en_5.2.0_3.0_1700519488893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/knowledge_graph_nlp_en_5.2.0_3.0_1700519488893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("knowledge_graph_nlp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("knowledge_graph_nlp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|knowledge_graph_nlp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vishnun/knowledge-graph-nlp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-lm_ner_linkedin_skills_recognition_en.md b/docs/_posts/ahmedlone127/2023-11-20-lm_ner_linkedin_skills_recognition_en.md new file mode 100644 index 000000000000..bae9fc475203 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-lm_ner_linkedin_skills_recognition_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English lm_ner_linkedin_skills_recognition DistilBertForTokenClassification from algiraldohe +author: John Snow Labs +name: lm_ner_linkedin_skills_recognition +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lm_ner_linkedin_skills_recognition` is a English model originally trained by algiraldohe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lm_ner_linkedin_skills_recognition_en_5.2.0_3.0_1700520525717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lm_ner_linkedin_skills_recognition_en_5.2.0_3.0_1700520525717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("lm_ner_linkedin_skills_recognition","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("lm_ner_linkedin_skills_recognition", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lm_ner_linkedin_skills_recognition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/algiraldohe/lm-ner-linkedin-skills-recognition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-mdistilbert_base_cased_ukrainian_toxicity_uk.md b/docs/_posts/ahmedlone127/2023-11-20-mdistilbert_base_cased_ukrainian_toxicity_uk.md new file mode 100644 index 000000000000..dcba2a8d6fd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-mdistilbert_base_cased_ukrainian_toxicity_uk.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Ukrainian mdistilbert_base_cased_ukrainian_toxicity DistilBertForSequenceClassification from dardem +author: John Snow Labs +name: mdistilbert_base_cased_ukrainian_toxicity +date: 2023-11-20 +tags: [bert, uk, open_source, sequence_classification, onnx] +task: Text Classification +language: uk +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mdistilbert_base_cased_ukrainian_toxicity` is a Ukrainian model originally trained by dardem. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mdistilbert_base_cased_ukrainian_toxicity_uk_5.2.0_3.0_1700454236766.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mdistilbert_base_cased_ukrainian_toxicity_uk_5.2.0_3.0_1700454236766.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("mdistilbert_base_cased_ukrainian_toxicity","uk")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("mdistilbert_base_cased_ukrainian_toxicity","uk") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mdistilbert_base_cased_ukrainian_toxicity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|uk| +|Size:|507.6 MB| + +## References + +https://huggingface.co/dardem/mdistilbert-base-cased-uk-toxicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-media_bias_ukraine_dataset_all_minus_ukraine_en.md b/docs/_posts/ahmedlone127/2023-11-20-media_bias_ukraine_dataset_all_minus_ukraine_en.md new file mode 100644 index 000000000000..a580c834abed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-media_bias_ukraine_dataset_all_minus_ukraine_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English media_bias_ukraine_dataset_all_minus_ukraine DistilBertForSequenceClassification from franfj +author: John Snow Labs +name: media_bias_ukraine_dataset_all_minus_ukraine +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`media_bias_ukraine_dataset_all_minus_ukraine` is a English model originally trained by franfj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/media_bias_ukraine_dataset_all_minus_ukraine_en_5.2.0_3.0_1700445026732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/media_bias_ukraine_dataset_all_minus_ukraine_en_5.2.0_3.0_1700445026732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("media_bias_ukraine_dataset_all_minus_ukraine","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("media_bias_ukraine_dataset_all_minus_ukraine","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|media_bias_ukraine_dataset_all_minus_ukraine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/franfj/media-bias-ukraine-dataset-all-minus-ukraine \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-medicine_ner_en.md b/docs/_posts/ahmedlone127/2023-11-20-medicine_ner_en.md new file mode 100644 index 000000000000..fe727e33d698 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-medicine_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English medicine_ner DistilBertForTokenClassification from jarvisx17 +author: John Snow Labs +name: medicine_ner +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medicine_ner` is a English model originally trained by jarvisx17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medicine_ner_en_5.2.0_3.0_1700524755802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medicine_ner_en_5.2.0_3.0_1700524755802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("medicine_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("medicine_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medicine_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/jarvisx17/medicine-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-mnli_distilbert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-11-20-mnli_distilbert_base_cased_en.md new file mode 100644 index 000000000000..54eabf944950 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-mnli_distilbert_base_cased_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mnli_distilbert_base_cased DistilBertForSequenceClassification from boychaboy +author: John Snow Labs +name: mnli_distilbert_base_cased +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mnli_distilbert_base_cased` is a English model originally trained by boychaboy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mnli_distilbert_base_cased_en_5.2.0_3.0_1700454722372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mnli_distilbert_base_cased_en_5.2.0_3.0_1700454722372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("mnli_distilbert_base_cased","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("mnli_distilbert_base_cased","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mnli_distilbert_base_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|246.0 MB| + +## References + +https://huggingface.co/boychaboy/MNLI_distilbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-news_classifier_distilbert_base_uncased_subject_only_en.md b/docs/_posts/ahmedlone127/2023-11-20-news_classifier_distilbert_base_uncased_subject_only_en.md new file mode 100644 index 000000000000..9bc3ad640b7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-news_classifier_distilbert_base_uncased_subject_only_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English news_classifier_distilbert_base_uncased_subject_only DistilBertForSequenceClassification from andypyc +author: John Snow Labs +name: news_classifier_distilbert_base_uncased_subject_only +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`news_classifier_distilbert_base_uncased_subject_only` is a English model originally trained by andypyc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/news_classifier_distilbert_base_uncased_subject_only_en_5.2.0_3.0_1700458445886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/news_classifier_distilbert_base_uncased_subject_only_en_5.2.0_3.0_1700458445886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_classifier_distilbert_base_uncased_subject_only","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("news_classifier_distilbert_base_uncased_subject_only","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|news_classifier_distilbert_base_uncased_subject_only| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/andypyc/news_classifier-distilbert-base-uncased-subject-only \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-nlp_deep_project_en.md b/docs/_posts/ahmedlone127/2023-11-20-nlp_deep_project_en.md new file mode 100644 index 000000000000..905d3308d207 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-nlp_deep_project_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_deep_project DistilBertForSequenceClassification from bamertl +author: John Snow Labs +name: nlp_deep_project +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_deep_project` is a English model originally trained by bamertl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_deep_project_en_5.2.0_3.0_1700486122586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_deep_project_en_5.2.0_3.0_1700486122586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_deep_project","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_deep_project","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_deep_project| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/bamertl/nlp_deep_project \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-nlp_sentiment_project_2001_samples_en.md b/docs/_posts/ahmedlone127/2023-11-20-nlp_sentiment_project_2001_samples_en.md new file mode 100644 index 000000000000..695506b89e60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-nlp_sentiment_project_2001_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_sentiment_project_2001_samples DistilBertForSequenceClassification from bwhite5311 +author: John Snow Labs +name: nlp_sentiment_project_2001_samples +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_sentiment_project_2001_samples` is a English model originally trained by bwhite5311. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_sentiment_project_2001_samples_en_5.2.0_3.0_1700466618319.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_sentiment_project_2001_samples_en_5.2.0_3.0_1700466618319.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_sentiment_project_2001_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("nlp_sentiment_project_2001_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_sentiment_project_2001_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/bwhite5311/NLP-sentiment-project-2001-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-output_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-20-output_text_classification_en.md new file mode 100644 index 000000000000..4d5fb8ba48a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-output_text_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English output_text_classification DistilBertForSequenceClassification from vihaim +author: John Snow Labs +name: output_text_classification +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`output_text_classification` is a English model originally trained by vihaim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/output_text_classification_en_5.2.0_3.0_1700457422498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/output_text_classification_en_5.2.0_3.0_1700457422498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("output_text_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("output_text_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|output_text_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/vihaim/output_text_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-pat_classifier_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-20-pat_classifier_distilbert_en.md new file mode 100644 index 000000000000..79959a5f8600 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-pat_classifier_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pat_classifier_distilbert DistilBertForSequenceClassification from leoliu +author: John Snow Labs +name: pat_classifier_distilbert +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pat_classifier_distilbert` is a English model originally trained by leoliu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pat_classifier_distilbert_en_5.2.0_3.0_1700444218820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pat_classifier_distilbert_en_5.2.0_3.0_1700444218820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("pat_classifier_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("pat_classifier_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pat_classifier_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/leoliu/pat_classifier_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-predict_perception_bertino_cause_object_en.md b/docs/_posts/ahmedlone127/2023-11-20-predict_perception_bertino_cause_object_en.md new file mode 100644 index 000000000000..115aff22aef9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-predict_perception_bertino_cause_object_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English predict_perception_bertino_cause_object DistilBertForSequenceClassification from gossminn +author: John Snow Labs +name: predict_perception_bertino_cause_object +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`predict_perception_bertino_cause_object` is a English model originally trained by gossminn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/predict_perception_bertino_cause_object_en_5.2.0_3.0_1700479598661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/predict_perception_bertino_cause_object_en_5.2.0_3.0_1700479598661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("predict_perception_bertino_cause_object","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("predict_perception_bertino_cause_object","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|predict_perception_bertino_cause_object| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|255.2 MB| + +## References + +https://huggingface.co/gossminn/predict-perception-bertino-cause-object \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-products_ner_en.md b/docs/_posts/ahmedlone127/2023-11-20-products_ner_en.md new file mode 100644 index 000000000000..4f1d74e7429a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-products_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English products_ner DistilBertForTokenClassification from rowdy-store +author: John Snow Labs +name: products_ner +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`products_ner` is a English model originally trained by rowdy-store. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/products_ner_en_5.2.0_3.0_1700522257521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/products_ner_en_5.2.0_3.0_1700522257521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("products_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("products_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|products_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rowdy-store/products-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-reddit_comment_sentiment_changed_en.md b/docs/_posts/ahmedlone127/2023-11-20-reddit_comment_sentiment_changed_en.md new file mode 100644 index 000000000000..a060ad81d1e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-reddit_comment_sentiment_changed_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English reddit_comment_sentiment_changed DistilBertForSequenceClassification from AG6019 +author: John Snow Labs +name: reddit_comment_sentiment_changed +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reddit_comment_sentiment_changed` is a English model originally trained by AG6019. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reddit_comment_sentiment_changed_en_5.2.0_3.0_1700452442085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reddit_comment_sentiment_changed_en_5.2.0_3.0_1700452442085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("reddit_comment_sentiment_changed","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("reddit_comment_sentiment_changed","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reddit_comment_sentiment_changed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/AG6019/reddit-comment-sentiment-changed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-resume_ner_en.md b/docs/_posts/ahmedlone127/2023-11-20-resume_ner_en.md new file mode 100644 index 000000000000..a6b23b13f9a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-resume_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English resume_ner DistilBertForTokenClassification from manishiitg +author: John Snow Labs +name: resume_ner +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`resume_ner` is a English model originally trained by manishiitg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/resume_ner_en_5.2.0_3.0_1700520525713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/resume_ner_en_5.2.0_3.0_1700520525713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("resume_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("resume_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|resume_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/manishiitg/resume-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_anindabitm_en.md b/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_anindabitm_en.md new file mode 100644 index 000000000000..75e8e289c513 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_anindabitm_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sagemaker_distilbert_emotion_anindabitm DistilBertForSequenceClassification from anindabitm +author: John Snow Labs +name: sagemaker_distilbert_emotion_anindabitm +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagemaker_distilbert_emotion_anindabitm` is a English model originally trained by anindabitm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_anindabitm_en_5.2.0_3.0_1700441834950.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_anindabitm_en_5.2.0_3.0_1700441834950.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_anindabitm","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_anindabitm","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagemaker_distilbert_emotion_anindabitm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/anindabitm/sagemaker-distilbert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_lewtun_en.md b/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_lewtun_en.md new file mode 100644 index 000000000000..75a12d8884e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_lewtun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sagemaker_distilbert_emotion_lewtun DistilBertForSequenceClassification from lewtun +author: John Snow Labs +name: sagemaker_distilbert_emotion_lewtun +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagemaker_distilbert_emotion_lewtun` is a English model originally trained by lewtun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_lewtun_en_5.2.0_3.0_1700472264456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_lewtun_en_5.2.0_3.0_1700472264456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_lewtun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_lewtun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagemaker_distilbert_emotion_lewtun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/lewtun/sagemaker-distilbert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_philschmid_en.md b/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_philschmid_en.md new file mode 100644 index 000000000000..2bf038d88ba8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-sagemaker_distilbert_emotion_philschmid_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sagemaker_distilbert_emotion_philschmid DistilBertForSequenceClassification from philschmid +author: John Snow Labs +name: sagemaker_distilbert_emotion_philschmid +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sagemaker_distilbert_emotion_philschmid` is a English model originally trained by philschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_philschmid_en_5.2.0_3.0_1700453813216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sagemaker_distilbert_emotion_philschmid_en_5.2.0_3.0_1700453813216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_philschmid","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sagemaker_distilbert_emotion_philschmid","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sagemaker_distilbert_emotion_philschmid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/philschmid/sagemaker-distilbert-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-sentence_compression_en.md b/docs/_posts/ahmedlone127/2023-11-20-sentence_compression_en.md new file mode 100644 index 000000000000..1a58fab95266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-sentence_compression_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sentence_compression DistilBertForTokenClassification from AlexMaclean +author: John Snow Labs +name: sentence_compression +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_compression` is a English model originally trained by AlexMaclean. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_compression_en_5.2.0_3.0_1700522541345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_compression_en_5.2.0_3.0_1700522541345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("sentence_compression","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("sentence_compression", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_compression| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/AlexMaclean/sentence-compression \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-sentiment_analysis_danlupu_en.md b/docs/_posts/ahmedlone127/2023-11-20-sentiment_analysis_danlupu_en.md new file mode 100644 index 000000000000..3e8799102e32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-sentiment_analysis_danlupu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_danlupu DistilBertForSequenceClassification from danlupu +author: John Snow Labs +name: sentiment_analysis_danlupu +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_danlupu` is a English model originally trained by danlupu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_danlupu_en_5.2.0_3.0_1700442939687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_danlupu_en_5.2.0_3.0_1700442939687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_danlupu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_analysis_danlupu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_danlupu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/danlupu/sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-sentiment_model_amazon_reviews_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-20-sentiment_model_amazon_reviews_distilbert_en.md new file mode 100644 index 000000000000..2b7760c3e838 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-sentiment_model_amazon_reviews_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_model_amazon_reviews_distilbert DistilBertForSequenceClassification from PabloAMC +author: John Snow Labs +name: sentiment_model_amazon_reviews_distilbert +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_model_amazon_reviews_distilbert` is a English model originally trained by PabloAMC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_model_amazon_reviews_distilbert_en_5.2.0_3.0_1700440162459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_model_amazon_reviews_distilbert_en_5.2.0_3.0_1700440162459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_amazon_reviews_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("sentiment_model_amazon_reviews_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_model_amazon_reviews_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/PabloAMC/sentiment-model-amazon-reviews-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-shopping_list_ner_en.md b/docs/_posts/ahmedlone127/2023-11-20-shopping_list_ner_en.md new file mode 100644 index 000000000000..0ea884e26096 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-shopping_list_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English shopping_list_ner DistilBertForTokenClassification from progg +author: John Snow Labs +name: shopping_list_ner +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`shopping_list_ner` is a English model originally trained by progg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/shopping_list_ner_en_5.2.0_3.0_1700523907474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/shopping_list_ner_en_5.2.0_3.0_1700523907474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("shopping_list_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("shopping_list_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|shopping_list_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/progg/shopping-list-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-skills_description_julian_big_en.md b/docs/_posts/ahmedlone127/2023-11-20-skills_description_julian_big_en.md new file mode 100644 index 000000000000..1c6120a01966 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-skills_description_julian_big_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English skills_description_julian_big DistilBertForSequenceClassification from joblift-julian +author: John Snow Labs +name: skills_description_julian_big +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`skills_description_julian_big` is a English model originally trained by joblift-julian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/skills_description_julian_big_en_5.2.0_3.0_1700464144819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/skills_description_julian_big_en_5.2.0_3.0_1700464144819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_description_julian_big","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("skills_description_julian_big","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|skills_description_julian_big| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.4 MB| + +## References + +https://huggingface.co/joblift-julian/skills_description_julian_big \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-text_tonga_tonga_islands_symptom_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-20-text_tonga_tonga_islands_symptom_distilbert_en.md new file mode 100644 index 000000000000..3bb4ab2c71e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-text_tonga_tonga_islands_symptom_distilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_tonga_tonga_islands_symptom_distilbert DistilBertForSequenceClassification from DinaSalama +author: John Snow Labs +name: text_tonga_tonga_islands_symptom_distilbert +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_tonga_tonga_islands_symptom_distilbert` is a English model originally trained by DinaSalama. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_symptom_distilbert_en_5.2.0_3.0_1700485222696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_tonga_tonga_islands_symptom_distilbert_en_5.2.0_3.0_1700485222696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_tonga_tonga_islands_symptom_distilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("text_tonga_tonga_islands_symptom_distilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_tonga_tonga_islands_symptom_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/DinaSalama/text_to_symptom_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-tiny_random_distilbertfortokenclassification_hf_internal_testing_en.md b/docs/_posts/ahmedlone127/2023-11-20-tiny_random_distilbertfortokenclassification_hf_internal_testing_en.md new file mode 100644 index 000000000000..9a22a9719c14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-tiny_random_distilbertfortokenclassification_hf_internal_testing_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tiny_random_distilbertfortokenclassification_hf_internal_testing DistilBertForTokenClassification from hf-internal-testing +author: John Snow Labs +name: tiny_random_distilbertfortokenclassification_hf_internal_testing +date: 2023-11-20 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_distilbertfortokenclassification_hf_internal_testing` is a English model originally trained by hf-internal-testing. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertfortokenclassification_hf_internal_testing_en_5.2.0_3.0_1700521136604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertfortokenclassification_hf_internal_testing_en_5.2.0_3.0_1700521136604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tiny_random_distilbertfortokenclassification_hf_internal_testing","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tiny_random_distilbertfortokenclassification_hf_internal_testing", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_distilbertfortokenclassification_hf_internal_testing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|347.3 KB| + +## References + +https://huggingface.co/hf-internal-testing/tiny-random-DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-20-uk_energy_industry_complaints_identifier_ver1_en.md b/docs/_posts/ahmedlone127/2023-11-20-uk_energy_industry_complaints_identifier_ver1_en.md new file mode 100644 index 000000000000..4cd8bfd2eb7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-20-uk_energy_industry_complaints_identifier_ver1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English uk_energy_industry_complaints_identifier_ver1 DistilBertForSequenceClassification from CalamitousVisibility +author: John Snow Labs +name: uk_energy_industry_complaints_identifier_ver1 +date: 2023-11-20 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`uk_energy_industry_complaints_identifier_ver1` is a English model originally trained by CalamitousVisibility. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/uk_energy_industry_complaints_identifier_ver1_en_5.2.0_3.0_1700449685328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/uk_energy_industry_complaints_identifier_ver1_en_5.2.0_3.0_1700449685328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = DistilBertForSequenceClassification.pretrained("uk_energy_industry_complaints_identifier_ver1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = DistilBertForSequenceClassification.pretrained("uk_energy_industry_complaints_identifier_ver1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|uk_energy_industry_complaints_identifier_ver1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|249.5 MB| + +## References + +https://huggingface.co/CalamitousVisibility/UK_Energy_Industry_Complaints_Identifier_ver1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-address_ner_german_de.md b/docs/_posts/ahmedlone127/2023-11-21-address_ner_german_de.md new file mode 100644 index 000000000000..c6c45921b11f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-address_ner_german_de.md @@ -0,0 +1,93 @@ +--- +layout: model +title: German address_ner_german DistilBertForTokenClassification from dswah +author: John Snow Labs +name: address_ner_german +date: 2023-11-21 +tags: [bert, de, open_source, token_classification, onnx] +task: Named Entity Recognition +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`address_ner_german` is a German model originally trained by dswah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/address_ner_german_de_5.2.0_3.0_1700526670428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/address_ner_german_de_5.2.0_3.0_1700526670428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("address_ner_german","de") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("address_ner_german", "de") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|address_ner_german| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|de| +|Size:|505.4 MB| + +## References + +https://huggingface.co/dswah/address-ner-de \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-affilgood_ner_test_v2_en.md b/docs/_posts/ahmedlone127/2023-11-21-affilgood_ner_test_v2_en.md new file mode 100644 index 000000000000..7b82d1fae1a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-affilgood_ner_test_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English affilgood_ner_test_v2 DistilBertForTokenClassification from nicolauduran45 +author: John Snow Labs +name: affilgood_ner_test_v2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`affilgood_ner_test_v2` is a English model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/affilgood_ner_test_v2_en_5.2.0_3.0_1700524902951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/affilgood_ner_test_v2_en_5.2.0_3.0_1700524902951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("affilgood_ner_test_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("affilgood_ner_test_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|affilgood_ner_test_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/nicolauduran45/affilgood-ner-test-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-ai4all_ucsf_reddit_2023_age_en.md b/docs/_posts/ahmedlone127/2023-11-21-ai4all_ucsf_reddit_2023_age_en.md new file mode 100644 index 000000000000..93a46d3afd24 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-ai4all_ucsf_reddit_2023_age_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ai4all_ucsf_reddit_2023_age DistilBertForTokenClassification from kc928 +author: John Snow Labs +name: ai4all_ucsf_reddit_2023_age +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai4all_ucsf_reddit_2023_age` is a English model originally trained by kc928. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai4all_ucsf_reddit_2023_age_en_5.2.0_3.0_1700573537295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai4all_ucsf_reddit_2023_age_en_5.2.0_3.0_1700573537295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ai4all_ucsf_reddit_2023_age","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ai4all_ucsf_reddit_2023_age", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai4all_ucsf_reddit_2023_age| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/kc928/AI4ALL-UCSF-Reddit-2023-Age \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-akai_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-akai_ner_en.md new file mode 100644 index 000000000000..a42d479f90f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-akai_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English akai_ner DistilBertForTokenClassification from GautamR +author: John Snow Labs +name: akai_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`akai_ner` is a English model originally trained by GautamR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/akai_ner_en_5.2.0_3.0_1700535603477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/akai_ner_en_5.2.0_3.0_1700535603477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("akai_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("akai_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|akai_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/GautamR/akai_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-arabic2023_ner_model_en.md b/docs/_posts/ahmedlone127/2023-11-21-arabic2023_ner_model_en.md new file mode 100644 index 000000000000..3775dd7534bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-arabic2023_ner_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English arabic2023_ner_model DistilBertForTokenClassification from Falah +author: John Snow Labs +name: arabic2023_ner_model +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabic2023_ner_model` is a English model originally trained by Falah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabic2023_ner_model_en_5.2.0_3.0_1700577444085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabic2023_ner_model_en_5.2.0_3.0_1700577444085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("arabic2023_ner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("arabic2023_ner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabic2023_ner_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Falah/arabic2023_ner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-aspect_extraction_laptop_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-21-aspect_extraction_laptop_reviews_en.md new file mode 100644 index 000000000000..26b1696586c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-aspect_extraction_laptop_reviews_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English aspect_extraction_laptop_reviews DistilBertForTokenClassification from jannikseus +author: John Snow Labs +name: aspect_extraction_laptop_reviews +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aspect_extraction_laptop_reviews` is a English model originally trained by jannikseus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aspect_extraction_laptop_reviews_en_5.2.0_3.0_1700581685469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aspect_extraction_laptop_reviews_en_5.2.0_3.0_1700581685469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("aspect_extraction_laptop_reviews","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("aspect_extraction_laptop_reviews", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aspect_extraction_laptop_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jannikseus/aspect_extraction_laptop_reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-aspect_extraction_restaurant_reviews_en.md b/docs/_posts/ahmedlone127/2023-11-21-aspect_extraction_restaurant_reviews_en.md new file mode 100644 index 000000000000..47082ea9845d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-aspect_extraction_restaurant_reviews_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English aspect_extraction_restaurant_reviews DistilBertForTokenClassification from jannikseus +author: John Snow Labs +name: aspect_extraction_restaurant_reviews +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aspect_extraction_restaurant_reviews` is a English model originally trained by jannikseus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aspect_extraction_restaurant_reviews_en_5.2.0_3.0_1700566233686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aspect_extraction_restaurant_reviews_en_5.2.0_3.0_1700566233686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("aspect_extraction_restaurant_reviews","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("aspect_extraction_restaurant_reviews", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aspect_extraction_restaurant_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jannikseus/aspect_extraction_restaurant_reviews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-aspect_extractor_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-21-aspect_extractor_distilbert_en.md new file mode 100644 index 000000000000..48916fd92fd4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-aspect_extractor_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English aspect_extractor_distilbert DistilBertForTokenClassification from Joshwabail +author: John Snow Labs +name: aspect_extractor_distilbert +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`aspect_extractor_distilbert` is a English model originally trained by Joshwabail. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/aspect_extractor_distilbert_en_5.2.0_3.0_1700541903422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/aspect_extractor_distilbert_en_5.2.0_3.0_1700541903422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("aspect_extractor_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("aspect_extractor_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|aspect_extractor_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Joshwabail/aspect_extractor_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_b08_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_b08_en.md new file mode 100644 index 000000000000..c0bad2fc28f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_b08_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_b08 DistilBertForTokenClassification from LazzeKappa +author: John Snow Labs +name: bert_b08 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_b08` is a English model originally trained by LazzeKappa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_b08_en_5.2.0_3.0_1700529116297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_b08_en_5.2.0_3.0_1700529116297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_b08","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_b08", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_b08| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/LazzeKappa/BERT_B08 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_base_dutch_cased_finetuned_mbert_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_base_dutch_cased_finetuned_mbert_en.md new file mode 100644 index 000000000000..ff96a22a9c41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_base_dutch_cased_finetuned_mbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_dutch_cased_finetuned_mbert DistilBertForTokenClassification from Matthijsvanhof +author: John Snow Labs +name: bert_base_dutch_cased_finetuned_mbert +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_dutch_cased_finetuned_mbert` is a English model originally trained by Matthijsvanhof. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_mbert_en_5.2.0_3.0_1700536744253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_dutch_cased_finetuned_mbert_en_5.2.0_3.0_1700536744253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_base_dutch_cased_finetuned_mbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_base_dutch_cased_finetuned_mbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_dutch_cased_finetuned_mbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Matthijsvanhof/bert-base-dutch-cased-finetuned-mBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_base_ner_058_10_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_base_ner_058_10_en.md new file mode 100644 index 000000000000..638b9494f038 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_base_ner_058_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_base_ner_058_10 DistilBertForTokenClassification from NguyenVanHieu1605 +author: John Snow Labs +name: bert_base_ner_058_10 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_base_ner_058_10` is a English model originally trained by NguyenVanHieu1605. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_base_ner_058_10_en_5.2.0_3.0_1700553049282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_base_ner_058_10_en_5.2.0_3.0_1700553049282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_base_ner_058_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_base_ner_058_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_base_ner_058_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/NguyenVanHieu1605/bert-base-ner-058-10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1_en.md new file mode 100644 index 000000000000..f7c40c55dce7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1_en_5.2.0_3.0_1700535482568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1_en_5.2.0_3.0_1700535482568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_concept_extraction_indoiranian_languages_from_kp20k_v1_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/bert_concept_extraction_iir_from_kp20k_v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1_en.md new file mode 100644 index 000000000000..07a73a776cf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1_en_5.2.0_3.0_1700571721291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1_en_5.2.0_3.0_1700571721291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_concept_extraction_kp20k_from_indoiranian_languages_v1_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/bert_concept_extraction_kp20k_from_iir_v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_cybersecurity_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_cybersecurity_ner_en.md new file mode 100644 index 000000000000..0f1202cea1d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_cybersecurity_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_cybersecurity_ner DistilBertForTokenClassification from danitamayo +author: John Snow Labs +name: bert_cybersecurity_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_cybersecurity_ner` is a English model originally trained by danitamayo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_cybersecurity_ner_en_5.2.0_3.0_1700525254778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_cybersecurity_ner_en_5.2.0_3.0_1700525254778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_cybersecurity_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_cybersecurity_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_cybersecurity_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/danitamayo/bert-cybersecurity-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_finetuned_ner_illiadagil_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_finetuned_ner_illiadagil_en.md new file mode 100644 index 000000000000..8123e46f42da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_finetuned_ner_illiadagil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_illiadagil DistilBertForTokenClassification from illiadagil +author: John Snow Labs +name: bert_finetuned_ner_illiadagil +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_illiadagil` is a English model originally trained by illiadagil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_illiadagil_en_5.2.0_3.0_1700590208408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_illiadagil_en_5.2.0_3.0_1700590208408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_illiadagil","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_finetuned_ner_illiadagil", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_illiadagil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/illiadagil/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-bert_ner_en.md new file mode 100644 index 000000000000..5e204801282d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_ner DistilBertForTokenClassification from Kriyans +author: John Snow Labs +name: bert_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_ner` is a English model originally trained by Kriyans. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_ner_en_5.2.0_3.0_1700525169497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_ner_en_5.2.0_3.0_1700525169497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Kriyans/Bert-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-bertoslav_limited_ner_sk.md b/docs/_posts/ahmedlone127/2023-11-21-bertoslav_limited_ner_sk.md new file mode 100644 index 000000000000..c209a63a8b51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-bertoslav_limited_ner_sk.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Slovak bertoslav_limited_ner DistilBertForTokenClassification from crabz +author: John Snow Labs +name: bertoslav_limited_ner +date: 2023-11-21 +tags: [bert, sk, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sk +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertoslav_limited_ner` is a Slovak model originally trained by crabz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertoslav_limited_ner_sk_5.2.0_3.0_1700549562073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertoslav_limited_ner_sk_5.2.0_3.0_1700549562073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bertoslav_limited_ner","sk") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bertoslav_limited_ner", "sk") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertoslav_limited_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sk| +|Size:|247.3 MB| + +## References + +https://huggingface.co/crabz/bertoslav-limited-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-biomedical_ner_all_anonimization_try_4_en.md b/docs/_posts/ahmedlone127/2023-11-21-biomedical_ner_all_anonimization_try_4_en.md new file mode 100644 index 000000000000..04a42349f455 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-biomedical_ner_all_anonimization_try_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_anonimization_try_4 DistilBertForTokenClassification from Juan281992 +author: John Snow Labs +name: biomedical_ner_all_anonimization_try_4 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_anonimization_try_4` is a English model originally trained by Juan281992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_4_en_5.2.0_3.0_1700586705926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_4_en_5.2.0_3.0_1700586705926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_anonimization_try_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_anonimization_try_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_anonimization_try_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Juan281992/biomedical-ner-all-anonimization_TRY_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-biomedical_ner_maccrobat_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-21-biomedical_ner_maccrobat_distilbert_en.md new file mode 100644 index 000000000000..fb708a738bec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-biomedical_ner_maccrobat_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_maccrobat_distilbert DistilBertForTokenClassification from vineetsharma +author: John Snow Labs +name: biomedical_ner_maccrobat_distilbert +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_maccrobat_distilbert` is a English model originally trained by vineetsharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_maccrobat_distilbert_en_5.2.0_3.0_1700526530519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_maccrobat_distilbert_en_5.2.0_3.0_1700526530519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_maccrobat_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_maccrobat_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_maccrobat_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/vineetsharma/BioMedical_NER-maccrobat-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_en.md new file mode 100644 index 000000000000..9af87232f8a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_address_tokenizer_model DistilBertForTokenClassification from bhattronak +author: John Snow Labs +name: burmese_awesome_address_tokenizer_model +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_address_tokenizer_model` is a English model originally trained by bhattronak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_en_5.2.0_3.0_1700529513538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_en_5.2.0_3.0_1700529513538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_address_tokenizer_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_address_tokenizer_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_address_tokenizer_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bhattronak/my-awesome-address-tokenizer-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v1_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v1_en.md new file mode 100644 index 000000000000..d048d6a6223f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_address_tokenizer_model_v1 DistilBertForTokenClassification from bhattronak +author: John Snow Labs +name: burmese_awesome_address_tokenizer_model_v1 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_address_tokenizer_model_v1` is a English model originally trained by bhattronak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v1_en_5.2.0_3.0_1700538467788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v1_en_5.2.0_3.0_1700538467788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_address_tokenizer_model_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_address_tokenizer_model_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_address_tokenizer_model_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bhattronak/my-awesome-address-tokenizer-model-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v2_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v2_en.md new file mode 100644 index 000000000000..5f37445bc7c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_address_tokenizer_model_v2 DistilBertForTokenClassification from bhattronak +author: John Snow Labs +name: burmese_awesome_address_tokenizer_model_v2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_address_tokenizer_model_v2` is a English model originally trained by bhattronak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v2_en_5.2.0_3.0_1700535482560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v2_en_5.2.0_3.0_1700535482560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_address_tokenizer_model_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_address_tokenizer_model_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_address_tokenizer_model_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bhattronak/my-awesome-address-tokenizer-model-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v4_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v4_en.md new file mode 100644 index 000000000000..bb7718e72d52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_address_tokenizer_model_v4 DistilBertForTokenClassification from bhattronak +author: John Snow Labs +name: burmese_awesome_address_tokenizer_model_v4 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_address_tokenizer_model_v4` is a English model originally trained by bhattronak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v4_en_5.2.0_3.0_1700526422149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v4_en_5.2.0_3.0_1700526422149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_address_tokenizer_model_v4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_address_tokenizer_model_v4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_address_tokenizer_model_v4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bhattronak/my-awesome-address-tokenizer-model-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v5_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v5_en.md new file mode 100644 index 000000000000..9bd0297490b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_address_tokenizer_model_v5 DistilBertForTokenClassification from bhattronak +author: John Snow Labs +name: burmese_awesome_address_tokenizer_model_v5 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_address_tokenizer_model_v5` is a English model originally trained by bhattronak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v5_en_5.2.0_3.0_1700533873359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v5_en_5.2.0_3.0_1700533873359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_address_tokenizer_model_v5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_address_tokenizer_model_v5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_address_tokenizer_model_v5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bhattronak/my-awesome-address-tokenizer-model-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v8_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v8_en.md new file mode 100644 index 000000000000..9e1ae2157565 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_address_tokenizer_model_v8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_address_tokenizer_model_v8 DistilBertForTokenClassification from bhattronak +author: John Snow Labs +name: burmese_awesome_address_tokenizer_model_v8 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_address_tokenizer_model_v8` is a English model originally trained by bhattronak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v8_en_5.2.0_3.0_1700532783120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v8_en_5.2.0_3.0_1700532783120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_address_tokenizer_model_v8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_address_tokenizer_model_v8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_address_tokenizer_model_v8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bhattronak/my-awesome-address-tokenizer-model-v8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_ukrsynth_model_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_ukrsynth_model_en.md new file mode 100644 index 000000000000..b91ed45a8e55 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_ukrsynth_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_ukrsynth_model DistilBertForTokenClassification from Olehlpnu +author: John Snow Labs +name: burmese_awesome_ukrsynth_model +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_ukrsynth_model` is a English model originally trained by Olehlpnu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_ukrsynth_model_en_5.2.0_3.0_1700585581410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_ukrsynth_model_en_5.2.0_3.0_1700585581410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_ukrsynth_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_ukrsynth_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_ukrsynth_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Olehlpnu/my_awesome_UkrSynth_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_bobbyw_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_bobbyw_en.md new file mode 100644 index 000000000000..de59c643dbe8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_bobbyw_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_bobbyw DistilBertForTokenClassification from bobbyw +author: John Snow Labs +name: burmese_awesome_wnut_model_bobbyw +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_bobbyw` is a English model originally trained by bobbyw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_bobbyw_en_5.2.0_3.0_1700533873234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_bobbyw_en_5.2.0_3.0_1700533873234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_bobbyw","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_bobbyw", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_bobbyw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bobbyw/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_olivermueller_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_olivermueller_en.md new file mode 100644 index 000000000000..f16efc7cf3c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_olivermueller_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_olivermueller DistilBertForTokenClassification from olivermueller +author: John Snow Labs +name: burmese_awesome_wnut_model_olivermueller +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_olivermueller` is a English model originally trained by olivermueller. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_olivermueller_en_5.2.0_3.0_1700540822793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_olivermueller_en_5.2.0_3.0_1700540822793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_olivermueller","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_olivermueller", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_olivermueller| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/olivermueller/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_quinta6728_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_quinta6728_en.md new file mode 100644 index 000000000000..71a987c531af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_awesome_wnut_model_quinta6728_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_quinta6728 DistilBertForTokenClassification from Quinta6728 +author: John Snow Labs +name: burmese_awesome_wnut_model_quinta6728 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_quinta6728` is a English model originally trained by Quinta6728. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_quinta6728_en_5.2.0_3.0_1700540709865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_quinta6728_en_5.2.0_3.0_1700540709865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_quinta6728","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_quinta6728", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_quinta6728| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Quinta6728/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_model_arunasaraswathy_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_model_arunasaraswathy_en.md new file mode 100644 index 000000000000..34d856091fac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_model_arunasaraswathy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_model_arunasaraswathy DistilBertForTokenClassification from ArunaSaraswathy +author: John Snow Labs +name: burmese_model_arunasaraswathy +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_model_arunasaraswathy` is a English model originally trained by ArunaSaraswathy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_model_arunasaraswathy_en_5.2.0_3.0_1700576885555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_model_arunasaraswathy_en_5.2.0_3.0_1700576885555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_model_arunasaraswathy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_model_arunasaraswathy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_model_arunasaraswathy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/ArunaSaraswathy/my_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-burmese_ner_model_tirendaz_en.md b/docs/_posts/ahmedlone127/2023-11-21-burmese_ner_model_tirendaz_en.md new file mode 100644 index 000000000000..16a6ae3b7e2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-burmese_ner_model_tirendaz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_ner_model_tirendaz DistilBertForTokenClassification from Tirendaz +author: John Snow Labs +name: burmese_ner_model_tirendaz +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_ner_model_tirendaz` is a English model originally trained by Tirendaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_ner_model_tirendaz_en_5.2.0_3.0_1700525744479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_ner_model_tirendaz_en_5.2.0_3.0_1700525744479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_ner_model_tirendaz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_ner_model_tirendaz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_ner_model_tirendaz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Tirendaz/my_ner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-claims_data_model_lunerwalker2_en.md b/docs/_posts/ahmedlone127/2023-11-21-claims_data_model_lunerwalker2_en.md new file mode 100644 index 000000000000..bc2516506ba3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-claims_data_model_lunerwalker2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English claims_data_model_lunerwalker2 DistilBertForTokenClassification from Lunerwalker2 +author: John Snow Labs +name: claims_data_model_lunerwalker2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`claims_data_model_lunerwalker2` is a English model originally trained by Lunerwalker2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/claims_data_model_lunerwalker2_en_5.2.0_3.0_1700569700057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/claims_data_model_lunerwalker2_en_5.2.0_3.0_1700569700057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("claims_data_model_lunerwalker2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("claims_data_model_lunerwalker2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|claims_data_model_lunerwalker2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Lunerwalker2/claims-data-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47_en.md b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47_en.md new file mode 100644 index 000000000000..0593e54c3156 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47_en_5.2.0_3.0_1700572576182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47_en_5.2.0_3.0_1700572576182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|correct_distilbert_token_itr0_1e_05_all_01_03_2022_15_43_47| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/correct_distilBERT_token_itr0_1e-05_all_01_03_2022-15_43_47 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32_en.md b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32_en.md new file mode 100644 index 000000000000..30399924d4aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32_en_5.2.0_3.0_1700545428933.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32_en_5.2.0_3.0_1700545428933.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|correct_distilbert_token_itr0_1e_05_editorials_01_03_2022_15_42_32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/correct_distilBERT_token_itr0_1e-05_editorials_01_03_2022-15_42_32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29_en.md b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29_en.md new file mode 100644 index 000000000000..5729fce26f9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29_en_5.2.0_3.0_1700543060483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29_en_5.2.0_3.0_1700543060483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|correct_distilbert_token_itr0_1e_05_essays_01_03_2022_15_41_29| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/correct_distilBERT_token_itr0_1e-05_essays_01_03_2022-15_41_29 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24_en.md b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24_en.md new file mode 100644 index 000000000000..b83ffacf0502 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24_en_5.2.0_3.0_1700567866172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24_en_5.2.0_3.0_1700567866172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|correct_distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_40_24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/correct_distilBERT_token_itr0_1e-05_webDiscourse_01_03_2022-15_40_24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-custom_ner_model_viajes_en.md b/docs/_posts/ahmedlone127/2023-11-21-custom_ner_model_viajes_en.md new file mode 100644 index 000000000000..13eb7a28b2b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-custom_ner_model_viajes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English custom_ner_model_viajes DistilBertForTokenClassification from hucruz +author: John Snow Labs +name: custom_ner_model_viajes +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`custom_ner_model_viajes` is a English model originally trained by hucruz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/custom_ner_model_viajes_en_5.2.0_3.0_1700570535355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/custom_ner_model_viajes_en_5.2.0_3.0_1700570535355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("custom_ner_model_viajes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("custom_ner_model_viajes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|custom_ner_model_viajes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/hucruz/custom-ner-model-viajes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-cybersecurity_ner_bnsapa_en.md b/docs/_posts/ahmedlone127/2023-11-21-cybersecurity_ner_bnsapa_en.md new file mode 100644 index 000000000000..799b45b9f452 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-cybersecurity_ner_bnsapa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cybersecurity_ner_bnsapa DistilBertForTokenClassification from bnsapa +author: John Snow Labs +name: cybersecurity_ner_bnsapa +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cybersecurity_ner_bnsapa` is a English model originally trained by bnsapa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cybersecurity_ner_bnsapa_en_5.2.0_3.0_1700548582368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cybersecurity_ner_bnsapa_en_5.2.0_3.0_1700548582368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("cybersecurity_ner_bnsapa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("cybersecurity_ner_bnsapa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cybersecurity_ner_bnsapa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bnsapa/cybersecurity-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-cybersecurity_ner_sudipadhikari_en.md b/docs/_posts/ahmedlone127/2023-11-21-cybersecurity_ner_sudipadhikari_en.md new file mode 100644 index 000000000000..92cde17c3d2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-cybersecurity_ner_sudipadhikari_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cybersecurity_ner_sudipadhikari DistilBertForTokenClassification from sudipadhikari +author: John Snow Labs +name: cybersecurity_ner_sudipadhikari +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cybersecurity_ner_sudipadhikari` is a English model originally trained by sudipadhikari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cybersecurity_ner_sudipadhikari_en_5.2.0_3.0_1700530787903.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cybersecurity_ner_sudipadhikari_en_5.2.0_3.0_1700530787903.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("cybersecurity_ner_sudipadhikari","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("cybersecurity_ner_sudipadhikari", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cybersecurity_ner_sudipadhikari| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sudipadhikari/cybersecurity_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-cybonto_distilbert_base_uncased_finetuned_ner_wnut17_en.md b/docs/_posts/ahmedlone127/2023-11-21-cybonto_distilbert_base_uncased_finetuned_ner_wnut17_en.md new file mode 100644 index 000000000000..6a06d4b6e03d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-cybonto_distilbert_base_uncased_finetuned_ner_wnut17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cybonto_distilbert_base_uncased_finetuned_ner_wnut17 DistilBertForTokenClassification from theResearchNinja +author: John Snow Labs +name: cybonto_distilbert_base_uncased_finetuned_ner_wnut17 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cybonto_distilbert_base_uncased_finetuned_ner_wnut17` is a English model originally trained by theResearchNinja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cybonto_distilbert_base_uncased_finetuned_ner_wnut17_en_5.2.0_3.0_1700570732176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cybonto_distilbert_base_uncased_finetuned_ner_wnut17_en_5.2.0_3.0_1700570732176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("cybonto_distilbert_base_uncased_finetuned_ner_wnut17","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("cybonto_distilbert_base_uncased_finetuned_ner_wnut17", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cybonto_distilbert_base_uncased_finetuned_ner_wnut17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/theResearchNinja/Cybonto-distilbert-base-uncased-finetuned-ner-Wnut17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-dir_distilbert_base_ner_058_en.md b/docs/_posts/ahmedlone127/2023-11-21-dir_distilbert_base_ner_058_en.md new file mode 100644 index 000000000000..ab9e0ed9166b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-dir_distilbert_base_ner_058_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dir_distilbert_base_ner_058 DistilBertForTokenClassification from NguyenVanHieu1605 +author: John Snow Labs +name: dir_distilbert_base_ner_058 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dir_distilbert_base_ner_058` is a English model originally trained by NguyenVanHieu1605. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dir_distilbert_base_ner_058_en_5.2.0_3.0_1700563690803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dir_distilbert_base_ner_058_en_5.2.0_3.0_1700563690803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("dir_distilbert_base_ner_058","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("dir_distilbert_base_ner_058", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dir_distilbert_base_ner_058| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/NguyenVanHieu1605/dir-distilbert-base-ner-058 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en.md new file mode 100644 index 000000000000..1daab3d8c658 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700553893749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700553893749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-iir-v1.0-concept-extraction-kp20k-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en.md new file mode 100644 index 000000000000..2a844b8da183 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en_5.2.0_3.0_1700550517628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en_5.2.0_3.0_1700550517628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-iir-v1.0-concept-extraction-kp20k-v1.4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_en.md new file mode 100644 index 000000000000..28fe5e185051 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700560117872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700560117872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_indoiranian_languages_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-iir-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en.md new file mode 100644 index 000000000000..89109c91530a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700576937642.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700576937642.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-iir-v1.2-concept-extraction-kp20k-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en.md new file mode 100644 index 000000000000..979b4926e0fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en_5.2.0_3.0_1700585581514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en_5.2.0_3.0_1700585581514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-iir-v1.2-concept-extraction-kp20k-v1.5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_en.md new file mode 100644 index 000000000000..1e5e1f14a208 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700562817705.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700562817705.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_indoiranian_languages_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-iir-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3_en.md new file mode 100644 index 000000000000..d3d4cf309006 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700565903963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700565903963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_indoiranian_languages_v1_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-iir-v1.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en.md new file mode 100644 index 000000000000..7d575fb08116 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en_5.2.0_3.0_1700574438032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en_5.2.0_3.0_1700574438032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-kp20k-v1.0-concept-extraction-wikipedia-v1.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_0_en.md new file mode 100644 index 000000000000..2a7c5b259ffb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_kp20k_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_kp20k_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_kp20k_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700565355521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700565355521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_kp20k_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-kp20k-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en.md new file mode 100644 index 000000000000..e30ceb0bbaf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en_5.2.0_3.0_1700553888684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en_5.2.0_3.0_1700553888684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-kp20k-v1.2-concept-extraction-allwikipedia-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en.md new file mode 100644 index 000000000000..5995331be8c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700554866418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700554866418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-kp20k-v1.2-concept-extraction-wikipedia-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_en.md new file mode 100644 index 000000000000..8cdcb3838555 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_kp20k_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_kp20k_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_kp20k_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_kp20k_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700566331314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700566331314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_kp20k_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_kp20k_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-kp20k-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en.md new file mode 100644 index 000000000000..ad835bf9c216 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700543972476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700543972476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-wikipedia-v1.0-concept-extraction-iir-v1.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_0_en.md new file mode 100644 index 000000000000..0011dcb16b74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_wikipedia_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_wikipedia_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_wikipedia_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700570728839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700570728839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_wikipedia_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-wikipedia-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en.md new file mode 100644 index 000000000000..1a4611052be9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700568491945.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700568491945.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-wikipedia-v1.2-concept-extraction-iir-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_2_en.md new file mode 100644 index 000000000000..a44e7347e00f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_concept_extraction_wikipedia_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_concept_extraction_wikipedia_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_cased_concept_extraction_wikipedia_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_concept_extraction_wikipedia_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700526734591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700526734591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_concept_extraction_wikipedia_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_concept_extraction_wikipedia_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-cased-concept-extraction-wikipedia-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_chunk_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_chunk_3_en.md new file mode 100644 index 000000000000..680755591f61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_chunk_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_chunk_3 DistilBertForTokenClassification from RobW +author: John Snow Labs +name: distilbert_base_cased_finetuned_chunk_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_chunk_3` is a English model originally trained by RobW. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_chunk_3_en_5.2.0_3.0_1700563690844.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_chunk_3_en_5.2.0_3.0_1700563690844.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_chunk_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_chunk_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_chunk_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RobW/distilbert-base-cased-finetuned-chunk-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_chunk_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_chunk_en.md new file mode 100644 index 000000000000..1367ddf410c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_chunk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_chunk DistilBertForTokenClassification from RobW +author: John Snow Labs +name: distilbert_base_cased_finetuned_chunk +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_chunk` is a English model originally trained by RobW. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_chunk_en_5.2.0_3.0_1700539962560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_chunk_en_5.2.0_3.0_1700539962560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_chunk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_chunk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_chunk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RobW/distilbert-base-cased-finetuned-chunk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_conll2003_en.md new file mode 100644 index 000000000000..c788489364a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_conll2003_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_conll2003 DistilBertForTokenClassification from EulerianKnight +author: John Snow Labs +name: distilbert_base_cased_finetuned_conll2003 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_conll2003` is a English model originally trained by EulerianKnight. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_conll2003_en_5.2.0_3.0_1700568053930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_conll2003_en_5.2.0_3.0_1700568053930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_conll2003","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_conll2003", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/EulerianKnight/distilbert-base-cased-finetuned-CONLL2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_cv2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_cv2_en.md new file mode 100644 index 000000000000..3f8991ad1d63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_cv2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_cv2 DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_cased_finetuned_cv2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_cv2` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_cv2_en_5.2.0_3.0_1700557574142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_cv2_en_5.2.0_3.0_1700557574142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_cv2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_cv2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_cv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-cased-finetuned-cv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_0301_j_data_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_0301_j_data_en.md new file mode 100644 index 000000000000..6386b3667fa4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_0301_j_data_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_0301_j_data DistilBertForTokenClassification from morganchen1007 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_0301_j_data +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_0301_j_data` is a English model originally trained by morganchen1007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_0301_j_data_en_5.2.0_3.0_1700582465529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_0301_j_data_en_5.2.0_3.0_1700582465529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_0301_j_data","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_0301_j_data", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_0301_j_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/morganchen1007/distilbert-base-cased-finetuned-ner_0301_J_DATA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_ssiyer_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_ssiyer_en.md new file mode 100644 index 000000000000..4c68e126fbef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_ssiyer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_ssiyer DistilBertForTokenClassification from ssiyer +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_ssiyer +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_ssiyer` is a English model originally trained by ssiyer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_ssiyer_en_5.2.0_3.0_1700537632053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_ssiyer_en_5.2.0_3.0_1700537632053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_ssiyer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_ssiyer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_ssiyer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/ssiyer/distilbert-base-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_swardiantara_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_swardiantara_en.md new file mode 100644 index 000000000000..9ded59893b70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_swardiantara_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_swardiantara DistilBertForTokenClassification from swardiantara +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_swardiantara +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_swardiantara` is a English model originally trained by swardiantara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_swardiantara_en_5.2.0_3.0_1700589776101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_swardiantara_en_5.2.0_3.0_1700589776101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_swardiantara","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_swardiantara", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_swardiantara| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/swardiantara/distilbert-base-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_t2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_t2_en.md new file mode 100644 index 000000000000..b80fce2246bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_t2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_t2 DistilBertForTokenClassification from RS7 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_t2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_t2` is a English model originally trained by RS7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t2_en_5.2.0_3.0_1700527546992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t2_en_5.2.0_3.0_1700527546992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_t2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_t2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_t2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RS7/distilbert-base-cased-finetuned-ner-t2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_t3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_t3_en.md new file mode 100644 index 000000000000..648766f45677 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_finetuned_ner_t3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_t3 DistilBertForTokenClassification from RS7 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_t3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_t3` is a English model originally trained by RS7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t3_en_5.2.0_3.0_1700526531687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t3_en_5.2.0_3.0_1700526531687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_t3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_t3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_t3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RS7/distilbert-base-cased-finetuned-ner-t3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_maple_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_maple_en.md new file mode 100644 index 000000000000..03deb3178e40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_maple_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_maple DistilBertForTokenClassification from maple +author: John Snow Labs +name: distilbert_base_cased_maple +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_maple` is a English model originally trained by maple. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_maple_en_5.2.0_3.0_1700590466632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_maple_en_5.2.0_3.0_1700590466632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_maple","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_maple", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_maple| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/maple/distilbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_ner_en.md new file mode 100644 index 000000000000..e7d223074ed2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_cased_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_ner DistilBertForTokenClassification from alvarobartt +author: John Snow Labs +name: distilbert_base_cased_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_ner` is a English model originally trained by alvarobartt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_ner_en_5.2.0_3.0_1700532911571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_ner_en_5.2.0_3.0_1700532911571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/alvarobartt/distilbert-base-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_german_cased_comma_derstandard_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_german_cased_comma_derstandard_en.md new file mode 100644 index 000000000000..8f47c5ff28f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_german_cased_comma_derstandard_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_german_cased_comma_derstandard DistilBertForTokenClassification from aseifert +author: John Snow Labs +name: distilbert_base_german_cased_comma_derstandard +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_german_cased_comma_derstandard` is a English model originally trained by aseifert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_german_cased_comma_derstandard_en_5.2.0_3.0_1700558240115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_german_cased_comma_derstandard_en_5.2.0_3.0_1700558240115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_german_cased_comma_derstandard","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_german_cased_comma_derstandard", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_german_cased_comma_derstandard| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/aseifert/distilbert-base-german-cased-comma-derstandard \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_german_cased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_german_cased_finetuned_ner_en.md new file mode 100644 index 000000000000..6081ab223761 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_german_cased_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_german_cased_finetuned_ner DistilBertForTokenClassification from FabianWillner +author: John Snow Labs +name: distilbert_base_german_cased_finetuned_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_german_cased_finetuned_ner` is a English model originally trained by FabianWillner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_german_cased_finetuned_ner_en_5.2.0_3.0_1700531928611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_german_cased_finetuned_ner_en_5.2.0_3.0_1700531928611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_german_cased_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_german_cased_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_german_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/FabianWillner/distilbert-base-german-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_cased_indic_glue_xx.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_cased_indic_glue_xx.md new file mode 100644 index 000000000000..45b5c5dce5e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_cased_indic_glue_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_indic_glue DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: distilbert_base_multilingual_cased_indic_glue +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_indic_glue` is a Multilingual model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_indic_glue_xx_5.2.0_3.0_1700591577598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_indic_glue_xx_5.2.0_3.0_1700591577598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_indic_glue","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_indic_glue", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_indic_glue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/AnanthZeke/distilbert-base-multilingual-cased-indic_glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_cased_mapa_fine_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_cased_mapa_fine_ner_xx.md new file mode 100644 index 000000000000..42dc02827c5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_cased_mapa_fine_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_mapa_fine_ner DistilBertForTokenClassification from dmargutierrez +author: John Snow Labs +name: distilbert_base_multilingual_cased_mapa_fine_ner +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_mapa_fine_ner` is a Multilingual model originally trained by dmargutierrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_mapa_fine_ner_xx_5.2.0_3.0_1700588322443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_mapa_fine_ner_xx_5.2.0_3.0_1700588322443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_mapa_fine_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_mapa_fine_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_mapa_fine_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.5 MB| + +## References + +https://huggingface.co/dmargutierrez/distilbert-base-multilingual-cased-mapa_fine-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_ner_naamapdam_fine_tuned_xx.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_ner_naamapdam_fine_tuned_xx.md new file mode 100644 index 000000000000..0822e81dbea5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_multilingual_ner_naamapdam_fine_tuned_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_ner_naamapdam_fine_tuned DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: distilbert_base_multilingual_ner_naamapdam_fine_tuned +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_ner_naamapdam_fine_tuned` is a Multilingual model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_ner_naamapdam_fine_tuned_xx_5.2.0_3.0_1700586218111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_ner_naamapdam_fine_tuned_xx_5.2.0_3.0_1700586218111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_ner_naamapdam_fine_tuned","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_ner_naamapdam_fine_tuned", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_ner_naamapdam_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/livinNector/distilbert-base-multilingual-NER-naamapdam-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_nguyenvanhieu1605_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_nguyenvanhieu1605_en.md new file mode 100644 index 000000000000..2e54ca893d68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_nguyenvanhieu1605_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_nguyenvanhieu1605 DistilBertForTokenClassification from NguyenVanHieu1605 +author: John Snow Labs +name: distilbert_base_nguyenvanhieu1605 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_nguyenvanhieu1605` is a English model originally trained by NguyenVanHieu1605. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_nguyenvanhieu1605_en_5.2.0_3.0_1700528345050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_nguyenvanhieu1605_en_5.2.0_3.0_1700528345050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_nguyenvanhieu1605","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_nguyenvanhieu1605", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_nguyenvanhieu1605| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/NguyenVanHieu1605/distilbert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_spanish_uncased_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_spanish_uncased_finetuned_ner_en.md new file mode 100644 index 000000000000..3c3f6647207f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_spanish_uncased_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_ner DistilBertForTokenClassification from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_ner` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_ner_en_5.2.0_3.0_1700540909738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_ner_en_5.2.0_3.0_1700540909738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_spanish_uncased_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_spanish_uncased_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.2 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_spanish_uncased_finetuned_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_spanish_uncased_finetuned_sayula_popoluca_en.md new file mode 100644 index 000000000000..26638ba3d438 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_spanish_uncased_finetuned_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_sayula_popoluca DistilBertForTokenClassification from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_sayula_popoluca +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_sayula_popoluca` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_sayula_popoluca_en_5.2.0_3.0_1700532030317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_sayula_popoluca_en_5.2.0_3.0_1700532030317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_spanish_uncased_finetuned_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_spanish_uncased_finetuned_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_token_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_token_en.md new file mode 100644 index 000000000000..10ee526cbf07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_token_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_token DistilBertForTokenClassification from PabloGuinea +author: John Snow Labs +name: distilbert_base_token +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_token` is a English model originally trained by PabloGuinea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_token_en_5.2.0_3.0_1700583329616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_token_en_5.2.0_3.0_1700583329616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_token","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_token", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_token| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/PabloGuinea/distilbert-base-token \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en.md new file mode 100644 index 000000000000..0d7a563d3f15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700555765404.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700555765404.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.0-concept-extraction-kp20k-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3_en.md new file mode 100644 index 000000000000..e1d8b41dafe4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3_en_5.2.0_3.0_1700552160239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3_en_5.2.0_3.0_1700552160239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.0-concept-extraction-kp20k-v1.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en.md new file mode 100644 index 000000000000..bfda6ebd3a38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en_5.2.0_3.0_1700582544233.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4_en_5.2.0_3.0_1700582544233.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_kp20k_v1_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.0-concept-extraction-kp20k-v1.4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc_en.md new file mode 100644 index 000000000000..fd539f03d62e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc_en_5.2.0_3.0_1700566233727.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc_en_5.2.0_3.0_1700566233727.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_truncated_3edbbc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.0-concept-extraction-truncated-3edbbc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0_en.md new file mode 100644 index 000000000000..4b88b3139767 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700550219535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700550219535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_concept_extraction_wikipedia_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.0-concept-extraction-wikipedia-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_en.md new file mode 100644 index 000000000000..68b160de9e98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700576069080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700576069080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en.md new file mode 100644 index 000000000000..7b8f376b0555 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700566618439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700566618439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.2-concept-extraction-kp20k-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en.md new file mode 100644 index 000000000000..23ecaefbed37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en_5.2.0_3.0_1700579901457.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5_en_5.2.0_3.0_1700579901457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_concept_extraction_kp20k_v1_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.2-concept-extraction-kp20k-v1.5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_en.md new file mode 100644 index 000000000000..2ba42a3410e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700580816103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700580816103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3_en.md new file mode 100644 index 000000000000..61c7094a4702 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700552160267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700552160267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_indoiranian_languages_v1_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-iir-v1.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523_en.md new file mode 100644 index 000000000000..938ca555e455 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523_en_5.2.0_3.0_1700586706106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523_en_5.2.0_3.0_1700586706106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_435523| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.0-concept-extracti-truncated-435523 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33_en.md new file mode 100644 index 000000000000..6353dfb7530d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33_en_5.2.0_3.0_1700573537330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33_en_5.2.0_3.0_1700573537330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extracti_truncated_7d1e33| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.0-concept-extracti-truncated-7d1e33 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0_en.md new file mode 100644 index 000000000000..0e6fcb185057 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700577971328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700577971328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_indoiranian_languages_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.0-concept-extraction-iir-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en.md new file mode 100644 index 000000000000..5407c8710a56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700578795206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700578795206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.0-concept-extraction-wikipedia-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1_en.md new file mode 100644 index 000000000000..db64bd27a095 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1_en_5.2.0_3.0_1700554884934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1_en_5.2.0_3.0_1700554884934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.0-concept-extraction-wikipedia-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en.md new file mode 100644 index 000000000000..3503dcae0ddb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en_5.2.0_3.0_1700538653430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3_en_5.2.0_3.0_1700538653430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_0_concept_extraction_wikipedia_v1_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.0-concept-extraction-wikipedia-v1.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_en.md new file mode 100644 index 000000000000..05006aee72fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700577085144.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700577085144.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en.md new file mode 100644 index 000000000000..88f1af367dd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en_5.2.0_3.0_1700549306061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0_en_5.2.0_3.0_1700549306061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_allwikipedia_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.2-concept-extraction-allwikipedia-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en.md new file mode 100644 index 000000000000..d26b4961def3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700541783214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700541783214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_2_concept_extraction_wikipedia_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.2-concept-extraction-wikipedia-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_en.md new file mode 100644 index 000000000000..bd50b1e61b37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_kp20k_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_kp20k_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_kp20k_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_kp20k_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700548581737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_kp20k_v1_2_en_5.2.0_3.0_1700548581737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_kp20k_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_kp20k_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-kp20k-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0_en.md new file mode 100644 index 000000000000..fab198fcb591 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700555689618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700555689618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.0-concept-extraction-iir-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en.md new file mode 100644 index 000000000000..908c7f6062c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700561966330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3_en_5.2.0_3.0_1700561966330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_indoiranian_languages_v1_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.0-concept-extraction-iir-v1.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0_en.md new file mode 100644 index 000000000000..837d1fab77e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700559302900.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0_en_5.2.0_3.0_1700559302900.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_0_concept_extraction_kp20k_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.0-concept-extraction-kp20k-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_en.md new file mode 100644 index 000000000000..29cbfe72836d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700570036191.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_0_en_5.2.0_3.0_1700570036191.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0_en.md new file mode 100644 index 000000000000..e4a3d5aee526 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700557157755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0_en_5.2.0_3.0_1700557157755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_1_concept_extraction_indoiranian_languages_v1_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.1-concept-extraction-iir-v1.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_1_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_1_en.md new file mode 100644 index 000000000000..9c83b41893fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_1 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_1 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_1` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_1_en_5.2.0_3.0_1700553894121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_1_en_5.2.0_3.0_1700553894121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en.md new file mode 100644 index 000000000000..cbf12fba9f5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700538826852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2_en_5.2.0_3.0_1700538826852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_2_concept_extraction_indoiranian_languages_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.2-concept-extraction-iir-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_2_en.md new file mode 100644 index 000000000000..a37fac28e0f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_concept_extraction_wikipedia_v1_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_concept_extraction_wikipedia_v1_2 DistilBertForTokenClassification from HungChau +author: John Snow Labs +name: distilbert_base_uncased_concept_extraction_wikipedia_v1_2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_concept_extraction_wikipedia_v1_2` is a English model originally trained by HungChau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700540017047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_concept_extraction_wikipedia_v1_2_en_5.2.0_3.0_1700540017047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_concept_extraction_wikipedia_v1_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_concept_extraction_wikipedia_v1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HungChau/distilbert-base-uncased-concept-extraction-wikipedia-v1.2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_date_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_date_en.md new file mode 100644 index 000000000000..ecc675d4f471 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_date_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_date DistilBertForTokenClassification from alicenkbaytop +author: John Snow Labs +name: distilbert_base_uncased_date +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_date` is a English model originally trained by alicenkbaytop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_date_en_5.2.0_3.0_1700536591291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_date_en_5.2.0_3.0_1700536591291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_date","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_date", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_date| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/alicenkbaytop/distilbert-base-uncased-date \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_fine_tuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_fine_tuned_ner_en.md new file mode 100644 index 000000000000..8149de19e245 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_fine_tuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_fine_tuned_ner DistilBertForTokenClassification from geckos +author: John Snow Labs +name: distilbert_base_uncased_fine_tuned_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fine_tuned_ner` is a English model originally trained by geckos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fine_tuned_ner_en_5.2.0_3.0_1700536591316.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fine_tuned_ner_en_5.2.0_3.0_1700536591316.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_fine_tuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_fine_tuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fine_tuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/geckos/distilbert-base-uncased-fine-tuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetune_dhlanm_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetune_dhlanm_en.md new file mode 100644 index 000000000000..5740f580489f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetune_dhlanm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetune_dhlanm DistilBertForTokenClassification from dhlanm +author: John Snow Labs +name: distilbert_base_uncased_finetune_dhlanm +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetune_dhlanm` is a English model originally trained by dhlanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetune_dhlanm_en_5.2.0_3.0_1700579983081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetune_dhlanm_en_5.2.0_3.0_1700579983081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetune_dhlanm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetune_dhlanm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetune_dhlanm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/dhlanm/distilbert-base-uncased-finetune \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud1_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud1_ner_en.md new file mode 100644 index 000000000000..2cc7ba79ccaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud1_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cloud1_ner DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cloud1_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cloud1_ner` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cloud1_ner_en_5.2.0_3.0_1700541783294.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cloud1_ner_en_5.2.0_3.0_1700541783294.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_cloud1_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_cloud1_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cloud1_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-cloud1-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud2_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud2_ner_en.md new file mode 100644 index 000000000000..3599d5a5922c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud2_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cloud2_ner DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cloud2_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cloud2_ner` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cloud2_ner_en_5.2.0_3.0_1700535482536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cloud2_ner_en_5.2.0_3.0_1700535482536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_cloud2_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_cloud2_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cloud2_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-cloud2-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud_ner_en.md new file mode 100644 index 000000000000..b11dd0cade2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_cloud_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cloud_ner DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cloud_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cloud_ner` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cloud_ner_en_5.2.0_3.0_1700545818827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cloud_ner_en_5.2.0_3.0_1700545818827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_cloud_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_cloud_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cloud_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-cloud-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_hypertuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_hypertuned_ner_en.md new file mode 100644 index 000000000000..e2e7ac0ef1b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_hypertuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_hypertuned_ner DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_hypertuned_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_hypertuned_ner` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_hypertuned_ner_en_5.2.0_3.0_1700536709631.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_hypertuned_ner_en_5.2.0_3.0_1700536709631.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_hypertuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_hypertuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_hypertuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-hypertuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ingredients_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ingredients_en.md new file mode 100644 index 000000000000..903c17c24339 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ingredients_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ingredients DistilBertForTokenClassification from harr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ingredients +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ingredients` is a English model originally trained by harr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ingredients_en_5.2.0_3.0_1700525270768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ingredients_en_5.2.0_3.0_1700525270768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ingredients","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ingredients", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ingredients| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/harr/distilbert-base-uncased-finetuned-ingredients \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_0212_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_0212_en.md new file mode 100644 index 000000000000..56029f4a273b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_0212_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_0212 DistilBertForTokenClassification from morganchen1007 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_0212 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_0212` is a English model originally trained by morganchen1007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_0212_en_5.2.0_3.0_1700582668097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_0212_en_5.2.0_3.0_1700582668097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_0212","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_0212", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_0212| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/morganchen1007/distilbert-base-uncased-finetuned-ner_0212 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_adityavithaldas_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_adityavithaldas_en.md new file mode 100644 index 000000000000..4db8721e9af8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_adityavithaldas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_adityavithaldas DistilBertForTokenClassification from adityavithaldas +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_adityavithaldas +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_adityavithaldas` is a English model originally trained by adityavithaldas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_adityavithaldas_en_5.2.0_3.0_1700570851612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_adityavithaldas_en_5.2.0_3.0_1700570851612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_adityavithaldas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_adityavithaldas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_adityavithaldas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/adityavithaldas/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_aidj_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_aidj_en.md new file mode 100644 index 000000000000..9ecffed86ad5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_aidj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_aidj DistilBertForTokenClassification from aidj +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_aidj +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_aidj` is a English model originally trained by aidj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aidj_en_5.2.0_3.0_1700545428613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aidj_en_5.2.0_3.0_1700545428613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_aidj","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_aidj", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_aidj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/aidj/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_airparadox_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_airparadox_en.md new file mode 100644 index 000000000000..6f47573b97fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_airparadox_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_airparadox DistilBertForTokenClassification from airparadox +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_airparadox +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_airparadox` is a English model originally trained by airparadox. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_airparadox_en_5.2.0_3.0_1700574003012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_airparadox_en_5.2.0_3.0_1700574003012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_airparadox","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_airparadox", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_airparadox| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/airparadox/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_akshaychaudhary_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_akshaychaudhary_en.md new file mode 100644 index 000000000000..e61a5bd468ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_akshaychaudhary_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_akshaychaudhary DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_akshaychaudhary +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_akshaychaudhary` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_akshaychaudhary_en_5.2.0_3.0_1700545428947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_akshaychaudhary_en_5.2.0_3.0_1700545428947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_akshaychaudhary","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_akshaychaudhary", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_akshaychaudhary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_al00014_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_al00014_en.md new file mode 100644 index 000000000000..03f037eca6ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_al00014_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_al00014 DistilBertForTokenClassification from al00014 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_al00014 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_al00014` is a English model originally trained by al00014. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_al00014_en_5.2.0_3.0_1700547530241.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_al00014_en_5.2.0_3.0_1700547530241.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_al00014","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_al00014", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_al00014| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/al00014/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_algiraldohe_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_algiraldohe_en.md new file mode 100644 index 000000000000..bca72684cab6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_algiraldohe_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_algiraldohe DistilBertForTokenClassification from algiraldohe +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_algiraldohe +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_algiraldohe` is a English model originally trained by algiraldohe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_algiraldohe_en_5.2.0_3.0_1700553894087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_algiraldohe_en_5.2.0_3.0_1700553894087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_algiraldohe","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_algiraldohe", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_algiraldohe| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/algiraldohe/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_andreasostling_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_andreasostling_en.md new file mode 100644 index 000000000000..4382b3339389 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_andreasostling_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_andreasostling DistilBertForTokenClassification from andreasostling +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_andreasostling +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_andreasostling` is a English model originally trained by andreasostling. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_andreasostling_en_5.2.0_3.0_1700584516914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_andreasostling_en_5.2.0_3.0_1700584516914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_andreasostling","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_andreasostling", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_andreasostling| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/andreasostling/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ann2020_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ann2020_en.md new file mode 100644 index 000000000000..d0fba0bd05a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ann2020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ann2020 DistilBertForTokenClassification from Ann2020 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ann2020 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ann2020` is a English model originally trained by Ann2020. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ann2020_en_5.2.0_3.0_1700531928789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ann2020_en_5.2.0_3.0_1700531928789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ann2020","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ann2020", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ann2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Ann2020/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ayush414_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ayush414_en.md new file mode 100644 index 000000000000..1b972c860044 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ayush414_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ayush414 DistilBertForTokenClassification from Ayush414 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ayush414 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ayush414` is a English model originally trained by Ayush414. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ayush414_en_5.2.0_3.0_1700573885122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ayush414_en_5.2.0_3.0_1700573885122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ayush414","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ayush414", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ayush414| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Ayush414/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_backpack30_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_backpack30_en.md new file mode 100644 index 000000000000..1149aedd1946 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_backpack30_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_backpack30 DistilBertForTokenClassification from backpack30 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_backpack30 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_backpack30` is a English model originally trained by backpack30. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_backpack30_en_5.2.0_3.0_1700571774333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_backpack30_en_5.2.0_3.0_1700571774333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_backpack30","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_backpack30", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_backpack30| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/backpack30/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_bwfyyy_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_bwfyyy_en.md new file mode 100644 index 000000000000..2d314bf409dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_bwfyyy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_bwfyyy DistilBertForTokenClassification from bwfyyy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_bwfyyy +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_bwfyyy` is a English model originally trained by bwfyyy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_bwfyyy_en_5.2.0_3.0_1700590392948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_bwfyyy_en_5.2.0_3.0_1700590392948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_bwfyyy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_bwfyyy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_bwfyyy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bwfyyy/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_cfisicaro_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_cfisicaro_en.md new file mode 100644 index 000000000000..bf434ad8c64b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_cfisicaro_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_cfisicaro DistilBertForTokenClassification from cfisicaro +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_cfisicaro +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_cfisicaro` is a English model originally trained by cfisicaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cfisicaro_en_5.2.0_3.0_1700546511214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cfisicaro_en_5.2.0_3.0_1700546511214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_cfisicaro","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_cfisicaro", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_cfisicaro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/cfisicaro/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_chanaa_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_chanaa_en.md new file mode 100644 index 000000000000..4e6d9fb72728 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_chanaa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_chanaa DistilBertForTokenClassification from chanaa +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_chanaa +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_chanaa` is a English model originally trained by chanaa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_chanaa_en_5.2.0_3.0_1700555765421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_chanaa_en_5.2.0_3.0_1700555765421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_chanaa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_chanaa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_chanaa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chanaa/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_charlecheng_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_charlecheng_en.md new file mode 100644 index 000000000000..a6f8fcf0cf4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_charlecheng_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_charlecheng DistilBertForTokenClassification from charlecheng +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_charlecheng +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_charlecheng` is a English model originally trained by charlecheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_charlecheng_en_5.2.0_3.0_1700547532015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_charlecheng_en_5.2.0_3.0_1700547532015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_charlecheng","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_charlecheng", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_charlecheng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/charlecheng/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ckandemir_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ckandemir_en.md new file mode 100644 index 000000000000..e5567931e156 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ckandemir_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ckandemir DistilBertForTokenClassification from ckandemir +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ckandemir +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ckandemir` is a English model originally trained by ckandemir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ckandemir_en_5.2.0_3.0_1700540870260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ckandemir_en_5.2.0_3.0_1700540870260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ckandemir","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ckandemir", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ckandemir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ckandemir/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_codingjacob_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_codingjacob_en.md new file mode 100644 index 000000000000..4308a734e420 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_codingjacob_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_codingjacob DistilBertForTokenClassification from codingJacob +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_codingjacob +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_codingjacob` is a English model originally trained by codingJacob. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_codingjacob_en_5.2.0_3.0_1700530468237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_codingjacob_en_5.2.0_3.0_1700530468237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_codingjacob","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_codingjacob", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_codingjacob| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/codingJacob/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_cogito233_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_cogito233_en.md new file mode 100644 index 000000000000..41d580dee8cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_cogito233_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_cogito233 DistilBertForTokenClassification from cogito233 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_cogito233 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_cogito233` is a English model originally trained by cogito233. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cogito233_en_5.2.0_3.0_1700549562083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_cogito233_en_5.2.0_3.0_1700549562083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_cogito233","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_cogito233", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_cogito233| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/cogito233/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_dbsamu_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_dbsamu_en.md new file mode 100644 index 000000000000..b6701d0b27cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_dbsamu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_dbsamu DistilBertForTokenClassification from dbsamu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_dbsamu +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_dbsamu` is a English model originally trained by dbsamu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_dbsamu_en_5.2.0_3.0_1700552160504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_dbsamu_en_5.2.0_3.0_1700552160504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_dbsamu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_dbsamu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_dbsamu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/dbsamu/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_delpart_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_delpart_en.md new file mode 100644 index 000000000000..e82b59a9acd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_delpart_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_delpart DistilBertForTokenClassification from delpart +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_delpart +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_delpart` is a English model originally trained by delpart. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_delpart_en_5.2.0_3.0_1700543507860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_delpart_en_5.2.0_3.0_1700543507860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_delpart","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_delpart", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_delpart| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/delpart/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_deval_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_deval_en.md new file mode 100644 index 000000000000..2e8f4e64a146 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_deval_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_deval DistilBertForTokenClassification from deval +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_deval +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_deval` is a English model originally trained by deval. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_deval_en_5.2.0_3.0_1700544586876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_deval_en_5.2.0_3.0_1700544586876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_deval","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_deval", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_deval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/deval/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_duc_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_duc_en.md new file mode 100644 index 000000000000..16f30a4f4ae9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_duc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_duc DistilBertForTokenClassification from Duc +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_duc +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_duc` is a English model originally trained by Duc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_duc_en_5.2.0_3.0_1700536591243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_duc_en_5.2.0_3.0_1700536591243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_duc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_duc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_duc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Duc/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_evgeneus_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_evgeneus_en.md new file mode 100644 index 000000000000..4e9be72b0d63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_evgeneus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_evgeneus DistilBertForTokenClassification from Evgeneus +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_evgeneus +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_evgeneus` is a English model originally trained by Evgeneus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_evgeneus_en_5.2.0_3.0_1700546614670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_evgeneus_en_5.2.0_3.0_1700546614670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_evgeneus","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_evgeneus", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_evgeneus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Evgeneus/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fahaddilib_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fahaddilib_en.md new file mode 100644 index 000000000000..6c5f04fc0fea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fahaddilib_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_fahaddilib DistilBertForTokenClassification from fahaddilib +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_fahaddilib +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_fahaddilib` is a English model originally trained by fahaddilib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fahaddilib_en_5.2.0_3.0_1700589054102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fahaddilib_en_5.2.0_3.0_1700589054102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_fahaddilib","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_fahaddilib", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_fahaddilib| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/fahaddilib/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fiddi_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fiddi_en.md new file mode 100644 index 000000000000..f306b31087c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fiddi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_fiddi DistilBertForTokenClassification from Fiddi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_fiddi +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_fiddi` is a English model originally trained by Fiddi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fiddi_en_5.2.0_3.0_1700543972490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fiddi_en_5.2.0_3.0_1700543972490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_fiddi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_fiddi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_fiddi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Fiddi/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_finetuned_ner_en.md new file mode 100644 index 000000000000..d00f7c0bd988 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_finetuned_ner DistilBertForTokenClassification from Maxaontrix +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_finetuned_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_finetuned_ner` is a English model originally trained by Maxaontrix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_finetuned_ner_en_5.2.0_3.0_1700590392945.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_finetuned_ner_en_5.2.0_3.0_1700590392945.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Maxaontrix/distilbert-base-uncased-finetuned-ner-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fredmath_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fredmath_en.md new file mode 100644 index 000000000000..f11a3777c3c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_fredmath_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_fredmath DistilBertForTokenClassification from FredMath +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_fredmath +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_fredmath` is a English model originally trained by FredMath. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fredmath_en_5.2.0_3.0_1700575094459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fredmath_en_5.2.0_3.0_1700575094459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_fredmath","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_fredmath", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_fredmath| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/FredMath/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_gagan3012_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_gagan3012_en.md new file mode 100644 index 000000000000..9665a8e3eb6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_gagan3012_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_gagan3012 DistilBertForTokenClassification from gagan3012 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_gagan3012 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_gagan3012` is a English model originally trained by gagan3012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gagan3012_en_5.2.0_3.0_1700544483674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gagan3012_en_5.2.0_3.0_1700544483674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_gagan3012","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_gagan3012", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_gagan3012| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/gagan3012/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hank_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hank_en.md new file mode 100644 index 000000000000..df2a612344e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hank_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_hank DistilBertForTokenClassification from Hank +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_hank +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_hank` is a English model originally trained by Hank. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hank_en_5.2.0_3.0_1700545004755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hank_en_5.2.0_3.0_1700545004755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_hank","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_hank", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_hank| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Hank/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hannahbillo_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hannahbillo_en.md new file mode 100644 index 000000000000..f45b1f0506e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hannahbillo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_hannahbillo DistilBertForTokenClassification from hannahbillo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_hannahbillo +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_hannahbillo` is a English model originally trained by hannahbillo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hannahbillo_en_5.2.0_3.0_1700591252402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hannahbillo_en_5.2.0_3.0_1700591252402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_hannahbillo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_hannahbillo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_hannahbillo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hannahbillo/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_histinct7002_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_histinct7002_en.md new file mode 100644 index 000000000000..0a23785a0581 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_histinct7002_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_histinct7002 DistilBertForTokenClassification from histinct7002 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_histinct7002 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_histinct7002` is a English model originally trained by histinct7002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_histinct7002_en_5.2.0_3.0_1700548420829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_histinct7002_en_5.2.0_3.0_1700548420829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_histinct7002","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_histinct7002", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_histinct7002| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/histinct7002/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hkoll2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hkoll2_en.md new file mode 100644 index 000000000000..72d1da0fccfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hkoll2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_hkoll2 DistilBertForTokenClassification from hkoll2 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_hkoll2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_hkoll2` is a English model originally trained by hkoll2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hkoll2_en_5.2.0_3.0_1700583446283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hkoll2_en_5.2.0_3.0_1700583446283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_hkoll2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_hkoll2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_hkoll2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hkoll2/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hyerim_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hyerim_en.md new file mode 100644 index 000000000000..0691a2d7d645 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_hyerim_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_hyerim DistilBertForTokenClassification from hyerim +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_hyerim +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_hyerim` is a English model originally trained by hyerim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hyerim_en_5.2.0_3.0_1700565363816.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hyerim_en_5.2.0_3.0_1700565363816.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_hyerim","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_hyerim", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_hyerim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hyerim/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_indridinn_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_indridinn_en.md new file mode 100644 index 000000000000..df3bb2a723a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_indridinn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_indridinn DistilBertForTokenClassification from indridinn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_indridinn +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_indridinn` is a English model originally trained by indridinn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_indridinn_en_5.2.0_3.0_1700556265218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_indridinn_en_5.2.0_3.0_1700556265218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_indridinn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_indridinn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_indridinn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/indridinn/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_invoicesendername_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_invoicesendername_en.md new file mode 100644 index 000000000000..916d99636939 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_invoicesendername_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_invoicesendername DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_invoicesendername +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_invoicesendername` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_invoicesendername_en_5.2.0_3.0_1700569802390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_invoicesendername_en_5.2.0_3.0_1700569802390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_invoicesendername","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_invoicesendername", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_invoicesendername| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-finetuned-ner-invoiceSenderName \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_kamsut_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_kamsut_en.md new file mode 100644 index 000000000000..028b49f30ae6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_kamsut_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_kamsut DistilBertForTokenClassification from KamSut +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_kamsut +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_kamsut` is a English model originally trained by KamSut. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kamsut_en_5.2.0_3.0_1700543407620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kamsut_en_5.2.0_3.0_1700543407620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_kamsut","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_kamsut", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_kamsut| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/KamSut/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_kizunasunhy_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_kizunasunhy_en.md new file mode 100644 index 000000000000..d21d7c308a04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_kizunasunhy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_kizunasunhy DistilBertForTokenClassification from kizunasunhy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_kizunasunhy +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_kizunasunhy` is a English model originally trained by kizunasunhy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kizunasunhy_en_5.2.0_3.0_1700587775444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kizunasunhy_en_5.2.0_3.0_1700587775444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_kizunasunhy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_kizunasunhy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_kizunasunhy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/kizunasunhy/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_lbw_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_lbw_en.md new file mode 100644 index 000000000000..3c1b6a64770b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_lbw_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_lbw DistilBertForTokenClassification from lbw +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_lbw +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_lbw` is a English model originally trained by lbw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_lbw_en_5.2.0_3.0_1700574847694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_lbw_en_5.2.0_3.0_1700574847694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_lbw","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_lbw", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_lbw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/lbw/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_leonadase_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_leonadase_en.md new file mode 100644 index 000000000000..c047c2d2c905 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_leonadase_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_leonadase DistilBertForTokenClassification from leonadase +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_leonadase +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_leonadase` is a English model originally trained by leonadase. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_leonadase_en_5.2.0_3.0_1700569316651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_leonadase_en_5.2.0_3.0_1700569316651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_leonadase","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_leonadase", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_leonadase| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/leonadase/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_linasaba_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_linasaba_en.md new file mode 100644 index 000000000000..fa9c7600c31b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_linasaba_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_linasaba DistilBertForTokenClassification from LinaSaba +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_linasaba +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_linasaba` is a English model originally trained by LinaSaba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_linasaba_en_5.2.0_3.0_1700577871323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_linasaba_en_5.2.0_3.0_1700577871323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_linasaba","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_linasaba", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_linasaba| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/LinaSaba/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_lucasmtz_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_lucasmtz_en.md new file mode 100644 index 000000000000..88aedb52a757 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_lucasmtz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_lucasmtz DistilBertForTokenClassification from lucasmtz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_lucasmtz +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_lucasmtz` is a English model originally trained by lucasmtz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_lucasmtz_en_5.2.0_3.0_1700568889129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_lucasmtz_en_5.2.0_3.0_1700568889129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_lucasmtz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_lucasmtz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_lucasmtz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/lucasmtz/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mackseem_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mackseem_en.md new file mode 100644 index 000000000000..73cd1f6230ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mackseem_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mackseem DistilBertForTokenClassification from mackseem +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mackseem +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mackseem` is a English model originally trained by mackseem. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mackseem_en_5.2.0_3.0_1700554884507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mackseem_en_5.2.0_3.0_1700554884507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mackseem","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mackseem", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mackseem| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mackseem/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mcdzwil_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mcdzwil_en.md new file mode 100644 index 000000000000..f5d42beb899f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mcdzwil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mcdzwil DistilBertForTokenClassification from mcdzwil +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mcdzwil +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mcdzwil` is a English model originally trained by mcdzwil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mcdzwil_en_5.2.0_3.0_1700564510715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mcdzwil_en_5.2.0_3.0_1700564510715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mcdzwil","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mcdzwil", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mcdzwil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mcdzwil/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mdroth_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mdroth_en.md new file mode 100644 index 000000000000..b7bfdb4bbf4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mdroth_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mdroth DistilBertForTokenClassification from mdroth +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mdroth +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mdroth` is a English model originally trained by mdroth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mdroth_en_5.2.0_3.0_1700585581414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mdroth_en_5.2.0_3.0_1700585581414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mdroth","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mdroth", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mdroth| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mdroth/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mikhailgalperin_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mikhailgalperin_en.md new file mode 100644 index 000000000000..301474aa22db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mikhailgalperin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mikhailgalperin DistilBertForTokenClassification from MikhailGalperin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mikhailgalperin +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mikhailgalperin` is a English model originally trained by MikhailGalperin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mikhailgalperin_en_5.2.0_3.0_1700591577756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mikhailgalperin_en_5.2.0_3.0_1700591577756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mikhailgalperin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mikhailgalperin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mikhailgalperin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/MikhailGalperin/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_minowa_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_minowa_en.md new file mode 100644 index 000000000000..7325586bfcd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_minowa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_minowa DistilBertForTokenClassification from Minowa +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_minowa +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_minowa` is a English model originally trained by Minowa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_minowa_en_5.2.0_3.0_1700552218019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_minowa_en_5.2.0_3.0_1700552218019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_minowa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_minowa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_minowa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Minowa/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mohammedhb_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mohammedhb_en.md new file mode 100644 index 000000000000..69d16527cb71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mohammedhb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mohammedhb DistilBertForTokenClassification from MohammedHB +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mohammedhb +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mohammedhb` is a English model originally trained by MohammedHB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mohammedhb_en_5.2.0_3.0_1700589054075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mohammedhb_en_5.2.0_3.0_1700589054075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mohammedhb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mohammedhb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mohammedhb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/MohammedHB/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_momo_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_momo_en.md new file mode 100644 index 000000000000..0c4870dc651f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_momo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_momo DistilBertForTokenClassification from momo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_momo +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_momo` is a English model originally trained by momo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_momo_en_5.2.0_3.0_1700564548578.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_momo_en_5.2.0_3.0_1700564548578.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_momo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_momo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_momo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/momo/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mood_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mood_en.md new file mode 100644 index 000000000000..ab6311488e88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_mood_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mood DistilBertForTokenClassification from Mood +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mood +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mood` is a English model originally trained by Mood. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mood_en_5.2.0_3.0_1700539869048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mood_en_5.2.0_3.0_1700539869048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mood","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mood", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mood| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Mood/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ncduy_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ncduy_en.md new file mode 100644 index 000000000000..83494bb26874 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ncduy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ncduy DistilBertForTokenClassification from ncduy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ncduy +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ncduy` is a English model originally trained by ncduy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ncduy_en_5.2.0_3.0_1700556619627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ncduy_en_5.2.0_3.0_1700556619627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ncduy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ncduy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ncduy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ncduy/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_nishmithaur_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_nishmithaur_en.md new file mode 100644 index 000000000000..ba5d7bdd69e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_nishmithaur_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_nishmithaur DistilBertForTokenClassification from nishmithaur +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_nishmithaur +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_nishmithaur` is a English model originally trained by nishmithaur. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nishmithaur_en_5.2.0_3.0_1700537631990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nishmithaur_en_5.2.0_3.0_1700537631990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_nishmithaur","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_nishmithaur", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_nishmithaur| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/nishmithaur/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_only_actions_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_only_actions_en.md new file mode 100644 index 000000000000..59fbd4fbd3b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_only_actions_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_only_actions DistilBertForTokenClassification from nestoralvaro +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_only_actions +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_only_actions` is a English model originally trained by nestoralvaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_only_actions_en_5.2.0_3.0_1700532919291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_only_actions_en_5.2.0_3.0_1700532919291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_only_actions","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_only_actions", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_only_actions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nestoralvaro/distilbert-base-uncased-finetuned-ner_only_actions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_pabloguinea_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_pabloguinea_en.md new file mode 100644 index 000000000000..cca9e0b5a4c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_pabloguinea_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_pabloguinea DistilBertForTokenClassification from PabloGuinea +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_pabloguinea +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_pabloguinea` is a English model originally trained by PabloGuinea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pabloguinea_en_5.2.0_3.0_1700580816211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pabloguinea_en_5.2.0_3.0_1700580816211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_pabloguinea","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_pabloguinea", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_pabloguinea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/PabloGuinea/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_prao_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_prao_en.md new file mode 100644 index 000000000000..364606573828 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_prao_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_prao DistilBertForTokenClassification from prao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_prao +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_prao` is a English model originally trained by prao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_prao_en_5.2.0_3.0_1700560967314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_prao_en_5.2.0_3.0_1700560967314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_prao","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_prao", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_prao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/prao/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_pytest_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_pytest_en.md new file mode 100644 index 000000000000..f3d058370dc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_pytest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_pytest DistilBertForTokenClassification from pytest +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_pytest +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_pytest` is a English model originally trained by pytest. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pytest_en_5.2.0_3.0_1700572576176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pytest_en_5.2.0_3.0_1700572576176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_pytest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_pytest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_pytest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/pytest/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_saiteja_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_saiteja_en.md new file mode 100644 index 000000000000..7bd79a43bc48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_saiteja_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_saiteja DistilBertForTokenClassification from Saiteja +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_saiteja +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_saiteja` is a English model originally trained by Saiteja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_saiteja_en_5.2.0_3.0_1700574532118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_saiteja_en_5.2.0_3.0_1700574532118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_saiteja","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_saiteja", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_saiteja| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Saiteja/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_saurabhkaushik_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_saurabhkaushik_en.md new file mode 100644 index 000000000000..fab5e89dbd60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_saurabhkaushik_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_saurabhkaushik DistilBertForTokenClassification from SaurabhKaushik +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_saurabhkaushik +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_saurabhkaushik` is a English model originally trained by SaurabhKaushik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_saurabhkaushik_en_5.2.0_3.0_1700578796272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_saurabhkaushik_en_5.2.0_3.0_1700578796272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_saurabhkaushik","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_saurabhkaushik", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_saurabhkaushik| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SaurabhKaushik/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_seishin_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_seishin_en.md new file mode 100644 index 000000000000..50a715d26a5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_seishin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_seishin DistilBertForTokenClassification from SEISHIN +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_seishin +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_seishin` is a English model originally trained by SEISHIN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_seishin_en_5.2.0_3.0_1700533895069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_seishin_en_5.2.0_3.0_1700533895069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_seishin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_seishin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_seishin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SEISHIN/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_shenyancheng_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_shenyancheng_en.md new file mode 100644 index 000000000000..f852b0769ed0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_shenyancheng_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_shenyancheng DistilBertForTokenClassification from Shenyancheng +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_shenyancheng +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_shenyancheng` is a English model originally trained by Shenyancheng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_shenyancheng_en_5.2.0_3.0_1700551327460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_shenyancheng_en_5.2.0_3.0_1700551327460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_shenyancheng","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_shenyancheng", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_shenyancheng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Shenyancheng/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_songrb_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_songrb_en.md new file mode 100644 index 000000000000..7e36c02488af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_songrb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_songrb DistilBertForTokenClassification from SongRb +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_songrb +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_songrb` is a English model originally trained by SongRb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_songrb_en_5.2.0_3.0_1700537632024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_songrb_en_5.2.0_3.0_1700537632024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_songrb","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_songrb", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_songrb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SongRb/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_sschangi_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_sschangi_en.md new file mode 100644 index 000000000000..fbf3a227978a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_sschangi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_sschangi DistilBertForTokenClassification from sschangi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_sschangi +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_sschangi` is a English model originally trained by sschangi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sschangi_en_5.2.0_3.0_1700575372766.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sschangi_en_5.2.0_3.0_1700575372766.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_sschangi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_sschangi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_sschangi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sschangi/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ssiyer_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ssiyer_en.md new file mode 100644 index 000000000000..59583e156a64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_ssiyer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ssiyer DistilBertForTokenClassification from ssiyer +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ssiyer +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ssiyer` is a English model originally trained by ssiyer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ssiyer_en_5.2.0_3.0_1700530377084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ssiyer_en_5.2.0_3.0_1700530377084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ssiyer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ssiyer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ssiyer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ssiyer/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_sunwei2011_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_sunwei2011_en.md new file mode 100644 index 000000000000..6ae28c1398b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_sunwei2011_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_sunwei2011 DistilBertForTokenClassification from sunwei2011 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_sunwei2011 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_sunwei2011` is a English model originally trained by sunwei2011. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sunwei2011_en_5.2.0_3.0_1700580739543.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sunwei2011_en_5.2.0_3.0_1700580739543.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_sunwei2011","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_sunwei2011", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_sunwei2011| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sunwei2011/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_swang2000_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_swang2000_en.md new file mode 100644 index 000000000000..9291334b1301 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_swang2000_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_swang2000 DistilBertForTokenClassification from swang2000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_swang2000 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_swang2000` is a English model originally trained by swang2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_swang2000_en_5.2.0_3.0_1700581688358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_swang2000_en_5.2.0_3.0_1700581688358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_swang2000","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_swang2000", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_swang2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/swang2000/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_tnavin_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_tnavin_en.md new file mode 100644 index 000000000000..1be0a742c7f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_tnavin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_tnavin DistilBertForTokenClassification from tnavin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_tnavin +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_tnavin` is a English model originally trained by tnavin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tnavin_en_5.2.0_3.0_1700575942730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tnavin_en_5.2.0_3.0_1700575942730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_tnavin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_tnavin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_tnavin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/tnavin/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_trans_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_trans_en.md new file mode 100644 index 000000000000..62090c50e938 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_trans_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_trans DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_trans +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_trans` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_trans_en_5.2.0_3.0_1700539993619.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_trans_en_5.2.0_3.0_1700539993619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_trans","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_trans", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_trans| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-finetuned-ner-TRANS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_udi_aharon_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_udi_aharon_en.md new file mode 100644 index 000000000000..08d402570032 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_udi_aharon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_udi_aharon DistilBertForTokenClassification from Udi-Aharon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_udi_aharon +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_udi_aharon` is a English model originally trained by Udi-Aharon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_udi_aharon_en_5.2.0_3.0_1700578859404.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_udi_aharon_en_5.2.0_3.0_1700578859404.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_udi_aharon","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_udi_aharon", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_udi_aharon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Udi-Aharon/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_v3rx2000_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_v3rx2000_en.md new file mode 100644 index 000000000000..766260b16563 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_v3rx2000_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_v3rx2000 DistilBertForTokenClassification from V3RX2000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_v3rx2000 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_v3rx2000` is a English model originally trained by V3RX2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_v3rx2000_en_5.2.0_3.0_1700532913964.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_v3rx2000_en_5.2.0_3.0_1700532913964.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_v3rx2000","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_v3rx2000", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_v3rx2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/V3RX2000/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_vibharkchauhan_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_vibharkchauhan_en.md new file mode 100644 index 000000000000..c96965fb9573 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_vibharkchauhan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_vibharkchauhan DistilBertForTokenClassification from Vibharkchauhan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_vibharkchauhan +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_vibharkchauhan` is a English model originally trained by Vibharkchauhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vibharkchauhan_en_5.2.0_3.0_1700533873232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vibharkchauhan_en_5.2.0_3.0_1700533873232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_vibharkchauhan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_vibharkchauhan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_vibharkchauhan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Vibharkchauhan/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_yam1ke_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_yam1ke_en.md new file mode 100644 index 000000000000..129361b604ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_yam1ke_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_yam1ke DistilBertForTokenClassification from yam1ke +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_yam1ke +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_yam1ke` is a English model originally trained by yam1ke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yam1ke_en_5.2.0_3.0_1700581616896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yam1ke_en_5.2.0_3.0_1700581616896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_yam1ke","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_yam1ke", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_yam1ke| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yam1ke/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zarinah_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zarinah_en.md new file mode 100644 index 000000000000..64e8cbd48bfe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zarinah_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_zarinah DistilBertForTokenClassification from Zarinah +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_zarinah +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_zarinah` is a English model originally trained by Zarinah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zarinah_en_5.2.0_3.0_1700584443501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zarinah_en_5.2.0_3.0_1700584443501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_zarinah","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_zarinah", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_zarinah| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Zarinah/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zboinek_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zboinek_en.md new file mode 100644 index 000000000000..3cadd9bc4491 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zboinek_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_zboinek DistilBertForTokenClassification from zboinek +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_zboinek +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_zboinek` is a English model originally trained by zboinek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zboinek_en_5.2.0_3.0_1700573072232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zboinek_en_5.2.0_3.0_1700573072232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_zboinek","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_zboinek", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_zboinek| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/zboinek/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zhihao_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zhihao_en.md new file mode 100644 index 000000000000..b348a4ab7498 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_ner_zhihao_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_zhihao DistilBertForTokenClassification from zhihao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_zhihao +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_zhihao` is a English model originally trained by zhihao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zhihao_en_5.2.0_3.0_1700576195607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zhihao_en_5.2.0_3.0_1700576195607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_zhihao","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_zhihao", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_zhihao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/zhihao/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_pos_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_pos_en.md new file mode 100644 index 000000000000..46bd7cecfd75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_pos_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pos DistilBertForTokenClassification from AymenKallala +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pos +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pos` is a English model originally trained by AymenKallala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pos_en_5.2.0_3.0_1700591393001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pos_en_5.2.0_3.0_1700591393001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_pos","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_pos", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/AymenKallala/distilbert-base-uncased-finetuned-POS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sayula_popoluca_floressullon_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sayula_popoluca_floressullon_en.md new file mode 100644 index 000000000000..53b425c01bdb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sayula_popoluca_floressullon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sayula_popoluca_floressullon DistilBertForTokenClassification from floressullon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sayula_popoluca_floressullon +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sayula_popoluca_floressullon` is a English model originally trained by floressullon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sayula_popoluca_floressullon_en_5.2.0_3.0_1700528343911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sayula_popoluca_floressullon_en_5.2.0_3.0_1700528343911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_sayula_popoluca_floressullon","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_sayula_popoluca_floressullon", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sayula_popoluca_floressullon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/floressullon/distilbert-base-uncased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sayula_popoluca_tbosse_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sayula_popoluca_tbosse_en.md new file mode 100644 index 000000000000..e1c4f11a40c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sayula_popoluca_tbosse_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sayula_popoluca_tbosse DistilBertForTokenClassification from tbosse +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sayula_popoluca_tbosse +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sayula_popoluca_tbosse` is a English model originally trained by tbosse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sayula_popoluca_tbosse_en_5.2.0_3.0_1700562855930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sayula_popoluca_tbosse_en_5.2.0_3.0_1700562855930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_sayula_popoluca_tbosse","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_sayula_popoluca_tbosse", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sayula_popoluca_tbosse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/tbosse/distilbert-base-uncased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_scientific_eval_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_scientific_eval_en.md new file mode 100644 index 000000000000..3bb31d32349c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_scientific_eval_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_scientific_eval DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_scientific_eval +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_scientific_eval` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_scientific_eval_en_5.2.0_3.0_1700588166377.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_scientific_eval_en_5.2.0_3.0_1700588166377.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_scientific_eval","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_scientific_eval", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_scientific_eval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-uncased-finetuned-scientific-eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative_en.md new file mode 100644 index 000000000000..df53cc9a53dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative_en_5.2.0_3.0_1700527547025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative_en_5.2.0_3.0_1700527547025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sst_2_english_finetuned_argumentative| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/distilbert-base-uncased-finetuned-sst-2-english-finetuned-argumentative \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_tokenclassification_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_tokenclassification_en.md new file mode 100644 index 000000000000..93d5e86790ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_tokenclassification_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_tokenclassification DistilBertForTokenClassification from Thi-Thu-Huong +author: John Snow Labs +name: distilbert_base_uncased_finetuned_tokenclassification +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_tokenclassification` is a English model originally trained by Thi-Thu-Huong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tokenclassification_en_5.2.0_3.0_1700588680107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tokenclassification_en_5.2.0_3.0_1700588680107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_tokenclassification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_tokenclassification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_tokenclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Thi-Thu-Huong/distilbert-base-uncased-finetuned-tokenclassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_tt2_exam_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_tt2_exam_en.md new file mode 100644 index 000000000000..078d6cafffc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_tt2_exam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_tt2_exam DistilBertForTokenClassification from roschmid +author: John Snow Labs +name: distilbert_base_uncased_finetuned_tt2_exam +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_tt2_exam` is a English model originally trained by roschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tt2_exam_en_5.2.0_3.0_1700582668092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_tt2_exam_en_5.2.0_3.0_1700582668092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_tt2_exam","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_tt2_exam", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_tt2_exam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/roschmid/distilbert-base-uncased-finetuned-TT2-exam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_wnut_17_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_wnut_17_ner_en.md new file mode 100644 index 000000000000..1fd80ed648c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_finetuned_wnut_17_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_wnut_17_ner DistilBertForTokenClassification from maksim2000153 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_wnut_17_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_wnut_17_ner` is a English model originally trained by maksim2000153. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_wnut_17_ner_en_5.2.0_3.0_1700589176863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_wnut_17_ner_en_5.2.0_3.0_1700589176863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_wnut_17_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_wnut_17_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_wnut_17_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/maksim2000153/distilbert-base-uncased-finetuned-wnut_17-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_mapa_ner_coarse_grained_v2_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_mapa_ner_coarse_grained_v2_en.md new file mode 100644 index 000000000000..d755a2faf110 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_mapa_ner_coarse_grained_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_mapa_ner_coarse_grained_v2 DistilBertForTokenClassification from dmargutierrez +author: John Snow Labs +name: distilbert_base_uncased_mapa_ner_coarse_grained_v2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_mapa_ner_coarse_grained_v2` is a English model originally trained by dmargutierrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mapa_ner_coarse_grained_v2_en_5.2.0_3.0_1700553052565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mapa_ner_coarse_grained_v2_en_5.2.0_3.0_1700553052565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_mapa_ner_coarse_grained_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_mapa_ner_coarse_grained_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_mapa_ner_coarse_grained_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/dmargutierrez/distilbert-base-uncased-mapa-ner-coarse_grained-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_chuvash_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_chuvash_en.md new file mode 100644 index 000000000000..fb6189592d8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_chuvash_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_chuvash DistilBertForTokenClassification from jhonparra18 +author: John Snow Labs +name: distilbert_base_uncased_ner_chuvash +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_chuvash` is a English model originally trained by jhonparra18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_chuvash_en_5.2.0_3.0_1700558317339.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_chuvash_en_5.2.0_3.0_1700558317339.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_chuvash","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_chuvash", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_chuvash| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/jhonparra18/distilbert-base-uncased-ner_cv \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_conll2003_andi611_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_conll2003_andi611_en.md new file mode 100644 index 000000000000..dbc1623bb983 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_conll2003_andi611_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_conll2003_andi611 DistilBertForTokenClassification from andi611 +author: John Snow Labs +name: distilbert_base_uncased_ner_conll2003_andi611 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_conll2003_andi611` is a English model originally trained by andi611. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_conll2003_andi611_en_5.2.0_3.0_1700549562055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_conll2003_andi611_en_5.2.0_3.0_1700549562055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_conll2003_andi611","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_conll2003_andi611", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_conll2003_andi611| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/andi611/distilbert-base-uncased-ner-conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_conll2003_gladiator_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_conll2003_gladiator_en.md new file mode 100644 index 000000000000..bd3f37370f47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_conll2003_gladiator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_conll2003_gladiator DistilBertForTokenClassification from Gladiator +author: John Snow Labs +name: distilbert_base_uncased_ner_conll2003_gladiator +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_conll2003_gladiator` is a English model originally trained by Gladiator. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_conll2003_gladiator_en_5.2.0_3.0_1700567688331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_conll2003_gladiator_en_5.2.0_3.0_1700567688331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_conll2003_gladiator","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_conll2003_gladiator", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_conll2003_gladiator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Gladiator/distilbert-base-uncased_ner_conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02_en.md new file mode 100644 index 000000000000..5fd8653a777a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02 DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02_en_5.2.0_3.0_1700578259961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02_en_5.2.0_3.0_1700578259961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_27_02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-ner-invoiceSenderRecipient_clean_inv_27_02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_mit_restaurant_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_mit_restaurant_en.md new file mode 100644 index 000000000000..eadbfd7bef1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_mit_restaurant_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_mit_restaurant DistilBertForTokenClassification from andi611 +author: John Snow Labs +name: distilbert_base_uncased_ner_mit_restaurant +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_mit_restaurant` is a English model originally trained by andi611. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_mit_restaurant_en_5.2.0_3.0_1700528445202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_mit_restaurant_en_5.2.0_3.0_1700528445202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_mit_restaurant","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_mit_restaurant", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_mit_restaurant| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/andi611/distilbert-base-uncased-ner-mit-restaurant \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_visbank_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_visbank_en.md new file mode 100644 index 000000000000..5454f4666c81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_visbank_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_visbank DistilBertForTokenClassification from Yamei +author: John Snow Labs +name: distilbert_base_uncased_ner_visbank +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_visbank` is a English model originally trained by Yamei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_visbank_en_5.2.0_3.0_1700577870369.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_visbank_en_5.2.0_3.0_1700577870369.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_visbank","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_visbank", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_visbank| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Yamei/distilbert-base-uncased_NER_VISBank \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_wikiann_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_wikiann_en.md new file mode 100644 index 000000000000..7e0f8ceb1c67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_wikiann_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_wikiann DistilBertForTokenClassification from Gladiator +author: John Snow Labs +name: distilbert_base_uncased_ner_wikiann +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_wikiann` is a English model originally trained by Gladiator. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_wikiann_en_5.2.0_3.0_1700558378341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_wikiann_en_5.2.0_3.0_1700558378341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_wikiann","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_wikiann", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_wikiann| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Gladiator/distilbert-base-uncased_ner_wikiann \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_wnut_17_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_wnut_17_en.md new file mode 100644 index 000000000000..ccf9250b9658 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_ner_wnut_17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_wnut_17 DistilBertForTokenClassification from Gladiator +author: John Snow Labs +name: distilbert_base_uncased_ner_wnut_17 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_wnut_17` is a English model originally trained by Gladiator. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_wnut_17_en_5.2.0_3.0_1700529351007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_wnut_17_en_5.2.0_3.0_1700529351007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_wnut_17","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_wnut_17", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_wnut_17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Gladiator/distilbert-base-uncased_ner_wnut_17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_tasteset_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_tasteset_ner_en.md new file mode 100644 index 000000000000..2f34e9c2e635 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_tasteset_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_tasteset_ner DistilBertForTokenClassification from dmargutierrez +author: John Snow Labs +name: distilbert_base_uncased_tasteset_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_tasteset_ner` is a English model originally trained by dmargutierrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_tasteset_ner_en_5.2.0_3.0_1700548582498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_tasteset_ner_en_5.2.0_3.0_1700548582498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_tasteset_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_tasteset_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_tasteset_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/dmargutierrez/distilbert-base-uncased-TASTESet-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_test2_oyvindgrutle_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_test2_oyvindgrutle_en.md new file mode 100644 index 000000000000..6581995c66d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_test2_oyvindgrutle_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_test2_oyvindgrutle DistilBertForTokenClassification from oyvindgrutle +author: John Snow Labs +name: distilbert_base_uncased_test2_oyvindgrutle +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_test2_oyvindgrutle` is a English model originally trained by oyvindgrutle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_test2_oyvindgrutle_en_5.2.0_3.0_1700578795647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_test2_oyvindgrutle_en_5.2.0_3.0_1700578795647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_test2_oyvindgrutle","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_test2_oyvindgrutle", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_test2_oyvindgrutle| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/oyvindgrutle/distilbert-base-uncased-test2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_wnut_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_wnut_ner_en.md new file mode 100644 index 000000000000..47d9ff56fd8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_base_uncased_wnut_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_wnut_ner DistilBertForTokenClassification from dmargutierrez +author: John Snow Labs +name: distilbert_base_uncased_wnut_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_wnut_ner` is a English model originally trained by dmargutierrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_wnut_ner_en_5.2.0_3.0_1700583446502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_wnut_ner_en_5.2.0_3.0_1700583446502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_wnut_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_wnut_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_wnut_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/dmargutierrez/distilbert-base-uncased-WNUT-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_based_german_cased_ler_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_based_german_cased_ler_en.md new file mode 100644 index 000000000000..eeaf2d87ba7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_based_german_cased_ler_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_based_german_cased_ler DistilBertForTokenClassification from joelniklaus +author: John Snow Labs +name: distilbert_based_german_cased_ler +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_based_german_cased_ler` is a English model originally trained by joelniklaus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_based_german_cased_ler_en_5.2.0_3.0_1700561966367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_based_german_cased_ler_en_5.2.0_3.0_1700561966367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_based_german_cased_ler","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_based_german_cased_ler", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_based_german_cased_ler| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.4 MB| + +## References + +https://huggingface.co/joelniklaus/distilbert-based-german-cased-ler \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_carpentries_restaurant_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_carpentries_restaurant_ner_en.md new file mode 100644 index 000000000000..0ea7cd898f41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_carpentries_restaurant_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_carpentries_restaurant_ner DistilBertForTokenClassification from karlholten +author: John Snow Labs +name: distilbert_carpentries_restaurant_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_carpentries_restaurant_ner` is a English model originally trained by karlholten. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_carpentries_restaurant_ner_en_5.2.0_3.0_1700580817347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_carpentries_restaurant_ner_en_5.2.0_3.0_1700580817347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_carpentries_restaurant_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_carpentries_restaurant_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_carpentries_restaurant_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/karlholten/distilbert-carpentries-restaurant-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_casing_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_casing_en.md new file mode 100644 index 000000000000..4d696ba94623 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_casing_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_casing DistilBertForTokenClassification from aseifert +author: John Snow Labs +name: distilbert_casing +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_casing` is a English model originally trained by aseifert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_casing_en_5.2.0_3.0_1700531928609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_casing_en_5.2.0_3.0_1700531928609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_casing","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_casing", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_casing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/aseifert/distilbert-casing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ai4privacy_50k_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ai4privacy_50k_en.md new file mode 100644 index 000000000000..2acd8399e714 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ai4privacy_50k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ai4privacy_50k DistilBertForTokenClassification from Isotonic +author: John Snow Labs +name: distilbert_finetuned_ai4privacy_50k +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ai4privacy_50k` is a English model originally trained by Isotonic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ai4privacy_50k_en_5.2.0_3.0_1700576951848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ai4privacy_50k_en_5.2.0_3.0_1700576951848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ai4privacy_50k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ai4privacy_50k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ai4privacy_50k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.6 MB| + +## References + +https://huggingface.co/Isotonic/distilbert_finetuned_ai4privacy_50k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ai4privacy_singhtanmay6735_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ai4privacy_singhtanmay6735_en.md new file mode 100644 index 000000000000..c12834ae911b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ai4privacy_singhtanmay6735_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ai4privacy_singhtanmay6735 DistilBertForTokenClassification from singhtanmay6735 +author: John Snow Labs +name: distilbert_finetuned_ai4privacy_singhtanmay6735 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ai4privacy_singhtanmay6735` is a English model originally trained by singhtanmay6735. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ai4privacy_singhtanmay6735_en_5.2.0_3.0_1700540909754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ai4privacy_singhtanmay6735_en_5.2.0_3.0_1700540909754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ai4privacy_singhtanmay6735","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ai4privacy_singhtanmay6735", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ai4privacy_singhtanmay6735| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.6 MB| + +## References + +https://huggingface.co/singhtanmay6735/distilbert_finetuned_ai4privacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_disaster_entity_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_disaster_entity_en.md new file mode 100644 index 000000000000..adb221df6ed0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_disaster_entity_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_disaster_entity DistilBertForTokenClassification from DipeshY +author: John Snow Labs +name: distilbert_finetuned_disaster_entity +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_disaster_entity` is a English model originally trained by DipeshY. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_disaster_entity_en_5.2.0_3.0_1700527391044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_disaster_entity_en_5.2.0_3.0_1700527391044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_disaster_entity","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_disaster_entity", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_disaster_entity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DipeshY/distilbert-finetuned-disaster-entity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_gesture_prediction_9_classes_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_gesture_prediction_9_classes_en.md new file mode 100644 index 000000000000..0a2c730d2435 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_gesture_prediction_9_classes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_gesture_prediction_9_classes DistilBertForTokenClassification from qfrodicio +author: John Snow Labs +name: distilbert_finetuned_gesture_prediction_9_classes +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_gesture_prediction_9_classes` is a English model originally trained by qfrodicio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_gesture_prediction_9_classes_en_5.2.0_3.0_1700589962924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_gesture_prediction_9_classes_en_5.2.0_3.0_1700589962924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_gesture_prediction_9_classes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_gesture_prediction_9_classes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_gesture_prediction_9_classes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/qfrodicio/distilbert-finetuned-gesture-prediction-9-classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ner_ontonotes_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ner_ontonotes_en.md new file mode 100644 index 000000000000..91b44749bf66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_ner_ontonotes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ner_ontonotes DistilBertForTokenClassification from nickprock +author: John Snow Labs +name: distilbert_finetuned_ner_ontonotes +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ner_ontonotes` is a English model originally trained by nickprock. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_ontonotes_en_5.2.0_3.0_1700546592804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_ontonotes_en_5.2.0_3.0_1700546592804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ner_ontonotes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ner_ontonotes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ner_ontonotes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/nickprock/distilbert-finetuned-ner-ontonotes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_sayula_popoluca_tag_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_sayula_popoluca_tag_en.md new file mode 100644 index 000000000000..d2ca362aab63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_sayula_popoluca_tag_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_sayula_popoluca_tag DistilBertForTokenClassification from Suraj-Yadav +author: John Snow Labs +name: distilbert_finetuned_sayula_popoluca_tag +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_sayula_popoluca_tag` is a English model originally trained by Suraj-Yadav. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_sayula_popoluca_tag_en_5.2.0_3.0_1700538653260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_sayula_popoluca_tag_en_5.2.0_3.0_1700538653260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_sayula_popoluca_tag","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_sayula_popoluca_tag", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_sayula_popoluca_tag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Suraj-Yadav/distilbert-finetuned-pos-tag \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_wnut17_wandb_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_wnut17_wandb_ner_en.md new file mode 100644 index 000000000000..afe2f232304e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_finetuned_wnut17_wandb_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_wnut17_wandb_ner DistilBertForTokenClassification from anudeepvanjavakam +author: John Snow Labs +name: distilbert_finetuned_wnut17_wandb_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_wnut17_wandb_ner` is a English model originally trained by anudeepvanjavakam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_wnut17_wandb_ner_en_5.2.0_3.0_1700576042886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_wnut17_wandb_ner_en_5.2.0_3.0_1700576042886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_wnut17_wandb_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_wnut17_wandb_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_wnut17_wandb_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/anudeepvanjavakam/distilbert_finetuned_wnut17_wandb_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_legal_chunk_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_legal_chunk_en.md new file mode 100644 index 000000000000..42d3dc452ad5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_legal_chunk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_legal_chunk DistilBertForTokenClassification from SpectaclesLLC +author: John Snow Labs +name: distilbert_legal_chunk +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_legal_chunk` is a English model originally trained by SpectaclesLLC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_legal_chunk_en_5.2.0_3.0_1700547400127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_legal_chunk_en_5.2.0_3.0_1700547400127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_legal_chunk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_legal_chunk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_legal_chunk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/SpectaclesLLC/distilbert-legal-chunk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_legal_definitions_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_legal_definitions_en.md new file mode 100644 index 000000000000..e641819b8af7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_legal_definitions_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_legal_definitions DistilBertForTokenClassification from SpectaclesLLC +author: John Snow Labs +name: distilbert_legal_definitions +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_legal_definitions` is a English model originally trained by SpectaclesLLC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_legal_definitions_en_5.2.0_3.0_1700543003113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_legal_definitions_en_5.2.0_3.0_1700543003113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_legal_definitions","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_legal_definitions", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_legal_definitions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/SpectaclesLLC/distilbert-legal-definitions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_multilingual_base_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_multilingual_base_ner_xx.md new file mode 100644 index 000000000000..48571de94fe3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_multilingual_base_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_multilingual_base_ner DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: distilbert_multilingual_base_ner +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_multilingual_base_ner` is a Multilingual model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_base_ner_xx_5.2.0_3.0_1700588173139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_base_ner_xx_5.2.0_3.0_1700588173139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_multilingual_base_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_multilingual_base_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_multilingual_base_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/livinNector/distilbert-multilingual-base-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_ner_wnut17_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_ner_wnut17_en.md new file mode 100644 index 000000000000..1d37f67eb88d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_ner_wnut17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_ner_wnut17 DistilBertForTokenClassification from NouRed +author: John Snow Labs +name: distilbert_ner_wnut17 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_ner_wnut17` is a English model originally trained by NouRed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_ner_wnut17_en_5.2.0_3.0_1700547473223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_ner_wnut17_en_5.2.0_3.0_1700547473223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_ner_wnut17","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_ner_wnut17", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_ner_wnut17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/NouRed/distilbert_ner_wnut17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_english_judgements_laws_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_english_judgements_laws_en.md new file mode 100644 index 000000000000..6c1d2a75a426 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_english_judgements_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_english_judgements_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_english_judgements_laws +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_english_judgements_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_english_judgements_laws_en_5.2.0_3.0_1700527817695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_english_judgements_laws_en_5.2.0_3.0_1700527817695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_english_judgements_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_english_judgements_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_english_judgements_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-en-judgements-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_judgements_laws_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_judgements_laws_en.md new file mode 100644 index 000000000000..7a79789e4a80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_judgements_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_french_judgements_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_french_judgements_laws +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_french_judgements_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_judgements_laws_en_5.2.0_3.0_1700544676846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_judgements_laws_en_5.2.0_3.0_1700544676846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_french_judgements_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_french_judgements_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_french_judgements_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-fr-judgements-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_spanish_italian_english_german_judgements_laws_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_spanish_italian_english_german_judgements_laws_en.md new file mode 100644 index 000000000000..76dc2181bae0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_spanish_italian_english_german_judgements_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_french_spanish_italian_english_german_judgements_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_french_spanish_italian_english_german_judgements_laws +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_french_spanish_italian_english_german_judgements_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_spanish_italian_english_german_judgements_laws_en_5.2.0_3.0_1700535633054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_spanish_italian_english_german_judgements_laws_en_5.2.0_3.0_1700535633054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_french_spanish_italian_english_german_judgements_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_french_spanish_italian_english_german_judgements_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_french_spanish_italian_english_german_judgements_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-fr-es-it-en-de-judgements-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_spanish_italian_english_german_laws_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_spanish_italian_english_german_laws_en.md new file mode 100644 index 000000000000..785ba9cc9fe9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_french_spanish_italian_english_german_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_french_spanish_italian_english_german_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_french_spanish_italian_english_german_laws +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_french_spanish_italian_english_german_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_spanish_italian_english_german_laws_en_5.2.0_3.0_1700587019658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_spanish_italian_english_german_laws_en_5.2.0_3.0_1700587019658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_french_spanish_italian_english_german_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_french_spanish_italian_english_german_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_french_spanish_italian_english_german_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-fr-es-it-en-de-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_german_judgements_laws_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_german_judgements_laws_en.md new file mode 100644 index 000000000000..21264f309009 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_german_judgements_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_german_judgements_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_german_judgements_laws +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_german_judgements_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_german_judgements_laws_en_5.2.0_3.0_1700542404811.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_german_judgements_laws_en_5.2.0_3.0_1700542404811.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_german_judgements_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_german_judgements_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_german_judgements_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-de-judgements-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_german_laws_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_german_laws_en.md new file mode 100644 index 000000000000..f2b8a1ae9337 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_german_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_german_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_german_laws +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_german_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_german_laws_en_5.2.0_3.0_1700586849001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_german_laws_en_5.2.0_3.0_1700586849001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_german_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_german_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_german_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-de-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_spanish_judgements_laws_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_spanish_judgements_laws_en.md new file mode 100644 index 000000000000..52b093c7e325 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_sbd_spanish_judgements_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_spanish_judgements_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_spanish_judgements_laws +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_spanish_judgements_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_spanish_judgements_laws_en_5.2.0_3.0_1700580018326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_spanish_judgements_laws_en_5.2.0_3.0_1700580018326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_spanish_judgements_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_spanish_judgements_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_spanish_judgements_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-es-judgements-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_srb_ner_setimes_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_srb_ner_setimes_en.md new file mode 100644 index 000000000000..60f1d233d912 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_srb_ner_setimes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_srb_ner_setimes DistilBertForTokenClassification from Aleksandar +author: John Snow Labs +name: distilbert_srb_ner_setimes +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_srb_ner_setimes` is a English model originally trained by Aleksandar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_srb_ner_setimes_en_5.2.0_3.0_1700533923372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_srb_ner_setimes_en_5.2.0_3.0_1700533923372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_srb_ner_setimes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_srb_ner_setimes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_srb_ner_setimes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|306.0 MB| + +## References + +https://huggingface.co/Aleksandar/distilbert-srb-ner-setimes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_srb_ner_sr.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_srb_ner_sr.md new file mode 100644 index 000000000000..b626046f6d6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_srb_ner_sr.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Serbian distilbert_srb_ner DistilBertForTokenClassification from Aleksandar +author: John Snow Labs +name: distilbert_srb_ner +date: 2023-11-21 +tags: [bert, sr, open_source, token_classification, onnx] +task: Named Entity Recognition +language: sr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_srb_ner` is a Serbian model originally trained by Aleksandar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_srb_ner_sr_5.2.0_3.0_1700529816033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_srb_ner_sr_5.2.0_3.0_1700529816033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_srb_ner","sr") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_srb_ner", "sr") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_srb_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|sr| +|Size:|306.0 MB| + +## References + +https://huggingface.co/Aleksandar/distilbert-srb-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_class_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_class_en.md new file mode 100644 index 000000000000..710b978710a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_class_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token_class DistilBertForTokenClassification from Varunreddy +author: John Snow Labs +name: distilbert_token_class +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token_class` is a English model originally trained by Varunreddy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_class_en_5.2.0_3.0_1700591090678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_class_en_5.2.0_3.0_1700591090678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_class","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token_class", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Varunreddy/distilbert-token-class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58_en.md new file mode 100644 index 000000000000..4f1ab64dfd5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58_en_5.2.0_3.0_1700545528224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58_en_5.2.0_3.0_1700545528224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_itr0_0_0001_all_01_03_2022_14_30_58| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/distilbert_token_itr0_0.0001_all_01_03_2022-14_30_58 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33_en.md new file mode 100644 index 000000000000..150db9fcd9e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33_en_5.2.0_3.0_1700551286786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33_en_5.2.0_3.0_1700551286786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_itr0_1e_05_all_01_03_2022_14_33_33| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/distilbert_token_itr0_1e-05_all_01_03_2022-14_33_33 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04_en.md new file mode 100644 index 000000000000..df6d24fb01b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04_en_5.2.0_3.0_1700570731354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04_en_5.2.0_3.0_1700570731354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_itr0_1e_05_all_01_03_2022_15_14_04| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/distilBERT_token_itr0_1e-05_all_01_03_2022-15_14_04 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47_en.md new file mode 100644 index 000000000000..acb641c8352f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47_en_5.2.0_3.0_1700575279883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47_en_5.2.0_3.0_1700575279883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_itr0_1e_05_editorials_01_03_2022_15_12_47| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/distilBERT_token_itr0_1e-05_editorials_01_03_2022-15_12_47 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44_en.md new file mode 100644 index 000000000000..ef16f0dcd373 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44_en_5.2.0_3.0_1700579901301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44_en_5.2.0_3.0_1700579901301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_itr0_1e_05_essays_01_03_2022_15_11_44| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/distilBERT_token_itr0_1e-05_essays_01_03_2022-15_11_44 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39_en.md new file mode 100644 index 000000000000..c0ddde27bc5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39_en_5.2.0_3.0_1700558378399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39_en_5.2.0_3.0_1700558378399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token_itr0_1e_05_webdiscourse_01_03_2022_15_10_39| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/distilBERT_token_itr0_1e-05_webDiscourse_01_03_2022-15_10_39 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-distilbert_uncased_finetuned_wnut17_en.md b/docs/_posts/ahmedlone127/2023-11-21-distilbert_uncased_finetuned_wnut17_en.md new file mode 100644 index 000000000000..c7ec3acb1857 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-distilbert_uncased_finetuned_wnut17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_uncased_finetuned_wnut17 DistilBertForTokenClassification from anudeepvanjavakam +author: John Snow Labs +name: distilbert_uncased_finetuned_wnut17 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_uncased_finetuned_wnut17` is a English model originally trained by anudeepvanjavakam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_uncased_finetuned_wnut17_en_5.2.0_3.0_1700571721440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_uncased_finetuned_wnut17_en_5.2.0_3.0_1700571721440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_uncased_finetuned_wnut17","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_uncased_finetuned_wnut17", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_uncased_finetuned_wnut17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/anudeepvanjavakam/distilbert_uncased_finetuned_wnut17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-entity_extract_en.md b/docs/_posts/ahmedlone127/2023-11-21-entity_extract_en.md new file mode 100644 index 000000000000..0ff27c0a38db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-entity_extract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English entity_extract DistilBertForTokenClassification from abcdda +author: John Snow Labs +name: entity_extract +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`entity_extract` is a English model originally trained by abcdda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_extract_en_5.2.0_3.0_1700546511189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/entity_extract_en_5.2.0_3.0_1700546511189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("entity_extract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("entity_extract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|entity_extract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abcdda/entity_extract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-evaluating_student_writing_distibert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-evaluating_student_writing_distibert_ner_en.md new file mode 100644 index 000000000000..07a4b3a7d551 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-evaluating_student_writing_distibert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English evaluating_student_writing_distibert_ner DistilBertForTokenClassification from NahedAbdelgaber +author: John Snow Labs +name: evaluating_student_writing_distibert_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`evaluating_student_writing_distibert_ner` is a English model originally trained by NahedAbdelgaber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/evaluating_student_writing_distibert_ner_en_5.2.0_3.0_1700552972148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/evaluating_student_writing_distibert_ner_en_5.2.0_3.0_1700552972148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("evaluating_student_writing_distibert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("evaluating_student_writing_distibert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|evaluating_student_writing_distibert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/NahedAbdelgaber/evaluating-student-writing-distibert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-evaluating_student_writing_distibert_ner_with_metric_en.md b/docs/_posts/ahmedlone127/2023-11-21-evaluating_student_writing_distibert_ner_with_metric_en.md new file mode 100644 index 000000000000..ca2e9568acc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-evaluating_student_writing_distibert_ner_with_metric_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English evaluating_student_writing_distibert_ner_with_metric DistilBertForTokenClassification from NahedAbdelgaber +author: John Snow Labs +name: evaluating_student_writing_distibert_ner_with_metric +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`evaluating_student_writing_distibert_ner_with_metric` is a English model originally trained by NahedAbdelgaber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/evaluating_student_writing_distibert_ner_with_metric_en_5.2.0_3.0_1700548734472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/evaluating_student_writing_distibert_ner_with_metric_en_5.2.0_3.0_1700548734472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("evaluating_student_writing_distibert_ner_with_metric","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("evaluating_student_writing_distibert_ner_with_metric", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|evaluating_student_writing_distibert_ner_with_metric| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/NahedAbdelgaber/evaluating-student-writing-distibert-ner-with-metric \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-fine_tuned_cybersecurity_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-fine_tuned_cybersecurity_ner_en.md new file mode 100644 index 000000000000..d9c455e0b27a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-fine_tuned_cybersecurity_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fine_tuned_cybersecurity_ner DistilBertForTokenClassification from Abiral7 +author: John Snow Labs +name: fine_tuned_cybersecurity_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_cybersecurity_ner` is a English model originally trained by Abiral7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_cybersecurity_ner_en_5.2.0_3.0_1700538826843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_cybersecurity_ner_en_5.2.0_3.0_1700538826843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("fine_tuned_cybersecurity_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("fine_tuned_cybersecurity_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_cybersecurity_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Abiral7/fine-tuned-cybersecurity-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_distilbert_persian_farsi_zwnj_base_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_distilbert_persian_farsi_zwnj_base_ner_en.md new file mode 100644 index 000000000000..a063d11c3158 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_distilbert_persian_farsi_zwnj_base_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_distilbert_persian_farsi_zwnj_base_ner DistilBertForTokenClassification from mehdidn +author: John Snow Labs +name: finetuned_distilbert_persian_farsi_zwnj_base_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_distilbert_persian_farsi_zwnj_base_ner` is a English model originally trained by mehdidn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_persian_farsi_zwnj_base_ner_en_5.2.0_3.0_1700553059634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_distilbert_persian_farsi_zwnj_base_ner_en_5.2.0_3.0_1700553059634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_distilbert_persian_farsi_zwnj_base_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_distilbert_persian_farsi_zwnj_base_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_distilbert_persian_farsi_zwnj_base_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|282.3 MB| + +## References + +https://huggingface.co/mehdidn/finetuned_distilbert_fa_zwnj_base_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_01_30_30_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_01_30_30_en.md new file mode 100644 index 000000000000..11a0b2187c26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_01_30_30_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_01_30_30 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_01_30_30 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_01_30_30` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_01_30_30_en_5.2.0_3.0_1700567866276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_01_30_30_en_5.2.0_3.0_1700567866276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_01_30_30","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_01_30_30", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_01_30_30| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-01_30_30 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_01_55_54_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_01_55_54_en.md new file mode 100644 index 000000000000..d9276d6b8b15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_01_55_54_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_01_55_54 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_01_55_54 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_01_55_54` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_01_55_54_en_5.2.0_3.0_1700559280542.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_01_55_54_en_5.2.0_3.0_1700559280542.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_01_55_54","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_01_55_54", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_01_55_54| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-01_55_54 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_15_41_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_15_41_en.md new file mode 100644 index 000000000000..34f7ec05fb10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_15_41_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_15_41 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_15_41 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_15_41` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_15_41_en_5.2.0_3.0_1700558378304.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_15_41_en_5.2.0_3.0_1700558378304.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_15_41","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_15_41", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_15_41| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_15_41 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_18_19_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_18_19_en.md new file mode 100644 index 000000000000..7b9e5ae87aa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_18_19_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_18_19 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_18_19 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_18_19` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_18_19_en_5.2.0_3.0_1700559302880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_18_19_en_5.2.0_3.0_1700559302880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_18_19","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_18_19", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_18_19| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_18_19 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_20_41_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_20_41_en.md new file mode 100644 index 000000000000..8cf48df92cc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_20_41_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_20_41 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_20_41 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_20_41` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_20_41_en_5.2.0_3.0_1700555369530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_20_41_en_5.2.0_3.0_1700555369530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_20_41","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_20_41", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_20_41| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_20_41 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_23_23_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_23_23_en.md new file mode 100644 index 000000000000..410bcfd7984e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_23_23_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_23_23 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_23_23 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_23_23` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_23_23_en_5.2.0_3.0_1700567093573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_23_23_en_5.2.0_3.0_1700567093573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_23_23","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_23_23", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_23_23| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_23_23 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_25_47_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_25_47_en.md new file mode 100644 index 000000000000..bc333fcf2d0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_25_47_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_25_47 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_25_47 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_25_47` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_25_47_en_5.2.0_3.0_1700578794940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_25_47_en_5.2.0_3.0_1700578794940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_25_47","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_25_47", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_25_47| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_25_47 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_28_10_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_28_10_en.md new file mode 100644 index 000000000000..32afa9224600 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_28_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_28_10 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_28_10 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_28_10` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_28_10_en_5.2.0_3.0_1700559186488.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_28_10_en_5.2.0_3.0_1700559186488.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_28_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_28_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_28_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_28_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_30_32_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_30_32_en.md new file mode 100644 index 000000000000..f6fe1f883162 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_30_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_30_32 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_30_32 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_30_32` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_30_32_en_5.2.0_3.0_1700559186155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_30_32_en_5.2.0_3.0_1700559186155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_30_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_30_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_30_32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_30_32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_32_56_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_32_56_en.md new file mode 100644 index 000000000000..6890f88456fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_32_56_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_32_56 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_32_56 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_32_56` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_32_56_en_5.2.0_3.0_1700547517640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_32_56_en_5.2.0_3.0_1700547517640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_32_56","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_32_56", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_32_56| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_32_56 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_35_19_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_35_19_en.md new file mode 100644 index 000000000000..593105dff059 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_35_19_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_35_19 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_35_19 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_35_19` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_35_19_en_5.2.0_3.0_1700560117980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_35_19_en_5.2.0_3.0_1700560117980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_35_19","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_35_19", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_35_19| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_35_19 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_37_42_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_37_42_en.md new file mode 100644 index 000000000000..7983af691313 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_16_02_2022_14_37_42_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_16_02_2022_14_37_42 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_16_02_2022_14_37_42 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_16_02_2022_14_37_42` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_37_42_en_5.2.0_3.0_1700583446512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_16_02_2022_14_37_42_en_5.2.0_3.0_1700583446512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_16_02_2022_14_37_42","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_16_02_2022_14_37_42", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_16_02_2022_14_37_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_16_02_2022-14_37_42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_41_15_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_41_15_en.md new file mode 100644 index 000000000000..cb3b98b2de6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_41_15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_15_41_15 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_15_41_15 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_15_41_15` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_41_15_en_5.2.0_3.0_1700573537289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_41_15_en_5.2.0_3.0_1700573537289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_15_41_15","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_15_41_15", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_15_41_15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-15_41_15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_43_42_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_43_42_en.md new file mode 100644 index 000000000000..f8ecdc370b0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_43_42_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_15_43_42 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_15_43_42 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_15_43_42` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_43_42_en_5.2.0_3.0_1700551327531.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_43_42_en_5.2.0_3.0_1700551327531.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_15_43_42","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_15_43_42", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_15_43_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-15_43_42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_46_07_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_46_07_en.md new file mode 100644 index 000000000000..29f11077d660 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_46_07_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_15_46_07 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_15_46_07 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_15_46_07` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_46_07_en_5.2.0_3.0_1700540009769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_46_07_en_5.2.0_3.0_1700540009769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_15_46_07","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_15_46_07", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_15_46_07| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-15_46_07 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_48_32_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_48_32_en.md new file mode 100644 index 000000000000..091a15148992 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_48_32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_15_48_32 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_15_48_32 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_15_48_32` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_48_32_en_5.2.0_3.0_1700564714798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_48_32_en_5.2.0_3.0_1700564714798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_15_48_32","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_15_48_32", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_15_48_32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-15_48_32 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_53_17_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_53_17_en.md new file mode 100644 index 000000000000..64e4664646b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_53_17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_15_53_17 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_15_53_17 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_15_53_17` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_53_17_en_5.2.0_3.0_1700575279900.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_53_17_en_5.2.0_3.0_1700575279900.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_15_53_17","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_15_53_17", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_15_53_17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-15_53_17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_56_33_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_56_33_en.md new file mode 100644 index 000000000000..3ac3b50c7cbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_56_33_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_15_56_33 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_15_56_33 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_15_56_33` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_56_33_en_5.2.0_3.0_1700556619374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_56_33_en_5.2.0_3.0_1700556619374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_15_56_33","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_15_56_33", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_15_56_33| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-15_56_33 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_59_50_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_59_50_en.md new file mode 100644 index 000000000000..29579a89b1b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_15_59_50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_15_59_50 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_15_59_50 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_15_59_50` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_59_50_en_5.2.0_3.0_1700577869653.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_15_59_50_en_5.2.0_3.0_1700577869653.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_15_59_50","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_15_59_50", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_15_59_50| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-15_59_50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_16_03_05_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_16_03_05_en.md new file mode 100644 index 000000000000..b6118934d397 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_16_03_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_16_03_05 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_16_03_05 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_16_03_05` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_16_03_05_en_5.2.0_3.0_1700557518248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_16_03_05_en_5.2.0_3.0_1700557518248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_16_03_05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_16_03_05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_16_03_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-16_03_05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_16_06_20_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_16_06_20_en.md new file mode 100644 index 000000000000..dffc8ceee58f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_2e_05_all_16_02_2022_16_06_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_2e_05_all_16_02_2022_16_06_20 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_2e_05_all_16_02_2022_16_06_20 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_2e_05_all_16_02_2022_16_06_20` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_16_06_20_en_5.2.0_3.0_1700581676877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_2e_05_all_16_02_2022_16_06_20_en_5.2.0_3.0_1700581676877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_2e_05_all_16_02_2022_16_06_20","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_2e_05_all_16_02_2022_16_06_20", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_2e_05_all_16_02_2022_16_06_20| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_2e-05_all_16_02_2022-16_06_20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_09_36_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_09_36_en.md new file mode 100644 index 000000000000..241d5e656091 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_09_36_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_3e_05_all_16_02_2022_16_09_36 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_3e_05_all_16_02_2022_16_09_36 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_3e_05_all_16_02_2022_16_09_36` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_09_36_en_5.2.0_3.0_1700574437819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_09_36_en_5.2.0_3.0_1700574437819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_3e_05_all_16_02_2022_16_09_36","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_3e_05_all_16_02_2022_16_09_36", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_3e_05_all_16_02_2022_16_09_36| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_3e-05_all_16_02_2022-16_09_36 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_12_51_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_12_51_en.md new file mode 100644 index 000000000000..1e18ffd95a28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_12_51_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_3e_05_all_16_02_2022_16_12_51 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_3e_05_all_16_02_2022_16_12_51 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_3e_05_all_16_02_2022_16_12_51` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_12_51_en_5.2.0_3.0_1700563577743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_12_51_en_5.2.0_3.0_1700563577743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_3e_05_all_16_02_2022_16_12_51","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_3e_05_all_16_02_2022_16_12_51", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_3e_05_all_16_02_2022_16_12_51| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_3e-05_all_16_02_2022-16_12_51 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_16_08_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_16_08_en.md new file mode 100644 index 000000000000..7fda4c08ccf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_16_08_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_3e_05_all_16_02_2022_16_16_08 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_3e_05_all_16_02_2022_16_16_08 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_3e_05_all_16_02_2022_16_16_08` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_16_08_en_5.2.0_3.0_1700560215633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_16_08_en_5.2.0_3.0_1700560215633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_3e_05_all_16_02_2022_16_16_08","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_3e_05_all_16_02_2022_16_16_08", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_3e_05_all_16_02_2022_16_16_08| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_3e-05_all_16_02_2022-16_16_08 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_19_24_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_19_24_en.md new file mode 100644 index 000000000000..d047155b906c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_19_24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_3e_05_all_16_02_2022_16_19_24 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_3e_05_all_16_02_2022_16_19_24 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_3e_05_all_16_02_2022_16_19_24` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_19_24_en_5.2.0_3.0_1700561949278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_19_24_en_5.2.0_3.0_1700561949278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_3e_05_all_16_02_2022_16_19_24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_3e_05_all_16_02_2022_16_19_24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_3e_05_all_16_02_2022_16_19_24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_3e-05_all_16_02_2022-16_19_24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_22_39_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_22_39_en.md new file mode 100644 index 000000000000..de356320f5ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_22_39_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_3e_05_all_16_02_2022_16_22_39 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_3e_05_all_16_02_2022_16_22_39 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_3e_05_all_16_02_2022_16_22_39` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_22_39_en_5.2.0_3.0_1700563577699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_22_39_en_5.2.0_3.0_1700563577699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_3e_05_all_16_02_2022_16_22_39","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_3e_05_all_16_02_2022_16_22_39", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_3e_05_all_16_02_2022_16_22_39| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_3e-05_all_16_02_2022-16_22_39 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_25_56_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_25_56_en.md new file mode 100644 index 000000000000..c4c726db9b00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_25_56_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_3e_05_all_16_02_2022_16_25_56 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_3e_05_all_16_02_2022_16_25_56 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_3e_05_all_16_02_2022_16_25_56` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_25_56_en_5.2.0_3.0_1700567688477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_25_56_en_5.2.0_3.0_1700567688477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_3e_05_all_16_02_2022_16_25_56","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_3e_05_all_16_02_2022_16_25_56", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_3e_05_all_16_02_2022_16_25_56| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_3e-05_all_16_02_2022-16_25_56 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_29_13_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_29_13_en.md new file mode 100644 index 000000000000..f32858d62160 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_3e_05_all_16_02_2022_16_29_13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_3e_05_all_16_02_2022_16_29_13 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_3e_05_all_16_02_2022_16_29_13 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_3e_05_all_16_02_2022_16_29_13` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_29_13_en_5.2.0_3.0_1700556450824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_3e_05_all_16_02_2022_16_29_13_en_5.2.0_3.0_1700556450824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_3e_05_all_16_02_2022_16_29_13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_3e_05_all_16_02_2022_16_29_13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_3e_05_all_16_02_2022_16_29_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_3e-05_all_16_02_2022-16_29_13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_argumentative_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_argumentative_en.md new file mode 100644 index 000000000000..29861de08be4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_argumentative_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_argumentative DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_argumentative +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_argumentative` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_argumentative_en_5.2.0_3.0_1700546571148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_argumentative_en_5.2.0_3.0_1700546571148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_argumentative","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_argumentative", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_argumentative| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned-token-argumentative \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27_en.md new file mode 100644 index 000000000000..c835b6898919 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27_en_5.2.0_3.0_1700562740771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27_en_5.2.0_3.0_1700562740771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_0_0002_all_16_02_2022_20_14_27| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_0.0002_all_16_02_2022-20_14_27 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01_en.md new file mode 100644 index 000000000000..204b84e137ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01_en_5.2.0_3.0_1700571720386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01_en_5.2.0_3.0_1700571720386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_0_0002_all_16_02_2022_20_30_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_0.0002_all_16_02_2022-20_30_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27_en.md new file mode 100644 index 000000000000..e706ca9da5df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27_en_5.2.0_3.0_1700565525570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27_en_5.2.0_3.0_1700565525570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_0_0002_all_16_02_2022_20_45_27| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_0.0002_all_16_02_2022-20_45_27 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10_en.md new file mode 100644 index 000000000000..acf39f379718 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10_en_5.2.0_3.0_1700582668087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10_en_5.2.0_3.0_1700582668087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_0_0002_all_16_02_2022_21_13_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_0.0002_all_16_02_2022-21_13_10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38_en.md new file mode 100644 index 000000000000..6ab9337a1b05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38_en_5.2.0_3.0_1700563654467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38_en_5.2.0_3.0_1700563654467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_0_0002_editorials_16_02_2022_21_07_38| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_0.0002_editorials_16_02_2022-21_07_38 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02_en.md new file mode 100644 index 000000000000..9fbf4b0094d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02_en_5.2.0_3.0_1700554017967.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02_en_5.2.0_3.0_1700554017967.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_0_0002_essays_16_02_2022_21_04_02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_0.0002_essays_16_02_2022-21_04_02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50_en.md new file mode 100644 index 000000000000..7e553cf6711c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50_en_5.2.0_3.0_1700568882591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50_en_5.2.0_3.0_1700568882591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_0_0002_webdiscourse_16_02_2022_21_00_50| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_0.0002_webDiscourse_16_02_2022-21_00_50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36_en.md new file mode 100644 index 000000000000..9193adee3e30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36_en_5.2.0_3.0_1700554884936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36_en_5.2.0_3.0_1700554884936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_2e_05_all_16_02_2022_20_09_36| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_2e-05_all_16_02_2022-20_09_36 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06_en.md new file mode 100644 index 000000000000..d618b0f65f6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06_en_5.2.0_3.0_1700581682974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06_en_5.2.0_3.0_1700581682974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_2e_05_all_16_02_2022_20_25_06| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_2e-05_all_16_02_2022-20_25_06 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28_en.md new file mode 100644 index 000000000000..b8314ff630fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28_en_5.2.0_3.0_1700567094956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28_en_5.2.0_3.0_1700567094956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_2e_05_all_16_02_2022_20_40_28| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_2e-05_all_16_02_2022-20_40_28 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55_en.md new file mode 100644 index 000000000000..6f4ab931d5a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55_en_5.2.0_3.0_1700553047545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55_en_5.2.0_3.0_1700553047545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_2e_05_all_16_02_2022_21_08_55| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_2e-05_all_16_02_2022-21_08_55 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05_en.md new file mode 100644 index 000000000000..b8e96ec21048 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05_en_5.2.0_3.0_1700550517651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05_en_5.2.0_3.0_1700550517651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_2e_05_editorials_16_02_2022_21_05_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_2e-05_editorials_16_02_2022-21_05_05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51_en.md new file mode 100644 index 000000000000..c6deb96cb8cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51_en_5.2.0_3.0_1700571721382.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51_en_5.2.0_3.0_1700571721382.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_2e_05_essays_16_02_2022_21_01_51| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_2e-05_essays_16_02_2022-21_01_51 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45_en.md new file mode 100644 index 000000000000..02f19e918f8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45_en_5.2.0_3.0_1700572576278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45_en_5.2.0_3.0_1700572576278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_2e_05_webdiscourse_16_02_2022_20_58_45| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_2e-05_webDiscourse_16_02_2022-20_58_45 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04_en.md new file mode 100644 index 000000000000..4bec6ada8be7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04_en_5.2.0_3.0_1700566233738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04_en_5.2.0_3.0_1700566233738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_3e_05_all_16_02_2022_20_12_04| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_3e-05_all_16_02_2022-20_12_04 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36_en.md new file mode 100644 index 000000000000..7081b3ca89c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36_en_5.2.0_3.0_1700561871114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36_en_5.2.0_3.0_1700561871114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_3e_05_all_16_02_2022_20_27_36| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_3e-05_all_16_02_2022-20_27_36 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00_en.md new file mode 100644 index 000000000000..311c77e64fe2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00_en_5.2.0_3.0_1700561966411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00_en_5.2.0_3.0_1700561966411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_3e_05_all_16_02_2022_20_43_00| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_3e-05_all_16_02_2022-20_43_00 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08_en.md new file mode 100644 index 000000000000..4f6c59a04f8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08_en_5.2.0_3.0_1700557577237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08_en_5.2.0_3.0_1700557577237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_3e_05_all_16_02_2022_21_11_08| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_3e-05_all_16_02_2022-21_11_08 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22_en.md new file mode 100644 index 000000000000..4761fc8f0a3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22_en_5.2.0_3.0_1700557577223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22_en_5.2.0_3.0_1700557577223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_3e_05_editorials_16_02_2022_21_06_22| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_3e-05_editorials_16_02_2022-21_06_22 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59_en.md b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59_en.md new file mode 100644 index 000000000000..1f5829a1f0a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59_en_5.2.0_3.0_1700549562095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59_en_5.2.0_3.0_1700549562095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_token_itr0_3e_05_essays_16_02_2022_21_02_59| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/finetuned_token_itr0_3e-05_essays_16_02_2022-21_02_59 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-furniture_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-furniture_ner_en.md new file mode 100644 index 000000000000..15225fa0bf00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-furniture_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English furniture_ner DistilBertForTokenClassification from apnd +author: John Snow Labs +name: furniture_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`furniture_ner` is a English model originally trained by apnd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/furniture_ner_en_5.2.0_3.0_1700527817576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/furniture_ner_en_5.2.0_3.0_1700527817576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("furniture_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("furniture_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|furniture_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/apnd/furniture-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-hindi_distilbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-hindi_distilbert_ner_en.md new file mode 100644 index 000000000000..729b2a17d6ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-hindi_distilbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hindi_distilbert_ner DistilBertForTokenClassification from mirfan899 +author: John Snow Labs +name: hindi_distilbert_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hindi_distilbert_ner` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hindi_distilbert_ner_en_5.2.0_3.0_1700550800164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hindi_distilbert_ner_en_5.2.0_3.0_1700550800164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("hindi_distilbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("hindi_distilbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hindi_distilbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/mirfan899/hindi-distilbert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-hl7_fhir_model_v1_en.md b/docs/_posts/ahmedlone127/2023-11-21-hl7_fhir_model_v1_en.md new file mode 100644 index 000000000000..2a4c568ab683 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-hl7_fhir_model_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hl7_fhir_model_v1 DistilBertForTokenClassification from SandeepKanao +author: John Snow Labs +name: hl7_fhir_model_v1 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hl7_fhir_model_v1` is a English model originally trained by SandeepKanao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hl7_fhir_model_v1_en_5.2.0_3.0_1700567188438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hl7_fhir_model_v1_en_5.2.0_3.0_1700567188438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("hl7_fhir_model_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("hl7_fhir_model_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hl7_fhir_model_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/SandeepKanao/HL7-FHIR-Model-V1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-hw1_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-21-hw1_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..1ee4c4b45d6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-hw1_distilbert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hw1_distilbert_base_uncased DistilBertForTokenClassification from KudriashovSS +author: John Snow Labs +name: hw1_distilbert_base_uncased +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hw1_distilbert_base_uncased` is a English model originally trained by KudriashovSS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hw1_distilbert_base_uncased_en_5.2.0_3.0_1700579901279.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hw1_distilbert_base_uncased_en_5.2.0_3.0_1700579901279.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("hw1_distilbert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("hw1_distilbert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hw1_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/KudriashovSS/HW1_distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-indic_transformers_telugu_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-21-indic_transformers_telugu_distilbert_en.md new file mode 100644 index 000000000000..f7acea94bcdb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-indic_transformers_telugu_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English indic_transformers_telugu_distilbert DistilBertForTokenClassification from durgaamma2005 +author: John Snow Labs +name: indic_transformers_telugu_distilbert +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indic_transformers_telugu_distilbert` is a English model originally trained by durgaamma2005. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indic_transformers_telugu_distilbert_en_5.2.0_3.0_1700541810759.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indic_transformers_telugu_distilbert_en_5.2.0_3.0_1700541810759.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("indic_transformers_telugu_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("indic_transformers_telugu_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indic_transformers_telugu_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|249.1 MB| + +## References + +https://huggingface.co/durgaamma2005/indic-transformers-te-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-initial_dq_model_en.md b/docs/_posts/ahmedlone127/2023-11-21-initial_dq_model_en.md new file mode 100644 index 000000000000..fd4d026e0f18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-initial_dq_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English initial_dq_model DistilBertForTokenClassification from lucafrost +author: John Snow Labs +name: initial_dq_model +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`initial_dq_model` is a English model originally trained by lucafrost. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/initial_dq_model_en_5.2.0_3.0_1700568906348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/initial_dq_model_en_5.2.0_3.0_1700568906348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("initial_dq_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("initial_dq_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|initial_dq_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/lucafrost/initial-dq-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-ma_ner_v6_distil_en.md b/docs/_posts/ahmedlone127/2023-11-21-ma_ner_v6_distil_en.md new file mode 100644 index 000000000000..7eb0c5a68c14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-ma_ner_v6_distil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ma_ner_v6_distil DistilBertForTokenClassification from CouchCat +author: John Snow Labs +name: ma_ner_v6_distil +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ma_ner_v6_distil` is a English model originally trained by CouchCat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ma_ner_v6_distil_en_5.2.0_3.0_1700569802239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ma_ner_v6_distil_en_5.2.0_3.0_1700569802239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ma_ner_v6_distil","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ma_ner_v6_distil", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ma_ner_v6_distil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/CouchCat/ma_ner_v6_distil \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-med_qa_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-med_qa_ner_en.md new file mode 100644 index 000000000000..d0d353a2054a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-med_qa_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English med_qa_ner DistilBertForTokenClassification from GEDISA +author: John Snow Labs +name: med_qa_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`med_qa_ner` is a English model originally trained by GEDISA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/med_qa_ner_en_5.2.0_3.0_1700589475452.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/med_qa_ner_en_5.2.0_3.0_1700589475452.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("med_qa_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("med_qa_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|med_qa_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/GEDISA/med-qa-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-mongolian_cased_bert_base_named_entity_mn.md b/docs/_posts/ahmedlone127/2023-11-21-mongolian_cased_bert_base_named_entity_mn.md new file mode 100644 index 000000000000..e4584619c1af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-mongolian_cased_bert_base_named_entity_mn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Mongolian mongolian_cased_bert_base_named_entity DistilBertForTokenClassification from 2rtl3 +author: John Snow Labs +name: mongolian_cased_bert_base_named_entity +date: 2023-11-21 +tags: [bert, mn, open_source, token_classification, onnx] +task: Named Entity Recognition +language: mn +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mongolian_cased_bert_base_named_entity` is a Mongolian model originally trained by 2rtl3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mongolian_cased_bert_base_named_entity_mn_5.2.0_3.0_1700529721626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mongolian_cased_bert_base_named_entity_mn_5.2.0_3.0_1700529721626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("mongolian_cased_bert_base_named_entity","mn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("mongolian_cased_bert_base_named_entity", "mn") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mongolian_cased_bert_base_named_entity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|mn| +|Size:|505.4 MB| + +## References + +https://huggingface.co/2rtl3/mn-cased-bert-base-named-entity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl_xx.md b/docs/_posts/ahmedlone127/2023-11-21-mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl_xx.md new file mode 100644 index 000000000000..87c50c4a31d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl DistilBertForTokenClassification from Blgn94 +author: John Snow Labs +name: mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl` is a Multilingual model originally trained by Blgn94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl_xx_5.2.0_3.0_1700543750215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl_xx_5.2.0_3.0_1700543750215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mongolian_davlan_distilbert_base_multilingual_cased_ner_hrl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Blgn94/mongolian-Davlan-distilbert-base-multilingual-cased-ner-hrl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-mongolian_distilbert_base_multilingual_cased_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-21-mongolian_distilbert_base_multilingual_cased_ner_xx.md new file mode 100644 index 000000000000..112b5651a2f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-mongolian_distilbert_base_multilingual_cased_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual mongolian_distilbert_base_multilingual_cased_ner DistilBertForTokenClassification from srglnjmb +author: John Snow Labs +name: mongolian_distilbert_base_multilingual_cased_ner +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mongolian_distilbert_base_multilingual_cased_ner` is a Multilingual model originally trained by srglnjmb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mongolian_distilbert_base_multilingual_cased_ner_xx_5.2.0_3.0_1700587681567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mongolian_distilbert_base_multilingual_cased_ner_xx_5.2.0_3.0_1700587681567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("mongolian_distilbert_base_multilingual_cased_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("mongolian_distilbert_base_multilingual_cased_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mongolian_distilbert_base_multilingual_cased_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/srglnjmb/mongolian-distilbert-base-multilingual-cased-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-mongolian_distilbert_base_multilingual_cased_xx.md b/docs/_posts/ahmedlone127/2023-11-21-mongolian_distilbert_base_multilingual_cased_xx.md new file mode 100644 index 000000000000..6a90d4be459c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-mongolian_distilbert_base_multilingual_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual mongolian_distilbert_base_multilingual_cased DistilBertForTokenClassification from Dakie +author: John Snow Labs +name: mongolian_distilbert_base_multilingual_cased +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mongolian_distilbert_base_multilingual_cased` is a Multilingual model originally trained by Dakie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mongolian_distilbert_base_multilingual_cased_xx_5.2.0_3.0_1700573072378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mongolian_distilbert_base_multilingual_cased_xx_5.2.0_3.0_1700573072378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("mongolian_distilbert_base_multilingual_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("mongolian_distilbert_base_multilingual_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mongolian_distilbert_base_multilingual_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Dakie/mongolian-distilbert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-moviehunt3_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-moviehunt3_ner_en.md new file mode 100644 index 000000000000..316b9f32f8b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-moviehunt3_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English moviehunt3_ner DistilBertForTokenClassification from AbidHasan95 +author: John Snow Labs +name: moviehunt3_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moviehunt3_ner` is a English model originally trained by AbidHasan95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moviehunt3_ner_en_5.2.0_3.0_1700564853561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moviehunt3_ner_en_5.2.0_3.0_1700564853561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("moviehunt3_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("moviehunt3_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moviehunt3_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AbidHasan95/movieHunt3-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-moviehunt4_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-moviehunt4_ner_en.md new file mode 100644 index 000000000000..d86d1050fe2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-moviehunt4_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English moviehunt4_ner DistilBertForTokenClassification from AbidHasan95 +author: John Snow Labs +name: moviehunt4_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moviehunt4_ner` is a English model originally trained by AbidHasan95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moviehunt4_ner_en_5.2.0_3.0_1700530787898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moviehunt4_ner_en_5.2.0_3.0_1700530787898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("moviehunt4_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("moviehunt4_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moviehunt4_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AbidHasan95/movieHunt4-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-ner_distilbert_base_uncased_ontonotesv5_englishv4_en.md b/docs/_posts/ahmedlone127/2023-11-21-ner_distilbert_base_uncased_ontonotesv5_englishv4_en.md new file mode 100644 index 000000000000..5587d1a49fa7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-ner_distilbert_base_uncased_ontonotesv5_englishv4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_distilbert_base_uncased_ontonotesv5_englishv4 DistilBertForTokenClassification from djagatiya +author: John Snow Labs +name: ner_distilbert_base_uncased_ontonotesv5_englishv4 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_distilbert_base_uncased_ontonotesv5_englishv4` is a English model originally trained by djagatiya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_distilbert_base_uncased_ontonotesv5_englishv4_en_5.2.0_3.0_1700530377026.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_distilbert_base_uncased_ontonotesv5_englishv4_en_5.2.0_3.0_1700530377026.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_distilbert_base_uncased_ontonotesv5_englishv4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_distilbert_base_uncased_ontonotesv5_englishv4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_distilbert_base_uncased_ontonotesv5_englishv4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/djagatiya/ner-distilbert-base-uncased-ontonotesv5-englishv4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-ner_distilbert_base_uncased_wnut_17_en.md b/docs/_posts/ahmedlone127/2023-11-21-ner_distilbert_base_uncased_wnut_17_en.md new file mode 100644 index 000000000000..84ba875f3ca8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-ner_distilbert_base_uncased_wnut_17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_distilbert_base_uncased_wnut_17 DistilBertForTokenClassification from Waleed-bin-Qamar +author: John Snow Labs +name: ner_distilbert_base_uncased_wnut_17 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_distilbert_base_uncased_wnut_17` is a English model originally trained by Waleed-bin-Qamar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_distilbert_base_uncased_wnut_17_en_5.2.0_3.0_1700556616573.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_distilbert_base_uncased_wnut_17_en_5.2.0_3.0_1700556616573.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_distilbert_base_uncased_wnut_17","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_distilbert_base_uncased_wnut_17", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_distilbert_base_uncased_wnut_17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Waleed-bin-Qamar/NER-distilbert-base-uncased-wnut_17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model_en.md b/docs/_posts/ahmedlone127/2023-11-21-nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model_en.md new file mode 100644 index 000000000000..21017f67aa5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model DistilBertForTokenClassification from GuCuChiara +author: John Snow Labs +name: nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model` is a English model originally trained by GuCuChiara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model_en_5.2.0_3.0_1700526010632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model_en_5.2.0_3.0_1700526010632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_hiba2_distemist_fine_tuned_distilbert_pretrained_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/GuCuChiara/NLP-HIBA2_DisTEMIST_fine_tuned_DistilBERT-pretrained-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-old_church_slavonic_sayula_popoluca_en.md b/docs/_posts/ahmedlone127/2023-11-21-old_church_slavonic_sayula_popoluca_en.md new file mode 100644 index 000000000000..8ccd6b30dd1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-old_church_slavonic_sayula_popoluca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English old_church_slavonic_sayula_popoluca DistilBertForTokenClassification from annadmitrieva +author: John Snow Labs +name: old_church_slavonic_sayula_popoluca +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`old_church_slavonic_sayula_popoluca` is a English model originally trained by annadmitrieva. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/old_church_slavonic_sayula_popoluca_en_5.2.0_3.0_1700537620236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/old_church_slavonic_sayula_popoluca_en_5.2.0_3.0_1700537620236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("old_church_slavonic_sayula_popoluca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("old_church_slavonic_sayula_popoluca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|old_church_slavonic_sayula_popoluca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/annadmitrieva/old-church-slavonic-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-products_ner2_en.md b/docs/_posts/ahmedlone127/2023-11-21-products_ner2_en.md new file mode 100644 index 000000000000..11422fc2add2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-products_ner2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English products_ner2 DistilBertForTokenClassification from Atheer174 +author: John Snow Labs +name: products_ner2 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`products_ner2` is a English model originally trained by Atheer174. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/products_ner2_en_5.2.0_3.0_1700569802668.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/products_ner2_en_5.2.0_3.0_1700569802668.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("products_ner2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("products_ner2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|products_ner2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Atheer174/Products_NER2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-rg_20k_fake_signatures_en.md b/docs/_posts/ahmedlone127/2023-11-21-rg_20k_fake_signatures_en.md new file mode 100644 index 000000000000..0415a176e55b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-rg_20k_fake_signatures_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rg_20k_fake_signatures DistilBertForTokenClassification from chilliadgl +author: John Snow Labs +name: rg_20k_fake_signatures +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rg_20k_fake_signatures` is a English model originally trained by chilliadgl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rg_20k_fake_signatures_en_5.2.0_3.0_1700584529402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rg_20k_fake_signatures_en_5.2.0_3.0_1700584529402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("rg_20k_fake_signatures","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("rg_20k_fake_signatures", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rg_20k_fake_signatures| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chilliadgl/RG_20k_fake_signatures \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-rg_fake_signatures_en.md b/docs/_posts/ahmedlone127/2023-11-21-rg_fake_signatures_en.md new file mode 100644 index 000000000000..97005e45c57d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-rg_fake_signatures_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rg_fake_signatures DistilBertForTokenClassification from chilliadgl +author: John Snow Labs +name: rg_fake_signatures +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rg_fake_signatures` is a English model originally trained by chilliadgl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rg_fake_signatures_en_5.2.0_3.0_1700562855971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rg_fake_signatures_en_5.2.0_3.0_1700562855971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("rg_fake_signatures","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("rg_fake_signatures", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rg_fake_signatures| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chilliadgl/RG_fake_signatures \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-tab_anonymizer_en.md b/docs/_posts/ahmedlone127/2023-11-21-tab_anonymizer_en.md new file mode 100644 index 000000000000..f36d021223ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-tab_anonymizer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tab_anonymizer DistilBertForTokenClassification from madaanpulkit +author: John Snow Labs +name: tab_anonymizer +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tab_anonymizer` is a English model originally trained by madaanpulkit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tab_anonymizer_en_5.2.0_3.0_1700555765348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tab_anonymizer_en_5.2.0_3.0_1700555765348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tab_anonymizer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tab_anonymizer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tab_anonymizer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/madaanpulkit/tab-anonymizer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-tabert_4k_naamapadam_en.md b/docs/_posts/ahmedlone127/2023-11-21-tabert_4k_naamapadam_en.md new file mode 100644 index 000000000000..a34c47dfbfbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-tabert_4k_naamapadam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tabert_4k_naamapadam DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tabert_4k_naamapadam +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tabert_4k_naamapadam` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tabert_4k_naamapadam_en_5.2.0_3.0_1700568733212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tabert_4k_naamapadam_en_5.2.0_3.0_1700568733212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tabert_4k_naamapadam","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tabert_4k_naamapadam", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tabert_4k_naamapadam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|172.5 MB| + +## References + +https://huggingface.co/AnanthZeke/tabert-4k-naamapadam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-tabert_500_naamapadam_en.md b/docs/_posts/ahmedlone127/2023-11-21-tabert_500_naamapadam_en.md new file mode 100644 index 000000000000..cf20683ad026 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-tabert_500_naamapadam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tabert_500_naamapadam DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tabert_500_naamapadam +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tabert_500_naamapadam` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tabert_500_naamapadam_en_5.2.0_3.0_1700580816218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tabert_500_naamapadam_en_5.2.0_3.0_1700580816218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tabert_500_naamapadam","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tabert_500_naamapadam", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tabert_500_naamapadam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|162.4 MB| + +## References + +https://huggingface.co/AnanthZeke/tabert-500-naamapadam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-tokenclassificationtest_en.md b/docs/_posts/ahmedlone127/2023-11-21-tokenclassificationtest_en.md new file mode 100644 index 000000000000..09fce60a9a92 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-tokenclassificationtest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tokenclassificationtest DistilBertForTokenClassification from adzcodez +author: John Snow Labs +name: tokenclassificationtest +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tokenclassificationtest` is a English model originally trained by adzcodez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tokenclassificationtest_en_5.2.0_3.0_1700537694247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tokenclassificationtest_en_5.2.0_3.0_1700537694247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tokenclassificationtest","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tokenclassificationtest", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tokenclassificationtest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/adzcodez/TokenClassificationTest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-tori_namesplitter_en.md b/docs/_posts/ahmedlone127/2023-11-21-tori_namesplitter_en.md new file mode 100644 index 000000000000..59c3302380c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-tori_namesplitter_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tori_namesplitter DistilBertForTokenClassification from ittailup +author: John Snow Labs +name: tori_namesplitter +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tori_namesplitter` is a English model originally trained by ittailup. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tori_namesplitter_en_5.2.0_3.0_1700532912550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tori_namesplitter_en_5.2.0_3.0_1700532912550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tori_namesplitter","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tori_namesplitter", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tori_namesplitter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ittailup/tori-namesplitter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-twitter_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-twitter_ner_en.md new file mode 100644 index 000000000000..e368160b5253 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-twitter_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English twitter_ner DistilBertForTokenClassification from dayvidwang +author: John Snow Labs +name: twitter_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_ner` is a English model originally trained by dayvidwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_ner_en_5.2.0_3.0_1700576383014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_ner_en_5.2.0_3.0_1700576383014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("twitter_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("twitter_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/dayvidwang/twitter_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35_en.md b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35_en.md new file mode 100644 index 000000000000..555c6c089ccd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35_en_5.2.0_3.0_1700583446750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35_en_5.2.0_3.0_1700583446750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_token_itr0_1e_05_all_01_03_2022_14_37_35| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/twitter_RoBERTa_token_itr0_1e-05_all_01_03_2022-14_37_35 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21_en.md b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21_en.md new file mode 100644 index 000000000000..592721986bdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21_en_5.2.0_3.0_1700562855876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21_en_5.2.0_3.0_1700562855876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_token_itr0_1e_05_editorials_01_03_2022_14_43_21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/twitter_RoBERTa_token_itr0_1e-05_editorials_01_03_2022-14_43_21 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24_en.md b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24_en.md new file mode 100644 index 000000000000..dfe13cf7fbfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24_en_5.2.0_3.0_1700584443413.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24_en_5.2.0_3.0_1700584443413.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_token_itr0_1e_05_essays_01_03_2022_14_40_24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/twitter_RoBERTa_token_itr0_1e-05_essays_01_03_2022-14_40_24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20_en.md b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20_en.md new file mode 100644 index 000000000000..d34e82983d29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20 DistilBertForTokenClassification from ali2066 +author: John Snow Labs +name: twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20 +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20_en_5.2.0_3.0_1700552215202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20_en_5.2.0_3.0_1700552215202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_token_itr0_1e_05_webdiscourse_01_03_2022_14_45_20| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ali2066/twitter_RoBERTa_token_itr0_1e-05_webDiscourse_01_03_2022-14_45_20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-urdu_distilbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-urdu_distilbert_ner_en.md new file mode 100644 index 000000000000..424b00cb477b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-urdu_distilbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English urdu_distilbert_ner DistilBertForTokenClassification from mirfan899 +author: John Snow Labs +name: urdu_distilbert_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`urdu_distilbert_ner` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/urdu_distilbert_ner_en_5.2.0_3.0_1700532003420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/urdu_distilbert_ner_en_5.2.0_3.0_1700532003420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("urdu_distilbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("urdu_distilbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|urdu_distilbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/mirfan899/urdu-distilbert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-week5_distilbert_base_multilingual_cased_finetuned_eng_xx.md b/docs/_posts/ahmedlone127/2023-11-21-week5_distilbert_base_multilingual_cased_finetuned_eng_xx.md new file mode 100644 index 000000000000..95c19ce0da3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-week5_distilbert_base_multilingual_cased_finetuned_eng_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual week5_distilbert_base_multilingual_cased_finetuned_eng DistilBertForTokenClassification from ensw +author: John Snow Labs +name: week5_distilbert_base_multilingual_cased_finetuned_eng +date: 2023-11-21 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`week5_distilbert_base_multilingual_cased_finetuned_eng` is a Multilingual model originally trained by ensw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/week5_distilbert_base_multilingual_cased_finetuned_eng_xx_5.2.0_3.0_1700585959398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/week5_distilbert_base_multilingual_cased_finetuned_eng_xx_5.2.0_3.0_1700585959398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("week5_distilbert_base_multilingual_cased_finetuned_eng","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("week5_distilbert_base_multilingual_cased_finetuned_eng", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|week5_distilbert_base_multilingual_cased_finetuned_eng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/ensw/week5-distilbert-base-multilingual-cased-finetuned-eng \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-whispquote_chunkeddq_en.md b/docs/_posts/ahmedlone127/2023-11-21-whispquote_chunkeddq_en.md new file mode 100644 index 000000000000..6f1e54c0f5dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-whispquote_chunkeddq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English whispquote_chunkeddq DistilBertForTokenClassification from lucafrost +author: John Snow Labs +name: whispquote_chunkeddq +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`whispquote_chunkeddq` is a English model originally trained by lucafrost. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/whispquote_chunkeddq_en_5.2.0_3.0_1700588823416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/whispquote_chunkeddq_en_5.2.0_3.0_1700588823416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("whispquote_chunkeddq","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("whispquote_chunkeddq", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|whispquote_chunkeddq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/lucafrost/whispQuote-ChunkedDQ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-21-wiki_hungarian_ner_en.md b/docs/_posts/ahmedlone127/2023-11-21-wiki_hungarian_ner_en.md new file mode 100644 index 000000000000..c296a308bc21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-21-wiki_hungarian_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wiki_hungarian_ner DistilBertForTokenClassification from terhdavid +author: John Snow Labs +name: wiki_hungarian_ner +date: 2023-11-21 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wiki_hungarian_ner` is a English model originally trained by terhdavid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wiki_hungarian_ner_en_5.2.0_3.0_1700551217273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wiki_hungarian_ner_en_5.2.0_3.0_1700551217273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("wiki_hungarian_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("wiki_hungarian_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wiki_hungarian_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/terhdavid/wiki_hu_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-230615_wnut_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-230615_wnut_model_en.md new file mode 100644 index 000000000000..f9087dd5e274 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-230615_wnut_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English 230615_wnut_model DistilBertForTokenClassification from Sogangina +author: John Snow Labs +name: 230615_wnut_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`230615_wnut_model` is a English model originally trained by Sogangina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/230615_wnut_model_en_5.2.0_3.0_1700638908062.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/230615_wnut_model_en_5.2.0_3.0_1700638908062.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("230615_wnut_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("230615_wnut_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|230615_wnut_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Sogangina/230615_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-affilgood_ner_test_en.md b/docs/_posts/ahmedlone127/2023-11-22-affilgood_ner_test_en.md new file mode 100644 index 000000000000..b8c7a4b14a72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-affilgood_ner_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English affilgood_ner_test DistilBertForTokenClassification from nicolauduran45 +author: John Snow Labs +name: affilgood_ner_test +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`affilgood_ner_test` is a English model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/affilgood_ner_test_en_5.2.0_3.0_1700671301510.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/affilgood_ner_test_en_5.2.0_3.0_1700671301510.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("affilgood_ner_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("affilgood_ner_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|affilgood_ner_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/nicolauduran45/affilgood-ner-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-autotrain_ner2_50086120238_en.md b/docs/_posts/ahmedlone127/2023-11-22-autotrain_ner2_50086120238_en.md new file mode 100644 index 000000000000..6370e8cb2fec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-autotrain_ner2_50086120238_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_ner2_50086120238 DistilBertForTokenClassification from onevholy +author: John Snow Labs +name: autotrain_ner2_50086120238 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_ner2_50086120238` is a English model originally trained by onevholy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_ner2_50086120238_en_5.2.0_3.0_1700630396002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_ner2_50086120238_en_5.2.0_3.0_1700630396002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("autotrain_ner2_50086120238","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("autotrain_ner2_50086120238", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_ner2_50086120238| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/onevholy/autotrain-ner2-50086120238 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-autotrain_test_ner_75401139975_en.md b/docs/_posts/ahmedlone127/2023-11-22-autotrain_test_ner_75401139975_en.md new file mode 100644 index 000000000000..f6c76863e849 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-autotrain_test_ner_75401139975_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_test_ner_75401139975 DistilBertForTokenClassification from sophy +author: John Snow Labs +name: autotrain_test_ner_75401139975 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_test_ner_75401139975` is a English model originally trained by sophy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_test_ner_75401139975_en_5.2.0_3.0_1700650578059.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_test_ner_75401139975_en_5.2.0_3.0_1700650578059.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("autotrain_test_ner_75401139975","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("autotrain_test_ner_75401139975", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_test_ner_75401139975| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sophy/autotrain-test-ner-75401139975 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2_en.md b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2_en.md new file mode 100644 index 000000000000..21ff455e506e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2 DistilBertForTokenClassification from Jsevisal +author: John Snow Labs +name: balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2` is a English model originally trained by Jsevisal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2_en_5.2.0_3.0_1700622115914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2_en_5.2.0_3.0_1700622115914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/Jsevisal/balanced-augmented-distilbert-base-gest-pred-seqeval-partialmatch-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_en.md b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_en.md new file mode 100644 index 000000000000..3e0be8012958 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch DistilBertForTokenClassification from Jsevisal +author: John Snow Labs +name: balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch` is a English model originally trained by Jsevisal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_en_5.2.0_3.0_1700623391602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch_en_5.2.0_3.0_1700623391602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|balanced_augmented_distilbert_base_gest_pred_seqeval_partialmatch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/Jsevisal/balanced-augmented-distilbert-base-gest-pred-seqeval-partialmatch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2_en.md b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2_en.md new file mode 100644 index 000000000000..7787ddb973d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2 DistilBertForTokenClassification from Jsevisal +author: John Snow Labs +name: balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2` is a English model originally trained by Jsevisal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2_en_5.2.0_3.0_1700620117873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2_en_5.2.0_3.0_1700620117873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/Jsevisal/balanced-augmented-ft-distilbert-gest-pred-seqeval-partialmatch-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_en.md b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_en.md new file mode 100644 index 000000000000..2c64ec2ca26b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch DistilBertForTokenClassification from Jsevisal +author: John Snow Labs +name: balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch` is a English model originally trained by Jsevisal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_en_5.2.0_3.0_1700620404065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch_en_5.2.0_3.0_1700620404065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|balanced_augmented_ft_distilbert_gest_pred_seqeval_partialmatch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/Jsevisal/balanced-augmented-ft-distilbert-gest-pred-seqeval-partialmatch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-basic_wnut_en.md b/docs/_posts/ahmedlone127/2023-11-22-basic_wnut_en.md new file mode 100644 index 000000000000..948a8c2e9194 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-basic_wnut_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English basic_wnut DistilBertForTokenClassification from eren23 +author: John Snow Labs +name: basic_wnut +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`basic_wnut` is a English model originally trained by eren23. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/basic_wnut_en_5.2.0_3.0_1700681182283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/basic_wnut_en_5.2.0_3.0_1700681182283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("basic_wnut","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("basic_wnut", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|basic_wnut| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/eren23/basic_wnut \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_dayvidwang_en.md b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_dayvidwang_en.md new file mode 100644 index 000000000000..e954ec38b7d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_dayvidwang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_dayvidwang DistilBertForTokenClassification from dayvidwang +author: John Snow Labs +name: bert_finetuned_ner_dayvidwang +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_dayvidwang` is a English model originally trained by dayvidwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_dayvidwang_en_5.2.0_3.0_1700623394472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_dayvidwang_en_5.2.0_3.0_1700623394472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_dayvidwang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_finetuned_ner_dayvidwang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_dayvidwang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/dayvidwang/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_ibrahimcse_en.md b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_ibrahimcse_en.md new file mode 100644 index 000000000000..7e1ad4ed24c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_ibrahimcse_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_ibrahimcse DistilBertForTokenClassification from ibrahimcse +author: John Snow Labs +name: bert_finetuned_ner_ibrahimcse +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_ibrahimcse` is a English model originally trained by ibrahimcse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ibrahimcse_en_5.2.0_3.0_1700621277697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_ibrahimcse_en_5.2.0_3.0_1700621277697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_ibrahimcse","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_finetuned_ner_ibrahimcse", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_ibrahimcse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ibrahimcse/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_jhdavis_en.md b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_jhdavis_en.md new file mode 100644 index 000000000000..d1d3b7509d26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_jhdavis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_jhdavis DistilBertForTokenClassification from jhdavis +author: John Snow Labs +name: bert_finetuned_ner_jhdavis +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_jhdavis` is a English model originally trained by jhdavis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jhdavis_en_5.2.0_3.0_1700628649196.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_jhdavis_en_5.2.0_3.0_1700628649196.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_jhdavis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_finetuned_ner_jhdavis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_jhdavis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/jhdavis/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_serah3_en.md b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_serah3_en.md new file mode 100644 index 000000000000..585f8916f224 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_ner_serah3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_ner_serah3 DistilBertForTokenClassification from Serah3 +author: John Snow Labs +name: bert_finetuned_ner_serah3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_ner_serah3` is a English model originally trained by Serah3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_serah3_en_5.2.0_3.0_1700621128546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_ner_serah3_en_5.2.0_3.0_1700621128546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_ner_serah3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_finetuned_ner_serah3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_ner_serah3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Serah3/bert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_radarr_en.md b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_radarr_en.md new file mode 100644 index 000000000000..60d540e844ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-bert_finetuned_radarr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_radarr DistilBertForTokenClassification from Servarr +author: John Snow Labs +name: bert_finetuned_radarr +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_radarr` is a English model originally trained by Servarr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_radarr_en_5.2.0_3.0_1700620409839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_radarr_en_5.2.0_3.0_1700620409839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_finetuned_radarr","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_finetuned_radarr", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_radarr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Servarr/bert-finetuned-radarr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-bert_multilingual_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-22-bert_multilingual_ner_xx.md new file mode 100644 index 000000000000..e71e0bd64f32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-bert_multilingual_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual bert_multilingual_ner DistilBertForTokenClassification from Giorgib +author: John Snow Labs +name: bert_multilingual_ner +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_multilingual_ner` is a Multilingual model originally trained by Giorgib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_multilingual_ner_xx_5.2.0_3.0_1700674081046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_multilingual_ner_xx_5.2.0_3.0_1700674081046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_multilingual_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_multilingual_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Giorgib/bert_multilingual_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-bert_warmup_en.md b/docs/_posts/ahmedlone127/2023-11-22-bert_warmup_en.md new file mode 100644 index 000000000000..6f85f2ac387a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-bert_warmup_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_warmup DistilBertForTokenClassification from mohsenfayyaz +author: John Snow Labs +name: bert_warmup +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_warmup` is a English model originally trained by mohsenfayyaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_warmup_en_5.2.0_3.0_1700622639444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_warmup_en_5.2.0_3.0_1700622639444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("bert_warmup","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("bert_warmup", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_warmup| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mohsenfayyaz/BERT_Warmup \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-biomed_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-biomed_ner_en.md new file mode 100644 index 000000000000..3bb8c4cd602f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-biomed_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomed_ner DistilBertForTokenClassification from Pkompally +author: John Snow Labs +name: biomed_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomed_ner` is a English model originally trained by Pkompally. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomed_ner_en_5.2.0_3.0_1700642877408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomed_ner_en_5.2.0_3.0_1700642877408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomed_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomed_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomed_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/Pkompally/biomed-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_2_en.md b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_2_en.md new file mode 100644 index 000000000000..1ea7719b46d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_anonimization_try_2 DistilBertForTokenClassification from Juan281992 +author: John Snow Labs +name: biomedical_ner_all_anonimization_try_2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_anonimization_try_2` is a English model originally trained by Juan281992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_2_en_5.2.0_3.0_1700664326651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_2_en_5.2.0_3.0_1700664326651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_anonimization_try_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_anonimization_try_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_anonimization_try_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Juan281992/biomedical-ner-all-anonimization_TRY_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_5_en.md b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_5_en.md new file mode 100644 index 000000000000..3172c0815329 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_anonimization_try_5 DistilBertForTokenClassification from Juan281992 +author: John Snow Labs +name: biomedical_ner_all_anonimization_try_5 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_anonimization_try_5` is a English model originally trained by Juan281992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_5_en_5.2.0_3.0_1700627640399.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_5_en_5.2.0_3.0_1700627640399.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_anonimization_try_5","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_anonimization_try_5", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_anonimization_try_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Juan281992/biomedical-ner-all-anonimization_TRY_5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_6_en.md b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_6_en.md new file mode 100644 index 000000000000..a9d20f834733 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_anonimization_try_6 DistilBertForTokenClassification from Juan281992 +author: John Snow Labs +name: biomedical_ner_all_anonimization_try_6 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_anonimization_try_6` is a English model originally trained by Juan281992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_6_en_5.2.0_3.0_1700634026957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_6_en_5.2.0_3.0_1700634026957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_anonimization_try_6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_anonimization_try_6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_anonimization_try_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Juan281992/biomedical-ner-all-anonimization_TRY_6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_7_en.md b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_7_en.md new file mode 100644 index 000000000000..53c885c9bbe7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_anonimization_try_7 DistilBertForTokenClassification from Juan281992 +author: John Snow Labs +name: biomedical_ner_all_anonimization_try_7 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_anonimization_try_7` is a English model originally trained by Juan281992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_7_en_5.2.0_3.0_1700648307682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_7_en_5.2.0_3.0_1700648307682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_anonimization_try_7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_anonimization_try_7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_anonimization_try_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Juan281992/biomedical-ner-all-anonimization_TRY_7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_8_anonimization_try_9_en.md b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_8_anonimization_try_9_en.md new file mode 100644 index 000000000000..90e574ae9adf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_8_anonimization_try_9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_anonimization_try_8_anonimization_try_9 DistilBertForTokenClassification from Juan281992 +author: John Snow Labs +name: biomedical_ner_all_anonimization_try_8_anonimization_try_9 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_anonimization_try_8_anonimization_try_9` is a English model originally trained by Juan281992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_8_anonimization_try_9_en_5.2.0_3.0_1700642091267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_8_anonimization_try_9_en_5.2.0_3.0_1700642091267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_anonimization_try_8_anonimization_try_9","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_anonimization_try_8_anonimization_try_9", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_anonimization_try_8_anonimization_try_9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Juan281992/biomedical-ner-all-anonimization_TRY_8-anonimization_TRY_9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_8_en.md b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_8_en.md new file mode 100644 index 000000000000..d4f22a08d2c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-biomedical_ner_all_anonimization_try_8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English biomedical_ner_all_anonimization_try_8 DistilBertForTokenClassification from Juan281992 +author: John Snow Labs +name: biomedical_ner_all_anonimization_try_8 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`biomedical_ner_all_anonimization_try_8` is a English model originally trained by Juan281992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_8_en_5.2.0_3.0_1700668542564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/biomedical_ner_all_anonimization_try_8_en_5.2.0_3.0_1700668542564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("biomedical_ner_all_anonimization_try_8","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("biomedical_ner_all_anonimization_try_8", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|biomedical_ner_all_anonimization_try_8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Juan281992/biomedical-ner-all-anonimization_TRY_8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_address_tokenizer_model_v7_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_address_tokenizer_model_v7_en.md new file mode 100644 index 000000000000..2ad6e373ca64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_address_tokenizer_model_v7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_address_tokenizer_model_v7 DistilBertForTokenClassification from bhattronak +author: John Snow Labs +name: burmese_awesome_address_tokenizer_model_v7 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_address_tokenizer_model_v7` is a English model originally trained by bhattronak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v7_en_5.2.0_3.0_1700648421765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_address_tokenizer_model_v7_en_5.2.0_3.0_1700648421765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_address_tokenizer_model_v7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_address_tokenizer_model_v7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_address_tokenizer_model_v7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bhattronak/my-awesome-address-tokenizer-model-v7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_indonesian_nergrit_corpus_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_indonesian_nergrit_corpus_model_en.md new file mode 100644 index 000000000000..c44069884b5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_indonesian_nergrit_corpus_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_indonesian_nergrit_corpus_model DistilBertForTokenClassification from nelsenputra +author: John Snow Labs +name: burmese_awesome_indonesian_nergrit_corpus_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_indonesian_nergrit_corpus_model` is a English model originally trained by nelsenputra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_indonesian_nergrit_corpus_model_en_5.2.0_3.0_1700619318579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_indonesian_nergrit_corpus_model_en_5.2.0_3.0_1700619318579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_indonesian_nergrit_corpus_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_indonesian_nergrit_corpus_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_indonesian_nergrit_corpus_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/nelsenputra/my_awesome_id_nergrit_corpus_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_ner_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_ner_model_en.md new file mode 100644 index 000000000000..3739ab7b193e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_ner_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_ner_model DistilBertForTokenClassification from Tirendaz +author: John Snow Labs +name: burmese_awesome_ner_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_ner_model` is a English model originally trained by Tirendaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_ner_model_en_5.2.0_3.0_1700619459450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_ner_model_en_5.2.0_3.0_1700619459450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_ner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_ner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_ner_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Tirendaz/my_awesome_ner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_pakner_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_pakner_model_en.md new file mode 100644 index 000000000000..d963615390e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_pakner_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_pakner_model DistilBertForTokenClassification from Iftisyed +author: John Snow Labs +name: burmese_awesome_pakner_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_pakner_model` is a English model originally trained by Iftisyed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_pakner_model_en_5.2.0_3.0_1700677444075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_pakner_model_en_5.2.0_3.0_1700677444075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_pakner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_pakner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_pakner_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Iftisyed/my_awesome_pakner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_reconstructor_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_reconstructor_model_en.md new file mode 100644 index 000000000000..bd92db6dd55b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_reconstructor_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_reconstructor_model DistilBertForTokenClassification from abdulaziz1928 +author: John Snow Labs +name: burmese_awesome_reconstructor_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_reconstructor_model` is a English model originally trained by abdulaziz1928. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_reconstructor_model_en_5.2.0_3.0_1700651765570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_reconstructor_model_en_5.2.0_3.0_1700651765570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_reconstructor_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_reconstructor_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_reconstructor_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/abdulaziz1928/my_awesome_reconstructor_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model2_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model2_en.md new file mode 100644 index 000000000000..ccaf88ef17e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model2 DistilBertForTokenClassification from Atheer174 +author: John Snow Labs +name: burmese_awesome_wnut_model2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model2` is a English model originally trained by Atheer174. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model2_en_5.2.0_3.0_1700677533067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model2_en_5.2.0_3.0_1700677533067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Atheer174/my_awesome_wnut_model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model3_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model3_en.md new file mode 100644 index 000000000000..427a341322f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model3 DistilBertForTokenClassification from Atheer174 +author: John Snow Labs +name: burmese_awesome_wnut_model3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model3` is a English model originally trained by Atheer174. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model3_en_5.2.0_3.0_1700665041438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model3_en_5.2.0_3.0_1700665041438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Atheer174/my_awesome_wnut_model3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_1_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_1_en.md new file mode 100644 index 000000000000..2d577aca51d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_1 DistilBertForTokenClassification from agdsga +author: John Snow Labs +name: burmese_awesome_wnut_model_1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_1` is a English model originally trained by agdsga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_1_en_5.2.0_3.0_1700680869364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_1_en_5.2.0_3.0_1700680869364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/agdsga/my_awesome_wnut_model_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_7dberry_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_7dberry_en.md new file mode 100644 index 000000000000..aafcec94ea2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_7dberry_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_7dberry DistilBertForTokenClassification from 7Dberry +author: John Snow Labs +name: burmese_awesome_wnut_model_7dberry +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_7dberry` is a English model originally trained by 7Dberry. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_7dberry_en_5.2.0_3.0_1700633407300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_7dberry_en_5.2.0_3.0_1700633407300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_7dberry","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_7dberry", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_7dberry| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/7Dberry/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alayaran_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alayaran_en.md new file mode 100644 index 000000000000..25ef268f071a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alayaran_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_alayaran DistilBertForTokenClassification from alayaran +author: John Snow Labs +name: burmese_awesome_wnut_model_alayaran +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_alayaran` is a English model originally trained by alayaran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_alayaran_en_5.2.0_3.0_1700641179680.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_alayaran_en_5.2.0_3.0_1700641179680.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_alayaran","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_alayaran", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_alayaran| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/alayaran/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alexisdpc_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alexisdpc_en.md new file mode 100644 index 000000000000..535266dc6b41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alexisdpc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_alexisdpc DistilBertForTokenClassification from alexisdpc +author: John Snow Labs +name: burmese_awesome_wnut_model_alexisdpc +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_alexisdpc` is a English model originally trained by alexisdpc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_alexisdpc_en_5.2.0_3.0_1700672982734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_alexisdpc_en_5.2.0_3.0_1700672982734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_alexisdpc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_alexisdpc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_alexisdpc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/alexisdpc/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alicenkbaytop_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alicenkbaytop_en.md new file mode 100644 index 000000000000..25b6be33cdcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_alicenkbaytop_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_alicenkbaytop DistilBertForTokenClassification from alicenkbaytop +author: John Snow Labs +name: burmese_awesome_wnut_model_alicenkbaytop +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_alicenkbaytop` is a English model originally trained by alicenkbaytop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_alicenkbaytop_en_5.2.0_3.0_1700656086944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_alicenkbaytop_en_5.2.0_3.0_1700656086944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_alicenkbaytop","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_alicenkbaytop", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_alicenkbaytop| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/alicenkbaytop/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_anandbhat_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_anandbhat_en.md new file mode 100644 index 000000000000..1eca0637fbd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_anandbhat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_anandbhat DistilBertForTokenClassification from AnandBhat +author: John Snow Labs +name: burmese_awesome_wnut_model_anandbhat +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_anandbhat` is a English model originally trained by AnandBhat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_anandbhat_en_5.2.0_3.0_1700625842248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_anandbhat_en_5.2.0_3.0_1700625842248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_anandbhat","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_anandbhat", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_anandbhat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AnandBhat/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_andyrasika_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_andyrasika_en.md new file mode 100644 index 000000000000..9986a86eca5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_andyrasika_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_andyrasika DistilBertForTokenClassification from Andyrasika +author: John Snow Labs +name: burmese_awesome_wnut_model_andyrasika +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_andyrasika` is a English model originally trained by Andyrasika. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_andyrasika_en_5.2.0_3.0_1700660238195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_andyrasika_en_5.2.0_3.0_1700660238195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_andyrasika","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_andyrasika", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_andyrasika| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Andyrasika/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_anyuanay_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_anyuanay_en.md new file mode 100644 index 000000000000..ad433c3659af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_anyuanay_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_anyuanay DistilBertForTokenClassification from anyuanay +author: John Snow Labs +name: burmese_awesome_wnut_model_anyuanay +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_anyuanay` is a English model originally trained by anyuanay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_anyuanay_en_5.2.0_3.0_1700668541093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_anyuanay_en_5.2.0_3.0_1700668541093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_anyuanay","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_anyuanay", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_anyuanay| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/anyuanay/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_atheer174_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_atheer174_en.md new file mode 100644 index 000000000000..c91a8fb13d68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_atheer174_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_atheer174 DistilBertForTokenClassification from Atheer174 +author: John Snow Labs +name: burmese_awesome_wnut_model_atheer174 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_atheer174` is a English model originally trained by Atheer174. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_atheer174_en_5.2.0_3.0_1700680434282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_atheer174_en_5.2.0_3.0_1700680434282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_atheer174","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_atheer174", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_atheer174| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Atheer174/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_blakemaster24_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_blakemaster24_en.md new file mode 100644 index 000000000000..656733079790 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_blakemaster24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_blakemaster24 DistilBertForTokenClassification from blakemaster24 +author: John Snow Labs +name: burmese_awesome_wnut_model_blakemaster24 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_blakemaster24` is a English model originally trained by blakemaster24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_blakemaster24_en_5.2.0_3.0_1700682010372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_blakemaster24_en_5.2.0_3.0_1700682010372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_blakemaster24","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_blakemaster24", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_blakemaster24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/blakemaster24/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_claudehotline_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_claudehotline_en.md new file mode 100644 index 000000000000..cd77faee20b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_claudehotline_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_claudehotline DistilBertForTokenClassification from claudehotline +author: John Snow Labs +name: burmese_awesome_wnut_model_claudehotline +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_claudehotline` is a English model originally trained by claudehotline. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_claudehotline_en_5.2.0_3.0_1700655180955.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_claudehotline_en_5.2.0_3.0_1700655180955.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_claudehotline","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_claudehotline", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_claudehotline| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/claudehotline/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_danstinga_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_danstinga_en.md new file mode 100644 index 000000000000..d39b79e20485 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_danstinga_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_danstinga DistilBertForTokenClassification from danstinga +author: John Snow Labs +name: burmese_awesome_wnut_model_danstinga +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_danstinga` is a English model originally trained by danstinga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_danstinga_en_5.2.0_3.0_1700647632219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_danstinga_en_5.2.0_3.0_1700647632219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_danstinga","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_danstinga", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_danstinga| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/danstinga/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_darrenhinde_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_darrenhinde_en.md new file mode 100644 index 000000000000..a2502721a569 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_darrenhinde_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_darrenhinde DistilBertForTokenClassification from darrenhinde +author: John Snow Labs +name: burmese_awesome_wnut_model_darrenhinde +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_darrenhinde` is a English model originally trained by darrenhinde. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_darrenhinde_en_5.2.0_3.0_1700677609935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_darrenhinde_en_5.2.0_3.0_1700677609935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_darrenhinde","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_darrenhinde", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_darrenhinde| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/darrenhinde/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_davidliu1110_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_davidliu1110_en.md new file mode 100644 index 000000000000..145defb09f22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_davidliu1110_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_davidliu1110 DistilBertForTokenClassification from davidliu1110 +author: John Snow Labs +name: burmese_awesome_wnut_model_davidliu1110 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_davidliu1110` is a English model originally trained by davidliu1110. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_davidliu1110_en_5.2.0_3.0_1700658076882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_davidliu1110_en_5.2.0_3.0_1700658076882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_davidliu1110","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_davidliu1110", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_davidliu1110| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/davidliu1110/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_dimitriish_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_dimitriish_en.md new file mode 100644 index 000000000000..4de97b4117c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_dimitriish_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_dimitriish DistilBertForTokenClassification from dimitriish +author: John Snow Labs +name: burmese_awesome_wnut_model_dimitriish +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_dimitriish` is a English model originally trained by dimitriish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_dimitriish_en_5.2.0_3.0_1700663496784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_dimitriish_en_5.2.0_3.0_1700663496784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_dimitriish","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_dimitriish", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_dimitriish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|504.0 MB| + +## References + +https://huggingface.co/dimitriish/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_edthomasset_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_edthomasset_en.md new file mode 100644 index 000000000000..80d2a50580d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_edthomasset_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_edthomasset DistilBertForTokenClassification from EdThomasset +author: John Snow Labs +name: burmese_awesome_wnut_model_edthomasset +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_edthomasset` is a English model originally trained by EdThomasset. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_edthomasset_en_5.2.0_3.0_1700663820504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_edthomasset_en_5.2.0_3.0_1700663820504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_edthomasset","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_edthomasset", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_edthomasset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/EdThomasset/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_eitanli_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_eitanli_en.md new file mode 100644 index 000000000000..f5d605ba1b69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_eitanli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_eitanli DistilBertForTokenClassification from Eitanli +author: John Snow Labs +name: burmese_awesome_wnut_model_eitanli +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_eitanli` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_eitanli_en_5.2.0_3.0_1700665803920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_eitanli_en_5.2.0_3.0_1700665803920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_eitanli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_eitanli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_eitanli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Eitanli/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_emmanuelq2_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_emmanuelq2_en.md new file mode 100644 index 000000000000..b4d83878146e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_emmanuelq2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_emmanuelq2 DistilBertForTokenClassification from emmanuelq2 +author: John Snow Labs +name: burmese_awesome_wnut_model_emmanuelq2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_emmanuelq2` is a English model originally trained by emmanuelq2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_emmanuelq2_en_5.2.0_3.0_1700676281719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_emmanuelq2_en_5.2.0_3.0_1700676281719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_emmanuelq2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_emmanuelq2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_emmanuelq2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/emmanuelq2/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ggouda_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ggouda_en.md new file mode 100644 index 000000000000..db2e6181f48d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ggouda_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_ggouda DistilBertForTokenClassification from ggouda +author: John Snow Labs +name: burmese_awesome_wnut_model_ggouda +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_ggouda` is a English model originally trained by ggouda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_ggouda_en_5.2.0_3.0_1700659423610.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_ggouda_en_5.2.0_3.0_1700659423610.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_ggouda","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_ggouda", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_ggouda| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ggouda/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_gwd77777_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_gwd77777_en.md new file mode 100644 index 000000000000..5a9d7473c3ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_gwd77777_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_gwd77777 DistilBertForTokenClassification from gwd77777 +author: John Snow Labs +name: burmese_awesome_wnut_model_gwd77777 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_gwd77777` is a English model originally trained by gwd77777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_gwd77777_en_5.2.0_3.0_1700650763877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_gwd77777_en_5.2.0_3.0_1700650763877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_gwd77777","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_gwd77777", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_gwd77777| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/gwd77777/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_hefeng0_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_hefeng0_en.md new file mode 100644 index 000000000000..9cfc805793e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_hefeng0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_hefeng0 DistilBertForTokenClassification from hefeng0 +author: John Snow Labs +name: burmese_awesome_wnut_model_hefeng0 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_hefeng0` is a English model originally trained by hefeng0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_hefeng0_en_5.2.0_3.0_1700679587351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_hefeng0_en_5.2.0_3.0_1700679587351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_hefeng0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_hefeng0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_hefeng0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hefeng0/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_idriska_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_idriska_en.md new file mode 100644 index 000000000000..4cfbf46bcad7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_idriska_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_idriska DistilBertForTokenClassification from Idriska +author: John Snow Labs +name: burmese_awesome_wnut_model_idriska +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_idriska` is a English model originally trained by Idriska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_idriska_en_5.2.0_3.0_1700666662441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_idriska_en_5.2.0_3.0_1700666662441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_idriska","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_idriska", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_idriska| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Idriska/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_iftisyed_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_iftisyed_en.md new file mode 100644 index 000000000000..3e8ac55151f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_iftisyed_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_iftisyed DistilBertForTokenClassification from Iftisyed +author: John Snow Labs +name: burmese_awesome_wnut_model_iftisyed +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_iftisyed` is a English model originally trained by Iftisyed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_iftisyed_en_5.2.0_3.0_1700681267761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_iftisyed_en_5.2.0_3.0_1700681267761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_iftisyed","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_iftisyed", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_iftisyed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Iftisyed/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_jessicaassis_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_jessicaassis_en.md new file mode 100644 index 000000000000..eb6b9d697a83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_jessicaassis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_jessicaassis DistilBertForTokenClassification from jessicaassis +author: John Snow Labs +name: burmese_awesome_wnut_model_jessicaassis +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_jessicaassis` is a English model originally trained by jessicaassis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_jessicaassis_en_5.2.0_3.0_1700653423325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_jessicaassis_en_5.2.0_3.0_1700653423325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_jessicaassis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_jessicaassis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_jessicaassis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/jessicaassis/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_kadir0_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_kadir0_en.md new file mode 100644 index 000000000000..fa8394d2fd43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_kadir0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_kadir0 DistilBertForTokenClassification from kadir0 +author: John Snow Labs +name: burmese_awesome_wnut_model_kadir0 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_kadir0` is a English model originally trained by kadir0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_kadir0_en_5.2.0_3.0_1700676150658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_kadir0_en_5.2.0_3.0_1700676150658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_kadir0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_kadir0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_kadir0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/kadir0/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_lathashree01_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_lathashree01_en.md new file mode 100644 index 000000000000..5c7bc447f0d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_lathashree01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_lathashree01 DistilBertForTokenClassification from lathashree01 +author: John Snow Labs +name: burmese_awesome_wnut_model_lathashree01 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_lathashree01` is a English model originally trained by lathashree01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_lathashree01_en_5.2.0_3.0_1700635736119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_lathashree01_en_5.2.0_3.0_1700635736119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_lathashree01","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_lathashree01", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_lathashree01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/lathashree01/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_liujunshi_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_liujunshi_en.md new file mode 100644 index 000000000000..32b305296cd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_liujunshi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_liujunshi DistilBertForTokenClassification from liujunshi +author: John Snow Labs +name: burmese_awesome_wnut_model_liujunshi +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_liujunshi` is a English model originally trained by liujunshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_liujunshi_en_5.2.0_3.0_1700632777770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_liujunshi_en_5.2.0_3.0_1700632777770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_liujunshi","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_liujunshi", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_liujunshi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/liujunshi/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_longmark_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_longmark_en.md new file mode 100644 index 000000000000..9a9a1e91786e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_longmark_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_longmark DistilBertForTokenClassification from longmark +author: John Snow Labs +name: burmese_awesome_wnut_model_longmark +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_longmark` is a English model originally trained by longmark. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_longmark_en_5.2.0_3.0_1700653524575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_longmark_en_5.2.0_3.0_1700653524575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_longmark","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_longmark", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_longmark| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/longmark/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_longxiang_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_longxiang_en.md new file mode 100644 index 000000000000..77a84ff46ecf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_longxiang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_longxiang DistilBertForTokenClassification from Longxiang +author: John Snow Labs +name: burmese_awesome_wnut_model_longxiang +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_longxiang` is a English model originally trained by Longxiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_longxiang_en_5.2.0_3.0_1700652542215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_longxiang_en_5.2.0_3.0_1700652542215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_longxiang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_longxiang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_longxiang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Longxiang/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_mamuninfo_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_mamuninfo_en.md new file mode 100644 index 000000000000..4d7d0644734b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_mamuninfo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_mamuninfo DistilBertForTokenClassification from mamuninfo +author: John Snow Labs +name: burmese_awesome_wnut_model_mamuninfo +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_mamuninfo` is a English model originally trained by mamuninfo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_mamuninfo_en_5.2.0_3.0_1700678374729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_mamuninfo_en_5.2.0_3.0_1700678374729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_mamuninfo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_mamuninfo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_mamuninfo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mamuninfo/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_maunilvyas_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_maunilvyas_en.md new file mode 100644 index 000000000000..fa43a3cde759 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_maunilvyas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_maunilvyas DistilBertForTokenClassification from MaunilVyas +author: John Snow Labs +name: burmese_awesome_wnut_model_maunilvyas +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_maunilvyas` is a English model originally trained by MaunilVyas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_maunilvyas_en_5.2.0_3.0_1700660182683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_maunilvyas_en_5.2.0_3.0_1700660182683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_maunilvyas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_maunilvyas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_maunilvyas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/MaunilVyas/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_may33_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_may33_en.md new file mode 100644 index 000000000000..2529ad4f5ec6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_may33_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_may33 DistilBertForTokenClassification from May33 +author: John Snow Labs +name: burmese_awesome_wnut_model_may33 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_may33` is a English model originally trained by May33. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_may33_en_5.2.0_3.0_1700624894018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_may33_en_5.2.0_3.0_1700624894018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_may33","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_may33", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_may33| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/May33/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_me11997_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_me11997_en.md new file mode 100644 index 000000000000..aed98afa226b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_me11997_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_me11997 DistilBertForTokenClassification from me11997 +author: John Snow Labs +name: burmese_awesome_wnut_model_me11997 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_me11997` is a English model originally trained by me11997. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_me11997_en_5.2.0_3.0_1700652542211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_me11997_en_5.2.0_3.0_1700652542211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_me11997","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_me11997", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_me11997| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/me11997/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_merlynjoseph_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_merlynjoseph_en.md new file mode 100644 index 000000000000..55e0ee383ab3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_merlynjoseph_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_merlynjoseph DistilBertForTokenClassification from merlynjoseph +author: John Snow Labs +name: burmese_awesome_wnut_model_merlynjoseph +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_merlynjoseph` is a English model originally trained by merlynjoseph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_merlynjoseph_en_5.2.0_3.0_1700621907136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_merlynjoseph_en_5.2.0_3.0_1700621907136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_merlynjoseph","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_merlynjoseph", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_merlynjoseph| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/merlynjoseph/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_mu_mj_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_mu_mj_en.md new file mode 100644 index 000000000000..2030a9832442 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_mu_mj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_mu_mj DistilBertForTokenClassification from Mu-mj +author: John Snow Labs +name: burmese_awesome_wnut_model_mu_mj +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_mu_mj` is a English model originally trained by Mu-mj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_mu_mj_en_5.2.0_3.0_1700628649830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_mu_mj_en_5.2.0_3.0_1700628649830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_mu_mj","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_mu_mj", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_mu_mj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Mu-mj/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_muibk_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_muibk_en.md new file mode 100644 index 000000000000..04bd6dac393f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_muibk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_muibk DistilBertForTokenClassification from muibk +author: John Snow Labs +name: burmese_awesome_wnut_model_muibk +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_muibk` is a English model originally trained by muibk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_muibk_en_5.2.0_3.0_1700672974341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_muibk_en_5.2.0_3.0_1700672974341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_muibk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_muibk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_muibk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/muibk/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_nadeemraja_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_nadeemraja_en.md new file mode 100644 index 000000000000..03d22a7b0520 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_nadeemraja_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_nadeemraja DistilBertForTokenClassification from nadeemraja +author: John Snow Labs +name: burmese_awesome_wnut_model_nadeemraja +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_nadeemraja` is a English model originally trained by nadeemraja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_nadeemraja_en_5.2.0_3.0_1700675660754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_nadeemraja_en_5.2.0_3.0_1700675660754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_nadeemraja","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_nadeemraja", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_nadeemraja| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/nadeemraja/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ni4z_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ni4z_en.md new file mode 100644 index 000000000000..4fbe835a1d8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ni4z_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_ni4z DistilBertForTokenClassification from Ni4z +author: John Snow Labs +name: burmese_awesome_wnut_model_ni4z +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_ni4z` is a English model originally trained by Ni4z. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_ni4z_en_5.2.0_3.0_1700635636005.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_ni4z_en_5.2.0_3.0_1700635636005.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_ni4z","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_ni4z", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_ni4z| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Ni4z/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_prudhvirazz_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_prudhvirazz_en.md new file mode 100644 index 000000000000..f80ea7cb279c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_prudhvirazz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_prudhvirazz DistilBertForTokenClassification from prudhvirazz +author: John Snow Labs +name: burmese_awesome_wnut_model_prudhvirazz +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_prudhvirazz` is a English model originally trained by prudhvirazz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_prudhvirazz_en_5.2.0_3.0_1700674221018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_prudhvirazz_en_5.2.0_3.0_1700674221018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_prudhvirazz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_prudhvirazz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_prudhvirazz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/prudhvirazz/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_qiaoqian_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_qiaoqian_en.md new file mode 100644 index 000000000000..cde77538801c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_qiaoqian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_qiaoqian DistilBertForTokenClassification from qiaoqian +author: John Snow Labs +name: burmese_awesome_wnut_model_qiaoqian +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_qiaoqian` is a English model originally trained by qiaoqian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_qiaoqian_en_5.2.0_3.0_1700624107296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_qiaoqian_en_5.2.0_3.0_1700624107296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_qiaoqian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_qiaoqian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_qiaoqian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/qiaoqian/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ramikassouf_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ramikassouf_en.md new file mode 100644 index 000000000000..91ab26424c3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_ramikassouf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_ramikassouf DistilBertForTokenClassification from RamiKassouf +author: John Snow Labs +name: burmese_awesome_wnut_model_ramikassouf +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_ramikassouf` is a English model originally trained by RamiKassouf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_ramikassouf_en_5.2.0_3.0_1700669114888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_ramikassouf_en_5.2.0_3.0_1700669114888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_ramikassouf","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_ramikassouf", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_ramikassouf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/RamiKassouf/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sameerakoppana_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sameerakoppana_en.md new file mode 100644 index 000000000000..ec87be08ddab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sameerakoppana_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_sameerakoppana DistilBertForTokenClassification from SameeraKoppana +author: John Snow Labs +name: burmese_awesome_wnut_model_sameerakoppana +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_sameerakoppana` is a English model originally trained by SameeraKoppana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sameerakoppana_en_5.2.0_3.0_1700674856471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sameerakoppana_en_5.2.0_3.0_1700674856471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_sameerakoppana","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_sameerakoppana", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_sameerakoppana| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SameeraKoppana/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_santoshuske_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_santoshuske_en.md new file mode 100644 index 000000000000..3994faa2a5ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_santoshuske_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_santoshuske DistilBertForTokenClassification from SantoshUske +author: John Snow Labs +name: burmese_awesome_wnut_model_santoshuske +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_santoshuske` is a English model originally trained by SantoshUske. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_santoshuske_en_5.2.0_3.0_1700625842248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_santoshuske_en_5.2.0_3.0_1700625842248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_santoshuske","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_santoshuske", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_santoshuske| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SantoshUske/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_shadman_rohan_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_shadman_rohan_en.md new file mode 100644 index 000000000000..e8ccfb61301d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_shadman_rohan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_shadman_rohan DistilBertForTokenClassification from Shadman-Rohan +author: John Snow Labs +name: burmese_awesome_wnut_model_shadman_rohan +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_shadman_rohan` is a English model originally trained by Shadman-Rohan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_shadman_rohan_en_5.2.0_3.0_1700644707896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_shadman_rohan_en_5.2.0_3.0_1700644707896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_shadman_rohan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_shadman_rohan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_shadman_rohan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Shadman-Rohan/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_shaohantian_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_shaohantian_en.md new file mode 100644 index 000000000000..88b3ec33d5cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_shaohantian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_shaohantian DistilBertForTokenClassification from ShaohanTian +author: John Snow Labs +name: burmese_awesome_wnut_model_shaohantian +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_shaohantian` is a English model originally trained by ShaohanTian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_shaohantian_en_5.2.0_3.0_1700672210931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_shaohantian_en_5.2.0_3.0_1700672210931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_shaohantian","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_shaohantian", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_shaohantian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ShaohanTian/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sinchir0_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sinchir0_en.md new file mode 100644 index 000000000000..701c76d2b99d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sinchir0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_sinchir0 DistilBertForTokenClassification from sinchir0 +author: John Snow Labs +name: burmese_awesome_wnut_model_sinchir0 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_sinchir0` is a English model originally trained by sinchir0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sinchir0_en_5.2.0_3.0_1700644707924.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sinchir0_en_5.2.0_3.0_1700644707924.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_sinchir0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_sinchir0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_sinchir0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sinchir0/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sofa566_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sofa566_en.md new file mode 100644 index 000000000000..54bcb5a4dad9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sofa566_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_sofa566 DistilBertForTokenClassification from sofa566 +author: John Snow Labs +name: burmese_awesome_wnut_model_sofa566 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_sofa566` is a English model originally trained by sofa566. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sofa566_en_5.2.0_3.0_1700660288569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sofa566_en_5.2.0_3.0_1700660288569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_sofa566","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_sofa566", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_sofa566| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sofa566/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_soroushbn_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_soroushbn_en.md new file mode 100644 index 000000000000..228d7ffef1f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_soroushbn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_soroushbn DistilBertForTokenClassification from soroushbn +author: John Snow Labs +name: burmese_awesome_wnut_model_soroushbn +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_soroushbn` is a English model originally trained by soroushbn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_soroushbn_en_5.2.0_3.0_1700628738696.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_soroushbn_en_5.2.0_3.0_1700628738696.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_soroushbn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_soroushbn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_soroushbn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/soroushbn/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_spleonard1_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_spleonard1_en.md new file mode 100644 index 000000000000..3506970efec4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_spleonard1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_spleonard1 DistilBertForTokenClassification from Spleonard1 +author: John Snow Labs +name: burmese_awesome_wnut_model_spleonard1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_spleonard1` is a English model originally trained by Spleonard1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_spleonard1_en_5.2.0_3.0_1700641088379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_spleonard1_en_5.2.0_3.0_1700641088379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_spleonard1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_spleonard1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_spleonard1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Spleonard1/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sravanipilla_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sravanipilla_en.md new file mode 100644 index 000000000000..6a7af52a2c0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_sravanipilla_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_sravanipilla DistilBertForTokenClassification from Sravanipilla +author: John Snow Labs +name: burmese_awesome_wnut_model_sravanipilla +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_sravanipilla` is a English model originally trained by Sravanipilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sravanipilla_en_5.2.0_3.0_1700672107247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_sravanipilla_en_5.2.0_3.0_1700672107247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_sravanipilla","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_sravanipilla", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_sravanipilla| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Sravanipilla/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_stepa_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_stepa_en.md new file mode 100644 index 000000000000..1be7f1c916ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_stepa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_stepa DistilBertForTokenClassification from Stepa +author: John Snow Labs +name: burmese_awesome_wnut_model_stepa +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_stepa` is a English model originally trained by Stepa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_stepa_en_5.2.0_3.0_1700661065750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_stepa_en_5.2.0_3.0_1700661065750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_stepa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_stepa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_stepa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Stepa/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_suhasparray_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_suhasparray_en.md new file mode 100644 index 000000000000..60cc5ec28399 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_suhasparray_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_suhasparray DistilBertForTokenClassification from suhasparray +author: John Snow Labs +name: burmese_awesome_wnut_model_suhasparray +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_suhasparray` is a English model originally trained by suhasparray. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_suhasparray_en_5.2.0_3.0_1700673973408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_suhasparray_en_5.2.0_3.0_1700673973408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_suhasparray","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_suhasparray", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_suhasparray| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/suhasparray/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_terhdavid_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_terhdavid_en.md new file mode 100644 index 000000000000..b538e26e3ab9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_terhdavid_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_terhdavid DistilBertForTokenClassification from terhdavid +author: John Snow Labs +name: burmese_awesome_wnut_model_terhdavid +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_terhdavid` is a English model originally trained by terhdavid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_terhdavid_en_5.2.0_3.0_1700662987844.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_terhdavid_en_5.2.0_3.0_1700662987844.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_terhdavid","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_terhdavid", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_terhdavid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/terhdavid/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_thrushwanth_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_thrushwanth_en.md new file mode 100644 index 000000000000..426e54a8c86f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_thrushwanth_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_thrushwanth DistilBertForTokenClassification from Thrushwanth +author: John Snow Labs +name: burmese_awesome_wnut_model_thrushwanth +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_thrushwanth` is a English model originally trained by Thrushwanth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_thrushwanth_en_5.2.0_3.0_1700667635611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_thrushwanth_en_5.2.0_3.0_1700667635611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_thrushwanth","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_thrushwanth", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_thrushwanth| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Thrushwanth/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_viktaradynets_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_viktaradynets_en.md new file mode 100644 index 000000000000..fd2dc5b2ae90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_viktaradynets_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_viktaradynets DistilBertForTokenClassification from ViktarAdynets +author: John Snow Labs +name: burmese_awesome_wnut_model_viktaradynets +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_viktaradynets` is a English model originally trained by ViktarAdynets. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_viktaradynets_en_5.2.0_3.0_1700672982777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_viktaradynets_en_5.2.0_3.0_1700672982777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_viktaradynets","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_viktaradynets", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_viktaradynets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ViktarAdynets/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_vsombhane_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_vsombhane_en.md new file mode 100644 index 000000000000..b02668e476ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_vsombhane_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_vsombhane DistilBertForTokenClassification from vsombhane +author: John Snow Labs +name: burmese_awesome_wnut_model_vsombhane +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_vsombhane` is a English model originally trained by vsombhane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_vsombhane_en_5.2.0_3.0_1700666005608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_vsombhane_en_5.2.0_3.0_1700666005608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_vsombhane","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_vsombhane", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_vsombhane| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vsombhane/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_vsufiy_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_vsufiy_en.md new file mode 100644 index 000000000000..40cf9726f918 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_vsufiy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_vsufiy DistilBertForTokenClassification from vsufiy +author: John Snow Labs +name: burmese_awesome_wnut_model_vsufiy +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_vsufiy` is a English model originally trained by vsufiy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_vsufiy_en_5.2.0_3.0_1700643787061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_vsufiy_en_5.2.0_3.0_1700643787061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_vsufiy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_vsufiy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_vsufiy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vsufiy/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_xian_xiang_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_xian_xiang_en.md new file mode 100644 index 000000000000..bac5a6ae4b87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_xian_xiang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_xian_xiang DistilBertForTokenClassification from Xian-Xiang +author: John Snow Labs +name: burmese_awesome_wnut_model_xian_xiang +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_xian_xiang` is a English model originally trained by Xian-Xiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_xian_xiang_en_5.2.0_3.0_1700624992662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_xian_xiang_en_5.2.0_3.0_1700624992662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_xian_xiang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_xian_xiang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_xian_xiang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Xian-Xiang/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_xsf_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_xsf_en.md new file mode 100644 index 000000000000..aabf1a30f623 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_xsf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_xsf DistilBertForTokenClassification from XSF +author: John Snow Labs +name: burmese_awesome_wnut_model_xsf +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_xsf` is a English model originally trained by XSF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_xsf_en_5.2.0_3.0_1700667136216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_xsf_en_5.2.0_3.0_1700667136216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_xsf","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_xsf", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_xsf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/XSF/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yannhabib_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yannhabib_en.md new file mode 100644 index 000000000000..c02ae0d1b3fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yannhabib_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_yannhabib DistilBertForTokenClassification from yannhabib +author: John Snow Labs +name: burmese_awesome_wnut_model_yannhabib +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_yannhabib` is a English model originally trained by yannhabib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_yannhabib_en_5.2.0_3.0_1700652647890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_yannhabib_en_5.2.0_3.0_1700652647890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_yannhabib","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_yannhabib", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_yannhabib| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yannhabib/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yuliang555_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yuliang555_en.md new file mode 100644 index 000000000000..88ee14542873 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yuliang555_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_yuliang555 DistilBertForTokenClassification from yuliang555 +author: John Snow Labs +name: burmese_awesome_wnut_model_yuliang555 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_yuliang555` is a English model originally trained by yuliang555. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_yuliang555_en_5.2.0_3.0_1700667588685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_yuliang555_en_5.2.0_3.0_1700667588685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_yuliang555","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_yuliang555", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_yuliang555| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yuliang555/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yyyy1992_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yyyy1992_en.md new file mode 100644 index 000000000000..663189e24c04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_yyyy1992_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_yyyy1992 DistilBertForTokenClassification from yyyy1992 +author: John Snow Labs +name: burmese_awesome_wnut_model_yyyy1992 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_yyyy1992` is a English model originally trained by yyyy1992. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_yyyy1992_en_5.2.0_3.0_1700680522901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_yyyy1992_en_5.2.0_3.0_1700680522901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_yyyy1992","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_yyyy1992", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_yyyy1992| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yyyy1992/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_zinoli_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_zinoli_en.md new file mode 100644 index 000000000000..49d682c43fd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awesome_wnut_model_zinoli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_wnut_model_zinoli DistilBertForTokenClassification from zinoli +author: John Snow Labs +name: burmese_awesome_wnut_model_zinoli +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_wnut_model_zinoli` is a English model originally trained by zinoli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_zinoli_en_5.2.0_3.0_1700668434180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_wnut_model_zinoli_en_5.2.0_3.0_1700668434180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awesome_wnut_model_zinoli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awesome_wnut_model_zinoli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_wnut_model_zinoli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/zinoli/my_awesome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_awsome_wnut_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_awsome_wnut_model_en.md new file mode 100644 index 000000000000..365665eae2dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_awsome_wnut_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awsome_wnut_model DistilBertForTokenClassification from Laurie +author: John Snow Labs +name: burmese_awsome_wnut_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awsome_wnut_model` is a English model originally trained by Laurie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awsome_wnut_model_en_5.2.0_3.0_1700623963619.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awsome_wnut_model_en_5.2.0_3.0_1700623963619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_awsome_wnut_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_awsome_wnut_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awsome_wnut_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Laurie/my_awsome_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_distilbert_finetune_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_distilbert_finetune_ner_en.md new file mode 100644 index 000000000000..389cb3cf5989 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_distilbert_finetune_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_distilbert_finetune_ner DistilBertForTokenClassification from DeepBird +author: John Snow Labs +name: burmese_distilbert_finetune_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_distilbert_finetune_ner` is a English model originally trained by DeepBird. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_distilbert_finetune_ner_en_5.2.0_3.0_1700624124322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_distilbert_finetune_ner_en_5.2.0_3.0_1700624124322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_distilbert_finetune_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_distilbert_finetune_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_distilbert_finetune_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/DeepBird/my-distilBERT-finetune-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_ebm_model_biobert_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_ebm_model_biobert_en.md new file mode 100644 index 000000000000..f114a02cd56c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_ebm_model_biobert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_ebm_model_biobert DistilBertForTokenClassification from Agniruudrra +author: John Snow Labs +name: burmese_ebm_model_biobert +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_ebm_model_biobert` is a English model originally trained by Agniruudrra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_ebm_model_biobert_en_5.2.0_3.0_1700640413677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_ebm_model_biobert_en_5.2.0_3.0_1700640413677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_ebm_model_biobert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_ebm_model_biobert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_ebm_model_biobert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.3 MB| + +## References + +https://huggingface.co/Agniruudrra/my_ebm_model_biobert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_first_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_first_model_en.md new file mode 100644 index 000000000000..fe67f26e4ba9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_first_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_first_model DistilBertForTokenClassification from mami99 +author: John Snow Labs +name: burmese_first_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_first_model` is a English model originally trained by mami99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_first_model_en_5.2.0_3.0_1700673926458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_first_model_en_5.2.0_3.0_1700673926458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_first_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_first_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_first_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mami99/my_first_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_model_eeeureka_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_model_eeeureka_en.md new file mode 100644 index 000000000000..1a5c033e9ffe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_model_eeeureka_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_model_eeeureka DistilBertForTokenClassification from eeeureka +author: John Snow Labs +name: burmese_model_eeeureka +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_model_eeeureka` is a English model originally trained by eeeureka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_model_eeeureka_en_5.2.0_3.0_1700623681934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_model_eeeureka_en_5.2.0_3.0_1700623681934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_model_eeeureka","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_model_eeeureka", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_model_eeeureka| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/eeeureka/my_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_ner_model_jimi11_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_ner_model_jimi11_en.md new file mode 100644 index 000000000000..a3faa1a478b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_ner_model_jimi11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_ner_model_jimi11 DistilBertForTokenClassification from Jimi11 +author: John Snow Labs +name: burmese_ner_model_jimi11 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_ner_model_jimi11` is a English model originally trained by Jimi11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_ner_model_jimi11_en_5.2.0_3.0_1700656485667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_ner_model_jimi11_en_5.2.0_3.0_1700656485667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_ner_model_jimi11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_ner_model_jimi11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_ner_model_jimi11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Jimi11/my_ner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_nlp_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_nlp_model_en.md new file mode 100644 index 000000000000..cb6bbf322fd2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_nlp_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_nlp_model DistilBertForTokenClassification from EdThomasset +author: John Snow Labs +name: burmese_nlp_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_nlp_model` is a English model originally trained by EdThomasset. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_nlp_model_en_5.2.0_3.0_1700679253423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_nlp_model_en_5.2.0_3.0_1700679253423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_nlp_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_nlp_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_nlp_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/EdThomasset/my_nlp_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_sequence_labelling_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_sequence_labelling_model_en.md new file mode 100644 index 000000000000..f051e55efe4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_sequence_labelling_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_sequence_labelling_model DistilBertForTokenClassification from Gayu +author: John Snow Labs +name: burmese_sequence_labelling_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_sequence_labelling_model` is a English model originally trained by Gayu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_sequence_labelling_model_en_5.2.0_3.0_1700622760026.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_sequence_labelling_model_en_5.2.0_3.0_1700622760026.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_sequence_labelling_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_sequence_labelling_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_sequence_labelling_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Gayu/my_sequence_labelling_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_test2_wnut_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_test2_wnut_model_en.md new file mode 100644 index 000000000000..bed9b0254e10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_test2_wnut_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_test2_wnut_model DistilBertForTokenClassification from tro9999 +author: John Snow Labs +name: burmese_test2_wnut_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_test2_wnut_model` is a English model originally trained by tro9999. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_test2_wnut_model_en_5.2.0_3.0_1700672894642.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_test2_wnut_model_en_5.2.0_3.0_1700672894642.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_test2_wnut_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_test2_wnut_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_test2_wnut_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/tro9999/my_test2_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-burmese_test_big_ner_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-burmese_test_big_ner_model_en.md new file mode 100644 index 000000000000..4e0a876b4334 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-burmese_test_big_ner_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_test_big_ner_model DistilBertForTokenClassification from tro9999 +author: John Snow Labs +name: burmese_test_big_ner_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_test_big_ner_model` is a English model originally trained by tro9999. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_test_big_ner_model_en_5.2.0_3.0_1700681189973.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_test_big_ner_model_en_5.2.0_3.0_1700681189973.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("burmese_test_big_ner_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("burmese_test_big_ner_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_test_big_ner_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/tro9999/my_test_big_ner_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1_en.md b/docs/_posts/ahmedlone127/2023-11-22-chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1_en.md new file mode 100644 index 000000000000..7964ea712005 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1 DistilBertForTokenClassification from quydchope +author: John Snow Labs +name: chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1` is a English model originally trained by quydchope. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1_en_5.2.0_3.0_1700621919205.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1_en_5.2.0_3.0_1700621919205.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chope_fine_dishing_distilbert_base_uncased_finetuned_ner_v0_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/quydchope/chope-fine-dishing-distilbert-base-uncased-finetuned-ner-v0.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_cgensheimer_en.md b/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_cgensheimer_en.md new file mode 100644 index 000000000000..d19f46ad9f97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_cgensheimer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English claims_data_model_cgensheimer DistilBertForTokenClassification from cgensheimer +author: John Snow Labs +name: claims_data_model_cgensheimer +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`claims_data_model_cgensheimer` is a English model originally trained by cgensheimer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/claims_data_model_cgensheimer_en_5.2.0_3.0_1700620723788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/claims_data_model_cgensheimer_en_5.2.0_3.0_1700620723788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("claims_data_model_cgensheimer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("claims_data_model_cgensheimer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|claims_data_model_cgensheimer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/cgensheimer/claims-data-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_jlandis_en.md b/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_jlandis_en.md new file mode 100644 index 000000000000..a49e8175d5d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_jlandis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English claims_data_model_jlandis DistilBertForTokenClassification from JLandis +author: John Snow Labs +name: claims_data_model_jlandis +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`claims_data_model_jlandis` is a English model originally trained by JLandis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/claims_data_model_jlandis_en_5.2.0_3.0_1700658368544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/claims_data_model_jlandis_en_5.2.0_3.0_1700658368544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("claims_data_model_jlandis","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("claims_data_model_jlandis", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|claims_data_model_jlandis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/JLandis/claims-data-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_mjokich_en.md b/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_mjokich_en.md new file mode 100644 index 000000000000..98fa46e1b39a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-claims_data_model_mjokich_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English claims_data_model_mjokich DistilBertForTokenClassification from mjokich +author: John Snow Labs +name: claims_data_model_mjokich +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`claims_data_model_mjokich` is a English model originally trained by mjokich. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/claims_data_model_mjokich_en_5.2.0_3.0_1700678530628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/claims_data_model_mjokich_en_5.2.0_3.0_1700678530628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("claims_data_model_mjokich","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("claims_data_model_mjokich", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|claims_data_model_mjokich| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mjokich/claims-data-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-clinico_en.md b/docs/_posts/ahmedlone127/2023-11-22-clinico_en.md new file mode 100644 index 000000000000..258ca22b30fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-clinico_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clinico DistilBertForTokenClassification from joheras +author: John Snow Labs +name: clinico +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinico` is a English model originally trained by joheras. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinico_en_5.2.0_3.0_1700629574997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinico_en_5.2.0_3.0_1700629574997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("clinico","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("clinico", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinico| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/joheras/clinico \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-clinico_finetuned_augmented1_en.md b/docs/_posts/ahmedlone127/2023-11-22-clinico_finetuned_augmented1_en.md new file mode 100644 index 000000000000..ea5a0e1c2d18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-clinico_finetuned_augmented1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clinico_finetuned_augmented1 DistilBertForTokenClassification from joheras +author: John Snow Labs +name: clinico_finetuned_augmented1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinico_finetuned_augmented1` is a English model originally trained by joheras. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinico_finetuned_augmented1_en_5.2.0_3.0_1700661180506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinico_finetuned_augmented1_en_5.2.0_3.0_1700661180506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("clinico_finetuned_augmented1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("clinico_finetuned_augmented1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinico_finetuned_augmented1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/joheras/clinico-finetuned-augmented1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-clinico_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-22-clinico_finetuned_en.md new file mode 100644 index 000000000000..b151da3c110c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-clinico_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clinico_finetuned DistilBertForTokenClassification from joheras +author: John Snow Labs +name: clinico_finetuned +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clinico_finetuned` is a English model originally trained by joheras. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clinico_finetuned_en_5.2.0_3.0_1700646125131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clinico_finetuned_en_5.2.0_3.0_1700646125131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("clinico_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("clinico_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clinico_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/joheras/clinico-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_09_v2_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_09_v2_finetuned_ner_en.md new file mode 100644 index 000000000000..c5b7f07f1ecb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_09_v2_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English color_extraction_2023_02_09_v2_finetuned_ner DistilBertForTokenClassification from plpkpjph +author: John Snow Labs +name: color_extraction_2023_02_09_v2_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`color_extraction_2023_02_09_v2_finetuned_ner` is a English model originally trained by plpkpjph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_09_v2_finetuned_ner_en_5.2.0_3.0_1700654252567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_09_v2_finetuned_ner_en_5.2.0_3.0_1700654252567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("color_extraction_2023_02_09_v2_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("color_extraction_2023_02_09_v2_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|color_extraction_2023_02_09_v2_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/plpkpjph/color_extraction_2023_02_09_v2-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v1_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v1_finetuned_ner_en.md new file mode 100644 index 000000000000..7b02cfd18ee8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v1_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English color_extraction_2023_02_10_v1_finetuned_ner DistilBertForTokenClassification from plpkpjph +author: John Snow Labs +name: color_extraction_2023_02_10_v1_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`color_extraction_2023_02_10_v1_finetuned_ner` is a English model originally trained by plpkpjph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_10_v1_finetuned_ner_en_5.2.0_3.0_1700680342766.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_10_v1_finetuned_ner_en_5.2.0_3.0_1700680342766.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("color_extraction_2023_02_10_v1_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("color_extraction_2023_02_10_v1_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|color_extraction_2023_02_10_v1_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/plpkpjph/color_extraction_2023_02_10_v1-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v2_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v2_finetuned_ner_en.md new file mode 100644 index 000000000000..f99eb8092e13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v2_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English color_extraction_2023_02_10_v2_finetuned_ner DistilBertForTokenClassification from plpkpjph +author: John Snow Labs +name: color_extraction_2023_02_10_v2_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`color_extraction_2023_02_10_v2_finetuned_ner` is a English model originally trained by plpkpjph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_10_v2_finetuned_ner_en_5.2.0_3.0_1700663999283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_10_v2_finetuned_ner_en_5.2.0_3.0_1700663999283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("color_extraction_2023_02_10_v2_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("color_extraction_2023_02_10_v2_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|color_extraction_2023_02_10_v2_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/plpkpjph/color_extraction_2023_02_10_v2-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v3_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v3_finetuned_ner_en.md new file mode 100644 index 000000000000..d1e9ffe1508c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-color_extraction_2023_02_10_v3_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English color_extraction_2023_02_10_v3_finetuned_ner DistilBertForTokenClassification from plpkpjph +author: John Snow Labs +name: color_extraction_2023_02_10_v3_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`color_extraction_2023_02_10_v3_finetuned_ner` is a English model originally trained by plpkpjph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_10_v3_finetuned_ner_en_5.2.0_3.0_1700665728569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/color_extraction_2023_02_10_v3_finetuned_ner_en_5.2.0_3.0_1700665728569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("color_extraction_2023_02_10_v3_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("color_extraction_2023_02_10_v3_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|color_extraction_2023_02_10_v3_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/plpkpjph/color_extraction_2023_02_10_v3-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-consejo_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-consejo_ner_en.md new file mode 100644 index 000000000000..26ca893618f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-consejo_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English consejo_ner DistilBertForTokenClassification from hucruz +author: John Snow Labs +name: consejo_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`consejo_ner` is a English model originally trained by hucruz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/consejo_ner_en_5.2.0_3.0_1700655180953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/consejo_ner_en_5.2.0_3.0_1700655180953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("consejo_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("consejo_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|consejo_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/hucruz/consejo-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-copilot_namanj_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-copilot_namanj_model_en.md new file mode 100644 index 000000000000..a3d2cb4b9baf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-copilot_namanj_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English copilot_namanj_model DistilBertForTokenClassification from bobbyw +author: John Snow Labs +name: copilot_namanj_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`copilot_namanj_model` is a English model originally trained by bobbyw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/copilot_namanj_model_en_5.2.0_3.0_1700673035628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/copilot_namanj_model_en_5.2.0_3.0_1700673035628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("copilot_namanj_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("copilot_namanj_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|copilot_namanj_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/bobbyw/copilot_namanj_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-copilot_wnut_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-copilot_wnut_model_en.md new file mode 100644 index 000000000000..c8a54b89a371 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-copilot_wnut_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English copilot_wnut_model DistilBertForTokenClassification from bobbyw +author: John Snow Labs +name: copilot_wnut_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`copilot_wnut_model` is a English model originally trained by bobbyw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/copilot_wnut_model_en_5.2.0_3.0_1700667458213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/copilot_wnut_model_en_5.2.0_3.0_1700667458213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("copilot_wnut_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("copilot_wnut_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|copilot_wnut_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bobbyw/copilot_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-cybonto_distilbert_base_uncased_finetuned_ner_fewnerd_en.md b/docs/_posts/ahmedlone127/2023-11-22-cybonto_distilbert_base_uncased_finetuned_ner_fewnerd_en.md new file mode 100644 index 000000000000..0ff378d6fe77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-cybonto_distilbert_base_uncased_finetuned_ner_fewnerd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cybonto_distilbert_base_uncased_finetuned_ner_fewnerd DistilBertForTokenClassification from theResearchNinja +author: John Snow Labs +name: cybonto_distilbert_base_uncased_finetuned_ner_fewnerd +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cybonto_distilbert_base_uncased_finetuned_ner_fewnerd` is a English model originally trained by theResearchNinja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cybonto_distilbert_base_uncased_finetuned_ner_fewnerd_en_5.2.0_3.0_1700619168095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cybonto_distilbert_base_uncased_finetuned_ner_fewnerd_en_5.2.0_3.0_1700619168095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("cybonto_distilbert_base_uncased_finetuned_ner_fewnerd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("cybonto_distilbert_base_uncased_finetuned_ner_fewnerd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cybonto_distilbert_base_uncased_finetuned_ner_fewnerd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/theResearchNinja/Cybonto-distilbert-base-uncased-finetuned-ner-FewNerd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-cybonto_distilbert_base_uncased_finetuned_ner_v0_1_en.md b/docs/_posts/ahmedlone127/2023-11-22-cybonto_distilbert_base_uncased_finetuned_ner_v0_1_en.md new file mode 100644 index 000000000000..be18cc2d5e45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-cybonto_distilbert_base_uncased_finetuned_ner_v0_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cybonto_distilbert_base_uncased_finetuned_ner_v0_1 DistilBertForTokenClassification from theResearchNinja +author: John Snow Labs +name: cybonto_distilbert_base_uncased_finetuned_ner_v0_1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cybonto_distilbert_base_uncased_finetuned_ner_v0_1` is a English model originally trained by theResearchNinja. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cybonto_distilbert_base_uncased_finetuned_ner_v0_1_en_5.2.0_3.0_1700622582594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cybonto_distilbert_base_uncased_finetuned_ner_v0_1_en_5.2.0_3.0_1700622582594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("cybonto_distilbert_base_uncased_finetuned_ner_v0_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("cybonto_distilbert_base_uncased_finetuned_ner_v0_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cybonto_distilbert_base_uncased_finetuned_ner_v0_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/theResearchNinja/Cybonto-distilbert-base-uncased-finetuned-ner-v0.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-datos_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-datos_ner_en.md new file mode 100644 index 000000000000..90cc7d053f90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-datos_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English datos_ner DistilBertForTokenClassification from hucruz +author: John Snow Labs +name: datos_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`datos_ner` is a English model originally trained by hucruz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/datos_ner_en_5.2.0_3.0_1700654252516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/datos_ner_en_5.2.0_3.0_1700654252516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("datos_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("datos_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|datos_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/hucruz/datos-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-dbert_finetuned_ct_2023_en.md b/docs/_posts/ahmedlone127/2023-11-22-dbert_finetuned_ct_2023_en.md new file mode 100644 index 000000000000..7ddaf5cde211 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-dbert_finetuned_ct_2023_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dbert_finetuned_ct_2023 DistilBertForTokenClassification from berkegocmen +author: John Snow Labs +name: dbert_finetuned_ct_2023 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbert_finetuned_ct_2023` is a English model originally trained by berkegocmen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbert_finetuned_ct_2023_en_5.2.0_3.0_1700662121482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbert_finetuned_ct_2023_en_5.2.0_3.0_1700662121482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("dbert_finetuned_ct_2023","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("dbert_finetuned_ct_2023", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbert_finetuned_ct_2023| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/berkegocmen/dbert-finetuned-ct-2023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-dimensions_extraction_2023_02_10_30000_datatpoints_en.md b/docs/_posts/ahmedlone127/2023-11-22-dimensions_extraction_2023_02_10_30000_datatpoints_en.md new file mode 100644 index 000000000000..1a401ab88eec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-dimensions_extraction_2023_02_10_30000_datatpoints_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dimensions_extraction_2023_02_10_30000_datatpoints DistilBertForTokenClassification from plpkpjph +author: John Snow Labs +name: dimensions_extraction_2023_02_10_30000_datatpoints +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dimensions_extraction_2023_02_10_30000_datatpoints` is a English model originally trained by plpkpjph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dimensions_extraction_2023_02_10_30000_datatpoints_en_5.2.0_3.0_1700625842227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dimensions_extraction_2023_02_10_30000_datatpoints_en_5.2.0_3.0_1700625842227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("dimensions_extraction_2023_02_10_30000_datatpoints","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("dimensions_extraction_2023_02_10_30000_datatpoints", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dimensions_extraction_2023_02_10_30000_datatpoints| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/plpkpjph/dimensions_extraction_2023_02_10_30000_datatpoints \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-dimensions_extraction_2023_02_10_v0_en.md b/docs/_posts/ahmedlone127/2023-11-22-dimensions_extraction_2023_02_10_v0_en.md new file mode 100644 index 000000000000..bf6aabf2362e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-dimensions_extraction_2023_02_10_v0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dimensions_extraction_2023_02_10_v0 DistilBertForTokenClassification from plpkpjph +author: John Snow Labs +name: dimensions_extraction_2023_02_10_v0 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dimensions_extraction_2023_02_10_v0` is a English model originally trained by plpkpjph. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dimensions_extraction_2023_02_10_v0_en_5.2.0_3.0_1700666821751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dimensions_extraction_2023_02_10_v0_en_5.2.0_3.0_1700666821751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("dimensions_extraction_2023_02_10_v0","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("dimensions_extraction_2023_02_10_v0", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dimensions_extraction_2023_02_10_v0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/plpkpjph/dimensions_extraction_2023_02_10_v0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-directquote_chunktext_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-22-directquote_chunktext_distilbert_en.md new file mode 100644 index 000000000000..bd64eb4daf2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-directquote_chunktext_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English directquote_chunktext_distilbert DistilBertForTokenClassification from whispAI +author: John Snow Labs +name: directquote_chunktext_distilbert +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`directquote_chunktext_distilbert` is a English model originally trained by whispAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/directquote_chunktext_distilbert_en_5.2.0_3.0_1700648866789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/directquote_chunktext_distilbert_en_5.2.0_3.0_1700648866789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("directquote_chunktext_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("directquote_chunktext_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|directquote_chunktext_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/whispAI/DirectQuote-ChunkText-DistilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-disbert_finetune_for_gentriple_malcolmcjj13_en.md b/docs/_posts/ahmedlone127/2023-11-22-disbert_finetune_for_gentriple_malcolmcjj13_en.md new file mode 100644 index 000000000000..11b08d821bd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-disbert_finetune_for_gentriple_malcolmcjj13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English disbert_finetune_for_gentriple_malcolmcjj13 DistilBertForTokenClassification from Malcolmcjj13 +author: John Snow Labs +name: disbert_finetune_for_gentriple_malcolmcjj13 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`disbert_finetune_for_gentriple_malcolmcjj13` is a English model originally trained by Malcolmcjj13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/disbert_finetune_for_gentriple_malcolmcjj13_en_5.2.0_3.0_1700623984733.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/disbert_finetune_for_gentriple_malcolmcjj13_en_5.2.0_3.0_1700623984733.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("disbert_finetune_for_gentriple_malcolmcjj13","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("disbert_finetune_for_gentriple_malcolmcjj13", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|disbert_finetune_for_gentriple_malcolmcjj13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Malcolmcjj13/disbert_finetune_for_gentriple \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distil_added_voca_en.md b/docs/_posts/ahmedlone127/2023-11-22-distil_added_voca_en.md new file mode 100644 index 000000000000..20a59306930b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distil_added_voca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distil_added_voca DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distil_added_voca +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_added_voca` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_added_voca_en_5.2.0_3.0_1700637500823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_added_voca_en_5.2.0_3.0_1700637500823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distil_added_voca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distil_added_voca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_added_voca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/troesy/distil-added-voca \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_3epoch_latfalse_updatedalligning_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_3epoch_latfalse_updatedalligning_en.md new file mode 100644 index 000000000000..a46c30ee4b96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_3epoch_latfalse_updatedalligning_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_3epoch_latfalse_updatedalligning DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_base_cased_3epoch_latfalse_updatedalligning +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_3epoch_latfalse_updatedalligning` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_3epoch_latfalse_updatedalligning_en_5.2.0_3.0_1700622578770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_3epoch_latfalse_updatedalligning_en_5.2.0_3.0_1700622578770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_3epoch_latfalse_updatedalligning","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_3epoch_latfalse_updatedalligning", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_3epoch_latfalse_updatedalligning| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilbert-base-cased-3epoch-LaTFalse-updatedAlligning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_3epoch_lattrue_updatedalligning_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_3epoch_lattrue_updatedalligning_en.md new file mode 100644 index 000000000000..a272f306254f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_3epoch_lattrue_updatedalligning_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_3epoch_lattrue_updatedalligning DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_base_cased_3epoch_lattrue_updatedalligning +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_3epoch_lattrue_updatedalligning` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_3epoch_lattrue_updatedalligning_en_5.2.0_3.0_1700630268063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_3epoch_lattrue_updatedalligning_en_5.2.0_3.0_1700630268063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_3epoch_lattrue_updatedalligning","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_3epoch_lattrue_updatedalligning", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_3epoch_lattrue_updatedalligning| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilbert-base-cased-3epoch-LaTTrue-updatedAlligning \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_nalllabel_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_nalllabel_en.md new file mode 100644 index 000000000000..26ad4ee04eb8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_nalllabel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_nalllabel DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_cased_finetuned_nalllabel +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_nalllabel` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_nalllabel_en_5.2.0_3.0_1700622558438.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_nalllabel_en_5.2.0_3.0_1700622558438.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_nalllabel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_nalllabel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_nalllabel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-cased-finetuned-nalllabel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_0220_j_oridata_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_0220_j_oridata_en.md new file mode 100644 index 000000000000..6dcafa956230 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_0220_j_oridata_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_0220_j_oridata DistilBertForTokenClassification from morganchen1007 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_0220_j_oridata +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_0220_j_oridata` is a English model originally trained by morganchen1007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_0220_j_oridata_en_5.2.0_3.0_1700621889386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_0220_j_oridata_en_5.2.0_3.0_1700621889386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_0220_j_oridata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_0220_j_oridata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_0220_j_oridata| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/morganchen1007/distilbert-base-cased-finetuned-ner_0220_J_ORIDATA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_all_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_all_en.md new file mode 100644 index 000000000000..160483eb90e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_all_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_all DistilBertForTokenClassification from RS7 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_all +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_all` is a English model originally trained by RS7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_all_en_5.2.0_3.0_1700662987781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_all_en_5.2.0_3.0_1700662987781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_all","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_all", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RS7/distilbert-base-cased-finetuned-ner-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_finetuned_ner_en.md new file mode 100644 index 000000000000..6ccf39ce7d19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_finetuned_ner DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_finetuned_ner` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_finetuned_ner_en_5.2.0_3.0_1700623234852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_finetuned_ner_en_5.2.0_3.0_1700623234852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-cased-finetuned-ner-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_linh101201_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_linh101201_en.md new file mode 100644 index 000000000000..84c68f72f8cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_linh101201_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_linh101201 DistilBertForTokenClassification from linh101201 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_linh101201 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_linh101201` is a English model originally trained by linh101201. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_linh101201_en_5.2.0_3.0_1700670001693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_linh101201_en_5.2.0_3.0_1700670001693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_linh101201","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_linh101201", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_linh101201| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.3 MB| + +## References + +https://huggingface.co/linh101201/distilbert-base-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_t2_g2_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_t2_g2_en.md new file mode 100644 index 000000000000..9beb0857a1fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_t2_g2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_t2_g2 DistilBertForTokenClassification from RS7 +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_t2_g2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_t2_g2` is a English model originally trained by RS7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t2_g2_en_5.2.0_3.0_1700678303244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_t2_g2_en_5.2.0_3.0_1700678303244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_t2_g2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_t2_g2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_t2_g2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/RS7/distilbert-base-cased-finetuned-ner-t2-g2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_test_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_test_en.md new file mode 100644 index 000000000000..eabff10bde07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_ner_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_ner_test DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_cased_finetuned_ner_test +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_ner_test` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_test_en_5.2.0_3.0_1700622879089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_ner_test_en_5.2.0_3.0_1700622879089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_ner_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_ner_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_ner_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-cased-finetuned-ner-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper2_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper2_en.md new file mode 100644 index 000000000000..ebb1511da79d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_paper2 DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_cased_finetuned_paper2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_paper2` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_paper2_en_5.2.0_3.0_1700623542326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_paper2_en_5.2.0_3.0_1700623542326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_paper2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_paper2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_paper2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-cased-finetuned-paper2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper3_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper3_en.md new file mode 100644 index 000000000000..7fe7aa0b6fed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_paper3 DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_cased_finetuned_paper3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_paper3` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_paper3_en_5.2.0_3.0_1700621950752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_paper3_en_5.2.0_3.0_1700621950752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_paper3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_paper3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_paper3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-cased-finetuned-paper3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper_en.md new file mode 100644 index 000000000000..192cdd588b7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_finetuned_paper_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_paper DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_cased_finetuned_paper +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_paper` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_paper_en_5.2.0_3.0_1700621412336.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_paper_en_5.2.0_3.0_1700621412336.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_finetuned_paper","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_finetuned_paper", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_paper| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-cased-finetuned-paper \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_ner_trained_on_synthea_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_ner_trained_on_synthea_en.md new file mode 100644 index 000000000000..8808780f3157 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_cased_ner_trained_on_synthea_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_ner_trained_on_synthea DistilBertForTokenClassification from jage +author: John Snow Labs +name: distilbert_base_cased_ner_trained_on_synthea +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_ner_trained_on_synthea` is a English model originally trained by jage. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_ner_trained_on_synthea_en_5.2.0_3.0_1700663869821.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_ner_trained_on_synthea_en_5.2.0_3.0_1700663869821.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_cased_ner_trained_on_synthea","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_cased_ner_trained_on_synthea", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_ner_trained_on_synthea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/jage/distilbert-base-cased-NER-trained-on-synthea \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_gest_pred_seqeval_partialmatch_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_gest_pred_seqeval_partialmatch_en.md new file mode 100644 index 000000000000..0d892918f3fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_gest_pred_seqeval_partialmatch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_gest_pred_seqeval_partialmatch DistilBertForTokenClassification from Jsevisal +author: John Snow Labs +name: distilbert_base_gest_pred_seqeval_partialmatch +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_gest_pred_seqeval_partialmatch` is a English model originally trained by Jsevisal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_gest_pred_seqeval_partialmatch_en_5.2.0_3.0_1700620255846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_gest_pred_seqeval_partialmatch_en_5.2.0_3.0_1700620255846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_gest_pred_seqeval_partialmatch","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_gest_pred_seqeval_partialmatch", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_gest_pred_seqeval_partialmatch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/Jsevisal/distilbert-base-gest-pred-seqeval-partialmatch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small_xx.md new file mode 100644 index 000000000000..5bbe8a020770 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small DistilBertForTokenClassification from ram-and-jony +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small` is a Multilingual model originally trained by ram-and-jony. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small_xx_5.2.0_3.0_1700643013598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small_xx_5.2.0_3.0_1700643013598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.5 MB| + +## References + +https://huggingface.co/ram-and-jony/distilbert-base-multilingual-cased-finetuned-ner__dataset-ner-heb-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels_xx.md new file mode 100644 index 000000000000..5aefe5b243b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels DistilBertForTokenClassification from ram-and-jony +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels` is a Multilingual model originally trained by ram-and-jony. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels_xx_5.2.0_3.0_1700639138546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels_xx_5.2.0_3.0_1700639138546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_standard_labels| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/ram-and-jony/distilbert-base-multilingual-cased-finetuned-ner__dataset-ner-heb-standard-labels \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_xx.md new file mode 100644 index 000000000000..7f3a05770b29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb DistilBertForTokenClassification from ram-and-jony +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb` is a Multilingual model originally trained by ram-and-jony. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_xx_5.2.0_3.0_1700655281075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb_xx_5.2.0_3.0_1700655281075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_ner__dataset_ner_heb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.5 MB| + +## References + +https://huggingface.co/ram-and-jony/distilbert-base-multilingual-cased-finetuned-ner__dataset-ner-heb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner_xx.md new file mode 100644 index 000000000000..32ba3b59369f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_finetuned_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_ner DistilBertForTokenClassification from ALWN +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_ner +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_ner` is a Multilingual model originally trained by ALWN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner_xx_5.2.0_3.0_1700633536796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_ner_xx_5.2.0_3.0_1700633536796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_finetuned_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_finetuned_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/ALWN/distilbert-base-multilingual-cased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_naamapadam_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_naamapadam_xx.md new file mode 100644 index 000000000000..72dd08862c1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_naamapadam_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_naamapadam DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: distilbert_base_multilingual_cased_naamapadam +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_naamapadam` is a Multilingual model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_naamapadam_xx_5.2.0_3.0_1700623251338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_naamapadam_xx_5.2.0_3.0_1700623251338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_naamapadam","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_naamapadam", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_naamapadam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/AnanthZeke/distilbert-base-multilingual-cased-naamapadam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_ner_demo_amarsanaa1525_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_ner_demo_amarsanaa1525_xx.md new file mode 100644 index 000000000000..9fdf9c37a169 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_ner_demo_amarsanaa1525_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_ner_demo_amarsanaa1525 DistilBertForTokenClassification from Amarsanaa1525 +author: John Snow Labs +name: distilbert_base_multilingual_cased_ner_demo_amarsanaa1525 +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_ner_demo_amarsanaa1525` is a Multilingual model originally trained by Amarsanaa1525. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_ner_demo_amarsanaa1525_xx_5.2.0_3.0_1700680345229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_ner_demo_amarsanaa1525_xx_5.2.0_3.0_1700680345229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_ner_demo_amarsanaa1525","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_ner_demo_amarsanaa1525", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_ner_demo_amarsanaa1525| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Amarsanaa1525/distilbert-base-multilingual-cased-ner-demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_wikineural_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_wikineural_ner_xx.md new file mode 100644 index 000000000000..e9759ef0f9f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_wikineural_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_wikineural_ner DistilBertForTokenClassification from dmargutierrez +author: John Snow Labs +name: distilbert_base_multilingual_cased_wikineural_ner +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_wikineural_ner` is a Multilingual model originally trained by dmargutierrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_wikineural_ner_xx_5.2.0_3.0_1700623046168.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_wikineural_ner_xx_5.2.0_3.0_1700623046168.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_wikineural_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_wikineural_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_wikineural_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/dmargutierrez/distilbert-base-multilingual-cased-wikineural-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_wnut_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_wnut_ner_xx.md new file mode 100644 index 000000000000..e4d76ad6cd4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_multilingual_cased_wnut_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_wnut_ner DistilBertForTokenClassification from dmargutierrez +author: John Snow Labs +name: distilbert_base_multilingual_cased_wnut_ner +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_wnut_ner` is a Multilingual model originally trained by dmargutierrez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_wnut_ner_xx_5.2.0_3.0_1700621171835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_wnut_ner_xx_5.2.0_3.0_1700621171835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_multilingual_cased_wnut_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_multilingual_cased_wnut_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_wnut_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/dmargutierrez/distilbert-base-multilingual-cased-WNUT-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_pabloguinea_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_pabloguinea_en.md new file mode 100644 index 000000000000..5a9b65bc9276 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_pabloguinea_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_pabloguinea DistilBertForTokenClassification from PabloGuinea +author: John Snow Labs +name: distilbert_base_pabloguinea +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_pabloguinea` is a English model originally trained by PabloGuinea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_pabloguinea_en_5.2.0_3.0_1700621004690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_pabloguinea_en_5.2.0_3.0_1700621004690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_pabloguinea","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_pabloguinea", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_pabloguinea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/PabloGuinea/distilbert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_abdullahf129_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_abdullahf129_en.md new file mode 100644 index 000000000000..bd5b40873a06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_abdullahf129_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_abdullahf129 DistilBertForTokenClassification from abdullahf129 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_abdullahf129 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_abdullahf129` is a English model originally trained by abdullahf129. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_abdullahf129_en_5.2.0_3.0_1700650929936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_abdullahf129_en_5.2.0_3.0_1700650929936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_abdullahf129","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_abdullahf129", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_abdullahf129| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/abdullahf129/distilbert-base-uncased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_chunk_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_chunk_en.md new file mode 100644 index 000000000000..ab46fb4e4a1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_chunk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_chunk DistilBertForTokenClassification from jh1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_chunk +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_chunk` is a English model originally trained by jh1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_chunk_en_5.2.0_3.0_1700638908052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_chunk_en_5.2.0_3.0_1700638908052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_chunk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_chunk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_chunk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/jh1/distilbert-base-uncased-finetuned-chunk \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_combinedmodel1_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_combinedmodel1_ner_en.md new file mode 100644 index 000000000000..c80af7e8d3c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_combinedmodel1_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_combinedmodel1_ner DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_combinedmodel1_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_combinedmodel1_ner` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_combinedmodel1_ner_en_5.2.0_3.0_1700619813049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_combinedmodel1_ner_en_5.2.0_3.0_1700619813049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_combinedmodel1_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_combinedmodel1_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_combinedmodel1_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-combinedmodel1-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_devops1_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_devops1_ner_en.md new file mode 100644 index 000000000000..b30546886ebb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_devops1_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_devops1_ner DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_devops1_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_devops1_ner` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_devops1_ner_en_5.2.0_3.0_1700621571655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_devops1_ner_en_5.2.0_3.0_1700621571655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_devops1_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_devops1_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_devops1_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-devops1-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_devops_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_devops_ner_en.md new file mode 100644 index 000000000000..b8839326d5d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_devops_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_devops_ner DistilBertForTokenClassification from akshaychaudhary +author: John Snow Labs +name: distilbert_base_uncased_finetuned_devops_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_devops_ner` is a English model originally trained by akshaychaudhary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_devops_ner_en_5.2.0_3.0_1700619990027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_devops_ner_en_5.2.0_3.0_1700619990027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_devops_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_devops_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_devops_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akshaychaudhary/distilbert-base-uncased-finetuned-devops-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model1_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model1_ner_en.md new file mode 100644 index 000000000000..c78c5b85b859 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model1_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_model1_ner DistilBertForTokenClassification from prashant852 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_model1_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_model1_ner` is a English model originally trained by prashant852. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_model1_ner_en_5.2.0_3.0_1700671189195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_model1_ner_en_5.2.0_3.0_1700671189195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_model1_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_model1_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_model1_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/prashant852/distilbert-base-uncased-finetuned_model1-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model2_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model2_ner_en.md new file mode 100644 index 000000000000..9299d720bff2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model2_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_model2_ner DistilBertForTokenClassification from prashant852 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_model2_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_model2_ner` is a English model originally trained by prashant852. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_model2_ner_en_5.2.0_3.0_1700669931482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_model2_ner_en_5.2.0_3.0_1700669931482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_model2_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_model2_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_model2_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/prashant852/distilbert-base-uncased-finetuned_model2-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model3_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model3_ner_en.md new file mode 100644 index 000000000000..17d4f6ff4e96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_model3_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_model3_ner DistilBertForTokenClassification from prashant852 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_model3_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_model3_ner` is a English model originally trained by prashant852. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_model3_ner_en_5.2.0_3.0_1700672107160.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_model3_ner_en_5.2.0_3.0_1700672107160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_model3_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_model3_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_model3_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/prashant852/distilbert-base-uncased-finetuned_model3-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner1_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner1_en.md new file mode 100644 index 000000000000..b4441637bab4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner1 DistilBertForTokenClassification from AhmedTaha012 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner1` is a English model originally trained by AhmedTaha012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner1_en_5.2.0_3.0_1700661179447.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner1_en_5.2.0_3.0_1700661179447.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AhmedTaha012/distilbert-base-uncased-finetuned-ner1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_0220_j_oridata_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_0220_j_oridata_en.md new file mode 100644 index 000000000000..92d16c33cdb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_0220_j_oridata_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_0220_j_oridata DistilBertForTokenClassification from morganchen1007 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_0220_j_oridata +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_0220_j_oridata` is a English model originally trained by morganchen1007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_0220_j_oridata_en_5.2.0_3.0_1700622281764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_0220_j_oridata_en_5.2.0_3.0_1700622281764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_0220_j_oridata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_0220_j_oridata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_0220_j_oridata| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/morganchen1007/distilbert-base-uncased-finetuned-ner_0220_J_ORIDATA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_3_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_3_en.md new file mode 100644 index 000000000000..01c1e4075dee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_3 DistilBertForTokenClassification from aarroonn22 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_3` is a English model originally trained by aarroonn22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_3_en_5.2.0_3.0_1700646950402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_3_en_5.2.0_3.0_1700646950402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/aarroonn22/distilbert-base-uncased-finetuned-ner-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aaraki_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aaraki_en.md new file mode 100644 index 000000000000..3a4047fc97bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aaraki_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_aaraki DistilBertForTokenClassification from aaraki +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_aaraki +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_aaraki` is a English model originally trained by aaraki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aaraki_en_5.2.0_3.0_1700623817150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aaraki_en_5.2.0_3.0_1700623817150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_aaraki","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_aaraki", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_aaraki| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/aaraki/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aaya_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aaya_en.md new file mode 100644 index 000000000000..69d15277a1bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aaya_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_aaya DistilBertForTokenClassification from aaya +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_aaya +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_aaya` is a English model originally trained by aaya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aaya_en_5.2.0_3.0_1700620527076.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aaya_en_5.2.0_3.0_1700620527076.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_aaya","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_aaya", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_aaya| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/aaya/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aburak621_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aburak621_en.md new file mode 100644 index 000000000000..c05ed513a806 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aburak621_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_aburak621 DistilBertForTokenClassification from aburak621 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_aburak621 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_aburak621` is a English model originally trained by aburak621. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aburak621_en_5.2.0_3.0_1700619448683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aburak621_en_5.2.0_3.0_1700619448683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_aburak621","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_aburak621", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_aburak621| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/aburak621/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_acshcse_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_acshcse_en.md new file mode 100644 index 000000000000..400c47b71122 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_acshcse_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_acshcse DistilBertForTokenClassification from ACSHCSE +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_acshcse +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_acshcse` is a English model originally trained by ACSHCSE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_acshcse_en_5.2.0_3.0_1700623822632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_acshcse_en_5.2.0_3.0_1700623822632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_acshcse","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_acshcse", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_acshcse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ACSHCSE/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_alemanio_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_alemanio_en.md new file mode 100644 index 000000000000..f92db1b4ee3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_alemanio_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_alemanio DistilBertForTokenClassification from Alemanio +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_alemanio +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_alemanio` is a English model originally trained by Alemanio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_alemanio_en_5.2.0_3.0_1700626555832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_alemanio_en_5.2.0_3.0_1700626555832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_alemanio","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_alemanio", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_alemanio| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Alemanio/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_almentalist_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_almentalist_en.md new file mode 100644 index 000000000000..c08ff87b7800 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_almentalist_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_almentalist DistilBertForTokenClassification from almentalist +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_almentalist +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_almentalist` is a English model originally trained by almentalist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_almentalist_en_5.2.0_3.0_1700620672708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_almentalist_en_5.2.0_3.0_1700620672708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_almentalist","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_almentalist", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_almentalist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/almentalist/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aloncohen_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aloncohen_en.md new file mode 100644 index 000000000000..04d76e4ef58d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_aloncohen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_aloncohen DistilBertForTokenClassification from AlonCohen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_aloncohen +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_aloncohen` is a English model originally trained by AlonCohen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aloncohen_en_5.2.0_3.0_1700623052142.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_aloncohen_en_5.2.0_3.0_1700623052142.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_aloncohen","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_aloncohen", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_aloncohen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AlonCohen/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ambreen2_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ambreen2_en.md new file mode 100644 index 000000000000..ae2b4399079e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ambreen2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ambreen2 DistilBertForTokenClassification from Ambreen2 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ambreen2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ambreen2` is a English model originally trained by Ambreen2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ambreen2_en_5.2.0_3.0_1700630359054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ambreen2_en_5.2.0_3.0_1700630359054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ambreen2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ambreen2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ambreen2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Ambreen2/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_amitkayal_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_amitkayal_en.md new file mode 100644 index 000000000000..4f64ad9be1a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_amitkayal_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_amitkayal DistilBertForTokenClassification from amitkayal +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_amitkayal +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_amitkayal` is a English model originally trained by amitkayal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_amitkayal_en_5.2.0_3.0_1700619989791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_amitkayal_en_5.2.0_3.0_1700619989791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_amitkayal","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_amitkayal", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_amitkayal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/amitkayal/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_andrewlitv_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_andrewlitv_en.md new file mode 100644 index 000000000000..e205f814468a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_andrewlitv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_andrewlitv DistilBertForTokenClassification from andrewlitv +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_andrewlitv +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_andrewlitv` is a English model originally trained by andrewlitv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_andrewlitv_en_5.2.0_3.0_1700650866023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_andrewlitv_en_5.2.0_3.0_1700650866023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_andrewlitv","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_andrewlitv", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_andrewlitv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/andrewlitv/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_arifdknt_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_arifdknt_en.md new file mode 100644 index 000000000000..f811936f05d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_arifdknt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_arifdknt DistilBertForTokenClassification from ArifDKNT +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_arifdknt +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_arifdknt` is a English model originally trained by ArifDKNT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_arifdknt_en_5.2.0_3.0_1700655180738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_arifdknt_en_5.2.0_3.0_1700655180738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_arifdknt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_arifdknt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_arifdknt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ArifDKNT/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ashishrag_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ashishrag_en.md new file mode 100644 index 000000000000..dceb1ce80c64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ashishrag_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ashishrag DistilBertForTokenClassification from ashishrag +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ashishrag +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ashishrag` is a English model originally trained by ashishrag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ashishrag_en_5.2.0_3.0_1700629514473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ashishrag_en_5.2.0_3.0_1700629514473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ashishrag","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ashishrag", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ashishrag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ashishrag/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_balaka92_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_balaka92_en.md new file mode 100644 index 000000000000..98accde67bf3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_balaka92_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_balaka92 DistilBertForTokenClassification from balaka92 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_balaka92 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_balaka92` is a English model originally trained by balaka92. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_balaka92_en_5.2.0_3.0_1700631433756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_balaka92_en_5.2.0_3.0_1700631433756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_balaka92","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_balaka92", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_balaka92| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/balaka92/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar_en.md new file mode 100644 index 000000000000..01235c6c667c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar DistilBertForTokenClassification from basaanithanaveenkumar +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar` is a English model originally trained by basaanithanaveenkumar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar_en_5.2.0_3.0_1700622435372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar_en_5.2.0_3.0_1700622435372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_basaanithanaveenkumar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/basaanithanaveenkumar/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_bchaipats_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_bchaipats_en.md new file mode 100644 index 000000000000..cdaf1b6d5610 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_bchaipats_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_bchaipats DistilBertForTokenClassification from bchaipats +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_bchaipats +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_bchaipats` is a English model originally trained by bchaipats. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_bchaipats_en_5.2.0_3.0_1700621293802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_bchaipats_en_5.2.0_3.0_1700621293802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_bchaipats","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_bchaipats", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_bchaipats| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bchaipats/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_bjfxs_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_bjfxs_en.md new file mode 100644 index 000000000000..de372c970489 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_bjfxs_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_bjfxs DistilBertForTokenClassification from bjfxs +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_bjfxs +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_bjfxs` is a English model originally trained by bjfxs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_bjfxs_en_5.2.0_3.0_1700645596598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_bjfxs_en_5.2.0_3.0_1700645596598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_bjfxs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_bjfxs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_bjfxs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/bjfxs/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_calin_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_calin_en.md new file mode 100644 index 000000000000..d4659a5dc5b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_calin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_calin DistilBertForTokenClassification from Calin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_calin +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_calin` is a English model originally trained by Calin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_calin_en_5.2.0_3.0_1700682553313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_calin_en_5.2.0_3.0_1700682553313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_calin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_calin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_calin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Calin/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_chancar_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_chancar_en.md new file mode 100644 index 000000000000..751a976f49e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_chancar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_chancar DistilBertForTokenClassification from chancar +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_chancar +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_chancar` is a English model originally trained by chancar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_chancar_en_5.2.0_3.0_1700622575408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_chancar_en_5.2.0_3.0_1700622575408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_chancar","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_chancar", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_chancar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chancar/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_chenyixin1986_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_chenyixin1986_en.md new file mode 100644 index 000000000000..97a6fed4a077 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_chenyixin1986_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_chenyixin1986 DistilBertForTokenClassification from chenyixin1986 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_chenyixin1986 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_chenyixin1986` is a English model originally trained by chenyixin1986. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_chenyixin1986_en_5.2.0_3.0_1700638542511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_chenyixin1986_en_5.2.0_3.0_1700638542511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_chenyixin1986","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_chenyixin1986", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_chenyixin1986| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chenyixin1986/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_davidliu1110_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_davidliu1110_en.md new file mode 100644 index 000000000000..c4e41804db22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_davidliu1110_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_davidliu1110 DistilBertForTokenClassification from davidliu1110 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_davidliu1110 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_davidliu1110` is a English model originally trained by davidliu1110. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_davidliu1110_en_5.2.0_3.0_1700628728296.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_davidliu1110_en_5.2.0_3.0_1700628728296.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_davidliu1110","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_davidliu1110", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_davidliu1110| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/davidliu1110/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_dingzhaohan_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_dingzhaohan_en.md new file mode 100644 index 000000000000..65a66eda45ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_dingzhaohan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_dingzhaohan DistilBertForTokenClassification from dingzhaohan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_dingzhaohan +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_dingzhaohan` is a English model originally trained by dingzhaohan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_dingzhaohan_en_5.2.0_3.0_1700621735276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_dingzhaohan_en_5.2.0_3.0_1700621735276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_dingzhaohan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_dingzhaohan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_dingzhaohan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/dingzhaohan/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_duyduong9htv_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_duyduong9htv_en.md new file mode 100644 index 000000000000..0bafca9850d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_duyduong9htv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_duyduong9htv DistilBertForTokenClassification from duyduong9htv +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_duyduong9htv +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_duyduong9htv` is a English model originally trained by duyduong9htv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_duyduong9htv_en_5.2.0_3.0_1700644707980.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_duyduong9htv_en_5.2.0_3.0_1700644707980.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_duyduong9htv","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_duyduong9htv", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_duyduong9htv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/duyduong9htv/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eca1g19_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eca1g19_en.md new file mode 100644 index 000000000000..b223d5347aff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eca1g19_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_eca1g19 DistilBertForTokenClassification from eca1g19 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_eca1g19 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_eca1g19` is a English model originally trained by eca1g19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eca1g19_en_5.2.0_3.0_1700648710833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eca1g19_en_5.2.0_3.0_1700648710833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_eca1g19","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_eca1g19", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_eca1g19| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/eca1g19/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ecartierlipn_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ecartierlipn_en.md new file mode 100644 index 000000000000..476a8c1ecd80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ecartierlipn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ecartierlipn DistilBertForTokenClassification from ecartierlipn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ecartierlipn +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ecartierlipn` is a English model originally trained by ecartierlipn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ecartierlipn_en_5.2.0_3.0_1700623397726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ecartierlipn_en_5.2.0_3.0_1700623397726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ecartierlipn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ecartierlipn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ecartierlipn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ecartierlipn/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eceozaydn_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eceozaydn_en.md new file mode 100644 index 000000000000..d7b48b00e4fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eceozaydn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_eceozaydn DistilBertForTokenClassification from eceozaydn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_eceozaydn +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_eceozaydn` is a English model originally trained by eceozaydn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eceozaydn_en_5.2.0_3.0_1700622435449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eceozaydn_en_5.2.0_3.0_1700622435449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_eceozaydn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_eceozaydn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_eceozaydn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/eceozaydn/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_edmz_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_edmz_en.md new file mode 100644 index 000000000000..2934c60f56f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_edmz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_edmz DistilBertForTokenClassification from edmz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_edmz +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_edmz` is a English model originally trained by edmz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edmz_en_5.2.0_3.0_1700622257530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edmz_en_5.2.0_3.0_1700622257530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_edmz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_edmz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_edmz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/edmz/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_edric111_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_edric111_en.md new file mode 100644 index 000000000000..dcf601e300dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_edric111_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_edric111 DistilBertForTokenClassification from Edric111 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_edric111 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_edric111` is a English model originally trained by Edric111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edric111_en_5.2.0_3.0_1700622287650.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_edric111_en_5.2.0_3.0_1700622287650.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_edric111","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_edric111", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_edric111| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Edric111/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eman222_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eman222_en.md new file mode 100644 index 000000000000..def96e75cc1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eman222_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_eman222 DistilBertForTokenClassification from Eman222 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_eman222 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_eman222` is a English model originally trained by Eman222. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eman222_en_5.2.0_3.0_1700623009432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eman222_en_5.2.0_3.0_1700623009432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_eman222","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_eman222", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_eman222| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Eman222/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eulaliefy_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eulaliefy_en.md new file mode 100644 index 000000000000..038c082e779f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_eulaliefy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_eulaliefy DistilBertForTokenClassification from Eulaliefy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_eulaliefy +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_eulaliefy` is a English model originally trained by Eulaliefy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eulaliefy_en_5.2.0_3.0_1700620409135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_eulaliefy_en_5.2.0_3.0_1700620409135.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_eulaliefy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_eulaliefy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_eulaliefy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Eulaliefy/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_fadhilarkn_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_fadhilarkn_en.md new file mode 100644 index 000000000000..5db5d14717e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_fadhilarkn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_fadhilarkn DistilBertForTokenClassification from fadhilarkn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_fadhilarkn +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_fadhilarkn` is a English model originally trained by fadhilarkn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fadhilarkn_en_5.2.0_3.0_1700619986427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_fadhilarkn_en_5.2.0_3.0_1700619986427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_fadhilarkn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_fadhilarkn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_fadhilarkn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/fadhilarkn/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_final_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_final_en.md new file mode 100644 index 000000000000..b9427606ba81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_final_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_final DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_final +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_final` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_final_en_5.2.0_3.0_1700621307604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_final_en_5.2.0_3.0_1700621307604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_final","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_final", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-finetuned-ner-final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gabrielzang_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gabrielzang_en.md new file mode 100644 index 000000000000..59d857a4a618 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gabrielzang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_gabrielzang DistilBertForTokenClassification from gabrielZang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_gabrielzang +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_gabrielzang` is a English model originally trained by gabrielZang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gabrielzang_en_5.2.0_3.0_1700628008313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gabrielzang_en_5.2.0_3.0_1700628008313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_gabrielzang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_gabrielzang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_gabrielzang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/gabrielZang/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gayatri_bt_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gayatri_bt_en.md new file mode 100644 index 000000000000..af4f231b1610 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gayatri_bt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_gayatri_bt DistilBertForTokenClassification from Gayatri-BT +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_gayatri_bt +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_gayatri_bt` is a English model originally trained by Gayatri-BT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gayatri_bt_en_5.2.0_3.0_1700619960785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gayatri_bt_en_5.2.0_3.0_1700619960785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_gayatri_bt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_gayatri_bt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_gayatri_bt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Gayatri-BT/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gayu_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gayu_en.md new file mode 100644 index 000000000000..132fbcb962ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_gayu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_gayu DistilBertForTokenClassification from Gayu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_gayu +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_gayu` is a English model originally trained by Gayu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gayu_en_5.2.0_3.0_1700619316191.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_gayu_en_5.2.0_3.0_1700619316191.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_gayu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_gayu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_gayu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Gayu/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_grantitdhcka_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_grantitdhcka_en.md new file mode 100644 index 000000000000..133d7702a956 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_grantitdhcka_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_grantitdhcka DistilBertForTokenClassification from grantitdhcka +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_grantitdhcka +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_grantitdhcka` is a English model originally trained by grantitdhcka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_grantitdhcka_en_5.2.0_3.0_1700639628212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_grantitdhcka_en_5.2.0_3.0_1700639628212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_grantitdhcka","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_grantitdhcka", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_grantitdhcka| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/grantitdhcka/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_guhuawuli_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_guhuawuli_en.md new file mode 100644 index 000000000000..8b5946a3fc23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_guhuawuli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_guhuawuli DistilBertForTokenClassification from guhuawuli +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_guhuawuli +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_guhuawuli` is a English model originally trained by guhuawuli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_guhuawuli_en_5.2.0_3.0_1700627305739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_guhuawuli_en_5.2.0_3.0_1700627305739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_guhuawuli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_guhuawuli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_guhuawuli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/guhuawuli/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_haneen77_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_haneen77_en.md new file mode 100644 index 000000000000..cae1223ef73a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_haneen77_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_haneen77 DistilBertForTokenClassification from Haneen77 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_haneen77 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_haneen77` is a English model originally trained by Haneen77. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_haneen77_en_5.2.0_3.0_1700637596812.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_haneen77_en_5.2.0_3.0_1700637596812.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_haneen77","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_haneen77", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_haneen77| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Haneen77/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_hossay_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_hossay_en.md new file mode 100644 index 000000000000..728245f672fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_hossay_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_hossay DistilBertForTokenClassification from hossay +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_hossay +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_hossay` is a English model originally trained by hossay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hossay_en_5.2.0_3.0_1700623511021.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_hossay_en_5.2.0_3.0_1700623511021.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_hossay","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_hossay", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_hossay| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hossay/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_huynguyen208_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_huynguyen208_en.md new file mode 100644 index 000000000000..ee3ed92acadc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_huynguyen208_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_huynguyen208 DistilBertForTokenClassification from huynguyen208 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_huynguyen208 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_huynguyen208` is a English model originally trained by huynguyen208. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_huynguyen208_en_5.2.0_3.0_1700621151699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_huynguyen208_en_5.2.0_3.0_1700621151699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_huynguyen208","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_huynguyen208", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_huynguyen208| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/huynguyen208/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_isamaks_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_isamaks_en.md new file mode 100644 index 000000000000..c6bc8aa3a0bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_isamaks_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_isamaks DistilBertForTokenClassification from IsaMaks +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_isamaks +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_isamaks` is a English model originally trained by IsaMaks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_isamaks_en_5.2.0_3.0_1700619850487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_isamaks_en_5.2.0_3.0_1700619850487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_isamaks","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_isamaks", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_isamaks| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/IsaMaks/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jaiti_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jaiti_en.md new file mode 100644 index 000000000000..d4c554cf71d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jaiti_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_jaiti DistilBertForTokenClassification from Jaiti +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_jaiti +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_jaiti` is a English model originally trained by Jaiti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jaiti_en_5.2.0_3.0_1700619934057.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jaiti_en_5.2.0_3.0_1700619934057.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_jaiti","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_jaiti", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_jaiti| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|324.4 MB| + +## References + +https://huggingface.co/Jaiti/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jasminebatra_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jasminebatra_en.md new file mode 100644 index 000000000000..fe0f8f5169c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jasminebatra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_jasminebatra DistilBertForTokenClassification from Jasminebatra +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_jasminebatra +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_jasminebatra` is a English model originally trained by Jasminebatra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jasminebatra_en_5.2.0_3.0_1700662987787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jasminebatra_en_5.2.0_3.0_1700662987787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_jasminebatra","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_jasminebatra", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_jasminebatra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Jasminebatra/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jb666_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jb666_en.md new file mode 100644 index 000000000000..eeaa5e437cfe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jb666_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_jb666 DistilBertForTokenClassification from jb666 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_jb666 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_jb666` is a English model originally trained by jb666. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jb666_en_5.2.0_3.0_1700622908419.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jb666_en_5.2.0_3.0_1700622908419.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_jb666","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_jb666", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_jb666| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/jb666/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jgraves_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jgraves_en.md new file mode 100644 index 000000000000..e4773b5bf907 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_jgraves_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_jgraves DistilBertForTokenClassification from JGraves +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_jgraves +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_jgraves` is a English model originally trained by JGraves. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jgraves_en_5.2.0_3.0_1700645046620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_jgraves_en_5.2.0_3.0_1700645046620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_jgraves","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_jgraves", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_jgraves| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/JGraves/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_juanmillan85_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_juanmillan85_en.md new file mode 100644 index 000000000000..39a332285dad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_juanmillan85_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_juanmillan85 DistilBertForTokenClassification from juanmillan85 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_juanmillan85 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_juanmillan85` is a English model originally trained by juanmillan85. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_juanmillan85_en_5.2.0_3.0_1700619588797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_juanmillan85_en_5.2.0_3.0_1700619588797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_juanmillan85","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_juanmillan85", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_juanmillan85| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/juanmillan85/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kdat_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kdat_en.md new file mode 100644 index 000000000000..f031b864b087 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kdat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_kdat DistilBertForTokenClassification from KDat +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_kdat +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_kdat` is a English model originally trained by KDat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kdat_en_5.2.0_3.0_1700624893086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kdat_en_5.2.0_3.0_1700624893086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_kdat","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_kdat", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_kdat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/KDat/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kinanmartin_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kinanmartin_en.md new file mode 100644 index 000000000000..fdc2cb156e3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kinanmartin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_kinanmartin DistilBertForTokenClassification from kinanmartin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_kinanmartin +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_kinanmartin` is a English model originally trained by kinanmartin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kinanmartin_en_5.2.0_3.0_1700624694532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kinanmartin_en_5.2.0_3.0_1700624694532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_kinanmartin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_kinanmartin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_kinanmartin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kinanmartin/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kisma_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kisma_en.md new file mode 100644 index 000000000000..b94f14005d3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_kisma_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_kisma DistilBertForTokenClassification from kisma +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_kisma +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_kisma` is a English model originally trained by kisma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kisma_en_5.2.0_3.0_1700656664072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_kisma_en_5.2.0_3.0_1700656664072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_kisma","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_kisma", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_kisma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/kisma/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_krag57_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_krag57_en.md new file mode 100644 index 000000000000..10785b0ff4b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_krag57_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_krag57 DistilBertForTokenClassification from krag57 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_krag57 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_krag57` is a English model originally trained by krag57. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_krag57_en_5.2.0_3.0_1700649069740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_krag57_en_5.2.0_3.0_1700649069740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_krag57","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_krag57", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_krag57| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/krag57/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_linh101201_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_linh101201_en.md new file mode 100644 index 000000000000..34694faea92e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_linh101201_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_linh101201 DistilBertForTokenClassification from linh101201 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_linh101201 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_linh101201` is a English model originally trained by linh101201. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_linh101201_en_5.2.0_3.0_1700669929943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_linh101201_en_5.2.0_3.0_1700669929943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_linh101201","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_linh101201", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_linh101201| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.7 MB| + +## References + +https://huggingface.co/linh101201/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_longxiang_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_longxiang_en.md new file mode 100644 index 000000000000..16a067978f4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_longxiang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_longxiang DistilBertForTokenClassification from Longxiang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_longxiang +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_longxiang` is a English model originally trained by Longxiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_longxiang_en_5.2.0_3.0_1700647431751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_longxiang_en_5.2.0_3.0_1700647431751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_longxiang","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_longxiang", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_longxiang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Longxiang/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_luciferlizard_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_luciferlizard_en.md new file mode 100644 index 000000000000..1e8d773519a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_luciferlizard_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_luciferlizard DistilBertForTokenClassification from luciferlizard +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_luciferlizard +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_luciferlizard` is a English model originally trained by luciferlizard. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_luciferlizard_en_5.2.0_3.0_1700622069940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_luciferlizard_en_5.2.0_3.0_1700622069940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_luciferlizard","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_luciferlizard", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_luciferlizard| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/luciferlizard/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mandur_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mandur_en.md new file mode 100644 index 000000000000..f740a012090c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mandur_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mandur DistilBertForTokenClassification from Mandur +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mandur +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mandur` is a English model originally trained by Mandur. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mandur_en_5.2.0_3.0_1700621449118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mandur_en_5.2.0_3.0_1700621449118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mandur","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mandur", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mandur| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Mandur/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mankness_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mankness_en.md new file mode 100644 index 000000000000..0d5f123e54f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mankness_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mankness DistilBertForTokenClassification from mankness +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mankness +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mankness` is a English model originally trained by mankness. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mankness_en_5.2.0_3.0_1700636014108.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mankness_en_5.2.0_3.0_1700636014108.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mankness","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mankness", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mankness| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mankness/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mariahabib_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mariahabib_en.md new file mode 100644 index 000000000000..17fb769aeb46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mariahabib_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mariahabib DistilBertForTokenClassification from MariaHabib +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mariahabib +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mariahabib` is a English model originally trained by MariaHabib. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mariahabib_en_5.2.0_3.0_1700641179734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mariahabib_en_5.2.0_3.0_1700641179734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mariahabib","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mariahabib", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mariahabib| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/MariaHabib/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mdcox_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mdcox_en.md new file mode 100644 index 000000000000..b3755d853d97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mdcox_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mdcox DistilBertForTokenClassification from mdcox +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mdcox +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mdcox` is a English model originally trained by mdcox. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mdcox_en_5.2.0_3.0_1700620131276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mdcox_en_5.2.0_3.0_1700620131276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mdcox","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mdcox", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mdcox| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mdcox/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_michelebern_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_michelebern_en.md new file mode 100644 index 000000000000..7881df9e6f6a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_michelebern_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_michelebern DistilBertForTokenClassification from michelebern +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_michelebern +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_michelebern` is a English model originally trained by michelebern. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_michelebern_en_5.2.0_3.0_1700636590720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_michelebern_en_5.2.0_3.0_1700636590720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_michelebern","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_michelebern", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_michelebern| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/michelebern/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mingyangli_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mingyangli_en.md new file mode 100644 index 000000000000..92fbb3cd38d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mingyangli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mingyangli DistilBertForTokenClassification from MingyangLi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mingyangli +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mingyangli` is a English model originally trained by MingyangLi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mingyangli_en_5.2.0_3.0_1700625483464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mingyangli_en_5.2.0_3.0_1700625483464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mingyangli","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mingyangli", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mingyangli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/MingyangLi/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mke10_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mke10_en.md new file mode 100644 index 000000000000..b45c7416d14c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mke10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mke10 DistilBertForTokenClassification from mke10 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mke10 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mke10` is a English model originally trained by mke10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mke10_en_5.2.0_3.0_1700656143750.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mke10_en_5.2.0_3.0_1700656143750.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mke10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mke10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mke10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mke10/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mo7amed3ly_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mo7amed3ly_en.md new file mode 100644 index 000000000000..34155112d865 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mo7amed3ly_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mo7amed3ly DistilBertForTokenClassification from mo7amed3ly +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mo7amed3ly +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mo7amed3ly` is a English model originally trained by mo7amed3ly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mo7amed3ly_en_5.2.0_3.0_1700619597202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mo7amed3ly_en_5.2.0_3.0_1700619597202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mo7amed3ly","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mo7amed3ly", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mo7amed3ly| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mo7amed3ly/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_morganchen1007_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_morganchen1007_en.md new file mode 100644 index 000000000000..ab9c447591d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_morganchen1007_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_morganchen1007 DistilBertForTokenClassification from morganchen1007 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_morganchen1007 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_morganchen1007` is a English model originally trained by morganchen1007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_morganchen1007_en_5.2.0_3.0_1700621735325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_morganchen1007_en_5.2.0_3.0_1700621735325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_morganchen1007","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_morganchen1007", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_morganchen1007| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/morganchen1007/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mssjoyy_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mssjoyy_en.md new file mode 100644 index 000000000000..5a53bad150e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_mssjoyy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_mssjoyy DistilBertForTokenClassification from mssjoyy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_mssjoyy +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_mssjoyy` is a English model originally trained by mssjoyy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mssjoyy_en_5.2.0_3.0_1700621459450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_mssjoyy_en_5.2.0_3.0_1700621459450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_mssjoyy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_mssjoyy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_mssjoyy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/mssjoyy/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_murdockthedude_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_murdockthedude_en.md new file mode 100644 index 000000000000..30427310aa73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_murdockthedude_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_murdockthedude DistilBertForTokenClassification from murdockthedude +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_murdockthedude +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_murdockthedude` is a English model originally trained by murdockthedude. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_murdockthedude_en_5.2.0_3.0_1700623662803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_murdockthedude_en_5.2.0_3.0_1700623662803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_murdockthedude","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_murdockthedude", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_murdockthedude| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/murdockthedude/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nerea06_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nerea06_en.md new file mode 100644 index 000000000000..00a420617405 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nerea06_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_nerea06 DistilBertForTokenClassification from Nerea06 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_nerea06 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_nerea06` is a English model originally trained by Nerea06. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nerea06_en_5.2.0_3.0_1700636721690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nerea06_en_5.2.0_3.0_1700636721690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_nerea06","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_nerea06", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_nerea06| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Nerea06/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nestoralvaro_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nestoralvaro_en.md new file mode 100644 index 000000000000..0bb16785714b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nestoralvaro_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_nestoralvaro DistilBertForTokenClassification from nestoralvaro +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_nestoralvaro +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_nestoralvaro` is a English model originally trained by nestoralvaro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nestoralvaro_en_5.2.0_3.0_1700620244219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nestoralvaro_en_5.2.0_3.0_1700620244219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_nestoralvaro","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_nestoralvaro", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_nestoralvaro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/nestoralvaro/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nicgh3_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nicgh3_en.md new file mode 100644 index 000000000000..34a15105764d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nicgh3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_nicgh3 DistilBertForTokenClassification from nicgh3 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_nicgh3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_nicgh3` is a English model originally trained by nicgh3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nicgh3_en_5.2.0_3.0_1700643786499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nicgh3_en_5.2.0_3.0_1700643786499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_nicgh3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_nicgh3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_nicgh3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/nicgh3/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nlp_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nlp_en.md new file mode 100644 index 000000000000..840e52df122d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nlp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_nlp DistilBertForTokenClassification from thomasfm +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_nlp +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_nlp` is a English model originally trained by thomasfm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nlp_en_5.2.0_3.0_1700622424262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nlp_en_5.2.0_3.0_1700622424262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_nlp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_nlp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_nlp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/thomasfm/distilbert-base-uncased-finetuned-ner-nlp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_noura_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_noura_en.md new file mode 100644 index 000000000000..038728690079 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_noura_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_noura DistilBertForTokenClassification from Noura +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_noura +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_noura` is a English model originally trained by Noura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_noura_en_5.2.0_3.0_1700622729790.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_noura_en_5.2.0_3.0_1700622729790.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_noura","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_noura", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_noura| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Noura/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_novik_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_novik_en.md new file mode 100644 index 000000000000..a703c51f6ec0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_novik_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_novik DistilBertForTokenClassification from Novik +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_novik +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_novik` is a English model originally trained by Novik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_novik_en_5.2.0_3.0_1700661180485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_novik_en_5.2.0_3.0_1700661180485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_novik","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_novik", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_novik| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Novik/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nsandra_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nsandra_en.md new file mode 100644 index 000000000000..043e887e2671 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_nsandra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_nsandra DistilBertForTokenClassification from NSandra +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_nsandra +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_nsandra` is a English model originally trained by NSandra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nsandra_en_5.2.0_3.0_1700622113789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_nsandra_en_5.2.0_3.0_1700622113789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_nsandra","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_nsandra", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_nsandra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/NSandra/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_orthogonal_orca_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_orthogonal_orca_en.md new file mode 100644 index 000000000000..ba72baed2937 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_orthogonal_orca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_orthogonal_orca DistilBertForTokenClassification from orthogonal-orca +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_orthogonal_orca +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_orthogonal_orca` is a English model originally trained by orthogonal-orca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_orthogonal_orca_en_5.2.0_3.0_1700647815174.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_orthogonal_orca_en_5.2.0_3.0_1700647815174.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_orthogonal_orca","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_orthogonal_orca", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_orthogonal_orca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/orthogonal-orca/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_oscarhoekstra_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_oscarhoekstra_en.md new file mode 100644 index 000000000000..7943a791f14c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_oscarhoekstra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_oscarhoekstra DistilBertForTokenClassification from OscarHoekstra +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_oscarhoekstra +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_oscarhoekstra` is a English model originally trained by OscarHoekstra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_oscarhoekstra_en_5.2.0_3.0_1700619861617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_oscarhoekstra_en_5.2.0_3.0_1700619861617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_oscarhoekstra","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_oscarhoekstra", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_oscarhoekstra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/OscarHoekstra/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pha_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pha_en.md new file mode 100644 index 000000000000..e6237bcedcab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pha_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_pha DistilBertForTokenClassification from Calin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_pha +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_pha` is a English model originally trained by Calin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pha_en_5.2.0_3.0_1700667789626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pha_en_5.2.0_3.0_1700667789626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_pha","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_pha", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_pha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Calin/distilbert-base-uncased-finetuned-ner-pha \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pitronalldak_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pitronalldak_en.md new file mode 100644 index 000000000000..343b3bd00306 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pitronalldak_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_pitronalldak DistilBertForTokenClassification from pitronalldak +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_pitronalldak +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_pitronalldak` is a English model originally trained by pitronalldak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pitronalldak_en_5.2.0_3.0_1700619596582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pitronalldak_en_5.2.0_3.0_1700619596582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_pitronalldak","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_pitronalldak", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_pitronalldak| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/pitronalldak/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pivolan_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pivolan_en.md new file mode 100644 index 000000000000..34235673480c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_pivolan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_pivolan DistilBertForTokenClassification from pivolan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_pivolan +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_pivolan` is a English model originally trained by pivolan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pivolan_en_5.2.0_3.0_1700659352853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_pivolan_en_5.2.0_3.0_1700659352853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_pivolan","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_pivolan", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_pivolan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/pivolan/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_qingm_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_qingm_en.md new file mode 100644 index 000000000000..329b9afeec06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_qingm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_qingm DistilBertForTokenClassification from qingm +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_qingm +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_qingm` is a English model originally trained by qingm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_qingm_en_5.2.0_3.0_1700629575116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_qingm_en_5.2.0_3.0_1700629575116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_qingm","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_qingm", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_qingm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/qingm/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ravenk_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ravenk_en.md new file mode 100644 index 000000000000..ccf79f7030f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ravenk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ravenk DistilBertForTokenClassification from RavenK +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ravenk +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ravenk` is a English model originally trained by RavenK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ravenk_en_5.2.0_3.0_1700620133773.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ravenk_en_5.2.0_3.0_1700620133773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ravenk","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ravenk", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ravenk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/RavenK/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rebeccakoganlee_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rebeccakoganlee_en.md new file mode 100644 index 000000000000..1a45ba344266 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rebeccakoganlee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_rebeccakoganlee DistilBertForTokenClassification from rebeccakoganlee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_rebeccakoganlee +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_rebeccakoganlee` is a English model originally trained by rebeccakoganlee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_rebeccakoganlee_en_5.2.0_3.0_1700630413538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_rebeccakoganlee_en_5.2.0_3.0_1700630413538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_rebeccakoganlee","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_rebeccakoganlee", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_rebeccakoganlee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rebeccakoganlee/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_reugene_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_reugene_en.md new file mode 100644 index 000000000000..809207184062 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_reugene_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_reugene DistilBertForTokenClassification from reugene +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_reugene +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_reugene` is a English model originally trained by reugene. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_reugene_en_5.2.0_3.0_1700658076877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_reugene_en_5.2.0_3.0_1700658076877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_reugene","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_reugene", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_reugene| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/reugene/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rockyend_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rockyend_en.md new file mode 100644 index 000000000000..a87d90d9842e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rockyend_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_rockyend DistilBertForTokenClassification from rockyend +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_rockyend +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_rockyend` is a English model originally trained by rockyend. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_rockyend_en_5.2.0_3.0_1700623991747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_rockyend_en_5.2.0_3.0_1700623991747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_rockyend","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_rockyend", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_rockyend| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rockyend/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rohanv123_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rohanv123_en.md new file mode 100644 index 000000000000..05e4be7d3a10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_rohanv123_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_rohanv123 DistilBertForTokenClassification from rohanv123 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_rohanv123 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_rohanv123` is a English model originally trained by rohanv123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_rohanv123_en_5.2.0_3.0_1700678530625.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_rohanv123_en_5.2.0_3.0_1700678530625.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_rohanv123","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_rohanv123", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_rohanv123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rohanv123/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_roschmid_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_roschmid_en.md new file mode 100644 index 000000000000..2efa7be70019 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_roschmid_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_roschmid DistilBertForTokenClassification from roschmid +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_roschmid +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_roschmid` is a English model originally trained by roschmid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_roschmid_en_5.2.0_3.0_1700623215628.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_roschmid_en_5.2.0_3.0_1700623215628.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_roschmid","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_roschmid", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_roschmid| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/roschmid/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_saideekshith_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_saideekshith_en.md new file mode 100644 index 000000000000..7e12bbca2695 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_saideekshith_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_saideekshith DistilBertForTokenClassification from saideekshith +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_saideekshith +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_saideekshith` is a English model originally trained by saideekshith. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_saideekshith_en_5.2.0_3.0_1700619645740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_saideekshith_en_5.2.0_3.0_1700619645740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_saideekshith","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_saideekshith", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_saideekshith| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/saideekshith/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sajib2023_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sajib2023_en.md new file mode 100644 index 000000000000..27b206589180 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sajib2023_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_sajib2023 DistilBertForTokenClassification from Sajib2023 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_sajib2023 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_sajib2023` is a English model originally trained by Sajib2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sajib2023_en_5.2.0_3.0_1700623374317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sajib2023_en_5.2.0_3.0_1700623374317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_sajib2023","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_sajib2023", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_sajib2023| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Sajib2023/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_samake_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_samake_en.md new file mode 100644 index 000000000000..818ee6e709e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_samake_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_samake DistilBertForTokenClassification from samake +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_samake +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_samake` is a English model originally trained by samake. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_samake_en_5.2.0_3.0_1700624518898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_samake_en_5.2.0_3.0_1700624518898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_samake","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_samake", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_samake| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/samake/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_samih1974_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_samih1974_en.md new file mode 100644 index 000000000000..3e76604feb9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_samih1974_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_samih1974 DistilBertForTokenClassification from samih1974 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_samih1974 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_samih1974` is a English model originally trained by samih1974. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_samih1974_en_5.2.0_3.0_1700675746344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_samih1974_en_5.2.0_3.0_1700675746344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_samih1974","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_samih1974", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_samih1974| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/samih1974/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_savlron_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_savlron_en.md new file mode 100644 index 000000000000..d1e2fa0e07fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_savlron_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_savlron DistilBertForTokenClassification from Savlron +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_savlron +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_savlron` is a English model originally trained by Savlron. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_savlron_en_5.2.0_3.0_1700620974421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_savlron_en_5.2.0_3.0_1700620974421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_savlron","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_savlron", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_savlron| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Savlron/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_seawolf_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_seawolf_en.md new file mode 100644 index 000000000000..d2679c6b4074 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_seawolf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_seawolf DistilBertForTokenClassification from Seawolf +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_seawolf +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_seawolf` is a English model originally trained by Seawolf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_seawolf_en_5.2.0_3.0_1700620567512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_seawolf_en_5.2.0_3.0_1700620567512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_seawolf","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_seawolf", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_seawolf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Seawolf/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_shulim_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_shulim_en.md new file mode 100644 index 000000000000..6d1e18b2d6c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_shulim_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_shulim DistilBertForTokenClassification from shulim +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_shulim +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_shulim` is a English model originally trained by shulim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_shulim_en_5.2.0_3.0_1700622722428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_shulim_en_5.2.0_3.0_1700622722428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_shulim","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_shulim", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_shulim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/shulim/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_slhoefel_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_slhoefel_en.md new file mode 100644 index 000000000000..007ef5ded2ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_slhoefel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_slhoefel DistilBertForTokenClassification from slhoefel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_slhoefel +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_slhoefel` is a English model originally trained by slhoefel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_slhoefel_en_5.2.0_3.0_1700658076909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_slhoefel_en_5.2.0_3.0_1700658076909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_slhoefel","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_slhoefel", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_slhoefel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/slhoefel/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_snailpoo_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_snailpoo_en.md new file mode 100644 index 000000000000..09e386910762 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_snailpoo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_snailpoo DistilBertForTokenClassification from SnailPoo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_snailpoo +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_snailpoo` is a English model originally trained by SnailPoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_snailpoo_en_5.2.0_3.0_1700619720166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_snailpoo_en_5.2.0_3.0_1700619720166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_snailpoo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_snailpoo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_snailpoo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SnailPoo/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_srosy_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_srosy_en.md new file mode 100644 index 000000000000..9a85220d6042 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_srosy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_srosy DistilBertForTokenClassification from srosy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_srosy +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_srosy` is a English model originally trained by srosy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_srosy_en_5.2.0_3.0_1700619276319.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_srosy_en_5.2.0_3.0_1700619276319.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_srosy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_srosy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_srosy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/srosy/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sultanithree_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sultanithree_en.md new file mode 100644 index 000000000000..c1417aef2fc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sultanithree_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_sultanithree DistilBertForTokenClassification from sultanithree +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_sultanithree +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_sultanithree` is a English model originally trained by sultanithree. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sultanithree_en_5.2.0_3.0_1700622430404.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sultanithree_en_5.2.0_3.0_1700622430404.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_sultanithree","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_sultanithree", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_sultanithree| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sultanithree/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sumanc_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sumanc_en.md new file mode 100644 index 000000000000..0f69b4e158cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_sumanc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_sumanc DistilBertForTokenClassification from sumanc +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_sumanc +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_sumanc` is a English model originally trained by sumanc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sumanc_en_5.2.0_3.0_1700656177204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_sumanc_en_5.2.0_3.0_1700656177204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_sumanc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_sumanc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_sumanc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sumanc/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_suwani_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_suwani_en.md new file mode 100644 index 000000000000..fb4098ff32ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_suwani_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_suwani DistilBertForTokenClassification from suwani +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_suwani +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_suwani` is a English model originally trained by suwani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_suwani_en_5.2.0_3.0_1700621730557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_suwani_en_5.2.0_3.0_1700621730557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_suwani","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_suwani", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_suwani| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/suwani/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_taehyunzzz_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_taehyunzzz_en.md new file mode 100644 index 000000000000..88d6bc6e00fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_taehyunzzz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_taehyunzzz DistilBertForTokenClassification from taehyunzzz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_taehyunzzz +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_taehyunzzz` is a English model originally trained by taehyunzzz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_taehyunzzz_en_5.2.0_3.0_1700619732824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_taehyunzzz_en_5.2.0_3.0_1700619732824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_taehyunzzz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_taehyunzzz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_taehyunzzz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/taehyunzzz/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thanhnguyenvn_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thanhnguyenvn_en.md new file mode 100644 index 000000000000..ee8384366733 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thanhnguyenvn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_thanhnguyenvn DistilBertForTokenClassification from thanhnguyenvn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_thanhnguyenvn +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_thanhnguyenvn` is a English model originally trained by thanhnguyenvn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thanhnguyenvn_en_5.2.0_3.0_1700621587554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thanhnguyenvn_en_5.2.0_3.0_1700621587554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_thanhnguyenvn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_thanhnguyenvn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_thanhnguyenvn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/thanhnguyenvn/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thearif_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thearif_en.md new file mode 100644 index 000000000000..8a4184aae66e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thearif_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_thearif DistilBertForTokenClassification from theArif +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_thearif +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_thearif` is a English model originally trained by theArif. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thearif_en_5.2.0_3.0_1700624117716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thearif_en_5.2.0_3.0_1700624117716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_thearif","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_thearif", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_thearif| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/theArif/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thivin_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thivin_en.md new file mode 100644 index 000000000000..112ab844c965 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thivin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_thivin DistilBertForTokenClassification from Thivin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_thivin +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_thivin` is a English model originally trained by Thivin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thivin_en_5.2.0_3.0_1700623070055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thivin_en_5.2.0_3.0_1700623070055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_thivin","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_thivin", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_thivin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Thivin/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thomaszz_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thomaszz_en.md new file mode 100644 index 000000000000..0d64c760a2a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thomaszz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_thomaszz DistilBertForTokenClassification from thomaszz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_thomaszz +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_thomaszz` is a English model originally trained by thomaszz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thomaszz_en_5.2.0_3.0_1700619439214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thomaszz_en_5.2.0_3.0_1700619439214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_thomaszz","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_thomaszz", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_thomaszz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/thomaszz/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thun11_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thun11_en.md new file mode 100644 index 000000000000..bdd9083f8f02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_thun11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_thun11 DistilBertForTokenClassification from Thun11 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_thun11 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_thun11` is a English model originally trained by Thun11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thun11_en_5.2.0_3.0_1700670043012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_thun11_en_5.2.0_3.0_1700670043012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_thun11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_thun11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_thun11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Thun11/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tiennvcs_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tiennvcs_en.md new file mode 100644 index 000000000000..8d94bb1ba08a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tiennvcs_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_tiennvcs DistilBertForTokenClassification from tiennvcs +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_tiennvcs +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_tiennvcs` is a English model originally trained by tiennvcs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tiennvcs_en_5.2.0_3.0_1700622182753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tiennvcs_en_5.2.0_3.0_1700622182753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_tiennvcs","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_tiennvcs", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_tiennvcs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/tiennvcs/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tingting_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tingting_en.md new file mode 100644 index 000000000000..796bec13e413 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tingting_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_tingting DistilBertForTokenClassification from tingting +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_tingting +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_tingting` is a English model originally trained by tingting. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tingting_en_5.2.0_3.0_1700623991769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tingting_en_5.2.0_3.0_1700623991769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_tingting","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_tingting", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_tingting| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/tingting/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tjklein_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tjklein_en.md new file mode 100644 index 000000000000..374688a54020 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_tjklein_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_tjklein DistilBertForTokenClassification from TJKlein +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_tjklein +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_tjklein` is a English model originally trained by TJKlein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tjklein_en_5.2.0_3.0_1700623814558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_tjklein_en_5.2.0_3.0_1700623814558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_tjklein","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_tjklein", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_tjklein| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/TJKlein/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_turhana_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_turhana_en.md new file mode 100644 index 000000000000..618f3dd8be07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_turhana_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_turhana DistilBertForTokenClassification from turhana +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_turhana +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_turhana` is a English model originally trained by turhana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_turhana_en_5.2.0_3.0_1700662077828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_turhana_en_5.2.0_3.0_1700662077828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_turhana","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_turhana", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_turhana| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/turhana/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ueb1_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ueb1_en.md new file mode 100644 index 000000000000..c70e2ff1d1e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_ueb1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_ueb1 DistilBertForTokenClassification from ueb1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_ueb1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_ueb1` is a English model originally trained by ueb1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ueb1_en_5.2.0_3.0_1700620556557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_ueb1_en_5.2.0_3.0_1700620556557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_ueb1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_ueb1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_ueb1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ueb1/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vaillant_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vaillant_en.md new file mode 100644 index 000000000000..42ea93bbf250 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vaillant_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_vaillant DistilBertForTokenClassification from vaillant +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_vaillant +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_vaillant` is a English model originally trained by vaillant. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vaillant_en_5.2.0_3.0_1700631433756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vaillant_en_5.2.0_3.0_1700631433756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_vaillant","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_vaillant", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_vaillant| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vaillant/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_valeriulacatusu_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_valeriulacatusu_en.md new file mode 100644 index 000000000000..afaa602cc925 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_valeriulacatusu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_valeriulacatusu DistilBertForTokenClassification from valeriulacatusu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_valeriulacatusu +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_valeriulacatusu` is a English model originally trained by valeriulacatusu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_valeriulacatusu_en_5.2.0_3.0_1700631589355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_valeriulacatusu_en_5.2.0_3.0_1700631589355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_valeriulacatusu","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_valeriulacatusu", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_valeriulacatusu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/valeriulacatusu/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vijays2_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vijays2_en.md new file mode 100644 index 000000000000..0e3adae235a1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vijays2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_vijays2 DistilBertForTokenClassification from vijays2 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_vijays2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_vijays2` is a English model originally trained by vijays2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vijays2_en_5.2.0_3.0_1700655181238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vijays2_en_5.2.0_3.0_1700655181238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_vijays2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_vijays2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_vijays2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vijays2/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vikasaeta_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vikasaeta_en.md new file mode 100644 index 000000000000..d986b5278fd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vikasaeta_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_vikasaeta DistilBertForTokenClassification from vikasaeta +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_vikasaeta +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_vikasaeta` is a English model originally trained by vikasaeta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vikasaeta_en_5.2.0_3.0_1700619442231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vikasaeta_en_5.2.0_3.0_1700619442231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_vikasaeta","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_vikasaeta", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_vikasaeta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/vikasaeta/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vincenzodeleo_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vincenzodeleo_en.md new file mode 100644 index 000000000000..006a8e828f9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vincenzodeleo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_vincenzodeleo DistilBertForTokenClassification from vincenzodeleo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_vincenzodeleo +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_vincenzodeleo` is a English model originally trained by vincenzodeleo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vincenzodeleo_en_5.2.0_3.0_1700629488286.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vincenzodeleo_en_5.2.0_3.0_1700629488286.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_vincenzodeleo","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_vincenzodeleo", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_vincenzodeleo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vincenzodeleo/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vutt_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vutt_en.md new file mode 100644 index 000000000000..9b43c1cedaa1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_vutt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_vutt DistilBertForTokenClassification from vutt +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_vutt +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_vutt` is a English model originally trained by vutt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vutt_en_5.2.0_3.0_1700637864521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_vutt_en_5.2.0_3.0_1700637864521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_vutt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_vutt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_vutt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vutt/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_xsf_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_xsf_en.md new file mode 100644 index 000000000000..5426d82a4672 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_xsf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_xsf DistilBertForTokenClassification from XSF +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_xsf +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_xsf` is a English model originally trained by XSF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_xsf_en_5.2.0_3.0_1700653836078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_xsf_en_5.2.0_3.0_1700653836078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_xsf","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_xsf", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_xsf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/XSF/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yilingwawa_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yilingwawa_en.md new file mode 100644 index 000000000000..b75dd3e14aec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yilingwawa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_yilingwawa DistilBertForTokenClassification from yilingwawa +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_yilingwawa +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_yilingwawa` is a English model originally trained by yilingwawa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yilingwawa_en_5.2.0_3.0_1700640413562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yilingwawa_en_5.2.0_3.0_1700640413562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_yilingwawa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_yilingwawa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_yilingwawa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yilingwawa/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yilmazasl_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yilmazasl_en.md new file mode 100644 index 000000000000..3526d0f8369e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yilmazasl_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_yilmazasl DistilBertForTokenClassification from yilmazasl +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_yilmazasl +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_yilmazasl` is a English model originally trained by yilmazasl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yilmazasl_en_5.2.0_3.0_1700659871073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yilmazasl_en_5.2.0_3.0_1700659871073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_yilmazasl","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_yilmazasl", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_yilmazasl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yilmazasl/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yougottheswag_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yougottheswag_en.md new file mode 100644 index 000000000000..0f42232c8ae4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_yougottheswag_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_yougottheswag DistilBertForTokenClassification from yougottheswag +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_yougottheswag +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_yougottheswag` is a English model originally trained by yougottheswag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yougottheswag_en_5.2.0_3.0_1700629643704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_yougottheswag_en_5.2.0_3.0_1700629643704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_yougottheswag","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_yougottheswag", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_yougottheswag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yougottheswag/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_zakria_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_zakria_en.md new file mode 100644 index 000000000000..0633fd2e30aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_zakria_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_zakria DistilBertForTokenClassification from zakria +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_zakria +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_zakria` is a English model originally trained by zakria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zakria_en_5.2.0_3.0_1700652125256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zakria_en_5.2.0_3.0_1700652125256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_zakria","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_zakria", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_zakria| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/zakria/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_zald_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_zald_en.md new file mode 100644 index 000000000000..914771546d11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_ner_zald_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_ner_zald DistilBertForTokenClassification from zald +author: John Snow Labs +name: distilbert_base_uncased_finetuned_ner_zald +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_ner_zald` is a English model originally trained by zald. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zald_en_5.2.0_3.0_1700621329430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_ner_zald_en_5.2.0_3.0_1700621329430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_ner_zald","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_ner_zald", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_ner_zald| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/zald/distilbert-base-uncased-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_recruitment_eval_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_recruitment_eval_en.md new file mode 100644 index 000000000000..f2c35acb2105 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_recruitment_eval_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_recruitment_eval DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_recruitment_eval +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_recruitment_eval` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_recruitment_eval_en_5.2.0_3.0_1700656485657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_recruitment_eval_en_5.2.0_3.0_1700656485657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_recruitment_eval","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_recruitment_eval", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_recruitment_eval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-uncased-finetuned-recruitment-eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_recruitment_exp_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_recruitment_exp_en.md new file mode 100644 index 000000000000..4a78e716ede0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_recruitment_exp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_recruitment_exp DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_recruitment_exp +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_recruitment_exp` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_recruitment_exp_en_5.2.0_3.0_1700620083065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_recruitment_exp_en_5.2.0_3.0_1700620083065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_recruitment_exp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_recruitment_exp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_recruitment_exp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-uncased-finetuned-recruitment-exp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_sayula_popoluca_arygx_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_sayula_popoluca_arygx_en.md new file mode 100644 index 000000000000..6edfd3e71cbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_sayula_popoluca_arygx_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sayula_popoluca_arygx DistilBertForTokenClassification from arygx +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sayula_popoluca_arygx +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sayula_popoluca_arygx` is a English model originally trained by arygx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sayula_popoluca_arygx_en_5.2.0_3.0_1700621131518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sayula_popoluca_arygx_en_5.2.0_3.0_1700621131518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_sayula_popoluca_arygx","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_sayula_popoluca_arygx", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sayula_popoluca_arygx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/arygx/distilbert-base-uncased-finetuned-pos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_scientific_exp_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_scientific_exp_en.md new file mode 100644 index 000000000000..9c6d5678a991 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_finetuned_scientific_exp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_scientific_exp DistilBertForTokenClassification from reyhanemyr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_scientific_exp +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_scientific_exp` is a English model originally trained by reyhanemyr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_scientific_exp_en_5.2.0_3.0_1700630933301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_scientific_exp_en_5.2.0_3.0_1700630933301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_finetuned_scientific_exp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_finetuned_scientific_exp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_scientific_exp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/reyhanemyr/distilbert-base-uncased-finetuned-scientific-exp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_marfinbirt_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_marfinbirt_en.md new file mode 100644 index 000000000000..7b2f6c5299fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_marfinbirt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_marfinbirt DistilBertForTokenClassification from marfinbirt +author: John Snow Labs +name: distilbert_base_uncased_marfinbirt +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_marfinbirt` is a English model originally trained by marfinbirt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_marfinbirt_en_5.2.0_3.0_1700662053303.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_marfinbirt_en_5.2.0_3.0_1700662053303.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_marfinbirt","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_marfinbirt", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_marfinbirt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/marfinbirt/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery_en.md new file mode 100644 index 000000000000..324eda35f09f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery DistilBertForTokenClassification from jonas-luehrs +author: John Snow Labs +name: distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery` is a English model originally trained by jonas-luehrs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery_en_5.2.0_3.0_1700642091089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery_en_5.2.0_3.0_1700642091089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_mlm_scirepeval_fos_chemistry_tokencls_battery| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/jonas-luehrs/distilbert-base-uncased-MLM-scirepeval_fos_chemistry-tokenCLS-BATTERY \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23_en.md new file mode 100644 index 000000000000..f1817230697e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23 DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23_en_5.2.0_3.0_1700622698988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23_en_5.2.0_3.0_1700622698988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_17_02_23| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-ner-invoiceSenderRecipient_all_inv_17_02_23 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02_en.md new file mode 100644 index 000000000000..a05686be6699 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02 DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02_en_5.2.0_3.0_1700620978485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02_en_5.2.0_3.0_1700620978485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_20_02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-ner-invoiceSenderRecipient_all_inv_20_02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12_en.md new file mode 100644 index 000000000000..94217eb08374 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12 DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12_en_5.2.0_3.0_1700620114735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12_en_5.2.0_3.0_1700620114735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_invoicesenderrecipient_all_inv_26_12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-ner-invoiceSenderRecipient-all-inv-26-12 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02_en.md new file mode 100644 index 000000000000..7d25ea9641b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02 DistilBertForTokenClassification from Lilya +author: John Snow Labs +name: distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02` is a English model originally trained by Lilya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02_en_5.2.0_3.0_1700623240674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02_en_5.2.0_3.0_1700623240674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ner_invoicesenderrecipient_clean_inv_28_02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Lilya/distilbert-base-uncased-ner-invoiceSenderRecipient_clean_inv_28_02 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_test2_jethuestad_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_test2_jethuestad_en.md new file mode 100644 index 000000000000..12f1f5597195 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_test2_jethuestad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_test2_jethuestad DistilBertForTokenClassification from Jethuestad +author: John Snow Labs +name: distilbert_base_uncased_test2_jethuestad +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_test2_jethuestad` is a English model originally trained by Jethuestad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_test2_jethuestad_en_5.2.0_3.0_1700621263643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_test2_jethuestad_en_5.2.0_3.0_1700621263643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_test2_jethuestad","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_test2_jethuestad", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_test2_jethuestad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Jethuestad/distilbert-base-uncased-test2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ui_chope_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ui_chope_en.md new file mode 100644 index 000000000000..5e2a703e9cee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_ui_chope_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_ui_chope DistilBertForTokenClassification from ui-chope +author: John Snow Labs +name: distilbert_base_uncased_ui_chope +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_ui_chope` is a English model originally trained by ui-chope. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ui_chope_en_5.2.0_3.0_1700620865874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_ui_chope_en_5.2.0_3.0_1700620865874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_ui_chope","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_ui_chope", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_ui_chope| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ui-chope/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_un_huongle_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_un_huongle_en.md new file mode 100644 index 000000000000..98d1270f0ba4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_base_uncased_un_huongle_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_un_huongle DistilBertForTokenClassification from Thi-Thu-Huong +author: John Snow Labs +name: distilbert_base_uncased_un_huongle +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_un_huongle` is a English model originally trained by Thi-Thu-Huong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_un_huongle_en_5.2.0_3.0_1700625504667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_un_huongle_en_5.2.0_3.0_1700625504667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_base_uncased_un_huongle","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_base_uncased_un_huongle", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_un_huongle| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Thi-Thu-Huong/distilbert-base-uncased-UN-huongle \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_binary_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_binary_en.md new file mode 100644 index 000000000000..aa4ccf2c7325 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_binary_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_binary DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_binary +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_binary` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_binary_en_5.2.0_3.0_1700623987240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_binary_en_5.2.0_3.0_1700623987240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_binary","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_binary", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_binary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilBERT-binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_bio_pv_superset_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_bio_pv_superset_en.md new file mode 100644 index 000000000000..480203260dfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_bio_pv_superset_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_bio_pv_superset DistilBertForTokenClassification from commanderstrife +author: John Snow Labs +name: distilbert_bio_pv_superset +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_bio_pv_superset` is a English model originally trained by commanderstrife. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_bio_pv_superset_en_5.2.0_3.0_1700637596883.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_bio_pv_superset_en_5.2.0_3.0_1700637596883.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_bio_pv_superset","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_bio_pv_superset", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_bio_pv_superset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/commanderstrife/distilBERT_bio_pv_superset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_bpmn_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_bpmn_en.md new file mode 100644 index 000000000000..ae7390114d81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_bpmn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_bpmn DistilBertForTokenClassification from jtlicardo +author: John Snow Labs +name: distilbert_bpmn +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_bpmn` is a English model originally trained by jtlicardo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_bpmn_en_5.2.0_3.0_1700619288022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_bpmn_en_5.2.0_3.0_1700619288022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_bpmn","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_bpmn", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_bpmn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/jtlicardo/distilbert-bpmn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_expense_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_expense_ner_en.md new file mode 100644 index 000000000000..c4c0f4c17a01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_expense_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_expense_ner DistilBertForTokenClassification from renjithks +author: John Snow Labs +name: distilbert_expense_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_expense_ner` is a English model originally trained by renjithks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_expense_ner_en_5.2.0_3.0_1700620417026.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_expense_ner_en_5.2.0_3.0_1700620417026.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_expense_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_expense_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_expense_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|280.9 MB| + +## References + +https://huggingface.co/renjithks/distilbert-expense-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_absa_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_absa_en.md new file mode 100644 index 000000000000..9979cf947486 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_absa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_absa DistilBertForTokenClassification from Joshwabail +author: John Snow Labs +name: distilbert_finetuned_absa +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_absa` is a English model originally trained by Joshwabail. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_absa_en_5.2.0_3.0_1700631194303.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_absa_en_5.2.0_3.0_1700631194303.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_absa","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_absa", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_absa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Joshwabail/distilbert-finetuned-absa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_gesture_prediction_21_classes_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_gesture_prediction_21_classes_en.md new file mode 100644 index 000000000000..0eaab2412fa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_gesture_prediction_21_classes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_gesture_prediction_21_classes DistilBertForTokenClassification from qfrodicio +author: John Snow Labs +name: distilbert_finetuned_gesture_prediction_21_classes +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_gesture_prediction_21_classes` is a English model originally trained by qfrodicio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_gesture_prediction_21_classes_en_5.2.0_3.0_1700620723299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_gesture_prediction_21_classes_en_5.2.0_3.0_1700620723299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_gesture_prediction_21_classes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_gesture_prediction_21_classes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_gesture_prediction_21_classes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.9 MB| + +## References + +https://huggingface.co/qfrodicio/distilbert-finetuned-gesture-prediction-21-classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_gesture_prediction_5_classes_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_gesture_prediction_5_classes_en.md new file mode 100644 index 000000000000..fe4ea840dec0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_gesture_prediction_5_classes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_gesture_prediction_5_classes DistilBertForTokenClassification from qfrodicio +author: John Snow Labs +name: distilbert_finetuned_gesture_prediction_5_classes +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_gesture_prediction_5_classes` is a English model originally trained by qfrodicio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_gesture_prediction_5_classes_en_5.2.0_3.0_1700620234826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_gesture_prediction_5_classes_en_5.2.0_3.0_1700620234826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_gesture_prediction_5_classes","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_gesture_prediction_5_classes", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_gesture_prediction_5_classes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/qfrodicio/distilbert-finetuned-gesture-prediction-5-classes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_mit_restaurant_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_mit_restaurant_ner_en.md new file mode 100644 index 000000000000..275a279ba981 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_mit_restaurant_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_mit_restaurant_ner DistilBertForTokenClassification from naufalso +author: John Snow Labs +name: distilbert_finetuned_mit_restaurant_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_mit_restaurant_ner` is a English model originally trained by naufalso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_mit_restaurant_ner_en_5.2.0_3.0_1700622908403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_mit_restaurant_ner_en_5.2.0_3.0_1700622908403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_mit_restaurant_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_mit_restaurant_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_mit_restaurant_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/naufalso/distilbert-finetuned-mit-restaurant-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_conll2003_en.md new file mode 100644 index 000000000000..d6928f638a8e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_conll2003_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ner_conll2003 DistilBertForTokenClassification from ViktorDo +author: John Snow Labs +name: distilbert_finetuned_ner_conll2003 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ner_conll2003` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_conll2003_en_5.2.0_3.0_1700628650426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_conll2003_en_5.2.0_3.0_1700628650426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ner_conll2003","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ner_conll2003", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ner_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ViktorDo/DistilBERT-finetuned-ner-conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_copious_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_copious_en.md new file mode 100644 index 000000000000..b8995d4eee80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_copious_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ner_copious DistilBertForTokenClassification from ViktorDo +author: John Snow Labs +name: distilbert_finetuned_ner_copious +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ner_copious` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_copious_en_5.2.0_3.0_1700666751247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_copious_en_5.2.0_3.0_1700666751247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ner_copious","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ner_copious", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ner_copious| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ViktorDo/DistilBERT-finetuned-ner-copious \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_pritam3355_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_pritam3355_en.md new file mode 100644 index 000000000000..a6c2918dd7e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_pritam3355_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ner_pritam3355 DistilBertForTokenClassification from pritam3355 +author: John Snow Labs +name: distilbert_finetuned_ner_pritam3355 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ner_pritam3355` is a English model originally trained by pritam3355. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_pritam3355_en_5.2.0_3.0_1700623355463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_pritam3355_en_5.2.0_3.0_1700623355463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ner_pritam3355","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ner_pritam3355", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ner_pritam3355| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/pritam3355/distilbert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_s800_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_s800_en.md new file mode 100644 index 000000000000..220a269ebf28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_s800_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ner_s800 DistilBertForTokenClassification from ViktorDo +author: John Snow Labs +name: distilbert_finetuned_ner_s800 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ner_s800` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_s800_en_5.2.0_3.0_1700650143536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_s800_en_5.2.0_3.0_1700650143536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ner_s800","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ner_s800", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ner_s800| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ViktorDo/DistilBERT-finetuned-ner-S800 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_safaab_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_safaab_en.md new file mode 100644 index 000000000000..23df1cec16fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_finetuned_ner_safaab_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_ner_safaab DistilBertForTokenClassification from SafaAb +author: John Snow Labs +name: distilbert_finetuned_ner_safaab +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_ner_safaab` is a English model originally trained by SafaAb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_safaab_en_5.2.0_3.0_1700621933934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ner_safaab_en_5.2.0_3.0_1700621933934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_finetuned_ner_safaab","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_finetuned_ner_safaab", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_ner_safaab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/SafaAb/distilbert-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_fresh_10epoch_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_fresh_10epoch_en.md new file mode 100644 index 000000000000..1adab1e4db16 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_fresh_10epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_fresh_10epoch DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_fresh_10epoch +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_fresh_10epoch` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_fresh_10epoch_en_5.2.0_3.0_1700626699458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_fresh_10epoch_en_5.2.0_3.0_1700626699458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_fresh_10epoch","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_fresh_10epoch", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_fresh_10epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilBERT-fresh_10epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_fresh_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_fresh_en.md new file mode 100644 index 000000000000..78c9c76a1148 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_fresh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_fresh DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_fresh +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_fresh` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_fresh_en_5.2.0_3.0_1700638457772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_fresh_en_5.2.0_3.0_1700638457772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_fresh","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_fresh", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_fresh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilBERT-fresh \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_hatexplain_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_hatexplain_en.md new file mode 100644 index 000000000000..d4df976c7e00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_hatexplain_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_hatexplain DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_hatexplain +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_hatexplain` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_hatexplain_en_5.2.0_3.0_1700619753982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_hatexplain_en_5.2.0_3.0_1700619753982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_hatexplain","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_hatexplain", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_hatexplain| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilbert-hatexplain \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_hatexplain_label_all_tokens_false_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_hatexplain_label_all_tokens_false_en.md new file mode 100644 index 000000000000..7eceb262e9c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_hatexplain_label_all_tokens_false_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_hatexplain_label_all_tokens_false DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_hatexplain_label_all_tokens_false +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_hatexplain_label_all_tokens_false` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_hatexplain_label_all_tokens_false_en_5.2.0_3.0_1700623239596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_hatexplain_label_all_tokens_false_en_5.2.0_3.0_1700623239596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_hatexplain_label_all_tokens_false","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_hatexplain_label_all_tokens_false", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_hatexplain_label_all_tokens_false| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilbert-hatexplain-label-all-tokens-False \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_label_all_tokens_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_label_all_tokens_en.md new file mode 100644 index 000000000000..48004ab09ca3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_label_all_tokens_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_label_all_tokens DistilBertForTokenClassification from troesy +author: John Snow Labs +name: distilbert_label_all_tokens +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_label_all_tokens` is a English model originally trained by troesy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_label_all_tokens_en_5.2.0_3.0_1700620564904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_label_all_tokens_en_5.2.0_3.0_1700620564904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_label_all_tokens","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_label_all_tokens", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_label_all_tokens| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/troesy/distilbert-label-all-tokens \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_pabloguinea_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_pabloguinea_en.md new file mode 100644 index 000000000000..b6c9a2024cdb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_pabloguinea_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_pabloguinea DistilBertForTokenClassification from PabloGuinea +author: John Snow Labs +name: distilbert_pabloguinea +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_pabloguinea` is a English model originally trained by PabloGuinea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_pabloguinea_en_5.2.0_3.0_1700651745872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_pabloguinea_en_5.2.0_3.0_1700651745872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_pabloguinea","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_pabloguinea", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_pabloguinea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/PabloGuinea/distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_judgements_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_judgements_en.md new file mode 100644 index 000000000000..b97622733c07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_judgements_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_french_judgements DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_french_judgements +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_french_judgements` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_judgements_en_5.2.0_3.0_1700621645904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_judgements_en_5.2.0_3.0_1700621645904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_french_judgements","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_french_judgements", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_french_judgements| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-fr-judgements \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_laws_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_laws_en.md new file mode 100644 index 000000000000..4eac231a4847 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_french_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_french_laws +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_french_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_laws_en_5.2.0_3.0_1700622111354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_laws_en_5.2.0_3.0_1700622111354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_french_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_french_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_french_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-fr-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_spanish_italian_english_german_judgements_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_spanish_italian_english_german_judgements_en.md new file mode 100644 index 000000000000..9f2a68351508 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_french_spanish_italian_english_german_judgements_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_french_spanish_italian_english_german_judgements DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_french_spanish_italian_english_german_judgements +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_french_spanish_italian_english_german_judgements` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_spanish_italian_english_german_judgements_en_5.2.0_3.0_1700620808007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_french_spanish_italian_english_german_judgements_en_5.2.0_3.0_1700620808007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_french_spanish_italian_english_german_judgements","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_french_spanish_italian_english_german_judgements", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_french_spanish_italian_english_german_judgements| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-fr-es-it-en-de-judgements \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_german_judgements_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_german_judgements_en.md new file mode 100644 index 000000000000..19f23b8ddcac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_german_judgements_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_german_judgements DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_german_judgements +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_german_judgements` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_german_judgements_en_5.2.0_3.0_1700621014968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_german_judgements_en_5.2.0_3.0_1700621014968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_german_judgements","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_german_judgements", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_german_judgements| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-de-judgements \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_judgements_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_judgements_en.md new file mode 100644 index 000000000000..84132e6da4b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_judgements_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_italian_judgements DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_italian_judgements +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_italian_judgements` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_italian_judgements_en_5.2.0_3.0_1700622757027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_italian_judgements_en_5.2.0_3.0_1700622757027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_italian_judgements","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_italian_judgements", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_italian_judgements| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-it-judgements \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_judgements_laws_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_judgements_laws_en.md new file mode 100644 index 000000000000..464e9085dccf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_judgements_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_italian_judgements_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_italian_judgements_laws +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_italian_judgements_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_italian_judgements_laws_en_5.2.0_3.0_1700620386430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_italian_judgements_laws_en_5.2.0_3.0_1700620386430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_italian_judgements_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_italian_judgements_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_italian_judgements_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-it-judgements-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_laws_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_laws_en.md new file mode 100644 index 000000000000..edc41e933c2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_italian_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_italian_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_italian_laws +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_italian_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_italian_laws_en_5.2.0_3.0_1700622122916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_italian_laws_en_5.2.0_3.0_1700622122916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_italian_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_italian_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_italian_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-it-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_spanish_judgements_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_spanish_judgements_en.md new file mode 100644 index 000000000000..49b8aef68b68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_spanish_judgements_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_spanish_judgements DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_spanish_judgements +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_spanish_judgements` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_spanish_judgements_en_5.2.0_3.0_1700621215214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_spanish_judgements_en_5.2.0_3.0_1700621215214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_spanish_judgements","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_spanish_judgements", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_spanish_judgements| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-es-judgements \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_spanish_laws_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_spanish_laws_en.md new file mode 100644 index 000000000000..18b64a0d6dda --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_sbd_spanish_laws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_sbd_spanish_laws DistilBertForTokenClassification from rcds +author: John Snow Labs +name: distilbert_sbd_spanish_laws +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_sbd_spanish_laws` is a English model originally trained by rcds. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_sbd_spanish_laws_en_5.2.0_3.0_1700621746249.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_sbd_spanish_laws_en_5.2.0_3.0_1700621746249.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_sbd_spanish_laws","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_sbd_spanish_laws", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_sbd_spanish_laws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/rcds/distilbert-SBD-es-laws \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilbert_token_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilbert_token_en.md new file mode 100644 index 000000000000..55f56ace7ec2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilbert_token_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_token DistilBertForTokenClassification from PabloGuinea +author: John Snow Labs +name: distilbert_token +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_token` is a English model originally trained by PabloGuinea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_token_en_5.2.0_3.0_1700626182676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_token_en_5.2.0_3.0_1700626182676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilbert_token","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilbert_token", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_token| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/PabloGuinea/distilbert-token \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilkobert_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilkobert_finetuned_ner_en.md new file mode 100644 index 000000000000..360894a592e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilkobert_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilkobert_finetuned_ner DistilBertForTokenClassification from jeinsong +author: John Snow Labs +name: distilkobert_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilkobert_finetuned_ner` is a English model originally trained by jeinsong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilkobert_finetuned_ner_en_5.2.0_3.0_1700643784147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilkobert_finetuned_ner_en_5.2.0_3.0_1700643784147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilkobert_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilkobert_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilkobert_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|104.3 MB| + +## References + +https://huggingface.co/jeinsong/distilkobert-finetuned-NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilkobert_kemofact_0925_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilkobert_kemofact_0925_en.md new file mode 100644 index 000000000000..d5bee196974d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilkobert_kemofact_0925_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilkobert_kemofact_0925 DistilBertForTokenClassification from xoyeop +author: John Snow Labs +name: distilkobert_kemofact_0925 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilkobert_kemofact_0925` is a English model originally trained by xoyeop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilkobert_kemofact_0925_en_5.2.0_3.0_1700648148254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilkobert_kemofact_0925_en_5.2.0_3.0_1700648148254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilkobert_kemofact_0925","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilkobert_kemofact_0925", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilkobert_kemofact_0925| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|104.4 MB| + +## References + +https://huggingface.co/xoyeop/distilkobert-KEmoFact-0925 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distilkobert_kemofact_efe_0927_en.md b/docs/_posts/ahmedlone127/2023-11-22-distilkobert_kemofact_efe_0927_en.md new file mode 100644 index 000000000000..02f8d8190181 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distilkobert_kemofact_efe_0927_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilkobert_kemofact_efe_0927 DistilBertForTokenClassification from xoyeop +author: John Snow Labs +name: distilkobert_kemofact_efe_0927 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilkobert_kemofact_efe_0927` is a English model originally trained by xoyeop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilkobert_kemofact_efe_0927_en_5.2.0_3.0_1700649420216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilkobert_kemofact_efe_0927_en_5.2.0_3.0_1700649420216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distilkobert_kemofact_efe_0927","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distilkobert_kemofact_efe_0927", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilkobert_kemofact_efe_0927| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|104.3 MB| + +## References + +https://huggingface.co/xoyeop/distilkobert-KEmoFact-EFE-0927 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-distillbert_base_uncase_conll2003_en.md b/docs/_posts/ahmedlone127/2023-11-22-distillbert_base_uncase_conll2003_en.md new file mode 100644 index 000000000000..d2daab8791fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-distillbert_base_uncase_conll2003_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distillbert_base_uncase_conll2003 DistilBertForTokenClassification from satyamrajawat1994 +author: John Snow Labs +name: distillbert_base_uncase_conll2003 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncase_conll2003` is a English model originally trained by satyamrajawat1994. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncase_conll2003_en_5.2.0_3.0_1700622414159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncase_conll2003_en_5.2.0_3.0_1700622414159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("distillbert_base_uncase_conll2003","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("distillbert_base_uncase_conll2003", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncase_conll2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/satyamrajawat1994/distillbert-base-uncase-conll2003 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-dogebooch_biomedical_ner_all_datasets_4_en.md b/docs/_posts/ahmedlone127/2023-11-22-dogebooch_biomedical_ner_all_datasets_4_en.md new file mode 100644 index 000000000000..357d555278cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-dogebooch_biomedical_ner_all_datasets_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dogebooch_biomedical_ner_all_datasets_4 DistilBertForTokenClassification from Dogebooch +author: John Snow Labs +name: dogebooch_biomedical_ner_all_datasets_4 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dogebooch_biomedical_ner_all_datasets_4` is a English model originally trained by Dogebooch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dogebooch_biomedical_ner_all_datasets_4_en_5.2.0_3.0_1700647815383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dogebooch_biomedical_ner_all_datasets_4_en_5.2.0_3.0_1700647815383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("dogebooch_biomedical_ner_all_datasets_4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("dogebooch_biomedical_ner_all_datasets_4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dogebooch_biomedical_ner_all_datasets_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/Dogebooch/Dogebooch_biomedical_ner_all_Datasets_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ecobert_finetuned_ner_copious_en.md b/docs/_posts/ahmedlone127/2023-11-22-ecobert_finetuned_ner_copious_en.md new file mode 100644 index 000000000000..1cae15279d73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ecobert_finetuned_ner_copious_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ecobert_finetuned_ner_copious DistilBertForTokenClassification from ViktorDo +author: John Snow Labs +name: ecobert_finetuned_ner_copious +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ecobert_finetuned_ner_copious` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ecobert_finetuned_ner_copious_en_5.2.0_3.0_1700675099508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ecobert_finetuned_ner_copious_en_5.2.0_3.0_1700675099508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ecobert_finetuned_ner_copious","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ecobert_finetuned_ner_copious", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ecobert_finetuned_ner_copious| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ViktorDo/EcoBERT-finetuned-ner-copious \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ecobert_finetuned_ner_s800_en.md b/docs/_posts/ahmedlone127/2023-11-22-ecobert_finetuned_ner_s800_en.md new file mode 100644 index 000000000000..cc0928e2db14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ecobert_finetuned_ner_s800_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ecobert_finetuned_ner_s800 DistilBertForTokenClassification from ViktorDo +author: John Snow Labs +name: ecobert_finetuned_ner_s800 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ecobert_finetuned_ner_s800` is a English model originally trained by ViktorDo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ecobert_finetuned_ner_s800_en_5.2.0_3.0_1700672483090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ecobert_finetuned_ner_s800_en_5.2.0_3.0_1700672483090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ecobert_finetuned_ner_s800","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ecobert_finetuned_ner_s800", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ecobert_finetuned_ner_s800| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ViktorDo/EcoBERT-finetuned-ner-S800 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-entity_extraction_not_evaluated_en.md b/docs/_posts/ahmedlone127/2023-11-22-entity_extraction_not_evaluated_en.md new file mode 100644 index 000000000000..e8ea94d39864 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-entity_extraction_not_evaluated_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English entity_extraction_not_evaluated DistilBertForTokenClassification from autoevaluate +author: John Snow Labs +name: entity_extraction_not_evaluated +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`entity_extraction_not_evaluated` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_extraction_not_evaluated_en_5.2.0_3.0_1700653423170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/entity_extraction_not_evaluated_en_5.2.0_3.0_1700653423170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("entity_extraction_not_evaluated","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("entity_extraction_not_evaluated", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|entity_extraction_not_evaluated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/autoevaluate/entity-extraction-not-evaluated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-experiment_2_en.md b/docs/_posts/ahmedlone127/2023-11-22-experiment_2_en.md new file mode 100644 index 000000000000..877091146d78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-experiment_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English experiment_2 DistilBertForTokenClassification from sophiestein +author: John Snow Labs +name: experiment_2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`experiment_2` is a English model originally trained by sophiestein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/experiment_2_en_5.2.0_3.0_1700631433791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/experiment_2_en_5.2.0_3.0_1700631433791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("experiment_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("experiment_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|experiment_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sophiestein/experiment_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-few_nerd_en.md b/docs/_posts/ahmedlone127/2023-11-22-few_nerd_en.md new file mode 100644 index 000000000000..d7baaadecac3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-few_nerd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English few_nerd DistilBertForTokenClassification from kolj4 +author: John Snow Labs +name: few_nerd +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`few_nerd` is a English model originally trained by kolj4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/few_nerd_en_5.2.0_3.0_1700641405107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/few_nerd_en_5.2.0_3.0_1700641405107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("few_nerd","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("few_nerd", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|few_nerd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/kolj4/few_nerd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-fine_tuned_cybersecurity_ner2_en.md b/docs/_posts/ahmedlone127/2023-11-22-fine_tuned_cybersecurity_ner2_en.md new file mode 100644 index 000000000000..1c4107b1ddaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-fine_tuned_cybersecurity_ner2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fine_tuned_cybersecurity_ner2 DistilBertForTokenClassification from Abiral7 +author: John Snow Labs +name: fine_tuned_cybersecurity_ner2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_cybersecurity_ner2` is a English model originally trained by Abiral7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_cybersecurity_ner2_en_5.2.0_3.0_1700620715184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_cybersecurity_ner2_en_5.2.0_3.0_1700620715184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("fine_tuned_cybersecurity_ner2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("fine_tuned_cybersecurity_ner2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_cybersecurity_ner2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Abiral7/fine-tuned-cybersecurity-ner2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-finetune_wnut_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-finetune_wnut_model_en.md new file mode 100644 index 000000000000..3f6fb508b2be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-finetune_wnut_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetune_wnut_model DistilBertForTokenClassification from vsufiy +author: John Snow Labs +name: finetune_wnut_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetune_wnut_model` is a English model originally trained by vsufiy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetune_wnut_model_en_5.2.0_3.0_1700665834421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetune_wnut_model_en_5.2.0_3.0_1700665834421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetune_wnut_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetune_wnut_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetune_wnut_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vsufiy/finetune_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-finetuned_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-finetuned_model_en.md new file mode 100644 index 000000000000..72fcf06ebe49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-finetuned_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_model DistilBertForTokenClassification from hemulitch +author: John Snow Labs +name: finetuned_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_model` is a English model originally trained by hemulitch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_model_en_5.2.0_3.0_1700643013355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_model_en_5.2.0_3.0_1700643013355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hemulitch/finetuned_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-finetuned_ner_finegrained_en.md b/docs/_posts/ahmedlone127/2023-11-22-finetuned_ner_finegrained_en.md new file mode 100644 index 000000000000..6cd14e7a0b31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-finetuned_ner_finegrained_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_ner_finegrained DistilBertForTokenClassification from kolj4 +author: John Snow Labs +name: finetuned_ner_finegrained +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_ner_finegrained` is a English model originally trained by kolj4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_ner_finegrained_en_5.2.0_3.0_1700639857519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_ner_finegrained_en_5.2.0_3.0_1700639857519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("finetuned_ner_finegrained","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("finetuned_ner_finegrained", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_ner_finegrained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.4 MB| + +## References + +https://huggingface.co/kolj4/finetuned-ner-finegrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-genderprediction_en.md b/docs/_posts/ahmedlone127/2023-11-22-genderprediction_en.md new file mode 100644 index 000000000000..1631219eec23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-genderprediction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English genderprediction DistilBertForTokenClassification from rahulkhandelw +author: John Snow Labs +name: genderprediction +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`genderprediction` is a English model originally trained by rahulkhandelw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/genderprediction_en_5.2.0_3.0_1700624231648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/genderprediction_en_5.2.0_3.0_1700624231648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("genderprediction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("genderprediction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|genderprediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rahulkhandelw/GenderPrediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-hueta_finetuned_1_en.md b/docs/_posts/ahmedlone127/2023-11-22-hueta_finetuned_1_en.md new file mode 100644 index 000000000000..0341edea4f19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-hueta_finetuned_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hueta_finetuned_1 DistilBertForTokenClassification from hemulitch +author: John Snow Labs +name: hueta_finetuned_1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hueta_finetuned_1` is a English model originally trained by hemulitch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hueta_finetuned_1_en_5.2.0_3.0_1700649752617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hueta_finetuned_1_en_5.2.0_3.0_1700649752617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("hueta_finetuned_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("hueta_finetuned_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hueta_finetuned_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hemulitch/hueta-finetuned_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-hueta_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-22-hueta_finetuned_en.md new file mode 100644 index 000000000000..e189ef356259 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-hueta_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hueta_finetuned DistilBertForTokenClassification from hemulitch +author: John Snow Labs +name: hueta_finetuned +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hueta_finetuned` is a English model originally trained by hemulitch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hueta_finetuned_en_5.2.0_3.0_1700663098719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hueta_finetuned_en_5.2.0_3.0_1700663098719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("hueta_finetuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("hueta_finetuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hueta_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/hemulitch/hueta-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-hun_wnut_modell_en.md b/docs/_posts/ahmedlone127/2023-11-22-hun_wnut_modell_en.md new file mode 100644 index 000000000000..136577db2db6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-hun_wnut_modell_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hun_wnut_modell DistilBertForTokenClassification from terhdavid +author: John Snow Labs +name: hun_wnut_modell +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hun_wnut_modell` is a English model originally trained by terhdavid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hun_wnut_modell_en_5.2.0_3.0_1700636721718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hun_wnut_modell_en_5.2.0_3.0_1700636721718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("hun_wnut_modell","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("hun_wnut_modell", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hun_wnut_modell| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/terhdavid/hun_wnut_modell \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-increase_exp_en.md b/docs/_posts/ahmedlone127/2023-11-22-increase_exp_en.md new file mode 100644 index 000000000000..4bb2da07497a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-increase_exp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English increase_exp DistilBertForTokenClassification from GiladH +author: John Snow Labs +name: increase_exp +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`increase_exp` is a English model originally trained by GiladH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/increase_exp_en_5.2.0_3.0_1700668540091.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/increase_exp_en_5.2.0_3.0_1700668540091.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("increase_exp","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("increase_exp", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|increase_exp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/GiladH/increase_exp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-innox_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-22-innox_distilbert_en.md new file mode 100644 index 000000000000..63c04a50a126 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-innox_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English innox_distilbert DistilBertForTokenClassification from brao +author: John Snow Labs +name: innox_distilbert +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`innox_distilbert` is a English model originally trained by brao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/innox_distilbert_en_5.2.0_3.0_1700620561905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/innox_distilbert_en_5.2.0_3.0_1700620561905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("innox_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("innox_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|innox_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/brao/innox-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-insertbert05_en.md b/docs/_posts/ahmedlone127/2023-11-22-insertbert05_en.md new file mode 100644 index 000000000000..4c4b90342082 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-insertbert05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English insertbert05 DistilBertForTokenClassification from adasgaleus +author: John Snow Labs +name: insertbert05 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`insertbert05` is a English model originally trained by adasgaleus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/insertbert05_en_5.2.0_3.0_1700622289865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/insertbert05_en_5.2.0_3.0_1700622289865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("insertbert05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("insertbert05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|insertbert05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adasgaleus/insertbert05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-insertion_prop05_ls01_en.md b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop05_ls01_en.md new file mode 100644 index 000000000000..9d4a1f829a04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop05_ls01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English insertion_prop05_ls01 DistilBertForTokenClassification from adasgaleus +author: John Snow Labs +name: insertion_prop05_ls01 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`insertion_prop05_ls01` is a English model originally trained by adasgaleus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/insertion_prop05_ls01_en_5.2.0_3.0_1700658941706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/insertion_prop05_ls01_en_5.2.0_3.0_1700658941706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("insertion_prop05_ls01","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("insertion_prop05_ls01", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|insertion_prop05_ls01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adasgaleus/insertion-prop05-ls01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-insertion_prop05_vocab_en.md b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop05_vocab_en.md new file mode 100644 index 000000000000..39edab0abc25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop05_vocab_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English insertion_prop05_vocab DistilBertForTokenClassification from adasgaleus +author: John Snow Labs +name: insertion_prop05_vocab +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`insertion_prop05_vocab` is a English model originally trained by adasgaleus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/insertion_prop05_vocab_en_5.2.0_3.0_1700644271447.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/insertion_prop05_vocab_en_5.2.0_3.0_1700644271447.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("insertion_prop05_vocab","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("insertion_prop05_vocab", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|insertion_prop05_vocab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adasgaleus/insertion-prop05-vocab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_015_correct_data_en.md b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_015_correct_data_en.md new file mode 100644 index 000000000000..800eb50cf3f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_015_correct_data_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English insertion_prop_015_correct_data DistilBertForTokenClassification from adasgaleus +author: John Snow Labs +name: insertion_prop_015_correct_data +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`insertion_prop_015_correct_data` is a English model originally trained by adasgaleus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/insertion_prop_015_correct_data_en_5.2.0_3.0_1700641991572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/insertion_prop_015_correct_data_en_5.2.0_3.0_1700641991572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("insertion_prop_015_correct_data","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("insertion_prop_015_correct_data", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|insertion_prop_015_correct_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adasgaleus/insertion-prop-015-correct-data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_05_correct_data_en.md b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_05_correct_data_en.md new file mode 100644 index 000000000000..47110e43e8fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_05_correct_data_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English insertion_prop_05_correct_data DistilBertForTokenClassification from adasgaleus +author: John Snow Labs +name: insertion_prop_05_correct_data +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`insertion_prop_05_correct_data` is a English model originally trained by adasgaleus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/insertion_prop_05_correct_data_en_5.2.0_3.0_1700651293943.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/insertion_prop_05_correct_data_en_5.2.0_3.0_1700651293943.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("insertion_prop_05_correct_data","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("insertion_prop_05_correct_data", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|insertion_prop_05_correct_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adasgaleus/insertion-prop-05-correct-data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_05_en.md b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_05_en.md new file mode 100644 index 000000000000..d3aae60195b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-insertion_prop_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English insertion_prop_05 DistilBertForTokenClassification from adasgaleus +author: John Snow Labs +name: insertion_prop_05 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`insertion_prop_05` is a English model originally trained by adasgaleus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/insertion_prop_05_en_5.2.0_3.0_1700623492647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/insertion_prop_05_en_5.2.0_3.0_1700623492647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("insertion_prop_05","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("insertion_prop_05", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|insertion_prop_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adasgaleus/insertion-prop-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-jl_distilbert_german_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-jl_distilbert_german_finetuned_ner_en.md new file mode 100644 index 000000000000..f895d5f1a57c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-jl_distilbert_german_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jl_distilbert_german_finetuned_ner DistilBertForTokenClassification from xander71988 +author: John Snow Labs +name: jl_distilbert_german_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jl_distilbert_german_finetuned_ner` is a English model originally trained by xander71988. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jl_distilbert_german_finetuned_ner_en_5.2.0_3.0_1700643352463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jl_distilbert_german_finetuned_ner_en_5.2.0_3.0_1700643352463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("jl_distilbert_german_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("jl_distilbert_german_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jl_distilbert_german_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.3 MB| + +## References + +https://huggingface.co/xander71988/jl-distilbert-de-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-loc_dataset_en.md b/docs/_posts/ahmedlone127/2023-11-22-loc_dataset_en.md new file mode 100644 index 000000000000..496537a63187 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-loc_dataset_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English loc_dataset DistilBertForTokenClassification from jauyeung +author: John Snow Labs +name: loc_dataset +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`loc_dataset` is a English model originally trained by jauyeung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/loc_dataset_en_5.2.0_3.0_1700633823072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/loc_dataset_en_5.2.0_3.0_1700633823072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("loc_dataset","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("loc_dataset", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|loc_dataset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jauyeung/loc_dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-medhack_en.md b/docs/_posts/ahmedlone127/2023-11-22-medhack_en.md new file mode 100644 index 000000000000..4f1ec6ba0b84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-medhack_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English medhack DistilBertForTokenClassification from myxik +author: John Snow Labs +name: medhack +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medhack` is a English model originally trained by myxik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medhack_en_5.2.0_3.0_1700675338607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medhack_en_5.2.0_3.0_1700675338607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("medhack","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("medhack", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medhack| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/myxik/medhack \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-meow_tagging_en.md b/docs/_posts/ahmedlone127/2023-11-22-meow_tagging_en.md new file mode 100644 index 000000000000..3dfcc6f6f578 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-meow_tagging_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English meow_tagging DistilBertForTokenClassification from h90a00l +author: John Snow Labs +name: meow_tagging +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`meow_tagging` is a English model originally trained by h90a00l. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/meow_tagging_en_5.2.0_3.0_1700634368485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/meow_tagging_en_5.2.0_3.0_1700634368485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("meow_tagging","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("meow_tagging", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|meow_tagging| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/h90a00l/meow_tagging \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-model_output_en.md b/docs/_posts/ahmedlone127/2023-11-22-model_output_en.md new file mode 100644 index 000000000000..60e0e9f7be3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-model_output_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model_output DistilBertForTokenClassification from alicenkbaytop +author: John Snow Labs +name: model_output +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_output` is a English model originally trained by alicenkbaytop. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_output_en_5.2.0_3.0_1700659334169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_output_en_5.2.0_3.0_1700659334169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("model_output","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("model_output", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_output| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/alicenkbaytop/model_output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-modelworking2_en.md b/docs/_posts/ahmedlone127/2023-11-22-modelworking2_en.md new file mode 100644 index 000000000000..d42f5644a6bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-modelworking2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English modelworking2 DistilBertForTokenClassification from rijulnandy +author: John Snow Labs +name: modelworking2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelworking2` is a English model originally trained by rijulnandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelworking2_en_5.2.0_3.0_1700674675396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelworking2_en_5.2.0_3.0_1700674675396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("modelworking2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("modelworking2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelworking2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rijulnandy/modelWorking2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-modelworking3_en.md b/docs/_posts/ahmedlone127/2023-11-22-modelworking3_en.md new file mode 100644 index 000000000000..8fb27c0fa686 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-modelworking3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English modelworking3 DistilBertForTokenClassification from rijulnandy +author: John Snow Labs +name: modelworking3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelworking3` is a English model originally trained by rijulnandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelworking3_en_5.2.0_3.0_1700671189150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelworking3_en_5.2.0_3.0_1700671189150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("modelworking3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("modelworking3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelworking3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rijulnandy/modelWorking3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-modelworking4_en.md b/docs/_posts/ahmedlone127/2023-11-22-modelworking4_en.md new file mode 100644 index 000000000000..f6752899a5d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-modelworking4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English modelworking4 DistilBertForTokenClassification from rijulnandy +author: John Snow Labs +name: modelworking4 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelworking4` is a English model originally trained by rijulnandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelworking4_en_5.2.0_3.0_1700665041664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelworking4_en_5.2.0_3.0_1700665041664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("modelworking4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("modelworking4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelworking4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rijulnandy/modelWorking4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-modelworking6_copy_en.md b/docs/_posts/ahmedlone127/2023-11-22-modelworking6_copy_en.md new file mode 100644 index 000000000000..e897d784c2d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-modelworking6_copy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English modelworking6_copy DistilBertForTokenClassification from rijulnandy +author: John Snow Labs +name: modelworking6_copy +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelworking6_copy` is a English model originally trained by rijulnandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelworking6_copy_en_5.2.0_3.0_1700669889975.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelworking6_copy_en_5.2.0_3.0_1700669889975.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("modelworking6_copy","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("modelworking6_copy", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelworking6_copy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rijulnandy/modelWorking6-copy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata2_en.md b/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata2_en.md new file mode 100644 index 000000000000..7a367914f1f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English modelworkingmanualdata2 DistilBertForTokenClassification from rijulnandy +author: John Snow Labs +name: modelworkingmanualdata2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelworkingmanualdata2` is a English model originally trained by rijulnandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelworkingmanualdata2_en_5.2.0_3.0_1700666821676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelworkingmanualdata2_en_5.2.0_3.0_1700666821676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("modelworkingmanualdata2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("modelworkingmanualdata2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelworkingmanualdata2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rijulnandy/modelWorkingManualData2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata3_en.md b/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata3_en.md new file mode 100644 index 000000000000..17372927b00f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English modelworkingmanualdata3 DistilBertForTokenClassification from rijulnandy +author: John Snow Labs +name: modelworkingmanualdata3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelworkingmanualdata3` is a English model originally trained by rijulnandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelworkingmanualdata3_en_5.2.0_3.0_1700679071393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelworkingmanualdata3_en_5.2.0_3.0_1700679071393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("modelworkingmanualdata3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("modelworkingmanualdata3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelworkingmanualdata3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rijulnandy/modelWorkingManualData3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata_en.md b/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata_en.md new file mode 100644 index 000000000000..16008516f80b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-modelworkingmanualdata_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English modelworkingmanualdata DistilBertForTokenClassification from rijulnandy +author: John Snow Labs +name: modelworkingmanualdata +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`modelworkingmanualdata` is a English model originally trained by rijulnandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/modelworkingmanualdata_en_5.2.0_3.0_1700676815089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/modelworkingmanualdata_en_5.2.0_3.0_1700676815089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("modelworkingmanualdata","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("modelworkingmanualdata", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|modelworkingmanualdata| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rijulnandy/modelWorkingManualData \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-mongolian_distilbert_base_multilingual_cased_demo_xx.md b/docs/_posts/ahmedlone127/2023-11-22-mongolian_distilbert_base_multilingual_cased_demo_xx.md new file mode 100644 index 000000000000..053e6b67ed18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-mongolian_distilbert_base_multilingual_cased_demo_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual mongolian_distilbert_base_multilingual_cased_demo DistilBertForTokenClassification from Enkhbold +author: John Snow Labs +name: mongolian_distilbert_base_multilingual_cased_demo +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mongolian_distilbert_base_multilingual_cased_demo` is a Multilingual model originally trained by Enkhbold. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mongolian_distilbert_base_multilingual_cased_demo_xx_5.2.0_3.0_1700619329420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mongolian_distilbert_base_multilingual_cased_demo_xx_5.2.0_3.0_1700619329420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("mongolian_distilbert_base_multilingual_cased_demo","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("mongolian_distilbert_base_multilingual_cased_demo", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mongolian_distilbert_base_multilingual_cased_demo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Enkhbold/mongolian-distilbert-base-multilingual-cased-demo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-moodprediction_en.md b/docs/_posts/ahmedlone127/2023-11-22-moodprediction_en.md new file mode 100644 index 000000000000..114c51026d26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-moodprediction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English moodprediction DistilBertForTokenClassification from rahulkhandelw +author: John Snow Labs +name: moodprediction +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moodprediction` is a English model originally trained by rahulkhandelw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moodprediction_en_5.2.0_3.0_1700675807787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moodprediction_en_5.2.0_3.0_1700675807787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("moodprediction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("moodprediction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moodprediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rahulkhandelw/MoodPrediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-nassganbiomedical_en.md b/docs/_posts/ahmedlone127/2023-11-22-nassganbiomedical_en.md new file mode 100644 index 000000000000..bd921d06d3f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-nassganbiomedical_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nassganbiomedical DistilBertForTokenClassification from nassga +author: John Snow Labs +name: nassganbiomedical +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nassganbiomedical` is a English model originally trained by nassga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nassganbiomedical_en_5.2.0_3.0_1700627762233.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nassganbiomedical_en_5.2.0_3.0_1700627762233.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("nassganbiomedical","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("nassganbiomedical", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nassganbiomedical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/nassga/nassGanBioMedical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_classification_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_classification_en.md new file mode 100644 index 000000000000..550952e8d140 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_classification_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_classification DistilBertForTokenClassification from oyvindgrutle +author: John Snow Labs +name: ner_classification +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_classification` is a English model originally trained by oyvindgrutle. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_classification_en_5.2.0_3.0_1700619870138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_classification_en_5.2.0_3.0_1700619870138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_classification","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_classification", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/oyvindgrutle/ner-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_distilbert_cased_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_distilbert_cased_en.md new file mode 100644 index 000000000000..78e9bec679c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_distilbert_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_distilbert_cased DistilBertForTokenClassification from rjac +author: John Snow Labs +name: ner_distilbert_cased +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_distilbert_cased` is a English model originally trained by rjac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_distilbert_cased_en_5.2.0_3.0_1700623529181.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_distilbert_cased_en_5.2.0_3.0_1700623529181.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_distilbert_cased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_distilbert_cased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_distilbert_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/rjac/ner-distilbert-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_distillbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_distillbert_ner_en.md new file mode 100644 index 000000000000..81294d835e9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_distillbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_distillbert_ner DistilBertForTokenClassification from harvinder676 +author: John Snow Labs +name: ner_distillbert_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_distillbert_ner` is a English model originally trained by harvinder676. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_distillbert_ner_en_5.2.0_3.0_1700653017197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_distillbert_ner_en_5.2.0_3.0_1700653017197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_distillbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_distillbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_distillbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/harvinder676/ner-distillbert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_distillbert_ner_tags_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_distillbert_ner_tags_en.md new file mode 100644 index 000000000000..fac5a61752a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_distillbert_ner_tags_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_distillbert_ner_tags DistilBertForTokenClassification from harvinder676 +author: John Snow Labs +name: ner_distillbert_ner_tags +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_distillbert_ner_tags` is a English model originally trained by harvinder676. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_distillbert_ner_tags_en_5.2.0_3.0_1700661243949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_distillbert_ner_tags_en_5.2.0_3.0_1700661243949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_distillbert_ner_tags","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_distillbert_ner_tags", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_distillbert_ner_tags| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/harvinder676/ner-distillbert-ner-tags \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_loc_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_loc_en.md new file mode 100644 index 000000000000..76e630ff85fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_loc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_loc DistilBertForTokenClassification from jauyeung +author: John Snow Labs +name: ner_loc +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_loc` is a English model originally trained by jauyeung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_loc_en_5.2.0_3.0_1700645154600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_loc_en_5.2.0_3.0_1700645154600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_loc","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_loc", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_loc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jauyeung/ner_loc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_model_v1_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_model_v1_en.md new file mode 100644 index 000000000000..1bca86f9992a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_model_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_model_v1 DistilBertForTokenClassification from AbderrahimAl +author: John Snow Labs +name: ner_model_v1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_model_v1` is a English model originally trained by AbderrahimAl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_model_v1_en_5.2.0_3.0_1700669035895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_model_v1_en_5.2.0_3.0_1700669035895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_model_v1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_model_v1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_model_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/AbderrahimAl/ner_model_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_our_base_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_our_base_model_en.md new file mode 100644 index 000000000000..daba5b8fc5e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_our_base_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_our_base_model DistilBertForTokenClassification from yashveer11 +author: John Snow Labs +name: ner_our_base_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_our_base_model` is a English model originally trained by yashveer11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_our_base_model_en_5.2.0_3.0_1700649940341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_our_base_model_en_5.2.0_3.0_1700649940341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_our_base_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_our_base_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_our_base_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yashveer11/Ner-our-base-Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_test3_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_test3_en.md new file mode 100644 index 000000000000..5921c50962ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_test3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_test3 DistilBertForTokenClassification from inesani +author: John Snow Labs +name: ner_test3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_test3` is a English model originally trained by inesani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_test3_en_5.2.0_3.0_1700621429951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_test3_en_5.2.0_3.0_1700621429951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_test3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_test3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_test3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/inesani/ner-test3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_testing_1_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_testing_1_en.md new file mode 100644 index 000000000000..2016be66f970 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_testing_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_testing_1 DistilBertForTokenClassification from yashveer11 +author: John Snow Labs +name: ner_testing_1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_testing_1` is a English model originally trained by yashveer11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_testing_1_en_5.2.0_3.0_1700636973544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_testing_1_en_5.2.0_3.0_1700636973544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_testing_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_testing_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_testing_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/yashveer11/Ner-Testing-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-ner_trainer_en.md b/docs/_posts/ahmedlone127/2023-11-22-ner_trainer_en.md new file mode 100644 index 000000000000..1f59b61511d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-ner_trainer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ner_trainer DistilBertForTokenClassification from jenniferjane +author: John Snow Labs +name: ner_trainer +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ner_trainer` is a English model originally trained by jenniferjane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_trainer_en_5.2.0_3.0_1700623640746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ner_trainer_en_5.2.0_3.0_1700623640746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("ner_trainer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("ner_trainer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ner_trainer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/jenniferjane/ner_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model_en.md new file mode 100644 index 000000000000..eb39a0264255 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model DistilBertForTokenClassification from GuCuChiara +author: John Snow Labs +name: nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model` is a English model originally trained by GuCuChiara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model_en_5.2.0_3.0_1700647431903.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model_en_5.2.0_3.0_1700647431903.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_hiba2_distemist_fine_tuned_biobert_pretrained_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/GuCuChiara/NLP-HIBA2_DisTEMIST_fine_tuned_biobert-pretrained-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-nlp_p4_en.md b/docs/_posts/ahmedlone127/2023-11-22-nlp_p4_en.md new file mode 100644 index 000000000000..7836891aeb0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-nlp_p4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlp_p4 DistilBertForTokenClassification from sameearif88 +author: John Snow Labs +name: nlp_p4 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_p4` is a English model originally trained by sameearif88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_p4_en_5.2.0_3.0_1700679128298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_p4_en_5.2.0_3.0_1700679128298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("nlp_p4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("nlp_p4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_p4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sameearif88/nlp-p4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-nlpstudy_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-22-nlpstudy_distilbert_en.md new file mode 100644 index 000000000000..1ac7e17bd368 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-nlpstudy_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlpstudy_distilbert DistilBertForTokenClassification from d8888 +author: John Snow Labs +name: nlpstudy_distilbert +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlpstudy_distilbert` is a English model originally trained by d8888. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlpstudy_distilbert_en_5.2.0_3.0_1700632388472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlpstudy_distilbert_en_5.2.0_3.0_1700632388472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("nlpstudy_distilbert","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("nlpstudy_distilbert", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlpstudy_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/d8888/nlpstudy_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-numberprediction_en.md b/docs/_posts/ahmedlone127/2023-11-22-numberprediction_en.md new file mode 100644 index 000000000000..e00d93106bb5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-numberprediction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English numberprediction DistilBertForTokenClassification from rahulkhandelw +author: John Snow Labs +name: numberprediction +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`numberprediction` is a English model originally trained by rahulkhandelw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/numberprediction_en_5.2.0_3.0_1700634916081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/numberprediction_en_5.2.0_3.0_1700634916081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("numberprediction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("numberprediction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|numberprediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rahulkhandelw/NumberPrediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-panda_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-panda_ner_en.md new file mode 100644 index 000000000000..3ccf284822dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-panda_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English panda_ner DistilBertForTokenClassification from GOATsan +author: John Snow Labs +name: panda_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`panda_ner` is a English model originally trained by GOATsan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/panda_ner_en_5.2.0_3.0_1700671468026.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/panda_ner_en_5.2.0_3.0_1700671468026.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("panda_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("panda_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|panda_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/GOATsan/panda_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-pautas_en.md b/docs/_posts/ahmedlone127/2023-11-22-pautas_en.md new file mode 100644 index 000000000000..650c4c843c29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-pautas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pautas DistilBertForTokenClassification from hucruz +author: John Snow Labs +name: pautas +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pautas` is a English model originally trained by hucruz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pautas_en_5.2.0_3.0_1700646922169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pautas_en_5.2.0_3.0_1700646922169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("pautas","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("pautas", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pautas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|250.4 MB| + +## References + +https://huggingface.co/hucruz/pautas \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-personprediction_en.md b/docs/_posts/ahmedlone127/2023-11-22-personprediction_en.md new file mode 100644 index 000000000000..b6083c5a5d3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-personprediction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English personprediction DistilBertForTokenClassification from rahulkhandelw +author: John Snow Labs +name: personprediction +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`personprediction` is a English model originally trained by rahulkhandelw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/personprediction_en_5.2.0_3.0_1700619170470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/personprediction_en_5.2.0_3.0_1700619170470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("personprediction","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("personprediction", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|personprediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/rahulkhandelw/PersonPrediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-portuguese_traing_en.md b/docs/_posts/ahmedlone127/2023-11-22-portuguese_traing_en.md new file mode 100644 index 000000000000..fac660e294d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-portuguese_traing_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English portuguese_traing DistilBertForTokenClassification from Marumaru0 +author: John Snow Labs +name: portuguese_traing +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`portuguese_traing` is a English model originally trained by Marumaru0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/portuguese_traing_en_5.2.0_3.0_1700681996782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/portuguese_traing_en_5.2.0_3.0_1700681996782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("portuguese_traing","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("portuguese_traing", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|portuguese_traing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Marumaru0/pt_traing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-punjabi_distilbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-punjabi_distilbert_ner_en.md new file mode 100644 index 000000000000..90d4b9b70a5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-punjabi_distilbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English punjabi_distilbert_ner DistilBertForTokenClassification from mirfan899 +author: John Snow Labs +name: punjabi_distilbert_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`punjabi_distilbert_ner` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/punjabi_distilbert_ner_en_5.2.0_3.0_1700649558408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/punjabi_distilbert_ner_en_5.2.0_3.0_1700649558408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("punjabi_distilbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("punjabi_distilbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|punjabi_distilbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/mirfan899/punjabi-distilbert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-qspot_distilbert_base_multilingual_cased_xx.md b/docs/_posts/ahmedlone127/2023-11-22-qspot_distilbert_base_multilingual_cased_xx.md new file mode 100644 index 000000000000..bc4cffd77efc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-qspot_distilbert_base_multilingual_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual qspot_distilbert_base_multilingual_cased DistilBertForTokenClassification from DataIntelligenceTeam +author: John Snow Labs +name: qspot_distilbert_base_multilingual_cased +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qspot_distilbert_base_multilingual_cased` is a Multilingual model originally trained by DataIntelligenceTeam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qspot_distilbert_base_multilingual_cased_xx_5.2.0_3.0_1700622937470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qspot_distilbert_base_multilingual_cased_xx_5.2.0_3.0_1700622937470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("qspot_distilbert_base_multilingual_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("qspot_distilbert_base_multilingual_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qspot_distilbert_base_multilingual_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.5 MB| + +## References + +https://huggingface.co/DataIntelligenceTeam/QSPOT-distilbert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-quote_model_delta_en.md b/docs/_posts/ahmedlone127/2023-11-22-quote_model_delta_en.md new file mode 100644 index 000000000000..22cb55b8a003 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-quote_model_delta_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English quote_model_delta DistilBertForTokenClassification from Iceland +author: John Snow Labs +name: quote_model_delta +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quote_model_delta` is a English model originally trained by Iceland. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quote_model_delta_en_5.2.0_3.0_1700651856417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quote_model_delta_en_5.2.0_3.0_1700651856417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("quote_model_delta","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("quote_model_delta", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quote_model_delta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Iceland/quote-model-delta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-results_raucusreno_en.md b/docs/_posts/ahmedlone127/2023-11-22-results_raucusreno_en.md new file mode 100644 index 000000000000..146f41f874d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-results_raucusreno_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English results_raucusreno DistilBertForTokenClassification from RaucusReno +author: John Snow Labs +name: results_raucusreno +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_raucusreno` is a English model originally trained by RaucusReno. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_raucusreno_en_5.2.0_3.0_1700662167953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_raucusreno_en_5.2.0_3.0_1700662167953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("results_raucusreno","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("results_raucusreno", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_raucusreno| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/RaucusReno/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-rg_distilbert_augmanted_signatures_en.md b/docs/_posts/ahmedlone127/2023-11-22-rg_distilbert_augmanted_signatures_en.md new file mode 100644 index 000000000000..f3dcb7e59d33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-rg_distilbert_augmanted_signatures_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rg_distilbert_augmanted_signatures DistilBertForTokenClassification from chilliadgl +author: John Snow Labs +name: rg_distilbert_augmanted_signatures +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rg_distilbert_augmanted_signatures` is a English model originally trained by chilliadgl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rg_distilbert_augmanted_signatures_en_5.2.0_3.0_1700635914697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rg_distilbert_augmanted_signatures_en_5.2.0_3.0_1700635914697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("rg_distilbert_augmanted_signatures","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("rg_distilbert_augmanted_signatures", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rg_distilbert_augmanted_signatures| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chilliadgl/RG_distilbert_augmanted_signatures \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-rg_distilbert_big_data_en.md b/docs/_posts/ahmedlone127/2023-11-22-rg_distilbert_big_data_en.md new file mode 100644 index 000000000000..1c7e81f2fa5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-rg_distilbert_big_data_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rg_distilbert_big_data DistilBertForTokenClassification from chilliadgl +author: John Snow Labs +name: rg_distilbert_big_data +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rg_distilbert_big_data` is a English model originally trained by chilliadgl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rg_distilbert_big_data_en_5.2.0_3.0_1700664653691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rg_distilbert_big_data_en_5.2.0_3.0_1700664653691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("rg_distilbert_big_data","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("rg_distilbert_big_data", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rg_distilbert_big_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chilliadgl/RG_distilbert_big_data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-rg_ner_for_emails_en.md b/docs/_posts/ahmedlone127/2023-11-22-rg_ner_for_emails_en.md new file mode 100644 index 000000000000..b3d7a8dd6fcd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-rg_ner_for_emails_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rg_ner_for_emails DistilBertForTokenClassification from chilliadgl +author: John Snow Labs +name: rg_ner_for_emails +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rg_ner_for_emails` is a English model originally trained by chilliadgl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rg_ner_for_emails_en_5.2.0_3.0_1700646747753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rg_ner_for_emails_en_5.2.0_3.0_1700646747753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("rg_ner_for_emails","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("rg_ner_for_emails", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rg_ner_for_emails| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chilliadgl/RG_NER_for_emails \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-rg_ner_sign_en.md b/docs/_posts/ahmedlone127/2023-11-22-rg_ner_sign_en.md new file mode 100644 index 000000000000..f8fa80c62bfb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-rg_ner_sign_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rg_ner_sign DistilBertForTokenClassification from chilliadgl +author: John Snow Labs +name: rg_ner_sign +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rg_ner_sign` is a English model originally trained by chilliadgl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rg_ner_sign_en_5.2.0_3.0_1700630418874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rg_ner_sign_en_5.2.0_3.0_1700630418874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("rg_ner_sign","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("rg_ner_sign", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rg_ner_sign| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chilliadgl/RG_NER_sign \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-roberta_large_ner_model_mimic_top10_en.md b/docs/_posts/ahmedlone127/2023-11-22-roberta_large_ner_model_mimic_top10_en.md new file mode 100644 index 000000000000..5a641c269441 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-roberta_large_ner_model_mimic_top10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English roberta_large_ner_model_mimic_top10 DistilBertForTokenClassification from alecocc +author: John Snow Labs +name: roberta_large_ner_model_mimic_top10 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_ner_model_mimic_top10` is a English model originally trained by alecocc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_ner_model_mimic_top10_en_5.2.0_3.0_1700634368464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_ner_model_mimic_top10_en_5.2.0_3.0_1700634368464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("roberta_large_ner_model_mimic_top10","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("roberta_large_ner_model_mimic_top10", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_ner_model_mimic_top10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/alecocc/roberta_large_ner_model_mimic_top10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-sara_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-sara_model_en.md new file mode 100644 index 000000000000..94590b695331 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-sara_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sara_model DistilBertForTokenClassification from sarasarasara +author: John Snow Labs +name: sara_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sara_model` is a English model originally trained by sarasarasara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sara_model_en_5.2.0_3.0_1700658368554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sara_model_en_5.2.0_3.0_1700658368554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("sara_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("sara_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sara_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/sarasarasara/sara-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-sayula_popoluca_test_model_1_en.md b/docs/_posts/ahmedlone127/2023-11-22-sayula_popoluca_test_model_1_en.md new file mode 100644 index 000000000000..7c726e262745 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-sayula_popoluca_test_model_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sayula_popoluca_test_model_1 DistilBertForTokenClassification from natalierobbins +author: John Snow Labs +name: sayula_popoluca_test_model_1 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sayula_popoluca_test_model_1` is a English model originally trained by natalierobbins. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sayula_popoluca_test_model_1_en_5.2.0_3.0_1700641992370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sayula_popoluca_test_model_1_en_5.2.0_3.0_1700641992370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("sayula_popoluca_test_model_1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("sayula_popoluca_test_model_1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sayula_popoluca_test_model_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/natalierobbins/pos_test_model_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-sayula_popoluca_test_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-sayula_popoluca_test_model_en.md new file mode 100644 index 000000000000..aec06141b939 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-sayula_popoluca_test_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sayula_popoluca_test_model DistilBertForTokenClassification from natalierobbins +author: John Snow Labs +name: sayula_popoluca_test_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sayula_popoluca_test_model` is a English model originally trained by natalierobbins. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sayula_popoluca_test_model_en_5.2.0_3.0_1700654309763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sayula_popoluca_test_model_en_5.2.0_3.0_1700654309763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("sayula_popoluca_test_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("sayula_popoluca_test_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sayula_popoluca_test_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/natalierobbins/pos_test_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-sf_1_2epochs_distilbert_base_uncased_en.md b/docs/_posts/ahmedlone127/2023-11-22-sf_1_2epochs_distilbert_base_uncased_en.md new file mode 100644 index 000000000000..d0fdafb15b25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-sf_1_2epochs_distilbert_base_uncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sf_1_2epochs_distilbert_base_uncased DistilBertForTokenClassification from PiceTRP +author: John Snow Labs +name: sf_1_2epochs_distilbert_base_uncased +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sf_1_2epochs_distilbert_base_uncased` is a English model originally trained by PiceTRP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sf_1_2epochs_distilbert_base_uncased_en_5.2.0_3.0_1700624127520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sf_1_2epochs_distilbert_base_uncased_en_5.2.0_3.0_1700624127520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("sf_1_2epochs_distilbert_base_uncased","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("sf_1_2epochs_distilbert_base_uncased", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sf_1_2epochs_distilbert_base_uncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/PiceTRP/sf_1_2epochs_distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-social_groups_ner_first_try_en.md b/docs/_posts/ahmedlone127/2023-11-22-social_groups_ner_first_try_en.md new file mode 100644 index 000000000000..8fbea40da7e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-social_groups_ner_first_try_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English social_groups_ner_first_try DistilBertForTokenClassification from AlonCohen +author: John Snow Labs +name: social_groups_ner_first_try +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`social_groups_ner_first_try` is a English model originally trained by AlonCohen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/social_groups_ner_first_try_en_5.2.0_3.0_1700621574481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/social_groups_ner_first_try_en_5.2.0_3.0_1700621574481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("social_groups_ner_first_try","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("social_groups_ner_first_try", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|social_groups_ner_first_try| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/AlonCohen/social-groups-ner-first-try \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-social_groups_second_try_aloncohen_en.md b/docs/_posts/ahmedlone127/2023-11-22-social_groups_second_try_aloncohen_en.md new file mode 100644 index 000000000000..553cc5e6e039 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-social_groups_second_try_aloncohen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English social_groups_second_try_aloncohen DistilBertForTokenClassification from AlonCohen +author: John Snow Labs +name: social_groups_second_try_aloncohen +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`social_groups_second_try_aloncohen` is a English model originally trained by AlonCohen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/social_groups_second_try_aloncohen_en_5.2.0_3.0_1700622835787.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/social_groups_second_try_aloncohen_en_5.2.0_3.0_1700622835787.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("social_groups_second_try_aloncohen","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("social_groups_second_try_aloncohen", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|social_groups_second_try_aloncohen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/AlonCohen/social_groups_second_try \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-social_groups_second_try_giladh_en.md b/docs/_posts/ahmedlone127/2023-11-22-social_groups_second_try_giladh_en.md new file mode 100644 index 000000000000..eb1a8f29dde6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-social_groups_second_try_giladh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English social_groups_second_try_giladh DistilBertForTokenClassification from GiladH +author: John Snow Labs +name: social_groups_second_try_giladh +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`social_groups_second_try_giladh` is a English model originally trained by GiladH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/social_groups_second_try_giladh_en_5.2.0_3.0_1700677625371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/social_groups_second_try_giladh_en_5.2.0_3.0_1700677625371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("social_groups_second_try_giladh","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("social_groups_second_try_giladh", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|social_groups_second_try_giladh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|244.0 MB| + +## References + +https://huggingface.co/GiladH/social_groups_second_try \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-sophie_spanish_implementation_en.md b/docs/_posts/ahmedlone127/2023-11-22-sophie_spanish_implementation_en.md new file mode 100644 index 000000000000..3161500a8858 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-sophie_spanish_implementation_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sophie_spanish_implementation DistilBertForTokenClassification from sophiestein +author: John Snow Labs +name: sophie_spanish_implementation +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sophie_spanish_implementation` is a English model originally trained by sophiestein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sophie_spanish_implementation_en_5.2.0_3.0_1700676749743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sophie_spanish_implementation_en_5.2.0_3.0_1700676749743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("sophie_spanish_implementation","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("sophie_spanish_implementation", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sophie_spanish_implementation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/sophiestein/sophie-spanish-implementation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-spanish_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-spanish_ner_en.md new file mode 100644 index 000000000000..816d9ceef3e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-spanish_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English spanish_ner DistilBertForTokenClassification from dayvidwang +author: John Snow Labs +name: spanish_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spanish_ner` is a English model originally trained by dayvidwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spanish_ner_en_5.2.0_3.0_1700676749060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spanish_ner_en_5.2.0_3.0_1700676749060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("spanish_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("spanish_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spanish_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/dayvidwang/spanish_ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tabert_1k_naamapadam_en.md b/docs/_posts/ahmedlone127/2023-11-22-tabert_1k_naamapadam_en.md new file mode 100644 index 000000000000..982faacbdadf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tabert_1k_naamapadam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tabert_1k_naamapadam DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tabert_1k_naamapadam +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tabert_1k_naamapadam` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tabert_1k_naamapadam_en_5.2.0_3.0_1700620970803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tabert_1k_naamapadam_en_5.2.0_3.0_1700620970803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tabert_1k_naamapadam","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tabert_1k_naamapadam", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tabert_1k_naamapadam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|163.8 MB| + +## References + +https://huggingface.co/AnanthZeke/tabert-1k-naamapadam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tabert_2k_naamapadam_en.md b/docs/_posts/ahmedlone127/2023-11-22-tabert_2k_naamapadam_en.md new file mode 100644 index 000000000000..f9b3024e9a47 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tabert_2k_naamapadam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tabert_2k_naamapadam DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tabert_2k_naamapadam +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tabert_2k_naamapadam` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tabert_2k_naamapadam_en_5.2.0_3.0_1700623109007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tabert_2k_naamapadam_en_5.2.0_3.0_1700623109007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tabert_2k_naamapadam","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tabert_2k_naamapadam", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tabert_2k_naamapadam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|166.7 MB| + +## References + +https://huggingface.co/AnanthZeke/tabert-2k-naamapadam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_1k_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_en.md new file mode 100644 index 000000000000..b349ec493126 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_1k DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: taner_1k +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_1k` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_1k_en_5.2.0_3.0_1700673817143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_1k_en_5.2.0_3.0_1700673817143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_1k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_1k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_1k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|163.8 MB| + +## References + +https://huggingface.co/livinNector/TaNER-1k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_1k_indic_glue_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_indic_glue_en.md new file mode 100644 index 000000000000..a997d23a7346 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_indic_glue_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_1k_indic_glue DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: taner_1k_indic_glue +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_1k_indic_glue` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_1k_indic_glue_en_5.2.0_3.0_1700635610034.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_1k_indic_glue_en_5.2.0_3.0_1700635610034.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_1k_indic_glue","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_1k_indic_glue", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_1k_indic_glue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|163.8 MB| + +## References + +https://huggingface.co/AnanthZeke/TaNER-1k-indic_glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_1k_naamapdam_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_naamapdam_fine_tuned_en.md new file mode 100644 index 000000000000..422c95a7c865 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_naamapdam_fine_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_1k_naamapdam_fine_tuned DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: taner_1k_naamapdam_fine_tuned +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_1k_naamapdam_fine_tuned` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_1k_naamapdam_fine_tuned_en_5.2.0_3.0_1700669145891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_1k_naamapdam_fine_tuned_en_5.2.0_3.0_1700669145891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_1k_naamapdam_fine_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_1k_naamapdam_fine_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_1k_naamapdam_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|163.8 MB| + +## References + +https://huggingface.co/livinNector/taNER-1k-naamapdam-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_1k_v2_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_v2_en.md new file mode 100644 index 000000000000..87c8922cc45f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_1k_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_1k_v2 DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: taner_1k_v2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_1k_v2` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_1k_v2_en_5.2.0_3.0_1700632269703.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_1k_v2_en_5.2.0_3.0_1700632269703.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_1k_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_1k_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_1k_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|163.8 MB| + +## References + +https://huggingface.co/livinNector/taNER-1k-V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_2k_indic_glue_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_2k_indic_glue_en.md new file mode 100644 index 000000000000..20b4cfd29dda --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_2k_indic_glue_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_2k_indic_glue DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: taner_2k_indic_glue +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_2k_indic_glue` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_2k_indic_glue_en_5.2.0_3.0_1700646553400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_2k_indic_glue_en_5.2.0_3.0_1700646553400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_2k_indic_glue","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_2k_indic_glue", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_2k_indic_glue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|166.7 MB| + +## References + +https://huggingface.co/AnanthZeke/TaNER-2k-indic_glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_4k_indic_glue_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_4k_indic_glue_en.md new file mode 100644 index 000000000000..15c4bda569b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_4k_indic_glue_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_4k_indic_glue DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: taner_4k_indic_glue +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_4k_indic_glue` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_4k_indic_glue_en_5.2.0_3.0_1700649044676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_4k_indic_glue_en_5.2.0_3.0_1700649044676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_4k_indic_glue","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_4k_indic_glue", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_4k_indic_glue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|172.5 MB| + +## References + +https://huggingface.co/AnanthZeke/TaNER-4k-indic_glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_500_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_500_en.md new file mode 100644 index 000000000000..f0bd9cb10c1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_500_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_500 DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: taner_500 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_500` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_500_en_5.2.0_3.0_1700674856384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_500_en_5.2.0_3.0_1700674856384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_500","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_500", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_500| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|162.4 MB| + +## References + +https://huggingface.co/livinNector/TaNER-500 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_500_indic_glue_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_500_indic_glue_en.md new file mode 100644 index 000000000000..409942c98b92 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_500_indic_glue_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_500_indic_glue DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: taner_500_indic_glue +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_500_indic_glue` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_500_indic_glue_en_5.2.0_3.0_1700632284235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_500_indic_glue_en_5.2.0_3.0_1700632284235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_500_indic_glue","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_500_indic_glue", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_500_indic_glue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|162.4 MB| + +## References + +https://huggingface.co/AnanthZeke/TaNER-500-indic_glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_500_naamapdam_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_500_naamapdam_fine_tuned_en.md new file mode 100644 index 000000000000..5636fbe4c968 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_500_naamapdam_fine_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_500_naamapdam_fine_tuned DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: taner_500_naamapdam_fine_tuned +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_500_naamapdam_fine_tuned` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_500_naamapdam_fine_tuned_en_5.2.0_3.0_1700639738234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_500_naamapdam_fine_tuned_en_5.2.0_3.0_1700639738234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_500_naamapdam_fine_tuned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_500_naamapdam_fine_tuned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_500_naamapdam_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|162.4 MB| + +## References + +https://huggingface.co/livinNector/taNER-500-naamapdam-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-taner_500_v2_en.md b/docs/_posts/ahmedlone127/2023-11-22-taner_500_v2_en.md new file mode 100644 index 000000000000..f39a58920d81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-taner_500_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English taner_500_v2 DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: taner_500_v2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`taner_500_v2` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/taner_500_v2_en_5.2.0_3.0_1700645382200.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/taner_500_v2_en_5.2.0_3.0_1700645382200.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("taner_500_v2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("taner_500_v2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|taner_500_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|162.4 MB| + +## References + +https://huggingface.co/livinNector/taNER-500-V2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tbert_ner_test_en.md b/docs/_posts/ahmedlone127/2023-11-22-tbert_ner_test_en.md new file mode 100644 index 000000000000..d2e834ad7d9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tbert_ner_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tbert_ner_test DistilBertForTokenClassification from ArunaSaraswathy +author: John Snow Labs +name: tbert_ner_test +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tbert_ner_test` is a English model originally trained by ArunaSaraswathy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tbert_ner_test_en_5.2.0_3.0_1700644067138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tbert_ner_test_en_5.2.0_3.0_1700644067138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tbert_ner_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tbert_ner_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tbert_ner_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/ArunaSaraswathy/tbert_ner_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-team_gryffindor_distilbert_finetuned_ner_creditcardcontract_en.md b/docs/_posts/ahmedlone127/2023-11-22-team_gryffindor_distilbert_finetuned_ner_creditcardcontract_en.md new file mode 100644 index 000000000000..abb283665a18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-team_gryffindor_distilbert_finetuned_ner_creditcardcontract_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English team_gryffindor_distilbert_finetuned_ner_creditcardcontract DistilBertForTokenClassification from timhbach +author: John Snow Labs +name: team_gryffindor_distilbert_finetuned_ner_creditcardcontract +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`team_gryffindor_distilbert_finetuned_ner_creditcardcontract` is a English model originally trained by timhbach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/team_gryffindor_distilbert_finetuned_ner_creditcardcontract_en_5.2.0_3.0_1700627761570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/team_gryffindor_distilbert_finetuned_ner_creditcardcontract_en_5.2.0_3.0_1700627761570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("team_gryffindor_distilbert_finetuned_ner_creditcardcontract","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("team_gryffindor_distilbert_finetuned_ner_creditcardcontract", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|team_gryffindor_distilbert_finetuned_ner_creditcardcontract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/timhbach/Team-Gryffindor-DistilBERT-finetuned-ner-creditcardcontract \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-team_gryffindor_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-team_gryffindor_ner_en.md new file mode 100644 index 000000000000..00617ba333c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-team_gryffindor_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English team_gryffindor_ner DistilBertForTokenClassification from timhbach +author: John Snow Labs +name: team_gryffindor_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`team_gryffindor_ner` is a English model originally trained by timhbach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/team_gryffindor_ner_en_5.2.0_3.0_1700623679547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/team_gryffindor_ner_en_5.2.0_3.0_1700623679547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("team_gryffindor_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("team_gryffindor_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|team_gryffindor_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/timhbach/Team_Gryffindor_NER \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test2_en.md b/docs/_posts/ahmedlone127/2023-11-22-test2_en.md new file mode 100644 index 000000000000..929532632133 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test2 DistilBertForTokenClassification from yam1ke +author: John Snow Labs +name: test2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test2` is a English model originally trained by yam1ke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test2_en_5.2.0_3.0_1700682010269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test2_en_5.2.0_3.0_1700682010269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yam1ke/test2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test4_en.md b/docs/_posts/ahmedlone127/2023-11-22-test4_en.md new file mode 100644 index 000000000000..d6c7be19d570 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test4 DistilBertForTokenClassification from yam1ke +author: John Snow Labs +name: test4 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test4` is a English model originally trained by yam1ke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test4_en_5.2.0_3.0_1700671278388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test4_en_5.2.0_3.0_1700671278388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test4","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test4", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/yam1ke/test4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test_distilbert_base_multilingual_cased_xx.md b/docs/_posts/ahmedlone127/2023-11-22-test_distilbert_base_multilingual_cased_xx.md new file mode 100644 index 000000000000..300791100948 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test_distilbert_base_multilingual_cased_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual test_distilbert_base_multilingual_cased DistilBertForTokenClassification from Erdenebold +author: John Snow Labs +name: test_distilbert_base_multilingual_cased +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_distilbert_base_multilingual_cased` is a Multilingual model originally trained by Erdenebold. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_distilbert_base_multilingual_cased_xx_5.2.0_3.0_1700621466558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_distilbert_base_multilingual_cased_xx_5.2.0_3.0_1700621466558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test_distilbert_base_multilingual_cased","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test_distilbert_base_multilingual_cased", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_distilbert_base_multilingual_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Erdenebold/test-distilbert-base-multilingual-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test_ner3_en.md b/docs/_posts/ahmedlone127/2023-11-22-test_ner3_en.md new file mode 100644 index 000000000000..41504e7ce3d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test_ner3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_ner3 DistilBertForTokenClassification from chintagunta85 +author: John Snow Labs +name: test_ner3 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_ner3` is a English model originally trained by chintagunta85. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_ner3_en_5.2.0_3.0_1700650330909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_ner3_en_5.2.0_3.0_1700650330909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test_ner3","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test_ner3", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_ner3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/chintagunta85/test_ner3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-test_ner_en.md new file mode 100644 index 000000000000..898e08a24e32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_ner DistilBertForTokenClassification from Falah +author: John Snow Labs +name: test_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_ner` is a English model originally trained by Falah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_ner_en_5.2.0_3.0_1700626721192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_ner_en_5.2.0_3.0_1700626721192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Falah/test-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test_ner_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-test_ner_finetuned_ner_en.md new file mode 100644 index 000000000000..9dfeff71aa1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test_ner_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_ner_finetuned_ner DistilBertForTokenClassification from HYM +author: John Snow Labs +name: test_ner_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_ner_finetuned_ner` is a English model originally trained by HYM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_ner_finetuned_ner_en_5.2.0_3.0_1700636973549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_ner_finetuned_ner_en_5.2.0_3.0_1700636973549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test_ner_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test_ner_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_ner_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/HYM/test_ner-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test_train_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-test_train_model_en.md new file mode 100644 index 000000000000..bdb02ee3e0b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test_train_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_train_model DistilBertForTokenClassification from terhdavid +author: John Snow Labs +name: test_train_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_train_model` is a English model originally trained by terhdavid. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_train_model_en_5.2.0_3.0_1700681187340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_train_model_en_5.2.0_3.0_1700681187340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test_train_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test_train_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_train_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/terhdavid/test-train-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-test_wnut_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-test_wnut_model_en.md new file mode 100644 index 000000000000..54f18dacaa02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-test_wnut_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_wnut_model DistilBertForTokenClassification from blambert +author: John Snow Labs +name: test_wnut_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_wnut_model` is a English model originally trained by blambert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_wnut_model_en_5.2.0_3.0_1700621773048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_wnut_model_en_5.2.0_3.0_1700621773048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("test_wnut_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("test_wnut_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_wnut_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/blambert/test_wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-testingmodel_mn.md b/docs/_posts/ahmedlone127/2023-11-22-testingmodel_mn.md new file mode 100644 index 000000000000..c35290ab67d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-testingmodel_mn.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Mongolian testingmodel DistilBertForTokenClassification from Blgn94 +author: John Snow Labs +name: testingmodel +date: 2023-11-22 +tags: [bert, mn, open_source, token_classification, onnx] +task: Named Entity Recognition +language: mn +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testingmodel` is a Mongolian model originally trained by Blgn94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testingmodel_mn_5.2.0_3.0_1700634803891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testingmodel_mn_5.2.0_3.0_1700634803891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("testingmodel","mn") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("testingmodel", "mn") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testingmodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|mn| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Blgn94/testingModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-testrun_model_en.md b/docs/_posts/ahmedlone127/2023-11-22-testrun_model_en.md new file mode 100644 index 000000000000..5d2eb69ae5fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-testrun_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English testrun_model DistilBertForTokenClassification from adasgaleus +author: John Snow Labs +name: testrun_model +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`testrun_model` is a English model originally trained by adasgaleus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/testrun_model_en_5.2.0_3.0_1700645596478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/testrun_model_en_5.2.0_3.0_1700645596478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("testrun_model","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("testrun_model", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|testrun_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adasgaleus/testrun_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tiny_random_distilbertfortokenclassification_hf_tiny_model_private_en.md b/docs/_posts/ahmedlone127/2023-11-22-tiny_random_distilbertfortokenclassification_hf_tiny_model_private_en.md new file mode 100644 index 000000000000..622f18ec1c74 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tiny_random_distilbertfortokenclassification_hf_tiny_model_private_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tiny_random_distilbertfortokenclassification_hf_tiny_model_private DistilBertForTokenClassification from hf-tiny-model-private +author: John Snow Labs +name: tiny_random_distilbertfortokenclassification_hf_tiny_model_private +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_distilbertfortokenclassification_hf_tiny_model_private` is a English model originally trained by hf-tiny-model-private. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertfortokenclassification_hf_tiny_model_private_en_5.2.0_3.0_1700678338641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertfortokenclassification_hf_tiny_model_private_en_5.2.0_3.0_1700678338641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tiny_random_distilbertfortokenclassification_hf_tiny_model_private","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tiny_random_distilbertfortokenclassification_hf_tiny_model_private", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_distilbertfortokenclassification_hf_tiny_model_private| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|347.3 KB| + +## References + +https://huggingface.co/hf-tiny-model-private/tiny-random-DistilBertForTokenClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tmp_trainer_en.md b/docs/_posts/ahmedlone127/2023-11-22-tmp_trainer_en.md new file mode 100644 index 000000000000..6a066f64e507 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tmp_trainer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tmp_trainer DistilBertForTokenClassification from anyuanay +author: John Snow Labs +name: tmp_trainer +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tmp_trainer` is a English model originally trained by anyuanay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tmp_trainer_en_5.2.0_3.0_1700664745538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tmp_trainer_en_5.2.0_3.0_1700664745538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tmp_trainer","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tmp_trainer", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tmp_trainer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/anyuanay/tmp_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_classification_test_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_classification_test_en.md new file mode 100644 index 000000000000..ed08d2f2ee17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_classification_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_classification_test DistilBertForTokenClassification from casonshep +author: John Snow Labs +name: token_classification_test +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_classification_test` is a English model originally trained by casonshep. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_classification_test_en_5.2.0_3.0_1700635298027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_classification_test_en_5.2.0_3.0_1700635298027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_classification_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_classification_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_classification_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/casonshep/token_classification_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_final_tunned_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_final_tunned_en.md new file mode 100644 index 000000000000..c1da497db082 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_final_tunned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_final_tunned DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_final_tunned +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_final_tunned` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_final_tunned_en_5.2.0_3.0_1700659234065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_final_tunned_en_5.2.0_3.0_1700659234065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_final_tunned","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_final_tunned", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_final_tunned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_final_tunned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_en.md new file mode 100644 index 000000000000..bc6c55eb8134 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_fine_tunned_flipkart_2 DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_fine_tunned_flipkart_2 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_fine_tunned_flipkart_2` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_en_5.2.0_3.0_1700634798346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_en_5.2.0_3.0_1700634798346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_fine_tunned_flipkart_2","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_fine_tunned_flipkart_2", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_fine_tunned_flipkart_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_fine_tunned_flipkart_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_galician_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_galician_en.md new file mode 100644 index 000000000000..d9cba1e71383 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_galician_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_fine_tunned_flipkart_2_galician DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_fine_tunned_flipkart_2_galician +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_fine_tunned_flipkart_2_galician` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_galician_en_5.2.0_3.0_1700654252458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_galician_en_5.2.0_3.0_1700654252458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_fine_tunned_flipkart_2_galician","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_fine_tunned_flipkart_2_galician", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_fine_tunned_flipkart_2_galician| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_fine_tunned_flipkart_2_gl \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl11_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl11_en.md new file mode 100644 index 000000000000..a6373f10d151 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl11_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_fine_tunned_flipkart_2_gl11 DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_fine_tunned_flipkart_2_gl11 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_fine_tunned_flipkart_2_gl11` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl11_en_5.2.0_3.0_1700632388524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl11_en_5.2.0_3.0_1700632388524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_fine_tunned_flipkart_2_gl11","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_fine_tunned_flipkart_2_gl11", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_fine_tunned_flipkart_2_gl11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_fine_tunned_flipkart_2_gl11 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl6_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl6_en.md new file mode 100644 index 000000000000..2dfc6dbdfbd4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_fine_tunned_flipkart_2_gl6 DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_fine_tunned_flipkart_2_gl6 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_fine_tunned_flipkart_2_gl6` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl6_en_5.2.0_3.0_1700623540069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl6_en_5.2.0_3.0_1700623540069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_fine_tunned_flipkart_2_gl6","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_fine_tunned_flipkart_2_gl6", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_fine_tunned_flipkart_2_gl6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_fine_tunned_flipkart_2_gl6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl7_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl7_en.md new file mode 100644 index 000000000000..c30c965c5e3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_fine_tunned_flipkart_2_gl7 DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_fine_tunned_flipkart_2_gl7 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_fine_tunned_flipkart_2_gl7` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl7_en_5.2.0_3.0_1700652125288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl7_en_5.2.0_3.0_1700652125288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_fine_tunned_flipkart_2_gl7","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_fine_tunned_flipkart_2_gl7", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_fine_tunned_flipkart_2_gl7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_fine_tunned_flipkart_2_gl7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl9_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl9_en.md new file mode 100644 index 000000000000..b4d714b8238c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_2_gl9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_fine_tunned_flipkart_2_gl9 DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_fine_tunned_flipkart_2_gl9 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_fine_tunned_flipkart_2_gl9` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl9_en_5.2.0_3.0_1700623819818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_2_gl9_en_5.2.0_3.0_1700623819818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_fine_tunned_flipkart_2_gl9","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_fine_tunned_flipkart_2_gl9", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_fine_tunned_flipkart_2_gl9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_fine_tunned_flipkart_2_gl9 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_en.md new file mode 100644 index 000000000000..e5ef82e6bf83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_fine_tunned_flipkart_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_fine_tunned_flipkart DistilBertForTokenClassification from vinayak361 +author: John Snow Labs +name: token_fine_tunned_flipkart +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_fine_tunned_flipkart` is a English model originally trained by vinayak361. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_en_5.2.0_3.0_1700653422928.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_fine_tunned_flipkart_en_5.2.0_3.0_1700653422928.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_fine_tunned_flipkart","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_fine_tunned_flipkart", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_fine_tunned_flipkart| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vinayak361/token_fine_tunned_flipkart \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-token_level_stereotype_detector_en.md b/docs/_posts/ahmedlone127/2023-11-22-token_level_stereotype_detector_en.md new file mode 100644 index 000000000000..bc8887a6a6d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-token_level_stereotype_detector_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English token_level_stereotype_detector DistilBertForTokenClassification from wu981526092 +author: John Snow Labs +name: token_level_stereotype_detector +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`token_level_stereotype_detector` is a English model originally trained by wu981526092. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/token_level_stereotype_detector_en_5.2.0_3.0_1700623659429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/token_level_stereotype_detector_en_5.2.0_3.0_1700623659429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("token_level_stereotype_detector","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("token_level_stereotype_detector", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|token_level_stereotype_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/wu981526092/Token-Level-Stereotype-Detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tokenclass_wnut_en.md b/docs/_posts/ahmedlone127/2023-11-22-tokenclass_wnut_en.md new file mode 100644 index 000000000000..71a91ba32563 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tokenclass_wnut_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tokenclass_wnut DistilBertForTokenClassification from Madhura +author: John Snow Labs +name: tokenclass_wnut +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tokenclass_wnut` is a English model originally trained by Madhura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tokenclass_wnut_en_5.2.0_3.0_1700650763879.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tokenclass_wnut_en_5.2.0_3.0_1700650763879.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tokenclass_wnut","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tokenclass_wnut", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tokenclass_wnut| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Madhura/tokenclass-wnut \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-try_connll_finetuned_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-try_connll_finetuned_ner_en.md new file mode 100644 index 000000000000..482c7da67c2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-try_connll_finetuned_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English try_connll_finetuned_ner DistilBertForTokenClassification from suwani +author: John Snow Labs +name: try_connll_finetuned_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`try_connll_finetuned_ner` is a English model originally trained by suwani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/try_connll_finetuned_ner_en_5.2.0_3.0_1700642408528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/try_connll_finetuned_ner_en_5.2.0_3.0_1700642408528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("try_connll_finetuned_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("try_connll_finetuned_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|try_connll_finetuned_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/suwani/try_connll-finetuned-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tryner_2k_en.md b/docs/_posts/ahmedlone127/2023-11-22-tryner_2k_en.md new file mode 100644 index 000000000000..e51e7da6ce08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tryner_2k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tryner_2k DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: tryner_2k +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tryner_2k` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tryner_2k_en_5.2.0_3.0_1700676570276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tryner_2k_en_5.2.0_3.0_1700676570276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tryner_2k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tryner_2k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tryner_2k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|166.7 MB| + +## References + +https://huggingface.co/livinNector/TryNer-2k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tryner_4k_en.md b/docs/_posts/ahmedlone127/2023-11-22-tryner_4k_en.md new file mode 100644 index 000000000000..074c7f9f6934 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tryner_4k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tryner_4k DistilBertForTokenClassification from livinNector +author: John Snow Labs +name: tryner_4k +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tryner_4k` is a English model originally trained by livinNector. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tryner_4k_en_5.2.0_3.0_1700679660758.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tryner_4k_en_5.2.0_3.0_1700679660758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tryner_4k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tryner_4k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tryner_4k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|172.5 MB| + +## References + +https://huggingface.co/livinNector/TryNer-4k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_1k_en.md b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_1k_en.md new file mode 100644 index 000000000000..0411994db8c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_1k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tryner_tabert_1k DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tryner_tabert_1k +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tryner_tabert_1k` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tryner_tabert_1k_en_5.2.0_3.0_1700633428325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tryner_tabert_1k_en_5.2.0_3.0_1700633428325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tryner_tabert_1k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tryner_tabert_1k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tryner_tabert_1k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|163.8 MB| + +## References + +https://huggingface.co/AnanthZeke/TryNER-tabert-1k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_2k_en.md b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_2k_en.md new file mode 100644 index 000000000000..86899ddda752 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_2k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tryner_tabert_2k DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tryner_tabert_2k +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tryner_tabert_2k` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tryner_tabert_2k_en_5.2.0_3.0_1700642877270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tryner_tabert_2k_en_5.2.0_3.0_1700642877270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tryner_tabert_2k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tryner_tabert_2k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tryner_tabert_2k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|166.7 MB| + +## References + +https://huggingface.co/AnanthZeke/TryNER-tabert-2k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_4k_en.md b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_4k_en.md new file mode 100644 index 000000000000..dfb7c318ec96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_4k_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tryner_tabert_4k DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tryner_tabert_4k +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tryner_tabert_4k` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tryner_tabert_4k_en_5.2.0_3.0_1700660105026.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tryner_tabert_4k_en_5.2.0_3.0_1700660105026.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tryner_tabert_4k","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tryner_tabert_4k", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tryner_tabert_4k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|172.5 MB| + +## References + +https://huggingface.co/AnanthZeke/TryNER-tabert-4k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_500_en.md b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_500_en.md new file mode 100644 index 000000000000..2b04f1a53340 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tryner_tabert_500_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tryner_tabert_500 DistilBertForTokenClassification from AnanthZeke +author: John Snow Labs +name: tryner_tabert_500 +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tryner_tabert_500` is a English model originally trained by AnanthZeke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tryner_tabert_500_en_5.2.0_3.0_1700637784683.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tryner_tabert_500_en_5.2.0_3.0_1700637784683.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tryner_tabert_500","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tryner_tabert_500", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tryner_tabert_500| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|162.4 MB| + +## References + +https://huggingface.co/AnanthZeke/TryNER-tabert-500 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-tutorial_en.md b/docs/_posts/ahmedlone127/2023-11-22-tutorial_en.md new file mode 100644 index 000000000000..230b8a5577d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-tutorial_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tutorial DistilBertForTokenClassification from lrmironova +author: John Snow Labs +name: tutorial +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tutorial` is a English model originally trained by lrmironova. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tutorial_en_5.2.0_3.0_1700665859595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tutorial_en_5.2.0_3.0_1700665859595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("tutorial","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("tutorial", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tutorial| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/lrmironova/tutorial \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-un_ner_test_en.md b/docs/_posts/ahmedlone127/2023-11-22-un_ner_test_en.md new file mode 100644 index 000000000000..80201ecc1000 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-un_ner_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English un_ner_test DistilBertForTokenClassification from saint1729 +author: John Snow Labs +name: un_ner_test +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`un_ner_test` is a English model originally trained by saint1729. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/un_ner_test_en_5.2.0_3.0_1700628008297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/un_ner_test_en_5.2.0_3.0_1700628008297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("un_ner_test","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("un_ner_test", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|un_ner_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/saint1729/un-ner-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-uner_distilbert_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-uner_distilbert_ner_en.md new file mode 100644 index 000000000000..43cba1f43ec7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-uner_distilbert_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English uner_distilbert_ner DistilBertForTokenClassification from mirfan899 +author: John Snow Labs +name: uner_distilbert_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`uner_distilbert_ner` is a English model originally trained by mirfan899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/uner_distilbert_ner_en_5.2.0_3.0_1700623832046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/uner_distilbert_ner_en_5.2.0_3.0_1700623832046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("uner_distilbert_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("uner_distilbert_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|uner_distilbert_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/mirfan899/uner-distilbert-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-vietai_asm1_ner_en.md b/docs/_posts/ahmedlone127/2023-11-22-vietai_asm1_ner_en.md new file mode 100644 index 000000000000..aeefc6a0b65c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-vietai_asm1_ner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English vietai_asm1_ner DistilBertForTokenClassification from QyQy +author: John Snow Labs +name: vietai_asm1_ner +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vietai_asm1_ner` is a English model originally trained by QyQy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vietai_asm1_ner_en_5.2.0_3.0_1700646261262.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vietai_asm1_ner_en_5.2.0_3.0_1700646261262.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("vietai_asm1_ner","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("vietai_asm1_ner", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vietai_asm1_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/QyQy/VietAI-ASM1-Ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-wikineural_multilingual_ner_xx.md b/docs/_posts/ahmedlone127/2023-11-22-wikineural_multilingual_ner_xx.md new file mode 100644 index 000000000000..e80025ff5a4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-wikineural_multilingual_ner_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual wikineural_multilingual_ner DistilBertForTokenClassification from iiShreya +author: John Snow Labs +name: wikineural_multilingual_ner +date: 2023-11-22 +tags: [bert, xx, open_source, token_classification, onnx] +task: Named Entity Recognition +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wikineural_multilingual_ner` is a Multilingual model originally trained by iiShreya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wikineural_multilingual_ner_xx_5.2.0_3.0_1700662121716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wikineural_multilingual_ner_xx_5.2.0_3.0_1700662121716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("wikineural_multilingual_ner","xx") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("wikineural_multilingual_ner", "xx") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wikineural_multilingual_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/iiShreya/wikineural-multilingual-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-wnut_model_navendux_en.md b/docs/_posts/ahmedlone127/2023-11-22-wnut_model_navendux_en.md new file mode 100644 index 000000000000..25742f5c1849 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-wnut_model_navendux_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wnut_model_navendux DistilBertForTokenClassification from navendux +author: John Snow Labs +name: wnut_model_navendux +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wnut_model_navendux` is a English model originally trained by navendux. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wnut_model_navendux_en_5.2.0_3.0_1700663932312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wnut_model_navendux_en_5.2.0_3.0_1700663932312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("wnut_model_navendux","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("wnut_model_navendux", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wnut_model_navendux| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/navendux/wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-wnut_model_realgon_en.md b/docs/_posts/ahmedlone127/2023-11-22-wnut_model_realgon_en.md new file mode 100644 index 000000000000..b8a969a27c26 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-wnut_model_realgon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wnut_model_realgon DistilBertForTokenClassification from Realgon +author: John Snow Labs +name: wnut_model_realgon +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wnut_model_realgon` is a English model originally trained by Realgon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wnut_model_realgon_en_5.2.0_3.0_1700672181854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wnut_model_realgon_en_5.2.0_3.0_1700672181854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("wnut_model_realgon","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("wnut_model_realgon", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wnut_model_realgon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Realgon/wnut_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-22-wolof_finetuned_ner_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-22-wolof_finetuned_ner_accelerate_en.md new file mode 100644 index 000000000000..86960644d3ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-22-wolof_finetuned_ner_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English wolof_finetuned_ner_accelerate DistilBertForTokenClassification from vonewman +author: John Snow Labs +name: wolof_finetuned_ner_accelerate +date: 2023-11-22 +tags: [bert, en, open_source, token_classification, onnx] +task: Named Entity Recognition +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForTokenClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForTokenClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`wolof_finetuned_ner_accelerate` is a English model originally trained by vonewman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/wolof_finetuned_ner_accelerate_en_5.2.0_3.0_1700667643646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/wolof_finetuned_ner_accelerate_en_5.2.0_3.0_1700667643646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +tokenClassifier = DistilBertForTokenClassification.pretrained("wolof_finetuned_ner_accelerate","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("ner") + +pipeline = Pipeline().setStages([documentAssembler, tokenClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val tokenClassifier = DistilBertForTokenClassification + .pretrained("wolof_finetuned_ner_accelerate", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("ner") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|wolof_finetuned_ner_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[ner]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/vonewman/wolof-finetuned-ner-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-adhd_test_qa_model_only_mike_en.md b/docs/_posts/ahmedlone127/2023-11-26-adhd_test_qa_model_only_mike_en.md new file mode 100644 index 000000000000..84a0d8ad6186 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-adhd_test_qa_model_only_mike_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English adhd_test_qa_model_only_mike DistilBertForQuestionAnswering from Only-Mike +author: John Snow Labs +name: adhd_test_qa_model_only_mike +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adhd_test_qa_model_only_mike` is a English model originally trained by Only-Mike. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adhd_test_qa_model_only_mike_en_5.2.0_3.0_1701041835430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adhd_test_qa_model_only_mike_en_5.2.0_3.0_1701041835430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("adhd_test_qa_model_only_mike","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("adhd_test_qa_model_only_mike", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adhd_test_qa_model_only_mike| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Only-Mike/ADHD_Test_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-ai_challenge_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-ai_challenge_model_en.md new file mode 100644 index 000000000000..f14fa211a572 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-ai_challenge_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ai_challenge_model DistilBertForQuestionAnswering from minhcrafters +author: John Snow Labs +name: ai_challenge_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai_challenge_model` is a English model originally trained by minhcrafters. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai_challenge_model_en_5.2.0_3.0_1701016192549.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai_challenge_model_en_5.2.0_3.0_1701016192549.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("ai_challenge_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("ai_challenge_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai_challenge_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/minhcrafters/ai-challenge-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-albertina_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-albertina_qa_model_en.md new file mode 100644 index 000000000000..85d1c0d2ecc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-albertina_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English albertina_qa_model DistilBertForQuestionAnswering from Ramison +author: John Snow Labs +name: albertina_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`albertina_qa_model` is a English model originally trained by Ramison. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albertina_qa_model_en_5.2.0_3.0_1701036118988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albertina_qa_model_en_5.2.0_3.0_1701036118988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("albertina_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("albertina_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|albertina_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Ramison/albertina_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-anirban_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-anirban_qa_model_en.md new file mode 100644 index 000000000000..942a14f060c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-anirban_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English anirban_qa_model DistilBertForQuestionAnswering from anirbankgec +author: John Snow Labs +name: anirban_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`anirban_qa_model` is a English model originally trained by anirbankgec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/anirban_qa_model_en_5.2.0_3.0_1701037857672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/anirban_qa_model_en_5.2.0_3.0_1701037857672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("anirban_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("anirban_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|anirban_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anirbankgec/anirban_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-anirban_qa_model_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-26-anirban_qa_model_finetuned_en.md new file mode 100644 index 000000000000..c3c7a68bfcd2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-anirban_qa_model_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English anirban_qa_model_finetuned DistilBertForQuestionAnswering from AnirbanRC +author: John Snow Labs +name: anirban_qa_model_finetuned +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`anirban_qa_model_finetuned` is a English model originally trained by AnirbanRC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/anirban_qa_model_finetuned_en_5.2.0_3.0_1701035385579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/anirban_qa_model_finetuned_en_5.2.0_3.0_1701035385579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("anirban_qa_model_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("anirban_qa_model_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|anirban_qa_model_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AnirbanRC/anirban_qa_model_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-arabicdistilbert_qa_ar.md b/docs/_posts/ahmedlone127/2023-11-26-arabicdistilbert_qa_ar.md new file mode 100644 index 000000000000..4d2ac170cd72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-arabicdistilbert_qa_ar.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Arabic arabicdistilbert_qa DistilBertForQuestionAnswering from arabi-elidrisi +author: John Snow Labs +name: arabicdistilbert_qa +date: 2023-11-26 +tags: [distilbert, ar, open_source, question_answering, onnx] +task: Question Answering +language: ar +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arabicdistilbert_qa` is a Arabic model originally trained by arabi-elidrisi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arabicdistilbert_qa_ar_5.2.0_3.0_1701016049647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arabicdistilbert_qa_ar_5.2.0_3.0_1701016049647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("arabicdistilbert_qa","ar") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("arabicdistilbert_qa", "ar") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arabicdistilbert_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ar| +|Size:|407.6 MB| + +## References + +https://huggingface.co/arabi-elidrisi/ArabicDistilBERT_QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-askinvesto_distilbert_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-askinvesto_distilbert_model_en.md new file mode 100644 index 000000000000..9fd554f9f71d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-askinvesto_distilbert_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English askinvesto_distilbert_model DistilBertForQuestionAnswering from jadegao +author: John Snow Labs +name: askinvesto_distilbert_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`askinvesto_distilbert_model` is a English model originally trained by jadegao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/askinvesto_distilbert_model_en_5.2.0_3.0_1701040820428.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/askinvesto_distilbert_model_en_5.2.0_3.0_1701040820428.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("askinvesto_distilbert_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("askinvesto_distilbert_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|askinvesto_distilbert_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jadegao/askinvesto-distilbert-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-autotrain_chatbot_part_34_54485127518_en.md b/docs/_posts/ahmedlone127/2023-11-26-autotrain_chatbot_part_34_54485127518_en.md new file mode 100644 index 000000000000..712c1bf985b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-autotrain_chatbot_part_34_54485127518_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_chatbot_part_34_54485127518 DistilBertForQuestionAnswering from harshith34 +author: John Snow Labs +name: autotrain_chatbot_part_34_54485127518 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_chatbot_part_34_54485127518` is a English model originally trained by harshith34. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_chatbot_part_34_54485127518_en_5.2.0_3.0_1701041601123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_chatbot_part_34_54485127518_en_5.2.0_3.0_1701041601123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("autotrain_chatbot_part_34_54485127518","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("autotrain_chatbot_part_34_54485127518", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_chatbot_part_34_54485127518| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/harshith34/autotrain-chatbot-part-34-54485127518 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-autotrain_distill_bert_anda_neuralg8_99799147480_en.md b/docs/_posts/ahmedlone127/2023-11-26-autotrain_distill_bert_anda_neuralg8_99799147480_en.md new file mode 100644 index 000000000000..f1980191fabd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-autotrain_distill_bert_anda_neuralg8_99799147480_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_distill_bert_anda_neuralg8_99799147480 DistilBertForQuestionAnswering from Samis922 +author: John Snow Labs +name: autotrain_distill_bert_anda_neuralg8_99799147480 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_distill_bert_anda_neuralg8_99799147480` is a English model originally trained by Samis922. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_distill_bert_anda_neuralg8_99799147480_en_5.2.0_3.0_1701015690357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_distill_bert_anda_neuralg8_99799147480_en_5.2.0_3.0_1701015690357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("autotrain_distill_bert_anda_neuralg8_99799147480","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("autotrain_distill_bert_anda_neuralg8_99799147480", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_distill_bert_anda_neuralg8_99799147480| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Samis922/autotrain-distill_bert_anda_neuralg8-99799147480 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-autotrain_nlp_distilbert_2772181933_en.md b/docs/_posts/ahmedlone127/2023-11-26-autotrain_nlp_distilbert_2772181933_en.md new file mode 100644 index 000000000000..e3932a0623ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-autotrain_nlp_distilbert_2772181933_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_nlp_distilbert_2772181933 DistilBertForQuestionAnswering from nejox +author: John Snow Labs +name: autotrain_nlp_distilbert_2772181933 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_nlp_distilbert_2772181933` is a English model originally trained by nejox. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_nlp_distilbert_2772181933_en_5.2.0_3.0_1701037434465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_nlp_distilbert_2772181933_en_5.2.0_3.0_1701037434465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("autotrain_nlp_distilbert_2772181933","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("autotrain_nlp_distilbert_2772181933", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_nlp_distilbert_2772181933| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nejox/autotrain-nlp_distilbert-2772181933 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_arabic_en.md b/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_arabic_en.md new file mode 100644 index 000000000000..aa2102f46aef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_arabic_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English base_seq_lab_arabic DistilBertForQuestionAnswering from mathildeparlo +author: John Snow Labs +name: base_seq_lab_arabic +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_seq_lab_arabic` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_seq_lab_arabic_en_5.2.0_3.0_1701014477978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_seq_lab_arabic_en_5.2.0_3.0_1701014477978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("base_seq_lab_arabic","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("base_seq_lab_arabic", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_seq_lab_arabic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mathildeparlo/base_seq_lab_arabic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_bengali_en.md b/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_bengali_en.md new file mode 100644 index 000000000000..92e1af5dce8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_bengali_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English base_seq_lab_bengali DistilBertForQuestionAnswering from mathildeparlo +author: John Snow Labs +name: base_seq_lab_bengali +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_seq_lab_bengali` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_seq_lab_bengali_en_5.2.0_3.0_1701017579813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_seq_lab_bengali_en_5.2.0_3.0_1701017579813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("base_seq_lab_bengali","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("base_seq_lab_bengali", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_seq_lab_bengali| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mathildeparlo/base_seq_lab_bengali \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_indonesian_en.md b/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_indonesian_en.md new file mode 100644 index 000000000000..29610332d783 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-base_seq_lab_indonesian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English base_seq_lab_indonesian DistilBertForQuestionAnswering from mathildeparlo +author: John Snow Labs +name: base_seq_lab_indonesian +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`base_seq_lab_indonesian` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/base_seq_lab_indonesian_en_5.2.0_3.0_1701017688480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/base_seq_lab_indonesian_en_5.2.0_3.0_1701017688480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("base_seq_lab_indonesian","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("base_seq_lab_indonesian", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|base_seq_lab_indonesian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mathildeparlo/base_seq_lab_indonesian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bbert_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-bbert_qa_en.md new file mode 100644 index 000000000000..5b19f66ed293 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bbert_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bbert_qa DistilBertForQuestionAnswering from rsml +author: John Snow Labs +name: bbert_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bbert_qa` is a English model originally trained by rsml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bbert_qa_en_5.2.0_3.0_1701019434593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bbert_qa_en_5.2.0_3.0_1701019434593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bbert_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bbert_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bbert_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rsml/bbert_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bbertqa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-bbertqa_model_en.md new file mode 100644 index 000000000000..032d2a1dcc01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bbertqa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bbertqa_model DistilBertForQuestionAnswering from rsml +author: John Snow Labs +name: bbertqa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bbertqa_model` is a English model originally trained by rsml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bbertqa_model_en_5.2.0_3.0_1701030537141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bbertqa_model_en_5.2.0_3.0_1701030537141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bbertqa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bbertqa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bbertqa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rsml/bbertqa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_chat_finetune_app_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_chat_finetune_app_en.md new file mode 100644 index 000000000000..fa3ac970780f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_chat_finetune_app_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_chat_finetune_app DistilBertForQuestionAnswering from atharvapawar +author: John Snow Labs +name: bert_chat_finetune_app +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_chat_finetune_app` is a English model originally trained by atharvapawar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_chat_finetune_app_en_5.2.0_3.0_1701038024850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_chat_finetune_app_en_5.2.0_3.0_1701038024850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_chat_finetune_app","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_chat_finetune_app", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_chat_finetune_app| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/atharvapawar/BERT-chat-finetune-APP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_clinicalqa_emrqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_clinicalqa_emrqa_en.md new file mode 100644 index 000000000000..2471179b0670 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_clinicalqa_emrqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_clinicalqa_emrqa DistilBertForQuestionAnswering from aaditya +author: John Snow Labs +name: bert_clinicalqa_emrqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_clinicalqa_emrqa` is a English model originally trained by aaditya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_clinicalqa_emrqa_en_5.2.0_3.0_1701016585621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_clinicalqa_emrqa_en_5.2.0_3.0_1701016585621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_clinicalqa_emrqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_clinicalqa_emrqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_clinicalqa_emrqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aaditya/BERT-ClinicalQA_emrqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_2_en.md new file mode 100644 index 000000000000..cb3cc5db3c3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_2 DistilBertForQuestionAnswering from tomXBE +author: John Snow Labs +name: bert_finetuned_squad_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_2` is a English model originally trained by tomXBE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_2_en_5.2.0_3.0_1701026222570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_2_en_5.2.0_3.0_1701026222570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_finetuned_squad_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_finetuned_squad_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tomXBE/bert-finetuned-squad_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_johnjose223_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_johnjose223_en.md new file mode 100644 index 000000000000..7810825b0959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_johnjose223_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_johnjose223 DistilBertForQuestionAnswering from johnjose223 +author: John Snow Labs +name: bert_finetuned_squad_johnjose223 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_johnjose223` is a English model originally trained by johnjose223. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_johnjose223_en_5.2.0_3.0_1701022575949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_johnjose223_en_5.2.0_3.0_1701022575949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_finetuned_squad_johnjose223","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_finetuned_squad_johnjose223", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_johnjose223| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/johnjose223/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_tanishj_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_tanishj_en.md new file mode 100644 index 000000000000..d89926389733 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_squad_tanishj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_squad_tanishj DistilBertForQuestionAnswering from tanishj +author: John Snow Labs +name: bert_finetuned_squad_tanishj +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_squad_tanishj` is a English model originally trained by tanishj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tanishj_en_5.2.0_3.0_1701040976411.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_squad_tanishj_en_5.2.0_3.0_1701040976411.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_finetuned_squad_tanishj","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_finetuned_squad_tanishj", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_squad_tanishj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tanishj/bert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_subjqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_subjqa_en.md new file mode 100644 index 000000000000..1111d70d0e65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_subjqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_subjqa DistilBertForQuestionAnswering from chaimaae +author: John Snow Labs +name: bert_finetuned_subjqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_subjqa` is a English model originally trained by chaimaae. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_subjqa_en_5.2.0_3.0_1701030122116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_subjqa_en_5.2.0_3.0_1701030122116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_finetuned_subjqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_finetuned_subjqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_subjqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chaimaae/bert-finetuned-subjqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_syllabus_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_syllabus_en.md new file mode 100644 index 000000000000..433366e52ace --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_finetuned_syllabus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_finetuned_syllabus DistilBertForQuestionAnswering from jteng +author: John Snow Labs +name: bert_finetuned_syllabus +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_finetuned_syllabus` is a English model originally trained by jteng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_finetuned_syllabus_en_5.2.0_3.0_1701018120190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_finetuned_syllabus_en_5.2.0_3.0_1701018120190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_finetuned_syllabus","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_finetuned_syllabus", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_finetuned_syllabus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jteng/bert-finetuned-syllabus \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_qa_model_krikstaponyte_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_qa_model_krikstaponyte_en.md new file mode 100644 index 000000000000..c013cb21e51e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_qa_model_krikstaponyte_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_model_krikstaponyte DistilBertForQuestionAnswering from krikstaponyte +author: John Snow Labs +name: bert_qa_model_krikstaponyte +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_model_krikstaponyte` is a English model originally trained by krikstaponyte. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_model_krikstaponyte_en_5.2.0_3.0_1701028443945.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_model_krikstaponyte_en_5.2.0_3.0_1701028443945.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_qa_model_krikstaponyte","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_qa_model_krikstaponyte", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_model_krikstaponyte| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/krikstaponyte/bert_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_qna_custom_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_qna_custom_tuned_en.md new file mode 100644 index 000000000000..f8b55915ff87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_qna_custom_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qna_custom_tuned DistilBertForQuestionAnswering from EnND +author: John Snow Labs +name: bert_qna_custom_tuned +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qna_custom_tuned` is a English model originally trained by EnND. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qna_custom_tuned_en_5.2.0_3.0_1701024372560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qna_custom_tuned_en_5.2.0_3.0_1701024372560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_qna_custom_tuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_qna_custom_tuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qna_custom_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/EnND/bert-qna-custom-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_tomi_kk1729_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_tomi_kk1729_en.md new file mode 100644 index 000000000000..d4491998062d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_tomi_kk1729_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tomi_kk1729 DistilBertForQuestionAnswering from kk1729 +author: John Snow Labs +name: bert_tomi_kk1729 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tomi_kk1729` is a English model originally trained by kk1729. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tomi_kk1729_en_5.2.0_3.0_1701024119572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tomi_kk1729_en_5.2.0_3.0_1701024119572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_tomi_kk1729","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_tomi_kk1729", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tomi_kk1729| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kk1729/bert-tomi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bert_tomi_xavia0012_en.md b/docs/_posts/ahmedlone127/2023-11-26-bert_tomi_xavia0012_en.md new file mode 100644 index 000000000000..81583fc64438 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bert_tomi_xavia0012_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_tomi_xavia0012 DistilBertForQuestionAnswering from Xavia0012 +author: John Snow Labs +name: bert_tomi_xavia0012 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_tomi_xavia0012` is a English model originally trained by Xavia0012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_tomi_xavia0012_en_5.2.0_3.0_1701023821364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_tomi_xavia0012_en_5.2.0_3.0_1701023821364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_tomi_xavia0012","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_tomi_xavia0012", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_tomi_xavia0012| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Xavia0012/bert-tomi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bertmodelqa2_en.md b/docs/_posts/ahmedlone127/2023-11-26-bertmodelqa2_en.md new file mode 100644 index 000000000000..55504fd5188a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bertmodelqa2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bertmodelqa2 DistilBertForQuestionAnswering from trinket2023 +author: John Snow Labs +name: bertmodelqa2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertmodelqa2` is a English model originally trained by trinket2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertmodelqa2_en_5.2.0_3.0_1701039857707.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertmodelqa2_en_5.2.0_3.0_1701039857707.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bertmodelqa2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bertmodelqa2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertmodelqa2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/trinket2023/BERTModelQA2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-bertmodelqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-bertmodelqa_en.md new file mode 100644 index 000000000000..6bf967b921e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-bertmodelqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bertmodelqa DistilBertForQuestionAnswering from trinket2023 +author: John Snow Labs +name: bertmodelqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertmodelqa` is a English model originally trained by trinket2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertmodelqa_en_5.2.0_3.0_1701034457782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertmodelqa_en_5.2.0_3.0_1701034457782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bertmodelqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bertmodelqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertmodelqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/trinket2023/BERTModelQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model2_amrutha3899_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model2_amrutha3899_en.md new file mode 100644 index 000000000000..5bd037e550a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model2_amrutha3899_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model2_amrutha3899 DistilBertForQuestionAnswering from amrutha3899 +author: John Snow Labs +name: burmese_awesome_qa_model2_amrutha3899 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model2_amrutha3899` is a English model originally trained by amrutha3899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model2_amrutha3899_en_5.2.0_3.0_1701031592676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model2_amrutha3899_en_5.2.0_3.0_1701031592676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model2_amrutha3899","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model2_amrutha3899", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model2_amrutha3899| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/amrutha3899/my_awesome_qa_model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model2_laolong9191_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model2_laolong9191_en.md new file mode 100644 index 000000000000..f022bf1cb9dc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model2_laolong9191_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model2_laolong9191 DistilBertForQuestionAnswering from laolong9191 +author: John Snow Labs +name: burmese_awesome_qa_model2_laolong9191 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model2_laolong9191` is a English model originally trained by laolong9191. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model2_laolong9191_en_5.2.0_3.0_1701042453974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model2_laolong9191_en_5.2.0_3.0_1701042453974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model2_laolong9191","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model2_laolong9191", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model2_laolong9191| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/laolong9191/my_awesome_qa_model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_01_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_01_en.md new file mode 100644 index 000000000000..122fec870e43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_01 DistilBertForQuestionAnswering from AlexPerkin +author: John Snow Labs +name: burmese_awesome_qa_model_01 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_01` is a English model originally trained by AlexPerkin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_01_en_5.2.0_3.0_1701040991170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_01_en_5.2.0_3.0_1701040991170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_01","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_01", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AlexPerkin/my_awesome_qa_model_01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_adrienbin_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_adrienbin_en.md new file mode 100644 index 000000000000..8948ee16edfd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_adrienbin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_adrienbin DistilBertForQuestionAnswering from AdrienBin +author: John Snow Labs +name: burmese_awesome_qa_model_adrienbin +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_adrienbin` is a English model originally trained by AdrienBin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_adrienbin_en_5.2.0_3.0_1701034487621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_adrienbin_en_5.2.0_3.0_1701034487621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_adrienbin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_adrienbin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_adrienbin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AdrienBin/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_anandshende_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_anandshende_en.md new file mode 100644 index 000000000000..f2741268d0b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_anandshende_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_anandshende DistilBertForQuestionAnswering from anandshende +author: John Snow Labs +name: burmese_awesome_qa_model_anandshende +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_anandshende` is a English model originally trained by anandshende. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anandshende_en_5.2.0_3.0_1701040276795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anandshende_en_5.2.0_3.0_1701040276795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_anandshende","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_anandshende", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_anandshende| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anandshende/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_annasnezhevna_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_annasnezhevna_en.md new file mode 100644 index 000000000000..37150cfd249f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_annasnezhevna_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_annasnezhevna DistilBertForQuestionAnswering from AnnaSnezhevna +author: John Snow Labs +name: burmese_awesome_qa_model_annasnezhevna +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_annasnezhevna` is a English model originally trained by AnnaSnezhevna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_annasnezhevna_en_5.2.0_3.0_1701037436851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_annasnezhevna_en_5.2.0_3.0_1701037436851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_annasnezhevna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_annasnezhevna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_annasnezhevna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AnnaSnezhevna/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_apes07_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_apes07_en.md new file mode 100644 index 000000000000..675c10d72574 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_apes07_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_apes07 DistilBertForQuestionAnswering from Apes07 +author: John Snow Labs +name: burmese_awesome_qa_model_apes07 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_apes07` is a English model originally trained by Apes07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_apes07_en_5.2.0_3.0_1701032347220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_apes07_en_5.2.0_3.0_1701032347220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_apes07","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_apes07", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_apes07| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Apes07/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_apurv_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_apurv_en.md new file mode 100644 index 000000000000..792ffe441c71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_apurv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_apurv DistilBertForQuestionAnswering from Apurv +author: John Snow Labs +name: burmese_awesome_qa_model_apurv +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_apurv` is a English model originally trained by Apurv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_apurv_en_5.2.0_3.0_1701032735600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_apurv_en_5.2.0_3.0_1701032735600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_apurv","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_apurv", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_apurv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Apurv/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ashmitg_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ashmitg_en.md new file mode 100644 index 000000000000..3c8341b8fff4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ashmitg_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ashmitg DistilBertForQuestionAnswering from ashmitg +author: John Snow Labs +name: burmese_awesome_qa_model_ashmitg +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ashmitg` is a English model originally trained by ashmitg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ashmitg_en_5.2.0_3.0_1701032203974.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ashmitg_en_5.2.0_3.0_1701032203974.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ashmitg","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ashmitg", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ashmitg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ashmitg/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_axel_0087_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_axel_0087_en.md new file mode 100644 index 000000000000..344044340ea7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_axel_0087_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_axel_0087 DistilBertForQuestionAnswering from Axel-0087 +author: John Snow Labs +name: burmese_awesome_qa_model_axel_0087 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_axel_0087` is a English model originally trained by Axel-0087. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_axel_0087_en_5.2.0_3.0_1701035969871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_axel_0087_en_5.2.0_3.0_1701035969871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_axel_0087","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_axel_0087", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_axel_0087| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Axel-0087/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bathmaraj_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bathmaraj_en.md new file mode 100644 index 000000000000..82b65bbe0c34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bathmaraj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_bathmaraj DistilBertForQuestionAnswering from bathmaraj +author: John Snow Labs +name: burmese_awesome_qa_model_bathmaraj +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_bathmaraj` is a English model originally trained by bathmaraj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bathmaraj_en_5.2.0_3.0_1701022153526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bathmaraj_en_5.2.0_3.0_1701022153526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_bathmaraj","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_bathmaraj", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_bathmaraj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bathmaraj/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bernardsw99_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bernardsw99_en.md new file mode 100644 index 000000000000..992c85630e82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bernardsw99_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_bernardsw99 DistilBertForQuestionAnswering from bernardsw99 +author: John Snow Labs +name: burmese_awesome_qa_model_bernardsw99 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_bernardsw99` is a English model originally trained by bernardsw99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bernardsw99_en_5.2.0_3.0_1701030804509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bernardsw99_en_5.2.0_3.0_1701030804509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_bernardsw99","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_bernardsw99", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_bernardsw99| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bernardsw99/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bluemetaldragon_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bluemetaldragon_en.md new file mode 100644 index 000000000000..21a502d39a5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_bluemetaldragon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_bluemetaldragon DistilBertForQuestionAnswering from bluemetaldragon +author: John Snow Labs +name: burmese_awesome_qa_model_bluemetaldragon +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_bluemetaldragon` is a English model originally trained by bluemetaldragon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bluemetaldragon_en_5.2.0_3.0_1701037038526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bluemetaldragon_en_5.2.0_3.0_1701037038526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_bluemetaldragon","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_bluemetaldragon", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_bluemetaldragon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bluemetaldragon/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_booksummarize_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_booksummarize_en.md new file mode 100644 index 000000000000..d2925acca9ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_booksummarize_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_booksummarize DistilBertForQuestionAnswering from booksummarize +author: John Snow Labs +name: burmese_awesome_qa_model_booksummarize +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_booksummarize` is a English model originally trained by booksummarize. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_booksummarize_en_5.2.0_3.0_1701025322852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_booksummarize_en_5.2.0_3.0_1701025322852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_booksummarize","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_booksummarize", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_booksummarize| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/booksummarize/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chet4_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chet4_en.md new file mode 100644 index 000000000000..5960a5997fc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chet4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_chet4 DistilBertForQuestionAnswering from chet4 +author: John Snow Labs +name: burmese_awesome_qa_model_chet4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_chet4` is a English model originally trained by chet4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chet4_en_5.2.0_3.0_1701031732807.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chet4_en_5.2.0_3.0_1701031732807.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_chet4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_chet4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_chet4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chet4/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chetna19_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chetna19_en.md new file mode 100644 index 000000000000..8a467c7fd1bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chetna19_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_chetna19 DistilBertForQuestionAnswering from Chetna19 +author: John Snow Labs +name: burmese_awesome_qa_model_chetna19 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_chetna19` is a English model originally trained by Chetna19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chetna19_en_5.2.0_3.0_1701031151576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chetna19_en_5.2.0_3.0_1701031151576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_chetna19","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_chetna19", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_chetna19| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Chetna19/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chhabi_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chhabi_en.md new file mode 100644 index 000000000000..bc36b5cf41fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chhabi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_chhabi DistilBertForQuestionAnswering from Chhabi +author: John Snow Labs +name: burmese_awesome_qa_model_chhabi +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_chhabi` is a English model originally trained by Chhabi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chhabi_en_5.2.0_3.0_1701023139593.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chhabi_en_5.2.0_3.0_1701023139593.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_chhabi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_chhabi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_chhabi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Chhabi/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chunwoolee0_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chunwoolee0_en.md new file mode 100644 index 000000000000..0e3c362e9846 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_chunwoolee0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_chunwoolee0 DistilBertForQuestionAnswering from chunwoolee0 +author: John Snow Labs +name: burmese_awesome_qa_model_chunwoolee0 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_chunwoolee0` is a English model originally trained by chunwoolee0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chunwoolee0_en_5.2.0_3.0_1701034579706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_chunwoolee0_en_5.2.0_3.0_1701034579706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_chunwoolee0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_chunwoolee0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_chunwoolee0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chunwoolee0/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ckaschny_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ckaschny_en.md new file mode 100644 index 000000000000..152c67d730a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ckaschny_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ckaschny DistilBertForQuestionAnswering from ckaschny +author: John Snow Labs +name: burmese_awesome_qa_model_ckaschny +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ckaschny` is a English model originally trained by ckaschny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ckaschny_en_5.2.0_3.0_1701030410172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ckaschny_en_5.2.0_3.0_1701030410172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ckaschny","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ckaschny", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ckaschny| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ckaschny/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_claraldk01_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_claraldk01_en.md new file mode 100644 index 000000000000..091f69a0555b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_claraldk01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_claraldk01 DistilBertForQuestionAnswering from claraldk01 +author: John Snow Labs +name: burmese_awesome_qa_model_claraldk01 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_claraldk01` is a English model originally trained by claraldk01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_claraldk01_en_5.2.0_3.0_1701032261918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_claraldk01_en_5.2.0_3.0_1701032261918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_claraldk01","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_claraldk01", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_claraldk01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/claraldk01/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_coderunner007_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_coderunner007_en.md new file mode 100644 index 000000000000..e602eaef7b4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_coderunner007_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_coderunner007 DistilBertForQuestionAnswering from coderunner007 +author: John Snow Labs +name: burmese_awesome_qa_model_coderunner007 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_coderunner007` is a English model originally trained by coderunner007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_coderunner007_en_5.2.0_3.0_1701015156957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_coderunner007_en_5.2.0_3.0_1701015156957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_coderunner007","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_coderunner007", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_coderunner007| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|253.8 MB| + +## References + +https://huggingface.co/coderunner007/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_cotran2_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_cotran2_en.md new file mode 100644 index 000000000000..c391076eaf44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_cotran2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_cotran2 DistilBertForQuestionAnswering from cotran2 +author: John Snow Labs +name: burmese_awesome_qa_model_cotran2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_cotran2` is a English model originally trained by cotran2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_cotran2_en_5.2.0_3.0_1701034972765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_cotran2_en_5.2.0_3.0_1701034972765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_cotran2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_cotran2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_cotran2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cotran2/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_daghspam_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_daghspam_en.md new file mode 100644 index 000000000000..026e625d26bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_daghspam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_daghspam DistilBertForQuestionAnswering from daghspam +author: John Snow Labs +name: burmese_awesome_qa_model_daghspam +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_daghspam` is a English model originally trained by daghspam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_daghspam_en_5.2.0_3.0_1701038354441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_daghspam_en_5.2.0_3.0_1701038354441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_daghspam","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_daghspam", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_daghspam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/daghspam/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_drelihan_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_drelihan_en.md new file mode 100644 index 000000000000..a0dc8d8107bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_drelihan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_drelihan DistilBertForQuestionAnswering from drelihan +author: John Snow Labs +name: burmese_awesome_qa_model_drelihan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_drelihan` is a English model originally trained by drelihan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_drelihan_en_5.2.0_3.0_1701037781739.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_drelihan_en_5.2.0_3.0_1701037781739.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_drelihan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_drelihan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_drelihan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/drelihan/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_greymatterz_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_greymatterz_en.md new file mode 100644 index 000000000000..42a9d37f4a86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_greymatterz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_greymatterz DistilBertForQuestionAnswering from greymatterz +author: John Snow Labs +name: burmese_awesome_qa_model_greymatterz +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_greymatterz` is a English model originally trained by greymatterz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_greymatterz_en_5.2.0_3.0_1701032055632.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_greymatterz_en_5.2.0_3.0_1701032055632.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_greymatterz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_greymatterz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_greymatterz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/greymatterz/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_haurajahra_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_haurajahra_en.md new file mode 100644 index 000000000000..75f62b3f5bf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_haurajahra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_haurajahra DistilBertForQuestionAnswering from haurajahra +author: John Snow Labs +name: burmese_awesome_qa_model_haurajahra +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_haurajahra` is a English model originally trained by haurajahra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_haurajahra_en_5.2.0_3.0_1701038628384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_haurajahra_en_5.2.0_3.0_1701038628384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_haurajahra","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_haurajahra", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_haurajahra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/haurajahra/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_helojo_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_helojo_en.md new file mode 100644 index 000000000000..ba6c2469be19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_helojo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_helojo DistilBertForQuestionAnswering from helojo +author: John Snow Labs +name: burmese_awesome_qa_model_helojo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_helojo` is a English model originally trained by helojo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_helojo_en_5.2.0_3.0_1701039979291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_helojo_en_5.2.0_3.0_1701039979291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_helojo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_helojo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_helojo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/helojo/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_heon98_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_heon98_en.md new file mode 100644 index 000000000000..0ed51badd1b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_heon98_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_heon98 DistilBertForQuestionAnswering from heon98 +author: John Snow Labs +name: burmese_awesome_qa_model_heon98 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_heon98` is a English model originally trained by heon98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_heon98_en_5.2.0_3.0_1701035658466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_heon98_en_5.2.0_3.0_1701035658466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_heon98","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_heon98", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_heon98| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/heon98/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_hyunjoocheong_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_hyunjoocheong_en.md new file mode 100644 index 000000000000..baf47ac498ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_hyunjoocheong_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_hyunjoocheong DistilBertForQuestionAnswering from HyunjooCheong +author: John Snow Labs +name: burmese_awesome_qa_model_hyunjoocheong +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_hyunjoocheong` is a English model originally trained by HyunjooCheong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hyunjoocheong_en_5.2.0_3.0_1701042286579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hyunjoocheong_en_5.2.0_3.0_1701042286579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_hyunjoocheong","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_hyunjoocheong", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_hyunjoocheong| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HyunjooCheong/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ieyriay_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ieyriay_en.md new file mode 100644 index 000000000000..ff4e3be5405d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ieyriay_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ieyriay DistilBertForQuestionAnswering from ieyriay +author: John Snow Labs +name: burmese_awesome_qa_model_ieyriay +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ieyriay` is a English model originally trained by ieyriay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ieyriay_en_5.2.0_3.0_1701040627328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ieyriay_en_5.2.0_3.0_1701040627328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ieyriay","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ieyriay", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ieyriay| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ieyriay/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ilidiolopes_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ilidiolopes_en.md new file mode 100644 index 000000000000..c1c08058485d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ilidiolopes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ilidiolopes DistilBertForQuestionAnswering from ilidiolopes +author: John Snow Labs +name: burmese_awesome_qa_model_ilidiolopes +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ilidiolopes` is a English model originally trained by ilidiolopes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ilidiolopes_en_5.2.0_3.0_1701033845953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ilidiolopes_en_5.2.0_3.0_1701033845953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ilidiolopes","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ilidiolopes", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ilidiolopes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ilidiolopes/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_joshuali19_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_joshuali19_en.md new file mode 100644 index 000000000000..330bc1509630 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_joshuali19_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_joshuali19 DistilBertForQuestionAnswering from joshuali19 +author: John Snow Labs +name: burmese_awesome_qa_model_joshuali19 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_joshuali19` is a English model originally trained by joshuali19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_joshuali19_en_5.2.0_3.0_1701031900857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_joshuali19_en_5.2.0_3.0_1701031900857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_joshuali19","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_joshuali19", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_joshuali19| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/joshuali19/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_jwenpaq_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_jwenpaq_en.md new file mode 100644 index 000000000000..9b66497fc8c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_jwenpaq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jwenpaq DistilBertForQuestionAnswering from jwenpaq +author: John Snow Labs +name: burmese_awesome_qa_model_jwenpaq +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jwenpaq` is a English model originally trained by jwenpaq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jwenpaq_en_5.2.0_3.0_1701034749690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jwenpaq_en_5.2.0_3.0_1701034749690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jwenpaq","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jwenpaq", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jwenpaq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jwenpaq/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_katxtong_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_katxtong_en.md new file mode 100644 index 000000000000..9c8ccb800e92 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_katxtong_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_katxtong DistilBertForQuestionAnswering from katxtong +author: John Snow Labs +name: burmese_awesome_qa_model_katxtong +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_katxtong` is a English model originally trained by katxtong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_katxtong_en_5.2.0_3.0_1701015096538.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_katxtong_en_5.2.0_3.0_1701015096538.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_katxtong","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_katxtong", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_katxtong| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/katxtong/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ketong3906_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ketong3906_en.md new file mode 100644 index 000000000000..54d96c667615 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ketong3906_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ketong3906 DistilBertForQuestionAnswering from ketong3906 +author: John Snow Labs +name: burmese_awesome_qa_model_ketong3906 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ketong3906` is a English model originally trained by ketong3906. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ketong3906_en_5.2.0_3.0_1701033285709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ketong3906_en_5.2.0_3.0_1701033285709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ketong3906","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ketong3906", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ketong3906| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ketong3906/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_kkkh1_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_kkkh1_en.md new file mode 100644 index 000000000000..ca882e6af70d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_kkkh1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kkkh1 DistilBertForQuestionAnswering from kkkh1 +author: John Snow Labs +name: burmese_awesome_qa_model_kkkh1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kkkh1` is a English model originally trained by kkkh1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kkkh1_en_5.2.0_3.0_1701037203075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kkkh1_en_5.2.0_3.0_1701037203075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kkkh1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kkkh1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kkkh1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kkkh1/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_krisha05_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_krisha05_en.md new file mode 100644 index 000000000000..191ed46187f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_krisha05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_krisha05 DistilBertForQuestionAnswering from krisha05 +author: John Snow Labs +name: burmese_awesome_qa_model_krisha05 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_krisha05` is a English model originally trained by krisha05. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_krisha05_en_5.2.0_3.0_1701038442237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_krisha05_en_5.2.0_3.0_1701038442237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_krisha05","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_krisha05", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_krisha05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/krisha05/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_liujunshi_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_liujunshi_en.md new file mode 100644 index 000000000000..c51692a822ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_liujunshi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_liujunshi DistilBertForQuestionAnswering from liujunshi +author: John Snow Labs +name: burmese_awesome_qa_model_liujunshi +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_liujunshi` is a English model originally trained by liujunshi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_liujunshi_en_5.2.0_3.0_1701037563202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_liujunshi_en_5.2.0_3.0_1701037563202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_liujunshi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_liujunshi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_liujunshi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/liujunshi/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_lovenoo_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_lovenoo_en.md new file mode 100644 index 000000000000..47d88f7e51c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_lovenoo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_lovenoo DistilBertForQuestionAnswering from LovenOO +author: John Snow Labs +name: burmese_awesome_qa_model_lovenoo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_lovenoo` is a English model originally trained by LovenOO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_lovenoo_en_5.2.0_3.0_1701017421364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_lovenoo_en_5.2.0_3.0_1701017421364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_lovenoo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_lovenoo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_lovenoo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LovenOO/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_mandy555_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_mandy555_en.md new file mode 100644 index 000000000000..9e5dfaf363f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_mandy555_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_mandy555 DistilBertForQuestionAnswering from mandy555 +author: John Snow Labs +name: burmese_awesome_qa_model_mandy555 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_mandy555` is a English model originally trained by mandy555. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_mandy555_en_5.2.0_3.0_1701038282929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_mandy555_en_5.2.0_3.0_1701038282929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_mandy555","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_mandy555", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_mandy555| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mandy555/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_narayan02_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_narayan02_en.md new file mode 100644 index 000000000000..203e6191418a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_narayan02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_narayan02 DistilBertForQuestionAnswering from narayan02 +author: John Snow Labs +name: burmese_awesome_qa_model_narayan02 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_narayan02` is a English model originally trained by narayan02. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_narayan02_en_5.2.0_3.0_1701039037848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_narayan02_en_5.2.0_3.0_1701039037848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_narayan02","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_narayan02", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_narayan02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/narayan02/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_nc33_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_nc33_en.md new file mode 100644 index 000000000000..7724842c0c04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_nc33_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_nc33 DistilBertForQuestionAnswering from nc33 +author: John Snow Labs +name: burmese_awesome_qa_model_nc33 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_nc33` is a English model originally trained by nc33. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nc33_en_5.2.0_3.0_1701042967185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nc33_en_5.2.0_3.0_1701042967185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_nc33","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_nc33", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_nc33| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nc33/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_neildave_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_neildave_en.md new file mode 100644 index 000000000000..b2dd2c19a172 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_neildave_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_neildave DistilBertForQuestionAnswering from Neildave +author: John Snow Labs +name: burmese_awesome_qa_model_neildave +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_neildave` is a English model originally trained by Neildave. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_neildave_en_5.2.0_3.0_1701034859611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_neildave_en_5.2.0_3.0_1701034859611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_neildave","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_neildave", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_neildave| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Neildave/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_niklas25_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_niklas25_en.md new file mode 100644 index 000000000000..5b7acdded3f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_niklas25_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_niklas25 DistilBertForQuestionAnswering from Niklas25 +author: John Snow Labs +name: burmese_awesome_qa_model_niklas25 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_niklas25` is a English model originally trained by Niklas25. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_niklas25_en_5.2.0_3.0_1701036013321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_niklas25_en_5.2.0_3.0_1701036013321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_niklas25","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_niklas25", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_niklas25| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Niklas25/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_notericwang_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_notericwang_en.md new file mode 100644 index 000000000000..7bde6581394f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_notericwang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_notericwang DistilBertForQuestionAnswering from notericwang +author: John Snow Labs +name: burmese_awesome_qa_model_notericwang +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_notericwang` is a English model originally trained by notericwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_notericwang_en_5.2.0_3.0_1701041912125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_notericwang_en_5.2.0_3.0_1701041912125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_notericwang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_notericwang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_notericwang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/notericwang/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_nyxabhi_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_nyxabhi_en.md new file mode 100644 index 000000000000..111dda64fae3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_nyxabhi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_nyxabhi DistilBertForQuestionAnswering from nyxabhi +author: John Snow Labs +name: burmese_awesome_qa_model_nyxabhi +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_nyxabhi` is a English model originally trained by nyxabhi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nyxabhi_en_5.2.0_3.0_1701042392691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nyxabhi_en_5.2.0_3.0_1701042392691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_nyxabhi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_nyxabhi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_nyxabhi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nyxabhi/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_oleksandr2003_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_oleksandr2003_en.md new file mode 100644 index 000000000000..ed002512df8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_oleksandr2003_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_oleksandr2003 DistilBertForQuestionAnswering from Oleksandr2003 +author: John Snow Labs +name: burmese_awesome_qa_model_oleksandr2003 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_oleksandr2003` is a English model originally trained by Oleksandr2003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_oleksandr2003_en_5.2.0_3.0_1701037114540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_oleksandr2003_en_5.2.0_3.0_1701037114540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_oleksandr2003","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_oleksandr2003", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_oleksandr2003| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Oleksandr2003/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pankaj10034_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pankaj10034_en.md new file mode 100644 index 000000000000..1d5b0dd79d30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pankaj10034_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pankaj10034 DistilBertForQuestionAnswering from pankaj10034 +author: John Snow Labs +name: burmese_awesome_qa_model_pankaj10034 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pankaj10034` is a English model originally trained by pankaj10034. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pankaj10034_en_5.2.0_3.0_1701036294871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pankaj10034_en_5.2.0_3.0_1701036294871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pankaj10034","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pankaj10034", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pankaj10034| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pankaj10034/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pankajmistry_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pankajmistry_en.md new file mode 100644 index 000000000000..d7795686d5eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pankajmistry_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pankajmistry DistilBertForQuestionAnswering from Pankajmistry +author: John Snow Labs +name: burmese_awesome_qa_model_pankajmistry +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pankajmistry` is a English model originally trained by Pankajmistry. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pankajmistry_en_5.2.0_3.0_1701042040115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pankajmistry_en_5.2.0_3.0_1701042040115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pankajmistry","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pankajmistry", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pankajmistry| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Pankajmistry/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_parvinder_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_parvinder_en.md new file mode 100644 index 000000000000..04d6e3285fa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_parvinder_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_parvinder DistilBertForQuestionAnswering from Parvinder +author: John Snow Labs +name: burmese_awesome_qa_model_parvinder +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_parvinder` is a English model originally trained by Parvinder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_parvinder_en_5.2.0_3.0_1701032387459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_parvinder_en_5.2.0_3.0_1701032387459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_parvinder","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_parvinder", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_parvinder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Parvinder/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_peterandrew987_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_peterandrew987_en.md new file mode 100644 index 000000000000..7ee969069ad9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_peterandrew987_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_peterandrew987 DistilBertForQuestionAnswering from peterandrew987 +author: John Snow Labs +name: burmese_awesome_qa_model_peterandrew987 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_peterandrew987` is a English model originally trained by peterandrew987. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_peterandrew987_en_5.2.0_3.0_1701039255047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_peterandrew987_en_5.2.0_3.0_1701039255047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_peterandrew987","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_peterandrew987", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_peterandrew987| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/peterandrew987/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_piotrtrochim_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_piotrtrochim_en.md new file mode 100644 index 000000000000..468d6803edda --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_piotrtrochim_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_piotrtrochim DistilBertForQuestionAnswering from piotrtrochim +author: John Snow Labs +name: burmese_awesome_qa_model_piotrtrochim +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_piotrtrochim` is a English model originally trained by piotrtrochim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_piotrtrochim_en_5.2.0_3.0_1701032581720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_piotrtrochim_en_5.2.0_3.0_1701032581720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_piotrtrochim","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_piotrtrochim", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_piotrtrochim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/piotrtrochim/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pols0_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pols0_en.md new file mode 100644 index 000000000000..1595614634f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pols0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pols0 DistilBertForQuestionAnswering from pols0 +author: John Snow Labs +name: burmese_awesome_qa_model_pols0 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pols0` is a English model originally trained by pols0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pols0_en_5.2.0_3.0_1701038423207.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pols0_en_5.2.0_3.0_1701038423207.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pols0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pols0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pols0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pols0/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pushpendra1bel_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pushpendra1bel_en.md new file mode 100644 index 000000000000..a33c6fd97871 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_pushpendra1bel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pushpendra1bel DistilBertForQuestionAnswering from pushpendra1bel +author: John Snow Labs +name: burmese_awesome_qa_model_pushpendra1bel +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pushpendra1bel` is a English model originally trained by pushpendra1bel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pushpendra1bel_en_5.2.0_3.0_1701040143465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pushpendra1bel_en_5.2.0_3.0_1701040143465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pushpendra1bel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pushpendra1bel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pushpendra1bel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pushpendra1bel/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_qiaoqian_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_qiaoqian_en.md new file mode 100644 index 000000000000..ce8e460b4d2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_qiaoqian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_qiaoqian DistilBertForQuestionAnswering from qiaoqian +author: John Snow Labs +name: burmese_awesome_qa_model_qiaoqian +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_qiaoqian` is a English model originally trained by qiaoqian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_qiaoqian_en_5.2.0_3.0_1701041998178.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_qiaoqian_en_5.2.0_3.0_1701041998178.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_qiaoqian","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_qiaoqian", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_qiaoqian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/qiaoqian/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_rajaganapathy_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_rajaganapathy_en.md new file mode 100644 index 000000000000..6395ca25bf54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_rajaganapathy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_rajaganapathy DistilBertForQuestionAnswering from Rajaganapathy +author: John Snow Labs +name: burmese_awesome_qa_model_rajaganapathy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_rajaganapathy` is a English model originally trained by Rajaganapathy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rajaganapathy_en_5.2.0_3.0_1701031464031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rajaganapathy_en_5.2.0_3.0_1701031464031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_rajaganapathy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_rajaganapathy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_rajaganapathy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Rajaganapathy/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ricardofaria_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ricardofaria_en.md new file mode 100644 index 000000000000..c3413600d54c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_ricardofaria_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ricardofaria DistilBertForQuestionAnswering from Ricardofaria +author: John Snow Labs +name: burmese_awesome_qa_model_ricardofaria +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ricardofaria` is a English model originally trained by Ricardofaria. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ricardofaria_en_5.2.0_3.0_1701035928522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ricardofaria_en_5.2.0_3.0_1701035928522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ricardofaria","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ricardofaria", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ricardofaria| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Ricardofaria/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_russellhaley_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_russellhaley_en.md new file mode 100644 index 000000000000..9cfaf226753f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_russellhaley_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_russellhaley DistilBertForQuestionAnswering from RussellHaley +author: John Snow Labs +name: burmese_awesome_qa_model_russellhaley +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_russellhaley` is a English model originally trained by RussellHaley. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_russellhaley_en_5.2.0_3.0_1701034513751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_russellhaley_en_5.2.0_3.0_1701034513751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_russellhaley","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_russellhaley", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_russellhaley| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/RussellHaley/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_saikatkumardey_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_saikatkumardey_en.md new file mode 100644 index 000000000000..0b9e36663f72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_saikatkumardey_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_saikatkumardey DistilBertForQuestionAnswering from saikatkumardey +author: John Snow Labs +name: burmese_awesome_qa_model_saikatkumardey +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_saikatkumardey` is a English model originally trained by saikatkumardey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_saikatkumardey_en_5.2.0_3.0_1701031652511.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_saikatkumardey_en_5.2.0_3.0_1701031652511.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_saikatkumardey","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_saikatkumardey", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_saikatkumardey| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/saikatkumardey/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_salad99_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_salad99_en.md new file mode 100644 index 000000000000..0d2841af9c72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_salad99_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_salad99 DistilBertForQuestionAnswering from Salad99 +author: John Snow Labs +name: burmese_awesome_qa_model_salad99 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_salad99` is a English model originally trained by Salad99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_salad99_en_5.2.0_3.0_1701040428566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_salad99_en_5.2.0_3.0_1701040428566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_salad99","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_salad99", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_salad99| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Salad99/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_satyamverma_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_satyamverma_en.md new file mode 100644 index 000000000000..b7c2ab4ac21c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_satyamverma_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_satyamverma DistilBertForQuestionAnswering from satyamverma +author: John Snow Labs +name: burmese_awesome_qa_model_satyamverma +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_satyamverma` is a English model originally trained by satyamverma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_satyamverma_en_5.2.0_3.0_1701037201962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_satyamverma_en_5.2.0_3.0_1701037201962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_satyamverma","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_satyamverma", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_satyamverma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/satyamverma/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_saurabhgupta_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_saurabhgupta_en.md new file mode 100644 index 000000000000..dbc0d000f07d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_saurabhgupta_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_saurabhgupta DistilBertForQuestionAnswering from saurabhgupta +author: John Snow Labs +name: burmese_awesome_qa_model_saurabhgupta +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_saurabhgupta` is a English model originally trained by saurabhgupta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_saurabhgupta_en_5.2.0_3.0_1701042788013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_saurabhgupta_en_5.2.0_3.0_1701042788013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_saurabhgupta","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_saurabhgupta", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_saurabhgupta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/saurabhgupta/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_shiddiqsugiono_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_shiddiqsugiono_en.md new file mode 100644 index 000000000000..c0c190d34276 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_shiddiqsugiono_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_shiddiqsugiono DistilBertForQuestionAnswering from shiddiqsugiono +author: John Snow Labs +name: burmese_awesome_qa_model_shiddiqsugiono +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_shiddiqsugiono` is a English model originally trained by shiddiqsugiono. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_shiddiqsugiono_en_5.2.0_3.0_1701039329209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_shiddiqsugiono_en_5.2.0_3.0_1701039329209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_shiddiqsugiono","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_shiddiqsugiono", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_shiddiqsugiono| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shiddiqsugiono/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sofa566_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sofa566_en.md new file mode 100644 index 000000000000..30bc9c16529a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sofa566_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sofa566 DistilBertForQuestionAnswering from sofa566 +author: John Snow Labs +name: burmese_awesome_qa_model_sofa566 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sofa566` is a English model originally trained by sofa566. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sofa566_en_5.2.0_3.0_1701038025768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sofa566_en_5.2.0_3.0_1701038025768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sofa566","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sofa566", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sofa566| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sofa566/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sprithivi123_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sprithivi123_en.md new file mode 100644 index 000000000000..b343d81f1788 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sprithivi123_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sprithivi123 DistilBertForQuestionAnswering from sprithivi123 +author: John Snow Labs +name: burmese_awesome_qa_model_sprithivi123 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sprithivi123` is a English model originally trained by sprithivi123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sprithivi123_en_5.2.0_3.0_1701040030019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sprithivi123_en_5.2.0_3.0_1701040030019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sprithivi123","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sprithivi123", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sprithivi123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sprithivi123/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sss7_7_7_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sss7_7_7_en.md new file mode 100644 index 000000000000..96bc235bea53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sss7_7_7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sss7_7_7 DistilBertForQuestionAnswering from Sss7-7-7 +author: John Snow Labs +name: burmese_awesome_qa_model_sss7_7_7 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sss7_7_7` is a English model originally trained by Sss7-7-7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sss7_7_7_en_5.2.0_3.0_1701016015466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sss7_7_7_en_5.2.0_3.0_1701016015466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sss7_7_7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sss7_7_7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sss7_7_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sss7-7-7/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_suryavenkat_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_suryavenkat_en.md new file mode 100644 index 000000000000..64373bf8c4c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_suryavenkat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_suryavenkat DistilBertForQuestionAnswering from SuryaVenkat +author: John Snow Labs +name: burmese_awesome_qa_model_suryavenkat +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_suryavenkat` is a English model originally trained by SuryaVenkat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_suryavenkat_en_5.2.0_3.0_1701036439887.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_suryavenkat_en_5.2.0_3.0_1701036439887.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_suryavenkat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_suryavenkat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_suryavenkat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SuryaVenkat/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sybghat_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sybghat_en.md new file mode 100644 index 000000000000..bacc52216602 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_sybghat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sybghat DistilBertForQuestionAnswering from Sybghat +author: John Snow Labs +name: burmese_awesome_qa_model_sybghat +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sybghat` is a English model originally trained by Sybghat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sybghat_en_5.2.0_3.0_1701026874655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sybghat_en_5.2.0_3.0_1701026874655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sybghat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sybghat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sybghat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sybghat/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_tejavoo_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_tejavoo_en.md new file mode 100644 index 000000000000..8c67bf943b49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_tejavoo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_tejavoo DistilBertForQuestionAnswering from tejavoo +author: John Snow Labs +name: burmese_awesome_qa_model_tejavoo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_tejavoo` is a English model originally trained by tejavoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_tejavoo_en_5.2.0_3.0_1701039251888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_tejavoo_en_5.2.0_3.0_1701039251888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_tejavoo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_tejavoo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_tejavoo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tejavoo/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_thejosephloy_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_thejosephloy_en.md new file mode 100644 index 000000000000..2c77177070ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_thejosephloy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_thejosephloy DistilBertForQuestionAnswering from thejosephloy +author: John Snow Labs +name: burmese_awesome_qa_model_thejosephloy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_thejosephloy` is a English model originally trained by thejosephloy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_thejosephloy_en_5.2.0_3.0_1701025912039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_thejosephloy_en_5.2.0_3.0_1701025912039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_thejosephloy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_thejosephloy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_thejosephloy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/thejosephloy/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_trinket2023_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_trinket2023_en.md new file mode 100644 index 000000000000..5dbb7c783f75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_trinket2023_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_trinket2023 DistilBertForQuestionAnswering from trinket2023 +author: John Snow Labs +name: burmese_awesome_qa_model_trinket2023 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_trinket2023` is a English model originally trained by trinket2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_trinket2023_en_5.2.0_3.0_1701039587329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_trinket2023_en_5.2.0_3.0_1701039587329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_trinket2023","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_trinket2023", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_trinket2023| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/trinket2023/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_umanagendra_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_umanagendra_en.md new file mode 100644 index 000000000000..cab1f4f32a57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_umanagendra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_umanagendra DistilBertForQuestionAnswering from umanagendra +author: John Snow Labs +name: burmese_awesome_qa_model_umanagendra +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_umanagendra` is a English model originally trained by umanagendra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_umanagendra_en_5.2.0_3.0_1701032579394.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_umanagendra_en_5.2.0_3.0_1701032579394.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_umanagendra","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_umanagendra", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_umanagendra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/umanagendra/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_unbelievable111_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_unbelievable111_en.md new file mode 100644 index 000000000000..a672ad86464c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_unbelievable111_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_unbelievable111 DistilBertForQuestionAnswering from unbelievable111 +author: John Snow Labs +name: burmese_awesome_qa_model_unbelievable111 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_unbelievable111` is a English model originally trained by unbelievable111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_unbelievable111_en_5.2.0_3.0_1701037754202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_unbelievable111_en_5.2.0_3.0_1701037754202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_unbelievable111","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_unbelievable111", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_unbelievable111| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/unbelievable111/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_vxbrandon_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_vxbrandon_en.md new file mode 100644 index 000000000000..a070522c5b4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_vxbrandon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_vxbrandon DistilBertForQuestionAnswering from vxbrandon +author: John Snow Labs +name: burmese_awesome_qa_model_vxbrandon +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_vxbrandon` is a English model originally trained by vxbrandon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vxbrandon_en_5.2.0_3.0_1701030818676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vxbrandon_en_5.2.0_3.0_1701030818676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_vxbrandon","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_vxbrandon", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_vxbrandon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vxbrandon/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_w_adapter_ketong3906_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_w_adapter_ketong3906_en.md new file mode 100644 index 000000000000..0c0e7cd597f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_w_adapter_ketong3906_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_w_adapter_ketong3906 DistilBertForQuestionAnswering from ketong3906 +author: John Snow Labs +name: burmese_awesome_qa_model_w_adapter_ketong3906 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_w_adapter_ketong3906` is a English model originally trained by ketong3906. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_w_adapter_ketong3906_en_5.2.0_3.0_1701034283797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_w_adapter_ketong3906_en_5.2.0_3.0_1701034283797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_w_adapter_ketong3906","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_w_adapter_ketong3906", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_w_adapter_ketong3906| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ketong3906/my_awesome_qa_model_w_adapter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_w_adapter_normanstorm_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_w_adapter_normanstorm_en.md new file mode 100644 index 000000000000..82e62c9c9ce8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_w_adapter_normanstorm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_w_adapter_normanstorm DistilBertForQuestionAnswering from normanStorm +author: John Snow Labs +name: burmese_awesome_qa_model_w_adapter_normanstorm +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_w_adapter_normanstorm` is a English model originally trained by normanStorm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_w_adapter_normanstorm_en_5.2.0_3.0_1701040148949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_w_adapter_normanstorm_en_5.2.0_3.0_1701040148949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_w_adapter_normanstorm","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_w_adapter_normanstorm", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_w_adapter_normanstorm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/normanStorm/my_awesome_qa_model_w_adapter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_weicheng112_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_weicheng112_en.md new file mode 100644 index 000000000000..8b6f22671039 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_weicheng112_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_weicheng112 DistilBertForQuestionAnswering from weicheng112 +author: John Snow Labs +name: burmese_awesome_qa_model_weicheng112 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_weicheng112` is a English model originally trained by weicheng112. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_weicheng112_en_5.2.0_3.0_1701037567769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_weicheng112_en_5.2.0_3.0_1701037567769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_weicheng112","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_weicheng112", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_weicheng112| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/weicheng112/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xiaoyang112_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xiaoyang112_en.md new file mode 100644 index 000000000000..8fc42a2ecaf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xiaoyang112_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_xiaoyang112 DistilBertForQuestionAnswering from xiaoyang112 +author: John Snow Labs +name: burmese_awesome_qa_model_xiaoyang112 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_xiaoyang112` is a English model originally trained by xiaoyang112. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xiaoyang112_en_5.2.0_3.0_1701035128816.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xiaoyang112_en_5.2.0_3.0_1701035128816.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_xiaoyang112","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_xiaoyang112", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_xiaoyang112| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/xiaoyang112/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xillolxlbln_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xillolxlbln_en.md new file mode 100644 index 000000000000..1d86209f485a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xillolxlbln_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_xillolxlbln DistilBertForQuestionAnswering from Xillolxlbln +author: John Snow Labs +name: burmese_awesome_qa_model_xillolxlbln +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_xillolxlbln` is a English model originally trained by Xillolxlbln. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xillolxlbln_en_5.2.0_3.0_1701041602604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xillolxlbln_en_5.2.0_3.0_1701041602604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_xillolxlbln","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_xillolxlbln", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_xillolxlbln| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Xillolxlbln/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xuanye_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xuanye_en.md new file mode 100644 index 000000000000..a35156d53e1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_xuanye_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_xuanye DistilBertForQuestionAnswering from xuanye +author: John Snow Labs +name: burmese_awesome_qa_model_xuanye +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_xuanye` is a English model originally trained by xuanye. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xuanye_en_5.2.0_3.0_1701033748686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xuanye_en_5.2.0_3.0_1701033748686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_xuanye","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_xuanye", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_xuanye| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/xuanye/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_yilinw_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_yilinw_en.md new file mode 100644 index 000000000000..98c492a1176f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_yilinw_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_yilinw DistilBertForQuestionAnswering from yilinw +author: John Snow Labs +name: burmese_awesome_qa_model_yilinw +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_yilinw` is a English model originally trained by yilinw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_yilinw_en_5.2.0_3.0_1701037543589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_yilinw_en_5.2.0_3.0_1701037543589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_yilinw","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_yilinw", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_yilinw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yilinw/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_yogita91_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_yogita91_en.md new file mode 100644 index 000000000000..ed00c304d638 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_awesome_qa_model_yogita91_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_yogita91 DistilBertForQuestionAnswering from yogita91 +author: John Snow Labs +name: burmese_awesome_qa_model_yogita91 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_yogita91` is a English model originally trained by yogita91. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_yogita91_en_5.2.0_3.0_1701031024295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_yogita91_en_5.2.0_3.0_1701031024295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_yogita91","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_yogita91", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_yogita91| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yogita91/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_distillbert_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_distillbert_qa_model_en.md new file mode 100644 index 000000000000..35f05a0716d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_distillbert_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_distillbert_qa_model DistilBertForQuestionAnswering from hzsushiqiren +author: John Snow Labs +name: burmese_distillbert_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_distillbert_qa_model` is a English model originally trained by hzsushiqiren. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_distillbert_qa_model_en_5.2.0_3.0_1701032736801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_distillbert_qa_model_en_5.2.0_3.0_1701032736801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_distillbert_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_distillbert_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_distillbert_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/hzsushiqiren/my_distillBert_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_faq_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_faq_model_en.md new file mode 100644 index 000000000000..daf4fc01cd7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_faq_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_faq_model DistilBertForQuestionAnswering from anon98801 +author: John Snow Labs +name: burmese_faq_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_faq_model` is a English model originally trained by anon98801. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_faq_model_en_5.2.0_3.0_1701036574806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_faq_model_en_5.2.0_3.0_1701036574806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_faq_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_faq_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_faq_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/anon98801/my_faq_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_first_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_first_model_en.md new file mode 100644 index 000000000000..c956f78d97a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_first_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_first_model DistilBertForQuestionAnswering from DarrenLo +author: John Snow Labs +name: burmese_first_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_first_model` is a English model originally trained by DarrenLo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_first_model_en_5.2.0_3.0_1701026773432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_first_model_en_5.2.0_3.0_1701026773432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_first_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_first_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_first_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DarrenLo/my_first_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_nepal_bhasa_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_nepal_bhasa_qa_model_en.md new file mode 100644 index 000000000000..4386b2450d0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_nepal_bhasa_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_nepal_bhasa_qa_model DistilBertForQuestionAnswering from advaithS7857 +author: John Snow Labs +name: burmese_nepal_bhasa_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_nepal_bhasa_qa_model` is a English model originally trained by advaithS7857. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_nepal_bhasa_qa_model_en_5.2.0_3.0_1701034857194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_nepal_bhasa_qa_model_en_5.2.0_3.0_1701034857194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_nepal_bhasa_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_nepal_bhasa_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_nepal_bhasa_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/advaithS7857/my_new_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_own_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_own_qa_model_en.md new file mode 100644 index 000000000000..612ab868e6bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_own_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_own_qa_model DistilBertForQuestionAnswering from SUhlemeyer +author: John Snow Labs +name: burmese_own_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_own_qa_model` is a English model originally trained by SUhlemeyer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_own_qa_model_en_5.2.0_3.0_1701015504348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_own_qa_model_en_5.2.0_3.0_1701015504348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_own_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_own_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_own_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SUhlemeyer/my_own_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_1_en.md new file mode 100644 index 000000000000..52cd4c89c72e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_1 DistilBertForQuestionAnswering from chichang +author: John Snow Labs +name: burmese_qa_model_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_1` is a English model originally trained by chichang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_1_en_5.2.0_3.0_1701034499455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_1_en_5.2.0_3.0_1701034499455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chichang/my_qa_model_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_aravind_selvam_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_aravind_selvam_en.md new file mode 100644 index 000000000000..6494dc6b1875 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_aravind_selvam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_aravind_selvam DistilBertForQuestionAnswering from aravind-selvam +author: John Snow Labs +name: burmese_qa_model_aravind_selvam +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_aravind_selvam` is a English model originally trained by aravind-selvam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_aravind_selvam_en_5.2.0_3.0_1701042168116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_aravind_selvam_en_5.2.0_3.0_1701042168116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_aravind_selvam","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_aravind_selvam", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_aravind_selvam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aravind-selvam/my_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_elaaaf_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_elaaaf_en.md new file mode 100644 index 000000000000..199e77d2798b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_elaaaf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_elaaaf DistilBertForQuestionAnswering from Elaaaf +author: John Snow Labs +name: burmese_qa_model_elaaaf +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_elaaaf` is a English model originally trained by Elaaaf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_elaaaf_en_5.2.0_3.0_1701037020469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_elaaaf_en_5.2.0_3.0_1701037020469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_elaaaf","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_elaaaf", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_elaaaf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Elaaaf/my-qa-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_koltunov_matthew_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_koltunov_matthew_en.md new file mode 100644 index 000000000000..f082db4f8a81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_koltunov_matthew_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_koltunov_matthew DistilBertForQuestionAnswering from Koltunov-Matthew +author: John Snow Labs +name: burmese_qa_model_koltunov_matthew +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_koltunov_matthew` is a English model originally trained by Koltunov-Matthew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_koltunov_matthew_en_5.2.0_3.0_1701042954460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_koltunov_matthew_en_5.2.0_3.0_1701042954460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_koltunov_matthew","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_koltunov_matthew", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_koltunov_matthew| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Koltunov-Matthew/my_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_sai2499_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_sai2499_en.md new file mode 100644 index 000000000000..6c818037bc00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_sai2499_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_sai2499 DistilBertForQuestionAnswering from sai2499 +author: John Snow Labs +name: burmese_qa_model_sai2499 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_sai2499` is a English model originally trained by sai2499. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_sai2499_en_5.2.0_3.0_1701033545718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_sai2499_en_5.2.0_3.0_1701033545718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_sai2499","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_sai2499", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_sai2499| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sai2499/my_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_sjadhav3_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_sjadhav3_en.md new file mode 100644 index 000000000000..41e112f20454 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_sjadhav3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_sjadhav3 DistilBertForQuestionAnswering from sjadhav3 +author: John Snow Labs +name: burmese_qa_model_sjadhav3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_sjadhav3` is a English model originally trained by sjadhav3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_sjadhav3_en_5.2.0_3.0_1701039575341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_sjadhav3_en_5.2.0_3.0_1701039575341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_sjadhav3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_sjadhav3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_sjadhav3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sjadhav3/my_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_vlso_en.md b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_vlso_en.md new file mode 100644 index 000000000000..170a62fca4ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-burmese_qa_model_vlso_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_vlso DistilBertForQuestionAnswering from vlso +author: John Snow Labs +name: burmese_qa_model_vlso +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_vlso` is a English model originally trained by vlso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_vlso_en_5.2.0_3.0_1701033676083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_vlso_en_5.2.0_3.0_1701033676083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_vlso","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_vlso", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_vlso| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vlso/my_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-canine_model_js_en.md b/docs/_posts/ahmedlone127/2023-11-26-canine_model_js_en.md new file mode 100644 index 000000000000..556521efbbbd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-canine_model_js_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English canine_model_js DistilBertForQuestionAnswering from Goico192 +author: John Snow Labs +name: canine_model_js +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`canine_model_js` is a English model originally trained by Goico192. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/canine_model_js_en_5.2.0_3.0_1701015690379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/canine_model_js_en_5.2.0_3.0_1701015690379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("canine_model_js","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("canine_model_js", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|canine_model_js| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Goico192/Canine_model_JS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-chatterbotqa_sglasher_en.md b/docs/_posts/ahmedlone127/2023-11-26-chatterbotqa_sglasher_en.md new file mode 100644 index 000000000000..7ad40f97a145 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-chatterbotqa_sglasher_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chatterbotqa_sglasher DistilBertForQuestionAnswering from sglasher +author: John Snow Labs +name: chatterbotqa_sglasher +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatterbotqa_sglasher` is a English model originally trained by sglasher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatterbotqa_sglasher_en_5.2.0_3.0_1701015725177.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatterbotqa_sglasher_en_5.2.0_3.0_1701015725177.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("chatterbotqa_sglasher","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("chatterbotqa_sglasher", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatterbotqa_sglasher| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/sglasher/ChatterBotQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-chatterbotqa_teslanando_en.md b/docs/_posts/ahmedlone127/2023-11-26-chatterbotqa_teslanando_en.md new file mode 100644 index 000000000000..2c9c7be66522 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-chatterbotqa_teslanando_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English chatterbotqa_teslanando DistilBertForQuestionAnswering from teslanando +author: John Snow Labs +name: chatterbotqa_teslanando +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatterbotqa_teslanando` is a English model originally trained by teslanando. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatterbotqa_teslanando_en_5.2.0_3.0_1701017161850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatterbotqa_teslanando_en_5.2.0_3.0_1701017161850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("chatterbotqa_teslanando","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("chatterbotqa_teslanando", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatterbotqa_teslanando| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/teslanando/ChatterBotQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-clickbait_spoiling_model_trial_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-clickbait_spoiling_model_trial_1_en.md new file mode 100644 index 000000000000..58bac32fc2fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-clickbait_spoiling_model_trial_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clickbait_spoiling_model_trial_1 DistilBertForQuestionAnswering from intanm +author: John Snow Labs +name: clickbait_spoiling_model_trial_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clickbait_spoiling_model_trial_1` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clickbait_spoiling_model_trial_1_en_5.2.0_3.0_1701035208818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clickbait_spoiling_model_trial_1_en_5.2.0_3.0_1701035208818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("clickbait_spoiling_model_trial_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("clickbait_spoiling_model_trial_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clickbait_spoiling_model_trial_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/intanm/clickbait_spoiling_model_trial_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-clickbait_spoiling_model_trial_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-clickbait_spoiling_model_trial_2_en.md new file mode 100644 index 000000000000..ac3e9cef6eed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-clickbait_spoiling_model_trial_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English clickbait_spoiling_model_trial_2 DistilBertForQuestionAnswering from intanm +author: John Snow Labs +name: clickbait_spoiling_model_trial_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clickbait_spoiling_model_trial_2` is a English model originally trained by intanm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clickbait_spoiling_model_trial_2_en_5.2.0_3.0_1701032060588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clickbait_spoiling_model_trial_2_en_5.2.0_3.0_1701032060588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("clickbait_spoiling_model_trial_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("clickbait_spoiling_model_trial_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clickbait_spoiling_model_trial_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/intanm/clickbait_spoiling_model_trial_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-comprehenso_en.md b/docs/_posts/ahmedlone127/2023-11-26-comprehenso_en.md new file mode 100644 index 000000000000..c88cb539e3c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-comprehenso_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English comprehenso DistilBertForQuestionAnswering from itsatarax +author: John Snow Labs +name: comprehenso +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`comprehenso` is a English model originally trained by itsatarax. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/comprehenso_en_5.2.0_3.0_1701016177634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/comprehenso_en_5.2.0_3.0_1701016177634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("comprehenso","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("comprehenso", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|comprehenso| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/itsatarax/comprehenso \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-content_finasgame_en.md b/docs/_posts/ahmedlone127/2023-11-26-content_finasgame_en.md new file mode 100644 index 000000000000..782cbfa1f504 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-content_finasgame_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English content_finasgame DistilBertForQuestionAnswering from finasgame +author: John Snow Labs +name: content_finasgame +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`content_finasgame` is a English model originally trained by finasgame. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/content_finasgame_en_5.2.0_3.0_1701041023060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/content_finasgame_en_5.2.0_3.0_1701041023060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("content_finasgame","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("content_finasgame", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|content_finasgame| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/finasgame/content \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-contextuality_en.md b/docs/_posts/ahmedlone127/2023-11-26-contextuality_en.md new file mode 100644 index 000000000000..7aae9335f126 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-contextuality_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English contextuality DistilBertForQuestionAnswering from camie-cool-2903 +author: John Snow Labs +name: contextuality +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`contextuality` is a English model originally trained by camie-cool-2903. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/contextuality_en_5.2.0_3.0_1701023498646.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/contextuality_en_5.2.0_3.0_1701023498646.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("contextuality","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("contextuality", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|contextuality| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/camie-cool-2903/contextuality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-covid_qa_distillbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-covid_qa_distillbert_en.md new file mode 100644 index 000000000000..13c223a5f436 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-covid_qa_distillbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English covid_qa_distillbert DistilBertForQuestionAnswering from shainahub +author: John Snow Labs +name: covid_qa_distillbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_qa_distillbert` is a English model originally trained by shainahub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_qa_distillbert_en_5.2.0_3.0_1701017283568.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_qa_distillbert_en_5.2.0_3.0_1701017283568.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("covid_qa_distillbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("covid_qa_distillbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_qa_distillbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shainahub/covid_qa_distillbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-cuad_distil_multi_fields_08_29_v1_en.md b/docs/_posts/ahmedlone127/2023-11-26-cuad_distil_multi_fields_08_29_v1_en.md new file mode 100644 index 000000000000..36b8af34724d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-cuad_distil_multi_fields_08_29_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English cuad_distil_multi_fields_08_29_v1 DistilBertForQuestionAnswering from saraks +author: John Snow Labs +name: cuad_distil_multi_fields_08_29_v1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cuad_distil_multi_fields_08_29_v1` is a English model originally trained by saraks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cuad_distil_multi_fields_08_29_v1_en_5.2.0_3.0_1701040590670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cuad_distil_multi_fields_08_29_v1_en_5.2.0_3.0_1701040590670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("cuad_distil_multi_fields_08_29_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("cuad_distil_multi_fields_08_29_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cuad_distil_multi_fields_08_29_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/saraks/cuad-distil-multi_fields-08-29-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-datasetforspotify_en.md b/docs/_posts/ahmedlone127/2023-11-26-datasetforspotify_en.md new file mode 100644 index 000000000000..a7991301abcf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-datasetforspotify_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English datasetforspotify DistilBertForQuestionAnswering from ajaydvrj +author: John Snow Labs +name: datasetforspotify +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`datasetforspotify` is a English model originally trained by ajaydvrj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/datasetforspotify_en_5.2.0_3.0_1701035389886.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/datasetforspotify_en_5.2.0_3.0_1701035389886.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("datasetforspotify","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("datasetforspotify", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|datasetforspotify| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ajaydvrj/datasetForSpotify \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-dbert_qa_model_070623_en.md b/docs/_posts/ahmedlone127/2023-11-26-dbert_qa_model_070623_en.md new file mode 100644 index 000000000000..767d3a18c18e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-dbert_qa_model_070623_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dbert_qa_model_070623 DistilBertForQuestionAnswering from asure22 +author: John Snow Labs +name: dbert_qa_model_070623 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbert_qa_model_070623` is a English model originally trained by asure22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbert_qa_model_070623_en_5.2.0_3.0_1701029929313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbert_qa_model_070623_en_5.2.0_3.0_1701029929313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("dbert_qa_model_070623","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("dbert_qa_model_070623", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbert_qa_model_070623| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/asure22/dbert_qa_model_070623 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-dbt4_en.md b/docs/_posts/ahmedlone127/2023-11-26-dbt4_en.md new file mode 100644 index 000000000000..f44d303c4d19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-dbt4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dbt4 DistilBertForQuestionAnswering from SUTS102779289 +author: John Snow Labs +name: dbt4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbt4` is a English model originally trained by SUTS102779289. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbt4_en_5.2.0_3.0_1701017855001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbt4_en_5.2.0_3.0_1701017855001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("dbt4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("dbt4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbt4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SUTS102779289/dbt4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-deep_project_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-deep_project_qa_model_en.md new file mode 100644 index 000000000000..62aeefee288c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-deep_project_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English deep_project_qa_model DistilBertForQuestionAnswering from nadidebeyza +author: John Snow Labs +name: deep_project_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deep_project_qa_model` is a English model originally trained by nadidebeyza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deep_project_qa_model_en_5.2.0_3.0_1701038971082.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deep_project_qa_model_en_5.2.0_3.0_1701038971082.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("deep_project_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("deep_project_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deep_project_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nadidebeyza/deep_project_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-diff_model2_en.md b/docs/_posts/ahmedlone127/2023-11-26-diff_model2_en.md new file mode 100644 index 000000000000..76a983d6df0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-diff_model2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English diff_model2 DistilBertForQuestionAnswering from radyad +author: John Snow Labs +name: diff_model2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`diff_model2` is a English model originally trained by radyad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/diff_model2_en_5.2.0_3.0_1701037038092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/diff_model2_en_5.2.0_3.0_1701037038092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("diff_model2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("diff_model2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|diff_model2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/radyad/diff_model2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distil_bert_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distil_bert_1_en.md new file mode 100644 index 000000000000..9c8184f08d2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distil_bert_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distil_bert_1 DistilBertForQuestionAnswering from satyamverma +author: John Snow Labs +name: distil_bert_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_1` is a English model originally trained by satyamverma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_1_en_5.2.0_3.0_1701016394045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_1_en_5.2.0_3.0_1701016394045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distil_bert_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distil_bert_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/satyamverma/Distil_BERT_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distil_bert_6_en.md b/docs/_posts/ahmedlone127/2023-11-26-distil_bert_6_en.md new file mode 100644 index 000000000000..5900cb2c7ed3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distil_bert_6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distil_bert_6 DistilBertForQuestionAnswering from hung200504 +author: John Snow Labs +name: distil_bert_6 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_6` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_6_en_5.2.0_3.0_1701037266757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_6_en_5.2.0_3.0_1701037266757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distil_bert_6","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distil_bert_6", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/hung200504/distil-bert-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distil_bert_fine_tune_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distil_bert_fine_tune_3_en.md new file mode 100644 index 000000000000..d5ebe007c5d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distil_bert_fine_tune_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distil_bert_fine_tune_3 DistilBertForQuestionAnswering from satyamverma +author: John Snow Labs +name: distil_bert_fine_tune_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_fine_tune_3` is a English model originally trained by satyamverma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_fine_tune_3_en_5.2.0_3.0_1701022570715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_fine_tune_3_en_5.2.0_3.0_1701022570715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distil_bert_fine_tune_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distil_bert_fine_tune_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_fine_tune_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/satyamverma/Distil_BERT_Fine_Tune_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_4_en.md new file mode 100644 index 000000000000..eab000749e36 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_4 DistilBertForQuestionAnswering from hung200504 +author: John Snow Labs +name: distilbert_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_4` is a English model originally trained by hung200504. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_4_en_5.2.0_3.0_1701036632701.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_4_en_5.2.0_3.0_1701036632701.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hung200504/distilbert-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_anaghasavit_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_anaghasavit_en.md new file mode 100644 index 000000000000..52c3c5dc6f76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_anaghasavit_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_anaghasavit DistilBertForQuestionAnswering from anaghasavit +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_anaghasavit +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_anaghasavit` is a English model originally trained by anaghasavit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_anaghasavit_en_5.2.0_3.0_1701030682302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_anaghasavit_en_5.2.0_3.0_1701030682302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_anaghasavit","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_anaghasavit", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_anaghasavit| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/anaghasavit/distilbert-base-cased-distilled-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_autoevaluate_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_autoevaluate_en.md new file mode 100644 index 000000000000..7a4c22db4b81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_autoevaluate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_autoevaluate DistilBertForQuestionAnswering from autoevaluate +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_autoevaluate +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_autoevaluate` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_autoevaluate_en_5.2.0_3.0_1701017582030.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_autoevaluate_en_5.2.0_3.0_1701017582030.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_autoevaluate","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_autoevaluate", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_autoevaluate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/autoevaluate/distilbert-base-cased-distilled-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_coffee20230108_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_coffee20230108_en.md new file mode 100644 index 000000000000..16407a7e5571 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_coffee20230108_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_coffee20230108 DistilBertForQuestionAnswering from nejox +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_coffee20230108 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_coffee20230108` is a English model originally trained by nejox. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_coffee20230108_en_5.2.0_3.0_1701018858025.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_coffee20230108_en_5.2.0_3.0_1701018858025.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_coffee20230108","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_coffee20230108", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_coffee20230108| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/nejox/distilbert-base-cased-distilled-squad-coffee20230108 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_emrqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_emrqa_en.md new file mode 100644 index 000000000000..0b87b9883dd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_emrqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_emrqa DistilBertForQuestionAnswering from aaditya +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_emrqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_emrqa` is a English model originally trained by aaditya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_emrqa_en_5.2.0_3.0_1701034615513.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_emrqa_en_5.2.0_3.0_1701034615513.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_emrqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_emrqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_emrqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/aaditya/distilbert-base-cased-distilled-squad_emrqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15_en.md new file mode 100644 index 000000000000..d2285e3e6cb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15_en_5.2.0_3.0_1701026935206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15_en_5.2.0_3.0_1701026935206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-lr1e-05-epochs15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20_en.md new file mode 100644 index 000000000000..499f0dca36d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20_en_5.2.0_3.0_1701019001072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20_en_5.2.0_3.0_1701019001072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_lr1e_05_epochs20| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-lr1e-05-epochs20 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15_en.md new file mode 100644 index 000000000000..9e47d58bde54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15_en_5.2.0_3.0_1701023956276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15_en_5.2.0_3.0_1701023956276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-lr1e-06-epochs15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50_en.md new file mode 100644 index 000000000000..fab8bbe12000 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50_en_5.2.0_3.0_1701020857823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50_en_5.2.0_3.0_1701020857823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_lr1e_06_epochs50| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-lr1e-06-epochs50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15_en.md new file mode 100644 index 000000000000..870ae1e29f7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15_en_5.2.0_3.0_1701028790942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15_en_5.2.0_3.0_1701028790942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_lr1e_07_epochs15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-lr1e-07-epochs15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100_en.md new file mode 100644 index 000000000000..6771982640bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100_en_5.2.0_3.0_1701022886579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100_en_5.2.0_3.0_1701022886579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_lr1e_08_epochs100| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-lr1e-08-epochs100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100_en.md new file mode 100644 index 000000000000..a165195a207c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100_en_5.2.0_3.0_1701027759151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100_en_5.2.0_3.0_1701027759151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_lr3e_06_epochs100| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-lr3e-06-epochs100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit_en.md new file mode 100644 index 000000000000..7418aa796267 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit DistilBertForQuestionAnswering from anaghasavit +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit` is a English model originally trained by anaghasavit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit_en_5.2.0_3.0_1701031277388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit_en_5.2.0_3.0_1701031277388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_squad_anaghasavit| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/anaghasavit/distilbert-base-cased-distilled-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter_en.md new file mode 100644 index 000000000000..c7acd1c516b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter DistilBertForQuestionAnswering from EricPeter +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter` is a English model originally trained by EricPeter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter_en_5.2.0_3.0_1701034612421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter_en_5.2.0_3.0_1701034612421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_squad_ericpeter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/EricPeter/distilbert-base-cased-distilled-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi_en.md new file mode 100644 index 000000000000..5ce555a644ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi_en_5.2.0_3.0_1701029506425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi_en_5.2.0_3.0_1701029506425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_squad_gallyamovi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-base-cased-distilled-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu_en.md new file mode 100644 index 000000000000..f73e1c0b950c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu DistilBertForQuestionAnswering from kuberpmu +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu` is a English model originally trained by kuberpmu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu_en_5.2.0_3.0_1701032065291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu_en_5.2.0_3.0_1701032065291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_squad_kuberpmu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/kuberpmu/distilbert-base-cased-distilled-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk_en.md new file mode 100644 index 000000000000..fa59d9d4c9ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk DistilBertForQuestionAnswering from yashwantk +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk` is a English model originally trained by yashwantk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk_en_5.2.0_3.0_1701019905287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk_en_5.2.0_3.0_1701019905287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_finetuned_squad_yashwantk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/yashwantk/distilbert-base-cased-distilled-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_how_1e_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_how_1e_4_en.md new file mode 100644 index 000000000000..bc0f36838104 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_how_1e_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_how_1e_4 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_how_1e_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_how_1e_4` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_how_1e_4_en_5.2.0_3.0_1701024683272.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_how_1e_4_en_5.2.0_3.0_1701024683272.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_how_1e_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_how_1e_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_how_1e_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-how-1e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_how_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_how_5e_05_en.md new file mode 100644 index 000000000000..88e84850a3ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_how_5e_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_how_5e_05 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_how_5e_05 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_how_5e_05` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_how_5e_05_en_5.2.0_3.0_1701023528179.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_how_5e_05_en_5.2.0_3.0_1701023528179.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_how_5e_05","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_how_5e_05", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_how_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-how-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4_en.md new file mode 100644 index 000000000000..d1e4638a3212 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4_en_5.2.0_3.0_1701028779829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4_en_5.2.0_3.0_1701028779829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_norwegian_label_1e_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-no-label-1e-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05_en.md new file mode 100644 index 000000000000..d70b9037fdf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05_en_5.2.0_3.0_1701032276385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05_en_5.2.0_3.0_1701032276385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_norwegian_label_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-no-label-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_what_1e_04_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_what_1e_04_en.md new file mode 100644 index 000000000000..cde4fc837bff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_what_1e_04_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_what_1e_04 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_what_1e_04 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_what_1e_04` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_what_1e_04_en_5.2.0_3.0_1701029298494.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_what_1e_04_en_5.2.0_3.0_1701029298494.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_what_1e_04","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_what_1e_04", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_what_1e_04| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-what-1e-04 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_what_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_what_5e_05_en.md new file mode 100644 index 000000000000..4c93605931d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_what_5e_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_what_5e_05 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_what_5e_05 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_what_5e_05` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_what_5e_05_en_5.2.0_3.0_1701021970710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_what_5e_05_en_5.2.0_3.0_1701021970710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_what_5e_05","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_what_5e_05", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_what_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-what-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_which_1e_04_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_which_1e_04_en.md new file mode 100644 index 000000000000..287a96f59328 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_which_1e_04_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_which_1e_04 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_which_1e_04 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_which_1e_04` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_which_1e_04_en_5.2.0_3.0_1701024709098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_which_1e_04_en_5.2.0_3.0_1701024709098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_which_1e_04","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_which_1e_04", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_which_1e_04| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-which-1e-04 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_which_5e_05_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_which_5e_05_en.md new file mode 100644 index 000000000000..c5231e4c9d9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_distilled_squad_orkg_which_5e_05_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_orkg_which_5e_05 DistilBertForQuestionAnswering from Moussab +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_orkg_which_5e_05 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_orkg_which_5e_05` is a English model originally trained by Moussab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_which_5e_05_en_5.2.0_3.0_1701019001694.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_orkg_which_5e_05_en_5.2.0_3.0_1701019001694.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_orkg_which_5e_05","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_orkg_which_5e_05", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_orkg_which_5e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Moussab/distilbert-base-cased-distilled-squad-orkg-which-5e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad2_en.md new file mode 100644 index 000000000000..c4b339299121 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_squad2 DistilBertForQuestionAnswering from prbocca +author: John Snow Labs +name: distilbert_base_cased_finetuned_squad2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_squad2` is a English model originally trained by prbocca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad2_en_5.2.0_3.0_1701028939039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad2_en_5.2.0_3.0_1701028939039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_finetuned_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_finetuned_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/prbocca/distilbert-base-cased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_monakth_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_monakth_en.md new file mode 100644 index 000000000000..56b7394e2290 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_monakth_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_squad_monakth DistilBertForQuestionAnswering from monakth +author: John Snow Labs +name: distilbert_base_cased_finetuned_squad_monakth +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_squad_monakth` is a English model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad_monakth_en_5.2.0_3.0_1701026652319.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad_monakth_en_5.2.0_3.0_1701026652319.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_finetuned_squad_monakth","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_finetuned_squad_monakth", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_squad_monakth| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/monakth/distilbert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_v2_en.md new file mode 100644 index 000000000000..ff2f55657ca2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_squad_v2 DistilBertForQuestionAnswering from victorlee071200 +author: John Snow Labs +name: distilbert_base_cased_finetuned_squad_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_squad_v2` is a English model originally trained by victorlee071200. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad_v2_en_5.2.0_3.0_1701030099829.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad_v2_en_5.2.0_3.0_1701030099829.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_finetuned_squad_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_finetuned_squad_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/victorlee071200/distilbert-base-cased-finetuned-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_victorlee071200_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_victorlee071200_en.md new file mode 100644 index 000000000000..52d1c2f9757b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squad_victorlee071200_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_squad_victorlee071200 DistilBertForQuestionAnswering from victorlee071200 +author: John Snow Labs +name: distilbert_base_cased_finetuned_squad_victorlee071200 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_squad_victorlee071200` is a English model originally trained by victorlee071200. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad_victorlee071200_en_5.2.0_3.0_1701029053206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squad_victorlee071200_en_5.2.0_3.0_1701029053206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_finetuned_squad_victorlee071200","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_finetuned_squad_victorlee071200", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_squad_victorlee071200| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/victorlee071200/distilbert-base-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squadv2_en.md new file mode 100644 index 000000000000..84e14dd7ee9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_finetuned_squadv2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_finetuned_squadv2 DistilBertForQuestionAnswering from monakth +author: John Snow Labs +name: distilbert_base_cased_finetuned_squadv2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_finetuned_squadv2` is a English model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squadv2_en_5.2.0_3.0_1701030656409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_finetuned_squadv2_en_5.2.0_3.0_1701030656409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_finetuned_squadv2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_finetuned_squadv2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_finetuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/monakth/distilbert-base-cased-finetuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_qa_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_qa_squad2_en.md new file mode 100644 index 000000000000..2ef4fbe1943e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_qa_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model +author: John Snow Labs +name: distilbert_base_cased_qa_squad2 +date: 2023-11-26 +tags: [open_source, distilbert, question_answering, en, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-cased-distilled-squad` is a English model originally trained by Hugging Face. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_qa_squad2_en_5.2.0_3.0_1701010357429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_qa_squad2_en_5.2.0_3.0_1701010357429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_qa_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_qa_squad2","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.distil_bert.base_cased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_qa_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +https://huggingface.co/distilbert-base-cased-distilled-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_squad_en.md new file mode 100644 index 000000000000..f68b98273e08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_cased_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_squad DistilBertForQuestionAnswering from sento800 +author: John Snow Labs +name: distilbert_base_cased_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_squad` is a English model originally trained by sento800. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_squad_en_5.2.0_3.0_1701039804621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_squad_en_5.2.0_3.0_1701039804621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/sento800/distilbert-base-cased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_dutch_cased_finetuned_squad_nl.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_dutch_cased_finetuned_squad_nl.md new file mode 100644 index 000000000000..069680ca6f71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_dutch_cased_finetuned_squad_nl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Dutch, Flemish distilbert_base_dutch_cased_finetuned_squad DistilBertForQuestionAnswering from tclungu +author: John Snow Labs +name: distilbert_base_dutch_cased_finetuned_squad +date: 2023-11-26 +tags: [distilbert, nl, open_source, question_answering, onnx] +task: Question Answering +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_dutch_cased_finetuned_squad` is a Dutch, Flemish model originally trained by tclungu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_dutch_cased_finetuned_squad_nl_5.2.0_3.0_1701015254493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_dutch_cased_finetuned_squad_nl_5.2.0_3.0_1701015254493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_dutch_cased_finetuned_squad","nl") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_dutch_cased_finetuned_squad", "nl") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_dutch_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|nl| +|Size:|229.0 MB| + +## References + +https://huggingface.co/tclungu/distilbert-base-nl-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_english_dutch_cased_finetuned_squad_nl.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_english_dutch_cased_finetuned_squad_nl.md new file mode 100644 index 000000000000..f7d7c08bdf1b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_english_dutch_cased_finetuned_squad_nl.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Dutch, Flemish distilbert_base_english_dutch_cased_finetuned_squad DistilBertForQuestionAnswering from tclungu +author: John Snow Labs +name: distilbert_base_english_dutch_cased_finetuned_squad +date: 2023-11-26 +tags: [distilbert, nl, open_source, question_answering, onnx] +task: Question Answering +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_english_dutch_cased_finetuned_squad` is a Dutch, Flemish model originally trained by tclungu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_english_dutch_cased_finetuned_squad_nl_5.2.0_3.0_1701014681965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_english_dutch_cased_finetuned_squad_nl_5.2.0_3.0_1701014681965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_english_dutch_cased_finetuned_squad","nl") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_english_dutch_cased_finetuned_squad", "nl") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_english_dutch_cased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|nl| +|Size:|256.4 MB| + +## References + +https://huggingface.co/tclungu/distilbert-base-en-nl-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_finetuned_recipe_modified_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_finetuned_recipe_modified_en.md new file mode 100644 index 000000000000..66d0dd59a2e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_finetuned_recipe_modified_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_finetuned_recipe_modified DistilBertForQuestionAnswering from saumyasinha0510 +author: John Snow Labs +name: distilbert_base_finetuned_recipe_modified +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_finetuned_recipe_modified` is a English model originally trained by saumyasinha0510. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_finetuned_recipe_modified_en_5.2.0_3.0_1701030975929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_finetuned_recipe_modified_en_5.2.0_3.0_1701030975929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_finetuned_recipe_modified","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_finetuned_recipe_modified", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_finetuned_recipe_modified| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/saumyasinha0510/distilbert-base-finetuned-recipe-modified \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_khanh_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_khanh_xx.md new file mode 100644 index 000000000000..c770eecab4e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_khanh_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_squad_khanh DistilBertForQuestionAnswering from Khanh +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_squad_khanh +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_squad_khanh` is a Multilingual model originally trained by Khanh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_khanh_xx_5.2.0_3.0_1701019603622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_khanh_xx_5.2.0_3.0_1701019603622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_finetuned_squad_khanh","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_finetuned_squad_khanh", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_squad_khanh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Khanh/distilbert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_monakth_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_monakth_xx.md new file mode 100644 index 000000000000..bafd25598071 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_monakth_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_squad_monakth DistilBertForQuestionAnswering from monakth +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_squad_monakth +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_squad_monakth` is a Multilingual model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_monakth_xx_5.2.0_3.0_1701015899863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_monakth_xx_5.2.0_3.0_1701015899863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_finetuned_squad_monakth","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_finetuned_squad_monakth", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_squad_monakth| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/monakth/distilbert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_ruselkomp_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_ruselkomp_xx.md new file mode 100644 index 000000000000..0bd3d9bc0670 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_ruselkomp_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_squad_ruselkomp DistilBertForQuestionAnswering from ruselkomp +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_squad_ruselkomp +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_squad_ruselkomp` is a Multilingual model originally trained by ruselkomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_ruselkomp_xx_5.2.0_3.0_1701020359574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_ruselkomp_xx_5.2.0_3.0_1701020359574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_finetuned_squad_ruselkomp","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_finetuned_squad_ruselkomp", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_squad_ruselkomp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/ruselkomp/distilbert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_silveto_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_silveto_xx.md new file mode 100644 index 000000000000..6843fe93184f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_silveto_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_squad_silveto DistilBertForQuestionAnswering from silveto +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_squad_silveto +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_squad_silveto` is a Multilingual model originally trained by silveto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_silveto_xx_5.2.0_3.0_1701023709166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_silveto_xx_5.2.0_3.0_1701023709166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_finetuned_squad_silveto","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_finetuned_squad_silveto", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_squad_silveto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/silveto/distilbert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_ztijn_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_ztijn_xx.md new file mode 100644 index 000000000000..19cf3060cc51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_squad_ztijn_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_squad_ztijn DistilBertForQuestionAnswering from Ztijn +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_squad_ztijn +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_squad_ztijn` is a Multilingual model originally trained by Ztijn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_ztijn_xx_5.2.0_3.0_1701028083895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_squad_ztijn_xx_5.2.0_3.0_1701028083895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_finetuned_squad_ztijn","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_finetuned_squad_ztijn", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_squad_ztijn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Ztijn/distilbert-base-multilingual-cased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_viquad_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_viquad_xx.md new file mode 100644 index 000000000000..38aa2099c14b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_finetuned_viquad_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_finetuned_viquad DistilBertForQuestionAnswering from Khanh +author: John Snow Labs +name: distilbert_base_multilingual_cased_finetuned_viquad +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_finetuned_viquad` is a Multilingual model originally trained by Khanh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_viquad_xx_5.2.0_3.0_1701017711830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_finetuned_viquad_xx_5.2.0_3.0_1701017711830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_finetuned_viquad","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_finetuned_viquad", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_finetuned_viquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/Khanh/distilbert-base-multilingual-cased-finetuned-viquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_squadv2_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_squadv2_xx.md new file mode 100644 index 000000000000..1dc4a9a73731 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_squadv2_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_squadv2 DistilBertForQuestionAnswering from monakth +author: John Snow Labs +name: distilbert_base_multilingual_cased_squadv2 +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_squadv2` is a Multilingual model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_squadv2_xx_5.2.0_3.0_1701014819388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_squadv2_xx_5.2.0_3.0_1701014819388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_squadv2","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_squadv2", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/monakth/distilbert-base-multilingual-cased-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_sv2_xx.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_sv2_xx.md new file mode 100644 index 000000000000..c6d57849734e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_multilingual_cased_sv2_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_base_multilingual_cased_sv2 DistilBertForQuestionAnswering from monakth +author: John Snow Labs +name: distilbert_base_multilingual_cased_sv2 +date: 2023-11-26 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_multilingual_cased_sv2` is a Multilingual model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sv2_xx_5.2.0_3.0_1701028665808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_multilingual_cased_sv2_xx_5.2.0_3.0_1701028665808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_multilingual_cased_sv2","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_multilingual_cased_sv2", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_multilingual_cased_sv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|505.4 MB| + +## References + +https://huggingface.co/monakth/distilbert-base-multilingual-cased-sv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_spanish_uncased_finetuned_qa_mlqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_spanish_uncased_finetuned_qa_mlqa_en.md new file mode 100644 index 000000000000..0d9eca227ae0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_spanish_uncased_finetuned_qa_mlqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_qa_mlqa DistilBertForQuestionAnswering from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_qa_mlqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_qa_mlqa` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_qa_mlqa_en_5.2.0_3.0_1701015382591.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_qa_mlqa_en_5.2.0_3.0_1701015382591.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_spanish_uncased_finetuned_qa_mlqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_spanish_uncased_finetuned_qa_mlqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_qa_mlqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|250.2 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-qa-mlqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_spanish_uncased_finetuned_qa_sqac_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_spanish_uncased_finetuned_qa_sqac_en.md new file mode 100644 index 000000000000..cfbf5e1621bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_spanish_uncased_finetuned_qa_sqac_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_qa_sqac DistilBertForQuestionAnswering from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_qa_sqac +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_qa_sqac` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_qa_sqac_en_5.2.0_3.0_1701024853920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_qa_sqac_en_5.2.0_3.0_1701024853920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_spanish_uncased_finetuned_qa_sqac","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_spanish_uncased_finetuned_qa_sqac", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_qa_sqac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|250.2 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-qa-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_squad_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_squad_finetuned_en.md new file mode 100644 index 000000000000..9b5d199d47b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_squad_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_squad_finetuned DistilBertForQuestionAnswering from rahulchakwate +author: John Snow Labs +name: distilbert_base_squad_finetuned +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_squad_finetuned` is a English model originally trained by rahulchakwate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_squad_finetuned_en_5.2.0_3.0_1701037910259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_squad_finetuned_en_5.2.0_3.0_1701037910259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_squad_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_squad_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_squad_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/rahulchakwate/distilbert-base-squad-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_test_en.md new file mode 100644 index 000000000000..aaa4abda8436 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_test DistilBertForQuestionAnswering from jeffnjy +author: John Snow Labs +name: distilbert_base_test +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_test` is a English model originally trained by jeffnjy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_test_en_5.2.0_3.0_1701014418631.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_test_en_5.2.0_3.0_1701014418631.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jeffnjy/distilbert-base-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_0_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_0_en.md new file mode 100644 index 000000000000..ba9db7b69066 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_0 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_0 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_0` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_0_en_5.2.0_3.0_1701026084043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_0_en_5.2.0_3.0_1701026084043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_1_en.md new file mode 100644 index 000000000000..0b0328edc0e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_1 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_1` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_1_en_5.2.0_3.0_1701027335764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_1_en_5.2.0_3.0_1701027335764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_2_en.md new file mode 100644 index 000000000000..e96bed7d0a2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_2 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_2` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_2_en_5.2.0_3.0_1701025463801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_2_en_5.2.0_3.0_1701025463801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.5 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_3_en.md new file mode 100644 index 000000000000..fb10444a7356 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_3 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_3` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_3_en_5.2.0_3.0_1701019287686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_3_en_5.2.0_3.0_1701019287686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_4_en.md new file mode 100644 index 000000000000..45c8beb89efa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_4 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_4` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_4_en_5.2.0_3.0_1701020627290.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_4_en_5.2.0_3.0_1701020627290.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_5_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_5_en.md new file mode 100644 index 000000000000..e5062b431422 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_5 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_5 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_5` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_5_en_5.2.0_3.0_1701032996771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_5_en_5.2.0_3.0_1701032996771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_6_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_6_en.md new file mode 100644 index 000000000000..85ca581efd98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_6 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_6 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_6` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_6_en_5.2.0_3.0_1701019141585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_6_en_5.2.0_3.0_1701019141585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_6","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_6", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_7_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_7_en.md new file mode 100644 index 000000000000..2776b262ae14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becas_7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becas_7 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becas_7 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becas_7` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_7_en_5.2.0_3.0_1701030975893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becas_7_en_5.2.0_3.0_1701030975893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becas_7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becas_7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becas_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becas-7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_1_en.md new file mode 100644 index 000000000000..39207a912e6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becasv2_1 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becasv2_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becasv2_1` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_1_en_5.2.0_3.0_1701023026305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_1_en_5.2.0_3.0_1701023026305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becasv2_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becasv2_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becasv2_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becasv2-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_2_en.md new file mode 100644 index 000000000000..a86873f3d5a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becasv2_2 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becasv2_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becasv2_2` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_2_en_5.2.0_3.0_1701019390911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_2_en_5.2.0_3.0_1701019390911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becasv2_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becasv2_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becasv2_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becasv2-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_3_en.md new file mode 100644 index 000000000000..428538e0e530 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becasv2_3 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becasv2_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becasv2_3` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_3_en_5.2.0_3.0_1701018858209.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_3_en_5.2.0_3.0_1701018858209.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becasv2_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becasv2_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becasv2_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becasv2-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_4_en.md new file mode 100644 index 000000000000..d27b22989011 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becasv2_4 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becasv2_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becasv2_4` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_4_en_5.2.0_3.0_1701020689100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_4_en_5.2.0_3.0_1701020689100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becasv2_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becasv2_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becasv2_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becasv2-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_5_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_5_en.md new file mode 100644 index 000000000000..3d91840a0424 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becasv2_5 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becasv2_5 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becasv2_5` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_5_en_5.2.0_3.0_1701023929894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_5_en_5.2.0_3.0_1701023929894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becasv2_5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becasv2_5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becasv2_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becasv2-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_6_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_6_en.md new file mode 100644 index 000000000000..1aef2f0a09cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv2_6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becasv2_6 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becasv2_6 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becasv2_6` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_6_en_5.2.0_3.0_1701030250934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv2_6_en_5.2.0_3.0_1701030250934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becasv2_6","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becasv2_6", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becasv2_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becasv2-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv3_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv3_1_en.md new file mode 100644 index 000000000000..7f40023d47b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_becasv3_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_becasv3_1 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_becasv3_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_becasv3_1` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv3_1_en_5.2.0_3.0_1701022741106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_becasv3_1_en_5.2.0_3.0_1701022741106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_becasv3_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_becasv3_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_becasv3_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-becasv3-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_combined_squad_adversarial_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_combined_squad_adversarial_en.md new file mode 100644 index 000000000000..1ff19908d7af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_combined_squad_adversarial_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_combined_squad_adversarial DistilBertForQuestionAnswering from stevemobs +author: John Snow Labs +name: distilbert_base_uncased_combined_squad_adversarial +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_combined_squad_adversarial` is a English model originally trained by stevemobs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_combined_squad_adversarial_en_5.2.0_3.0_1701024854077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_combined_squad_adversarial_en_5.2.0_3.0_1701024854077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_combined_squad_adversarial","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_combined_squad_adversarial", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_combined_squad_adversarial| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/stevemobs/distilbert-base-uncased-combined-squad-adversarial \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_diffuserconfuser_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_diffuserconfuser_en.md new file mode 100644 index 000000000000..5ef1d82ed479 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_diffuserconfuser_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_diffuserconfuser DistilBertForQuestionAnswering from diffuserconfuser +author: John Snow Labs +name: distilbert_base_uncased_diffuserconfuser +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_diffuserconfuser` is a English model originally trained by diffuserconfuser. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_diffuserconfuser_en_5.2.0_3.0_1701014622283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_diffuserconfuser_en_5.2.0_3.0_1701014622283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_diffuserconfuser","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_diffuserconfuser", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_diffuserconfuser| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/diffuserconfuser/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_arinakosovskaia_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_arinakosovskaia_en.md new file mode 100644 index 000000000000..22fc1f2e6211 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_arinakosovskaia_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_squad_arinakosovskaia DistilBertForQuestionAnswering from arinakosovskaia +author: John Snow Labs +name: distilbert_base_uncased_distilled_squad_arinakosovskaia +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_squad_arinakosovskaia` is a English model originally trained by arinakosovskaia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_arinakosovskaia_en_5.2.0_3.0_1701025594161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_arinakosovskaia_en_5.2.0_3.0_1701025594161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_distilled_squad_arinakosovskaia","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_distilled_squad_arinakosovskaia", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_squad_arinakosovskaia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/arinakosovskaia/distilbert-base-uncased-distilled-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_coffee20230108_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_coffee20230108_en.md new file mode 100644 index 000000000000..c145f2dce86f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_coffee20230108_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_squad_coffee20230108 DistilBertForQuestionAnswering from nejox +author: John Snow Labs +name: distilbert_base_uncased_distilled_squad_coffee20230108 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_squad_coffee20230108` is a English model originally trained by nejox. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_coffee20230108_en_5.2.0_3.0_1701023238662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_coffee20230108_en_5.2.0_3.0_1701023238662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_distilled_squad_coffee20230108","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_distilled_squad_coffee20230108", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_squad_coffee20230108| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nejox/distilbert-base-uncased-distilled-squad-coffee20230108 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_finetuned_squad_en.md new file mode 100644 index 000000000000..dd0dab8c33f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_squad_finetuned_squad DistilBertForQuestionAnswering from M4ycon +author: John Snow Labs +name: distilbert_base_uncased_distilled_squad_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_squad_finetuned_squad` is a English model originally trained by M4ycon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_finetuned_squad_en_5.2.0_3.0_1701030255117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_finetuned_squad_en_5.2.0_3.0_1701030255117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_distilled_squad_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_distilled_squad_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_squad_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/M4ycon/distilbert-base-uncased-distilled-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_qa_model_chetna19_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_qa_model_chetna19_en.md new file mode 100644 index 000000000000..99b0148f87d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_qa_model_chetna19_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_squad_qa_model_chetna19 DistilBertForQuestionAnswering from Chetna19 +author: John Snow Labs +name: distilbert_base_uncased_distilled_squad_qa_model_chetna19 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_squad_qa_model_chetna19` is a English model originally trained by Chetna19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_qa_model_chetna19_en_5.2.0_3.0_1701021368547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_qa_model_chetna19_en_5.2.0_3.0_1701021368547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_distilled_squad_qa_model_chetna19","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_distilled_squad_qa_model_chetna19", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_squad_qa_model_chetna19| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Chetna19/distilbert_base_uncased_distilled_squad_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_sarmila_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_sarmila_en.md new file mode 100644 index 000000000000..37bc329074d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_distilled_squad_sarmila_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_distilled_squad_sarmila DistilBertForQuestionAnswering from Sarmila +author: John Snow Labs +name: distilbert_base_uncased_distilled_squad_sarmila +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_distilled_squad_sarmila` is a English model originally trained by Sarmila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_sarmila_en_5.2.0_3.0_1701022290355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_distilled_squad_sarmila_en_5.2.0_3.0_1701022290355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_distilled_squad_sarmila","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_distilled_squad_sarmila", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_distilled_squad_sarmila| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sarmila/distilbert-base-uncased-distilled-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_fine_tuned_en.md new file mode 100644 index 000000000000..bea1a08d3040 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_fine_tuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_fine_tuned DistilBertForQuestionAnswering from nicotaroni +author: John Snow Labs +name: distilbert_base_uncased_fine_tuned +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_fine_tuned` is a English model originally trained by nicotaroni. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fine_tuned_en_5.2.0_3.0_1701020842154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_fine_tuned_en_5.2.0_3.0_1701020842154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_fine_tuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_fine_tuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nicotaroni/distilbert-base-uncased_fine_tuned_ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_amz_brander_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_amz_brander_en.md new file mode 100644 index 000000000000..a4c154e31e4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_amz_brander_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_amz_brander DistilBertForQuestionAnswering from Aleron12 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_amz_brander +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_amz_brander` is a English model originally trained by Aleron12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amz_brander_en_5.2.0_3.0_1701029438075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_amz_brander_en_5.2.0_3.0_1701029438075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_amz_brander","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_amz_brander", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_amz_brander| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Aleron12/distilbert-base-uncased-finetuned-amz_brander \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_atuscol_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_atuscol_en.md new file mode 100644 index 000000000000..d65ef867b833 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_atuscol_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_atuscol DistilBertForQuestionAnswering from kpeyton +author: John Snow Labs +name: distilbert_base_uncased_finetuned_atuscol +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_atuscol` is a English model originally trained by kpeyton. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_atuscol_en_5.2.0_3.0_1701023241307.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_atuscol_en_5.2.0_3.0_1701023241307.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_atuscol","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_atuscol", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_atuscol| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kpeyton/distilbert-base-uncased-finetuned-atuscol \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_covdistilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_covdistilbert_en.md new file mode 100644 index 000000000000..d3cb7c0ffff4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_covdistilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_covdistilbert DistilBertForQuestionAnswering from juliusco +author: John Snow Labs +name: distilbert_base_uncased_finetuned_covdistilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_covdistilbert` is a English model originally trained by juliusco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_covdistilbert_en_5.2.0_3.0_1701027254281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_covdistilbert_en_5.2.0_3.0_1701027254281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_covdistilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_covdistilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_covdistilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/juliusco/distilbert-base-uncased-finetuned-covdistilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_danlobo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_danlobo_en.md new file mode 100644 index 000000000000..74e6a2ab385b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_danlobo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cuad_danlobo DistilBertForQuestionAnswering from danlobo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cuad_danlobo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cuad_danlobo` is a English model originally trained by danlobo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_danlobo_en_5.2.0_3.0_1701020926767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_danlobo_en_5.2.0_3.0_1701020926767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_cuad_danlobo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_cuad_danlobo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cuad_danlobo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/danlobo/distilbert-base-uncased-finetuned-cuad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_distilbert_en.md new file mode 100644 index 000000000000..0f5e7949fda5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cuad_distilbert DistilBertForQuestionAnswering from Gam +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cuad_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cuad_distilbert` is a English model originally trained by Gam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_distilbert_en_5.2.0_3.0_1701033164587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_distilbert_en_5.2.0_3.0_1701033164587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_cuad_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_cuad_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cuad_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Gam/distilbert-base-uncased-finetuned-cuad-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_2_en.md new file mode 100644 index 000000000000..5b29096afb2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cuad_smaller_2 DistilBertForQuestionAnswering from yogesh0502 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cuad_smaller_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cuad_smaller_2` is a English model originally trained by yogesh0502. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_2_en_5.2.0_3.0_1701026084359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_2_en_5.2.0_3.0_1701026084359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_cuad_smaller_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_cuad_smaller_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cuad_smaller_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yogesh0502/distilbert-base-uncased-finetuned-cuad_smaller_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_3_en.md new file mode 100644 index 000000000000..a8b3f4666ba9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cuad_smaller_3 DistilBertForQuestionAnswering from yogesh0502 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cuad_smaller_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cuad_smaller_3` is a English model originally trained by yogesh0502. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_3_en_5.2.0_3.0_1701036885715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_3_en_5.2.0_3.0_1701036885715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_cuad_smaller_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_cuad_smaller_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cuad_smaller_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yogesh0502/distilbert-base-uncased-finetuned-cuad_smaller_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_4_en.md new file mode 100644 index 000000000000..7f1444aa1e89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cuad_smaller_4 DistilBertForQuestionAnswering from yogesh0502 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cuad_smaller_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cuad_smaller_4` is a English model originally trained by yogesh0502. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_4_en_5.2.0_3.0_1701025183874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_4_en_5.2.0_3.0_1701025183874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_cuad_smaller_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_cuad_smaller_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cuad_smaller_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yogesh0502/distilbert-base-uncased-finetuned-cuad_smaller_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_en.md new file mode 100644 index 000000000000..c0907561d74a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_cuad_smaller_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_cuad_smaller DistilBertForQuestionAnswering from yogesh0502 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_cuad_smaller +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_cuad_smaller` is a English model originally trained by yogesh0502. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_en_5.2.0_3.0_1701041773214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_cuad_smaller_en_5.2.0_3.0_1701041773214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_cuad_smaller","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_cuad_smaller", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_cuad_smaller| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yogesh0502/distilbert-base-uncased-finetuned-cuad_smaller \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_diabetes_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_diabetes_en.md new file mode 100644 index 000000000000..0bca3cc4f787 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_diabetes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_diabetes DistilBertForQuestionAnswering from LeWince +author: John Snow Labs +name: distilbert_base_uncased_finetuned_diabetes +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_diabetes` is a English model originally trained by LeWince. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_en_5.2.0_3.0_1701042550836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_en_5.2.0_3.0_1701042550836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_diabetes","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_diabetes", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_diabetes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LeWince/distilbert-base-uncased-finetuned-diabetes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_diabetes_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_diabetes_v2_en.md new file mode 100644 index 000000000000..93d4d6ff4588 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_diabetes_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_diabetes_v2 DistilBertForQuestionAnswering from LeWince +author: John Snow Labs +name: distilbert_base_uncased_finetuned_diabetes_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_diabetes_v2` is a English model originally trained by LeWince. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_v2_en_5.2.0_3.0_1701018282956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_v2_en_5.2.0_3.0_1701018282956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_diabetes_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_diabetes_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_diabetes_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LeWince/distilbert-base-uncased-finetuned-diabetes-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_en.md new file mode 100644 index 000000000000..458781caafdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned DistilBertForQuestionAnswering from stig +author: John Snow Labs +name: distilbert_base_uncased_finetuned +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned` is a English model originally trained by stig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_en_5.2.0_3.0_1701024435385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_en_5.2.0_3.0_1701024435385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/stig/distilbert-base-uncased-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_fira_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_fira_en.md new file mode 100644 index 000000000000..2347fc5b3f88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_fira_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_fira DistilBertForQuestionAnswering from ThaisBeham +author: John Snow Labs +name: distilbert_base_uncased_finetuned_fira +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_fira` is a English model originally trained by ThaisBeham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_fira_en_5.2.0_3.0_1701027522372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_fira_en_5.2.0_3.0_1701027522372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_fira","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_fira", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_fira| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ThaisBeham/distilbert-base-uncased-finetuned-fira \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_h2physics_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_h2physics_en.md new file mode 100644 index 000000000000..0ff4eb854011 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_h2physics_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_h2physics DistilBertForQuestionAnswering from danielcwq +author: John Snow Labs +name: distilbert_base_uncased_finetuned_h2physics +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_h2physics` is a English model originally trained by danielcwq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_h2physics_en_5.2.0_3.0_1701016348797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_h2physics_en_5.2.0_3.0_1701016348797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_h2physics","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_h2physics", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_h2physics| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/danielcwq/distilbert-base-uncased-finetuned-H2Physics \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_hotpot_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_hotpot_qa_en.md new file mode 100644 index 000000000000..132c8913ed3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_hotpot_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_hotpot_qa DistilBertForQuestionAnswering from vish88 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_hotpot_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_hotpot_qa` is a English model originally trained by vish88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_hotpot_qa_en_5.2.0_3.0_1701018706580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_hotpot_qa_en_5.2.0_3.0_1701018706580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_hotpot_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_hotpot_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_hotpot_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vish88/distilbert-base-uncased-finetuned-hotpot_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_legal_data_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_legal_data_en.md new file mode 100644 index 000000000000..a87140c239cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_legal_data_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_legal_data DistilBertForQuestionAnswering from MariamD +author: John Snow Labs +name: distilbert_base_uncased_finetuned_legal_data +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_legal_data` is a English model originally trained by MariamD. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_legal_data_en_5.2.0_3.0_1701016348121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_legal_data_en_5.2.0_3.0_1701016348121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_legal_data","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_legal_data", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_legal_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/MariamD/distilbert-base-uncased-finetuned-legal_data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_org_address_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_org_address_en.md new file mode 100644 index 000000000000..75b04cfabf25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_org_address_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_org_address DistilBertForQuestionAnswering from mansee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_org_address +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_org_address` is a English model originally trained by mansee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_org_address_en_5.2.0_3.0_1701017714366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_org_address_en_5.2.0_3.0_1701017714366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_org_address","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_org_address", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_org_address| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mansee/distilbert-base-uncased-finetuned-org-address \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_policies_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_policies_en.md new file mode 100644 index 000000000000..77c026cea82e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_policies_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_policies DistilBertForQuestionAnswering from Ineract +author: John Snow Labs +name: distilbert_base_uncased_finetuned_policies +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_policies` is a English model originally trained by Ineract. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_policies_en_5.2.0_3.0_1701025448553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_policies_en_5.2.0_3.0_1701025448553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_policies","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_policies", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_policies| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Ineract/distilbert-base-uncased-finetuned-policies \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_pubmedbykrs_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_pubmedbykrs_en.md new file mode 100644 index 000000000000..709bbe13b12d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_pubmedbykrs_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pubmedbykrs DistilBertForQuestionAnswering from pythonist +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pubmedbykrs +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pubmedbykrs` is a English model originally trained by pythonist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pubmedbykrs_en_5.2.0_3.0_1701025745481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pubmedbykrs_en_5.2.0_3.0_1701025745481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_pubmedbykrs","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_pubmedbykrs", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pubmedbykrs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pythonist/distilbert-base-uncased-finetuned-pubmedbykrs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_pubmedqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_pubmedqa_en.md new file mode 100644 index 000000000000..919588c7e284 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_pubmedqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_pubmedqa DistilBertForQuestionAnswering from pythonist +author: John Snow Labs +name: distilbert_base_uncased_finetuned_pubmedqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_pubmedqa` is a English model originally trained by pythonist. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pubmedqa_en_5.2.0_3.0_1701017720875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_pubmedqa_en_5.2.0_3.0_1701017720875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_pubmedqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_pubmedqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_pubmedqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pythonist/distilbert-base-uncased-finetuned-PubmedQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_qa_en.md new file mode 100644 index 000000000000..853fbcc85e80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_qa DistilBertForQuestionAnswering from Sneka +author: John Snow Labs +name: distilbert_base_uncased_finetuned_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_qa` is a English model originally trained by Sneka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qa_en_5.2.0_3.0_1701018704193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_qa_en_5.2.0_3.0_1701018704193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sneka/distilbert-base-uncased-finetuned-QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_sqaud_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_sqaud_en.md new file mode 100644 index 000000000000..ba972ca5cd18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_sqaud_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_sqaud DistilBertForQuestionAnswering from adisomani +author: John Snow Labs +name: distilbert_base_uncased_finetuned_sqaud +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_sqaud` is a English model originally trained by adisomani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sqaud_en_5.2.0_3.0_1701020926873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_sqaud_en_5.2.0_3.0_1701020926873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_sqaud","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_sqaud", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_sqaud| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adisomani/distilbert-base-uncased-finetuned-sqaud \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squac_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squac_en.md new file mode 100644 index 000000000000..eeb1e89a2e2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squac_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squac DistilBertForQuestionAnswering from adrinanou +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squac +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squac` is a English model originally trained by adrinanou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squac_en_5.2.0_3.0_1701040814986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squac_en_5.2.0_3.0_1701040814986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squac","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squac", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adrinanou/distilbert-base-uncased-finetuned-squac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad1_en.md new file mode 100644 index 000000000000..f71027af329f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad1 DistilBertForQuestionAnswering from sasuke +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad1` is a English model originally trained by sasuke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad1_en_5.2.0_3.0_1701036632387.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad1_en_5.2.0_3.0_1701036632387.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sasuke/distilbert-base-uncased-finetuned-squad1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_0_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_0_en.md new file mode 100644 index 000000000000..fa9cf4e27a79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad2_0 DistilBertForQuestionAnswering from lauraparra28 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad2_0 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad2_0` is a English model originally trained by lauraparra28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_0_en_5.2.0_3.0_1701041093113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_0_en_5.2.0_3.0_1701041093113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad2_0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad2_0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad2_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lauraparra28/Distilbert-base-uncased-finetuned-SQuAD2.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_iproject_10_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_iproject_10_en.md new file mode 100644 index 000000000000..c2e8dc2f8300 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_iproject_10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad2_iproject_10 DistilBertForQuestionAnswering from IProject-10 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad2_iproject_10 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad2_iproject_10` is a English model originally trained by IProject-10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_iproject_10_en_5.2.0_3.0_1701033281491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_iproject_10_en_5.2.0_3.0_1701033281491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad2_iproject_10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad2_iproject_10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad2_iproject_10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/IProject-10/distilbert-base-uncased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_kevinbror_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_kevinbror_en.md new file mode 100644 index 000000000000..617aa9daf6b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_kevinbror_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad2_kevinbror DistilBertForQuestionAnswering from kevinbror +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad2_kevinbror +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad2_kevinbror` is a English model originally trained by kevinbror. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_kevinbror_en_5.2.0_3.0_1701018514283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_kevinbror_en_5.2.0_3.0_1701018514283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad2_kevinbror","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad2_kevinbror", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad2_kevinbror| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kevinbror/distilbert-base-uncased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_zeroro80_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_zeroro80_en.md new file mode 100644 index 000000000000..94fd3093e46a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2_zeroro80_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad2_zeroro80 DistilBertForQuestionAnswering from zeroro80 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad2_zeroro80 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad2_zeroro80` is a English model originally trained by zeroro80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_zeroro80_en_5.2.0_3.0_1701025035878.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2_zeroro80_en_5.2.0_3.0_1701025035878.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad2_zeroro80","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad2_zeroro80", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad2_zeroro80| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zeroro80/distilbert-base-uncased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2test1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2test1_en.md new file mode 100644 index 000000000000..f4711409a084 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad2test1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad2test1 DistilBertForQuestionAnswering from zeroro80 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad2test1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad2test1` is a English model originally trained by zeroro80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2test1_en_5.2.0_3.0_1701028216359.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad2test1_en_5.2.0_3.0_1701028216359.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad2test1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad2test1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad2test1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zeroro80/distilbert-base-uncased-finetuned-squad2test1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_1987kostya_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_1987kostya_en.md new file mode 100644 index 000000000000..6a2dc953dd9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_1987kostya_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_1987kostya DistilBertForQuestionAnswering from 1987kostya +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_1987kostya +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_1987kostya` is a English model originally trained by 1987kostya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_1987kostya_en_5.2.0_3.0_1701031946358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_1987kostya_en_5.2.0_3.0_1701031946358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_1987kostya","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_1987kostya", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_1987kostya| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/1987kostya/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_1_en.md new file mode 100644 index 000000000000..e9bba6f00f33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_1 DistilBertForQuestionAnswering from ChutianTao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_1` is a English model originally trained by ChutianTao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_1_en_5.2.0_3.0_1701030640507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_1_en_5.2.0_3.0_1701030640507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ChutianTao/distilbert-base-uncased-finetuned-squad-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2020uee0139_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2020uee0139_en.md new file mode 100644 index 000000000000..8966f7ceb083 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2020uee0139_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_2020uee0139 DistilBertForQuestionAnswering from 2020uee0139 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_2020uee0139 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_2020uee0139` is a English model originally trained by 2020uee0139. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2020uee0139_en_5.2.0_3.0_1701021806297.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2020uee0139_en_5.2.0_3.0_1701021806297.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_2020uee0139","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_2020uee0139", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_2020uee0139| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/2020uee0139/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_384_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_384_1_en.md new file mode 100644 index 000000000000..5c61ded8aeeb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_384_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_2_384_1 DistilBertForQuestionAnswering from raisinbl +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_2_384_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_2_384_1` is a English model originally trained by raisinbl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2_384_1_en_5.2.0_3.0_1701032063098.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2_384_1_en_5.2.0_3.0_1701032063098.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_2_384_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_2_384_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_2_384_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/raisinbl/distilbert-base-uncased-finetuned-squad_2_384_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_512_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_512_1_en.md new file mode 100644 index 000000000000..4697564192fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_512_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_2_512_1 DistilBertForQuestionAnswering from raisinbl +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_2_512_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_2_512_1` is a English model originally trained by raisinbl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2_512_1_en_5.2.0_3.0_1701027393445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2_512_1_en_5.2.0_3.0_1701027393445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_2_512_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_2_512_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_2_512_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/raisinbl/distilbert-base-uncased-finetuned-squad_2_512_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_en.md new file mode 100644 index 000000000000..704a3c401ab7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_2 DistilBertForQuestionAnswering from ChutianTao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_2` is a English model originally trained by ChutianTao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2_en_5.2.0_3.0_1701027482530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_2_en_5.2.0_3.0_1701027482530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ChutianTao/distilbert-base-uncased-finetuned-squad-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_745h1n_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_745h1n_en.md new file mode 100644 index 000000000000..ea937c330156 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_745h1n_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_745h1n DistilBertForQuestionAnswering from 745H1N +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_745h1n +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_745h1n` is a English model originally trained by 745H1N. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_745h1n_en_5.2.0_3.0_1701027255617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_745h1n_en_5.2.0_3.0_1701027255617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_745h1n","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_745h1n", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_745h1n| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/745H1N/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_a4_q3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_a4_q3_en.md new file mode 100644 index 000000000000..4b4e8e99d8ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_a4_q3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_a4_q3 DistilBertForQuestionAnswering from qiny17 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_a4_q3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_a4_q3` is a English model originally trained by qiny17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_a4_q3_en_5.2.0_3.0_1701027925072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_a4_q3_en_5.2.0_3.0_1701027925072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_a4_q3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_a4_q3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_a4_q3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/qiny17/distilbert-base-uncased-finetuned-squad-a4-q3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ab20211112_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ab20211112_en.md new file mode 100644 index 000000000000..71177d688c52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ab20211112_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ab20211112 DistilBertForQuestionAnswering from ab20211112 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ab20211112 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ab20211112` is a English model originally trained by ab20211112. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ab20211112_en_5.2.0_3.0_1701017858246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ab20211112_en_5.2.0_3.0_1701017858246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ab20211112","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ab20211112", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ab20211112| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ab20211112/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_abbynewcomb_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_abbynewcomb_en.md new file mode 100644 index 000000000000..8d130ac1cafc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_abbynewcomb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_abbynewcomb DistilBertForQuestionAnswering from abbynewcomb +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_abbynewcomb +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_abbynewcomb` is a English model originally trained by abbynewcomb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_abbynewcomb_en_5.2.0_3.0_1701027096060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_abbynewcomb_en_5.2.0_3.0_1701027096060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_abbynewcomb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_abbynewcomb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_abbynewcomb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abbynewcomb/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_acremoux3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_acremoux3_en.md new file mode 100644 index 000000000000..478fbaf94051 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_acremoux3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_acremoux3 DistilBertForQuestionAnswering from acremoux3 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_acremoux3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_acremoux3` is a English model originally trained by acremoux3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_acremoux3_en_5.2.0_3.0_1701020989991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_acremoux3_en_5.2.0_3.0_1701020989991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_acremoux3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_acremoux3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_acremoux3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/acremoux3/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_adilhafeez_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_adilhafeez_en.md new file mode 100644 index 000000000000..52ca21b5252a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_adilhafeez_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_adilhafeez DistilBertForQuestionAnswering from adilhafeez +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_adilhafeez +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_adilhafeez` is a English model originally trained by adilhafeez. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_adilhafeez_en_5.2.0_3.0_1701013789814.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_adilhafeez_en_5.2.0_3.0_1701013789814.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_adilhafeez","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_adilhafeez", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_adilhafeez| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adilhafeez/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_aglasnovic_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_aglasnovic_en.md new file mode 100644 index 000000000000..76c80c3873aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_aglasnovic_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_aglasnovic DistilBertForQuestionAnswering from aglasnovic +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_aglasnovic +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_aglasnovic` is a English model originally trained by aglasnovic. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_aglasnovic_en_5.2.0_3.0_1701021658416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_aglasnovic_en_5.2.0_3.0_1701021658416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_aglasnovic","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_aglasnovic", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_aglasnovic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aglasnovic/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahcene_ikram_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahcene_ikram_en.md new file mode 100644 index 000000000000..74fec84d0a2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahcene_ikram_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ahcene_ikram DistilBertForQuestionAnswering from ahcene-ikram +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ahcene_ikram +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ahcene_ikram` is a English model originally trained by ahcene-ikram. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ahcene_ikram_en_5.2.0_3.0_1701018709074.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ahcene_ikram_en_5.2.0_3.0_1701018709074.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ahcene_ikram","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ahcene_ikram", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ahcene_ikram| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ahcene-ikram/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahirtonlopes_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahirtonlopes_en.md new file mode 100644 index 000000000000..5082a2fe135c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahirtonlopes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ahirtonlopes DistilBertForQuestionAnswering from ahirtonlopes +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ahirtonlopes +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ahirtonlopes` is a English model originally trained by ahirtonlopes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ahirtonlopes_en_5.2.0_3.0_1701022848040.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ahirtonlopes_en_5.2.0_3.0_1701022848040.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ahirtonlopes","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ahirtonlopes", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ahirtonlopes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ahirtonlopes/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahujaniharika95_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahujaniharika95_en.md new file mode 100644 index 000000000000..986ff6d796cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ahujaniharika95_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ahujaniharika95 DistilBertForQuestionAnswering from ahujaniharika95 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ahujaniharika95 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ahujaniharika95` is a English model originally trained by ahujaniharika95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ahujaniharika95_en_5.2.0_3.0_1701029295456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ahujaniharika95_en_5.2.0_3.0_1701029295456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ahujaniharika95","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ahujaniharika95", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ahujaniharika95| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ahujaniharika95/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_aindrakumar26_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_aindrakumar26_en.md new file mode 100644 index 000000000000..b949b1afb028 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_aindrakumar26_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_aindrakumar26 DistilBertForQuestionAnswering from aindrakumar26 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_aindrakumar26 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_aindrakumar26` is a English model originally trained by aindrakumar26. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_aindrakumar26_en_5.2.0_3.0_1701018535474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_aindrakumar26_en_5.2.0_3.0_1701018535474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_aindrakumar26","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_aindrakumar26", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_aindrakumar26| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aindrakumar26/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ak987_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ak987_en.md new file mode 100644 index 000000000000..2b8f3a0f2070 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ak987_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ak987 DistilBertForQuestionAnswering from ak987 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ak987 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ak987` is a English model originally trained by ak987. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ak987_en_5.2.0_3.0_1701024245944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ak987_en_5.2.0_3.0_1701024245944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ak987","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ak987", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ak987| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ak987/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_akamsali_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_akamsali_en.md new file mode 100644 index 000000000000..f42bb416a499 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_akamsali_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_akamsali DistilBertForQuestionAnswering from akamsali +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_akamsali +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_akamsali` is a English model originally trained by akamsali. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_akamsali_en_5.2.0_3.0_1701024853949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_akamsali_en_5.2.0_3.0_1701024853949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_akamsali","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_akamsali", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_akamsali| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akamsali/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_akrishna5_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_akrishna5_en.md new file mode 100644 index 000000000000..9d5eb17cb367 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_akrishna5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_akrishna5 DistilBertForQuestionAnswering from akrishna5 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_akrishna5 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_akrishna5` is a English model originally trained by akrishna5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_akrishna5_en_5.2.0_3.0_1701026225060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_akrishna5_en_5.2.0_3.0_1701026225060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_akrishna5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_akrishna5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_akrishna5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akrishna5/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_albertz_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_albertz_en.md new file mode 100644 index 000000000000..c5ede8902269 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_albertz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_albertz DistilBertForQuestionAnswering from albertz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_albertz +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_albertz` is a English model originally trained by albertz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_albertz_en_5.2.0_3.0_1701018699793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_albertz_en_5.2.0_3.0_1701018699793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_albertz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_albertz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_albertz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/albertz/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_alexperkin_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_alexperkin_en.md new file mode 100644 index 000000000000..7d6611653149 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_alexperkin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_alexperkin DistilBertForQuestionAnswering from AlexPerkin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_alexperkin +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_alexperkin` is a English model originally trained by AlexPerkin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_alexperkin_en_5.2.0_3.0_1701018109502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_alexperkin_en_5.2.0_3.0_1701018109502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_alexperkin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_alexperkin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_alexperkin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AlexPerkin/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anasaqsme_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anasaqsme_en.md new file mode 100644 index 000000000000..43092510bd9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anasaqsme_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_anasaqsme DistilBertForQuestionAnswering from anasaqsme +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_anasaqsme +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_anasaqsme` is a English model originally trained by anasaqsme. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anasaqsme_en_5.2.0_3.0_1701019138938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anasaqsme_en_5.2.0_3.0_1701019138938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_anasaqsme","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_anasaqsme", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_anasaqsme| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anasaqsme/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_annt5396_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_annt5396_en.md new file mode 100644 index 000000000000..a33147bb90c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_annt5396_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_annt5396 DistilBertForQuestionAnswering from annt5396 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_annt5396 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_annt5396` is a English model originally trained by annt5396. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_annt5396_en_5.2.0_3.0_1701029750255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_annt5396_en_5.2.0_3.0_1701029750255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_annt5396","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_annt5396", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_annt5396| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/annt5396/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anogmeld_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anogmeld_en.md new file mode 100644 index 000000000000..69972bf53fed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anogmeld_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_anogmeld DistilBertForQuestionAnswering from Anogmeld +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_anogmeld +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_anogmeld` is a English model originally trained by Anogmeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anogmeld_en_5.2.0_3.0_1701025033545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anogmeld_en_5.2.0_3.0_1701025033545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_anogmeld","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_anogmeld", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_anogmeld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Anogmeld/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anu24_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anu24_en.md new file mode 100644 index 000000000000..06c3d877d740 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anu24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_anu24 DistilBertForQuestionAnswering from anu24 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_anu24 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_anu24` is a English model originally trained by anu24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anu24_en_5.2.0_3.0_1701030800754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anu24_en_5.2.0_3.0_1701030800754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_anu24","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_anu24", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_anu24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anu24/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anuran_roy_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anuran_roy_en.md new file mode 100644 index 000000000000..b0469b33472e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_anuran_roy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_anuran_roy DistilBertForQuestionAnswering from anuran-roy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_anuran_roy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_anuran_roy` is a English model originally trained by anuran-roy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anuran_roy_en_5.2.0_3.0_1701029482337.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anuran_roy_en_5.2.0_3.0_1701029482337.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_anuran_roy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_anuran_roy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_anuran_roy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anuran-roy/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arpitsharma_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arpitsharma_en.md new file mode 100644 index 000000000000..3ceddcad93e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arpitsharma_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_arpitsharma DistilBertForQuestionAnswering from ArpitSharma +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_arpitsharma +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_arpitsharma` is a English model originally trained by ArpitSharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_arpitsharma_en_5.2.0_3.0_1701033430263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_arpitsharma_en_5.2.0_3.0_1701033430263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_arpitsharma","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_arpitsharma", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_arpitsharma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ArpitSharma/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arshiya20_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arshiya20_en.md new file mode 100644 index 000000000000..5e0a0186119f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arshiya20_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_arshiya20 DistilBertForQuestionAnswering from arshiya20 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_arshiya20 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_arshiya20` is a English model originally trained by arshiya20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_arshiya20_en_5.2.0_3.0_1701021272634.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_arshiya20_en_5.2.0_3.0_1701021272634.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_arshiya20","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_arshiya20", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_arshiya20| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/arshiya20/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arunkumar629_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arunkumar629_en.md new file mode 100644 index 000000000000..4f19f108d970 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_arunkumar629_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_arunkumar629 DistilBertForQuestionAnswering from arunkumar629 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_arunkumar629 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_arunkumar629` is a English model originally trained by arunkumar629. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_arunkumar629_en_5.2.0_3.0_1701024896069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_arunkumar629_en_5.2.0_3.0_1701024896069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_arunkumar629","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_arunkumar629", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_arunkumar629| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/arunkumar629/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ashhyun_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ashhyun_en.md new file mode 100644 index 000000000000..88cab80f47de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ashhyun_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ashhyun DistilBertForQuestionAnswering from ashhyun +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ashhyun +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ashhyun` is a English model originally trained by ashhyun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ashhyun_en_5.2.0_3.0_1701030372826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ashhyun_en_5.2.0_3.0_1701030372826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ashhyun","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ashhyun", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ashhyun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ashhyun/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ashutoshyadav4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ashutoshyadav4_en.md new file mode 100644 index 000000000000..943ac5ab3a59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ashutoshyadav4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ashutoshyadav4 DistilBertForQuestionAnswering from ashutoshyadav4 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ashutoshyadav4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ashutoshyadav4` is a English model originally trained by ashutoshyadav4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ashutoshyadav4_en_5.2.0_3.0_1701020384643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ashutoshyadav4_en_5.2.0_3.0_1701020384643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ashutoshyadav4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ashutoshyadav4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ashutoshyadav4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ashutoshyadav4/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_asmaa_ali_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_asmaa_ali_en.md new file mode 100644 index 000000000000..08a1905fbcc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_asmaa_ali_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_asmaa_ali DistilBertForQuestionAnswering from asmaa-ali +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_asmaa_ali +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_asmaa_ali` is a English model originally trained by asmaa-ali. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_asmaa_ali_en_5.2.0_3.0_1701042704367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_asmaa_ali_en_5.2.0_3.0_1701042704367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_asmaa_ali","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_asmaa_ali", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_asmaa_ali| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/asmaa-ali/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_atoivat_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_atoivat_en.md new file mode 100644 index 000000000000..734a368afaef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_atoivat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_atoivat DistilBertForQuestionAnswering from atoivat +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_atoivat +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_atoivat` is a English model originally trained by atoivat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_atoivat_en_5.2.0_3.0_1701031274429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_atoivat_en_5.2.0_3.0_1701031274429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_atoivat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_atoivat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_atoivat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/atoivat/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_augustin99_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_augustin99_en.md new file mode 100644 index 000000000000..e7b6ae8fd700 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_augustin99_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_augustin99 DistilBertForQuestionAnswering from Augustin99 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_augustin99 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_augustin99` is a English model originally trained by Augustin99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_augustin99_en_5.2.0_3.0_1701036123783.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_augustin99_en_5.2.0_3.0_1701036123783.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_augustin99","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_augustin99", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_augustin99| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Augustin99/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_averyb123_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_averyb123_en.md new file mode 100644 index 000000000000..bd8c97931968 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_averyb123_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_averyb123 DistilBertForQuestionAnswering from averyb123 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_averyb123 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_averyb123` is a English model originally trained by averyb123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_averyb123_en_5.2.0_3.0_1701020689008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_averyb123_en_5.2.0_3.0_1701020689008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_averyb123","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_averyb123", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_averyb123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/averyb123/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_b_mu_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_b_mu_en.md new file mode 100644 index 000000000000..f2f0d94eff4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_b_mu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_b_mu DistilBertForQuestionAnswering from b-mu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_b_mu +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_b_mu` is a English model originally trained by b-mu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_b_mu_en_5.2.0_3.0_1701029923377.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_b_mu_en_5.2.0_3.0_1701029923377.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_b_mu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_b_mu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_b_mu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/b-mu/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_babs001seye_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_babs001seye_en.md new file mode 100644 index 000000000000..b15155df916e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_babs001seye_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_babs001seye DistilBertForQuestionAnswering from babs001seye +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_babs001seye +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_babs001seye` is a English model originally trained by babs001seye. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_babs001seye_en_5.2.0_3.0_1701029612835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_babs001seye_en_5.2.0_3.0_1701029612835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_babs001seye","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_babs001seye", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_babs001seye| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/babs001seye/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_badokorach_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_badokorach_en.md new file mode 100644 index 000000000000..27400fabd2a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_badokorach_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_badokorach DistilBertForQuestionAnswering from badokorach +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_badokorach +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_badokorach` is a English model originally trained by badokorach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_badokorach_en_5.2.0_3.0_1701017860743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_badokorach_en_5.2.0_3.0_1701017860743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_badokorach","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_badokorach", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_badokorach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/badokorach/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bagusdp_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bagusdp_en.md new file mode 100644 index 000000000000..f9df503b7146 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bagusdp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_bagusdp DistilBertForQuestionAnswering from BagusDP +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_bagusdp +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_bagusdp` is a English model originally trained by BagusDP. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bagusdp_en_5.2.0_3.0_1701016684901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bagusdp_en_5.2.0_3.0_1701016684901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_bagusdp","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_bagusdp", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_bagusdp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/BagusDP/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_balivickas_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_balivickas_en.md new file mode 100644 index 000000000000..389b5aa5bb10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_balivickas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_balivickas DistilBertForQuestionAnswering from balivickas +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_balivickas +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_balivickas` is a English model originally trained by balivickas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_balivickas_en_5.2.0_3.0_1701017000361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_balivickas_en_5.2.0_3.0_1701017000361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_balivickas","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_balivickas", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_balivickas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/balivickas/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_baru98_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_baru98_en.md new file mode 100644 index 000000000000..b83221e0ad7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_baru98_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_baru98 DistilBertForQuestionAnswering from baru98 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_baru98 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_baru98` is a English model originally trained by baru98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_baru98_en_5.2.0_3.0_1701031132848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_baru98_en_5.2.0_3.0_1701031132848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_baru98","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_baru98", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_baru98| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/baru98/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bhan_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bhan_en.md new file mode 100644 index 000000000000..78ddff05cb32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bhan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_bhan DistilBertForQuestionAnswering from bhan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_bhan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_bhan` is a English model originally trained by bhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bhan_en_5.2.0_3.0_1701017365405.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bhan_en_5.2.0_3.0_1701017365405.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_bhan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_bhan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_bhan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bhan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_billzou_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_billzou_en.md new file mode 100644 index 000000000000..66171d659972 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_billzou_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_billzou DistilBertForQuestionAnswering from BillZou +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_billzou +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_billzou` is a English model originally trained by BillZou. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_billzou_en_5.2.0_3.0_1701019693892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_billzou_en_5.2.0_3.0_1701019693892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_billzou","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_billzou", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_billzou| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/BillZou/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_botika_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_botika_en.md new file mode 100644 index 000000000000..525ead02ba53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_botika_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_botika DistilBertForQuestionAnswering from botika +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_botika +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_botika` is a English model originally trained by botika. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_botika_en_5.2.0_3.0_1701028921097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_botika_en_5.2.0_3.0_1701028921097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_botika","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_botika", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_botika| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/botika/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_brucezjc_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_brucezjc_en.md new file mode 100644 index 000000000000..a51426180604 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_brucezjc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_brucezjc DistilBertForQuestionAnswering from BruceZJC +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_brucezjc +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_brucezjc` is a English model originally trained by BruceZJC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_brucezjc_en_5.2.0_3.0_1701029930505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_brucezjc_en_5.2.0_3.0_1701029930505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_brucezjc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_brucezjc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_brucezjc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/BruceZJC/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bunjuk_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bunjuk_en.md new file mode 100644 index 000000000000..570056e7b25a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bunjuk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_bunjuk DistilBertForQuestionAnswering from Bunjuk +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_bunjuk +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_bunjuk` is a English model originally trained by Bunjuk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bunjuk_en_5.2.0_3.0_1701027461804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bunjuk_en_5.2.0_3.0_1701027461804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_bunjuk","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_bunjuk", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_bunjuk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Bunjuk/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bwaj_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bwaj_en.md new file mode 100644 index 000000000000..88bb782aa7d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_bwaj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_bwaj DistilBertForQuestionAnswering from bwaj +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_bwaj +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_bwaj` is a English model originally trained by bwaj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bwaj_en_5.2.0_3.0_1701034097075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bwaj_en_5.2.0_3.0_1701034097075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_bwaj","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_bwaj", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_bwaj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bwaj/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_carolgao66_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_carolgao66_en.md new file mode 100644 index 000000000000..e95835da5776 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_carolgao66_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_carolgao66 DistilBertForQuestionAnswering from carolgao66 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_carolgao66 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_carolgao66` is a English model originally trained by carolgao66. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_carolgao66_en_5.2.0_3.0_1701018599799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_carolgao66_en_5.2.0_3.0_1701018599799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_carolgao66","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_carolgao66", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_carolgao66| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/carolgao66/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cavazd_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cavazd_en.md new file mode 100644 index 000000000000..0a6b10e88b35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cavazd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_cavazd DistilBertForQuestionAnswering from cavazd +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_cavazd +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_cavazd` is a English model originally trained by cavazd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_cavazd_en_5.2.0_3.0_1701038573073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_cavazd_en_5.2.0_3.0_1701038573073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_cavazd","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_cavazd", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_cavazd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cavazd/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_changjin_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_changjin_en.md new file mode 100644 index 000000000000..c1a1898adb65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_changjin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_changjin DistilBertForQuestionAnswering from changjin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_changjin +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_changjin` is a English model originally trained by changjin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_changjin_en_5.2.0_3.0_1701033587661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_changjin_en_5.2.0_3.0_1701033587661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_changjin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_changjin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_changjin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/changjin/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_charinet_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_charinet_en.md new file mode 100644 index 000000000000..6fef44fb95a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_charinet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_charinet DistilBertForQuestionAnswering from Charinet +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_charinet +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_charinet` is a English model originally trained by Charinet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_charinet_en_5.2.0_3.0_1701029619686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_charinet_en_5.2.0_3.0_1701029619686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_charinet","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_charinet", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_charinet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Charinet/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_chiendvhust_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_chiendvhust_en.md new file mode 100644 index 000000000000..cdaaab3ea19c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_chiendvhust_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_chiendvhust DistilBertForQuestionAnswering from chiendvhust +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_chiendvhust +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_chiendvhust` is a English model originally trained by chiendvhust. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_chiendvhust_en_5.2.0_3.0_1701019134218.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_chiendvhust_en_5.2.0_3.0_1701019134218.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_chiendvhust","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_chiendvhust", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_chiendvhust| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chiendvhust/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_choz_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_choz_en.md new file mode 100644 index 000000000000..fc0481fa77fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_choz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_choz DistilBertForQuestionAnswering from choz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_choz +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_choz` is a English model originally trained by choz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_choz_en_5.2.0_3.0_1701043034762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_choz_en_5.2.0_3.0_1701043034762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_choz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_choz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_choz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/choz/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_chuchun9_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_chuchun9_en.md new file mode 100644 index 000000000000..bee71b30879f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_chuchun9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_chuchun9 DistilBertForQuestionAnswering from chuchun9 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_chuchun9 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_chuchun9` is a English model originally trained by chuchun9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_chuchun9_en_5.2.0_3.0_1701025312911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_chuchun9_en_5.2.0_3.0_1701025312911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_chuchun9","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_chuchun9", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_chuchun9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chuchun9/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ckadam15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ckadam15_en.md new file mode 100644 index 000000000000..fc575a6e7636 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ckadam15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ckadam15 DistilBertForQuestionAnswering from ckadam15 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ckadam15 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ckadam15` is a English model originally trained by ckadam15. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ckadam15_en_5.2.0_3.0_1701039467715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ckadam15_en_5.2.0_3.0_1701039467715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ckadam15","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ckadam15", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ckadam15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ckadam15/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cosmo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cosmo_en.md new file mode 100644 index 000000000000..4adb2c7be9ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cosmo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_cosmo DistilBertForQuestionAnswering from cosmo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_cosmo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_cosmo` is a English model originally trained by cosmo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_cosmo_en_5.2.0_3.0_1701019899181.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_cosmo_en_5.2.0_3.0_1701019899181.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_cosmo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_cosmo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_cosmo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cosmo/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_crepot_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_crepot_en.md new file mode 100644 index 000000000000..0a7b38e2bfe2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_crepot_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_crepot DistilBertForQuestionAnswering from Crepot +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_crepot +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_crepot` is a English model originally trained by Crepot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_crepot_en_5.2.0_3.0_1701020326171.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_crepot_en_5.2.0_3.0_1701020326171.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_crepot","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_crepot", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_crepot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Crepot/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cv43_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cv43_en.md new file mode 100644 index 000000000000..a543d27095a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_cv43_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_cv43 DistilBertForQuestionAnswering from cv43 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_cv43 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_cv43` is a English model originally trained by cv43. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_cv43_en_5.2.0_3.0_1701029782115.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_cv43_en_5.2.0_3.0_1701029782115.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_cv43","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_cv43", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_cv43| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cv43/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_daidv1112_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_daidv1112_en.md new file mode 100644 index 000000000000..ef7044d20de8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_daidv1112_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_daidv1112 DistilBertForQuestionAnswering from daidv1112 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_daidv1112 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_daidv1112` is a English model originally trained by daidv1112. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_daidv1112_en_5.2.0_3.0_1701021509410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_daidv1112_en_5.2.0_3.0_1701021509410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_daidv1112","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_daidv1112", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_daidv1112| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/daidv1112/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_damdauvaotran_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_damdauvaotran_en.md new file mode 100644 index 000000000000..e7db7d36785b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_damdauvaotran_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_damdauvaotran DistilBertForQuestionAnswering from damdauvaotran +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_damdauvaotran +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_damdauvaotran` is a English model originally trained by damdauvaotran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_damdauvaotran_en_5.2.0_3.0_1701036582601.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_damdauvaotran_en_5.2.0_3.0_1701036582601.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_damdauvaotran","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_damdauvaotran", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_damdauvaotran| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/damdauvaotran/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dannycho1530_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dannycho1530_en.md new file mode 100644 index 000000000000..a5f92e2d0286 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dannycho1530_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dannycho1530 DistilBertForQuestionAnswering from dannycho1530 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dannycho1530 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dannycho1530` is a English model originally trained by dannycho1530. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dannycho1530_en_5.2.0_3.0_1701024308206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dannycho1530_en_5.2.0_3.0_1701024308206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dannycho1530","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dannycho1530", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dannycho1530| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dannycho1530/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dcerys_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dcerys_en.md new file mode 100644 index 000000000000..739e28987dc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dcerys_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dcerys DistilBertForQuestionAnswering from dcerys +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dcerys +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dcerys` is a English model originally trained by dcerys. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dcerys_en_5.2.0_3.0_1701024432840.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dcerys_en_5.2.0_3.0_1701024432840.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dcerys","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dcerys", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dcerys| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dcerys/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ddxplagueaws_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ddxplagueaws_en.md new file mode 100644 index 000000000000..3c549b2f73c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ddxplagueaws_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ddxplagueaws DistilBertForQuestionAnswering from DDxPlagueAWS +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ddxplagueaws +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ddxplagueaws` is a English model originally trained by DDxPlagueAWS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ddxplagueaws_en_5.2.0_3.0_1701043094084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ddxplagueaws_en_5.2.0_3.0_1701043094084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ddxplagueaws","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ddxplagueaws", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ddxplagueaws| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DDxPlagueAWS/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_deepakrish_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_deepakrish_en.md new file mode 100644 index 000000000000..074cacce0926 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_deepakrish_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_deepakrish DistilBertForQuestionAnswering from DeepaKrish +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_deepakrish +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_deepakrish` is a English model originally trained by DeepaKrish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_deepakrish_en_5.2.0_3.0_1701024705522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_deepakrish_en_5.2.0_3.0_1701024705522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_deepakrish","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_deepakrish", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_deepakrish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DeepaKrish/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_desak_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_desak_en.md new file mode 100644 index 000000000000..323f8ff3b4f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_desak_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_desak DistilBertForQuestionAnswering from Desak +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_desak +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_desak` is a English model originally trained by Desak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_desak_en_5.2.0_3.0_1701031900813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_desak_en_5.2.0_3.0_1701031900813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_desak","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_desak", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_desak| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Desak/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dfountoukidis_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dfountoukidis_en.md new file mode 100644 index 000000000000..9005688658e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dfountoukidis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dfountoukidis DistilBertForQuestionAnswering from dfountoukidis +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dfountoukidis +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dfountoukidis` is a English model originally trained by dfountoukidis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dfountoukidis_en_5.2.0_3.0_1701042390306.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dfountoukidis_en_5.2.0_3.0_1701042390306.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dfountoukidis","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dfountoukidis", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dfountoukidis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dfountoukidis/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dieexbr_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dieexbr_en.md new file mode 100644 index 000000000000..85fc89a53e0f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dieexbr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dieexbr DistilBertForQuestionAnswering from dieexbr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dieexbr +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dieexbr` is a English model originally trained by dieexbr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dieexbr_en_5.2.0_3.0_1701036122595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dieexbr_en_5.2.0_3.0_1701036122595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dieexbr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dieexbr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dieexbr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dieexbr/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dingzhaohan_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dingzhaohan_en.md new file mode 100644 index 000000000000..126f064c8c5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dingzhaohan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dingzhaohan DistilBertForQuestionAnswering from dingzhaohan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dingzhaohan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dingzhaohan` is a English model originally trained by dingzhaohan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dingzhaohan_en_5.2.0_3.0_1701036062552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dingzhaohan_en_5.2.0_3.0_1701036062552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dingzhaohan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dingzhaohan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dingzhaohan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dingzhaohan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_drewski_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_drewski_en.md new file mode 100644 index 000000000000..d7312c50d36a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_drewski_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_drewski DistilBertForQuestionAnswering from drewski +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_drewski +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_drewski` is a English model originally trained by drewski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_drewski_en_5.2.0_3.0_1701023818396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_drewski_en_5.2.0_3.0_1701023818396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_drewski","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_drewski", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_drewski| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/drewski/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dspg_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dspg_en.md new file mode 100644 index 000000000000..cd47de36171b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dspg_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dspg DistilBertForQuestionAnswering from dspg +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dspg +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dspg` is a English model originally trained by dspg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dspg_en_5.2.0_3.0_1701032864969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dspg_en_5.2.0_3.0_1701032864969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dspg","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dspg", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dspg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dspg/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_duplets_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_duplets_en.md new file mode 100644 index 000000000000..b591172a83bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_duplets_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_duplets DistilBertForQuestionAnswering from Duplets +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_duplets +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_duplets` is a English model originally trained by Duplets. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_duplets_en_5.2.0_3.0_1701029091519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_duplets_en_5.2.0_3.0_1701029091519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_duplets","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_duplets", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_duplets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Duplets/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dylan1999_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dylan1999_en.md new file mode 100644 index 000000000000..6741f05f80ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_dylan1999_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dylan1999 DistilBertForQuestionAnswering from Dylan1999 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dylan1999 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dylan1999` is a English model originally trained by Dylan1999. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dylan1999_en_5.2.0_3.0_1701025904900.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dylan1999_en_5.2.0_3.0_1701025904900.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dylan1999","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dylan1999", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dylan1999| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Dylan1999/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ecmoy_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ecmoy_en.md new file mode 100644 index 000000000000..bc30b82a7cf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ecmoy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ecmoy DistilBertForQuestionAnswering from ecmoy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ecmoy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ecmoy` is a English model originally trained by ecmoy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ecmoy_en_5.2.0_3.0_1701038729589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ecmoy_en_5.2.0_3.0_1701038729589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ecmoy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ecmoy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ecmoy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ecmoy/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_egemenkoroglu_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_egemenkoroglu_en.md new file mode 100644 index 000000000000..99c8b13c38fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_egemenkoroglu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_egemenkoroglu DistilBertForQuestionAnswering from EgemenKoroglu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_egemenkoroglu +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_egemenkoroglu` is a English model originally trained by EgemenKoroglu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_egemenkoroglu_en_5.2.0_3.0_1701023531541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_egemenkoroglu_en_5.2.0_3.0_1701023531541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_egemenkoroglu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_egemenkoroglu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_egemenkoroglu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/EgemenKoroglu/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_elasticdotventures_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_elasticdotventures_en.md new file mode 100644 index 000000000000..47ad69304591 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_elasticdotventures_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_elasticdotventures DistilBertForQuestionAnswering from elasticdotventures +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_elasticdotventures +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_elasticdotventures` is a English model originally trained by elasticdotventures. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_elasticdotventures_en_5.2.0_3.0_1701041602136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_elasticdotventures_en_5.2.0_3.0_1701041602136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_elasticdotventures","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_elasticdotventures", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_elasticdotventures| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/elasticdotventures/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_eldadshulman_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_eldadshulman_en.md new file mode 100644 index 000000000000..691cdfe8f258 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_eldadshulman_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_eldadshulman DistilBertForQuestionAnswering from eldadshulman +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_eldadshulman +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_eldadshulman` is a English model originally trained by eldadshulman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_eldadshulman_en_5.2.0_3.0_1701033025169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_eldadshulman_en_5.2.0_3.0_1701033025169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_eldadshulman","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_eldadshulman", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_eldadshulman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/eldadshulman/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv_en.md new file mode 100644 index 000000000000..ce9119763712 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv DistilBertForQuestionAnswering from LenaSchmidt +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv` is a English model originally trained by LenaSchmidt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv_en_5.2.0_3.0_1701015367761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv_en_5.2.0_3.0_1701015367761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_endpoint_with_impossible_csv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LenaSchmidt/distilbert-base-uncased-finetuned-squad-Endpoint_with_impossible.csv \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ericpeter_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ericpeter_en.md new file mode 100644 index 000000000000..4f4c810e1cce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ericpeter_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ericpeter DistilBertForQuestionAnswering from EricPeter +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ericpeter +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ericpeter` is a English model originally trained by EricPeter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ericpeter_en_5.2.0_3.0_1701040702532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ericpeter_en_5.2.0_3.0_1701040702532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ericpeter","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ericpeter", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ericpeter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/EricPeter/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_esculapeso_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_esculapeso_en.md new file mode 100644 index 000000000000..b4548676cfdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_esculapeso_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_esculapeso DistilBertForQuestionAnswering from esculapeso +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_esculapeso +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_esculapeso` is a English model originally trained by esculapeso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_esculapeso_en_5.2.0_3.0_1701041104784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_esculapeso_en_5.2.0_3.0_1701041104784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_esculapeso","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_esculapeso", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_esculapeso| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/esculapeso/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_evelyn18_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_evelyn18_en.md new file mode 100644 index 000000000000..acc8234c7d0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_evelyn18_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_evelyn18 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_evelyn18 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_evelyn18` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_evelyn18_en_5.2.0_3.0_1701029279720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_evelyn18_en_5.2.0_3.0_1701029279720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_evelyn18","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_evelyn18", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_evelyn18| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fabianwillner_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fabianwillner_en.md new file mode 100644 index 000000000000..d78a2844e272 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fabianwillner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_fabianwillner DistilBertForQuestionAnswering from FabianWillner +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_fabianwillner +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_fabianwillner` is a English model originally trained by FabianWillner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fabianwillner_en_5.2.0_3.0_1701019282458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fabianwillner_en_5.2.0_3.0_1701019282458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_fabianwillner","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_fabianwillner", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_fabianwillner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/FabianWillner/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fedegallo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fedegallo_en.md new file mode 100644 index 000000000000..b8d5bf75ccc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fedegallo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_fedegallo DistilBertForQuestionAnswering from fedegallo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_fedegallo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_fedegallo` is a English model originally trained by fedegallo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fedegallo_en_5.2.0_3.0_1701017128963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fedegallo_en_5.2.0_3.0_1701017128963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_fedegallo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_fedegallo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_fedegallo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fedegallo/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_filial_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_filial_en.md new file mode 100644 index 000000000000..34719670485c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_filial_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_filial DistilBertForQuestionAnswering from Filial +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_filial +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_filial` is a English model originally trained by Filial. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_filial_en_5.2.0_3.0_1701018540422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_filial_en_5.2.0_3.0_1701018540422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_filial","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_filial", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_filial| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Filial/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial_en.md new file mode 100644 index 000000000000..b4a68a39a2c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial DistilBertForQuestionAnswering from stevemobs +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial` is a English model originally trained by stevemobs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial_en_5.2.0_3.0_1701028229972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial_en_5.2.0_3.0_1701028229972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_finetuned_squad_adversarial| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/stevemobs/distilbert-base-uncased-finetuned-squad-finetuned-squad_adversarial \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_squad_en.md new file mode 100644 index 000000000000..b929a2d7013e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_finetuned_squad DistilBertForQuestionAnswering from flowing-concepts-ai +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_finetuned_squad` is a English model originally trained by flowing-concepts-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_squad_en_5.2.0_3.0_1701030640508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_squad_en_5.2.0_3.0_1701030640508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/flowing-concepts-ai/distilbert-base-uncased-finetuned-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_thread_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_thread_en.md new file mode 100644 index 000000000000..ef710c9cbb68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_thread_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_finetuned_thread DistilBertForQuestionAnswering from ashtrevi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_finetuned_thread +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_finetuned_thread` is a English model originally trained by ashtrevi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_thread_en_5.2.0_3.0_1701035813193.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_thread_en_5.2.0_3.0_1701035813193.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_finetuned_thread","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_finetuned_thread", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_finetuned_thread| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ashtrevi/distilbert-base-uncased-finetuned-squad-finetuned-thread \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_triviaqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_triviaqa_en.md new file mode 100644 index 000000000000..61a0204ddef8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finetuned_triviaqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_finetuned_triviaqa DistilBertForQuestionAnswering from FabianWillner +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_finetuned_triviaqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_finetuned_triviaqa` is a English model originally trained by FabianWillner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_triviaqa_en_5.2.0_3.0_1701017123517.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_triviaqa_en_5.2.0_3.0_1701017123517.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_finetuned_triviaqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_finetuned_triviaqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_finetuned_triviaqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/FabianWillner/distilbert-base-uncased-finetuned-squad-finetuned-triviaqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finlaymiller_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finlaymiller_en.md new file mode 100644 index 000000000000..c5f74d547f6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_finlaymiller_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_finlaymiller DistilBertForQuestionAnswering from finlaymiller +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_finlaymiller +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_finlaymiller` is a English model originally trained by finlaymiller. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finlaymiller_en_5.2.0_3.0_1701043092423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finlaymiller_en_5.2.0_3.0_1701043092423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_finlaymiller","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_finlaymiller", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_finlaymiller| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/finlaymiller/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_flopijut_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_flopijut_en.md new file mode 100644 index 000000000000..0b01f39b454c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_flopijut_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_flopijut DistilBertForQuestionAnswering from flopijut +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_flopijut +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_flopijut` is a English model originally trained by flopijut. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_flopijut_en_5.2.0_3.0_1701028357535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_flopijut_en_5.2.0_3.0_1701028357535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_flopijut","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_flopijut", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_flopijut| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/flopijut/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_flowing_concepts_ai_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_flowing_concepts_ai_en.md new file mode 100644 index 000000000000..9c1e5e0a8640 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_flowing_concepts_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_flowing_concepts_ai DistilBertForQuestionAnswering from flowing-concepts-ai +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_flowing_concepts_ai +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_flowing_concepts_ai` is a English model originally trained by flowing-concepts-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_flowing_concepts_ai_en_5.2.0_3.0_1701028385976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_flowing_concepts_ai_en_5.2.0_3.0_1701028385976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_flowing_concepts_ai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_flowing_concepts_ai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_flowing_concepts_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/flowing-concepts-ai/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_forturne_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_forturne_en.md new file mode 100644 index 000000000000..222cb9e2cefa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_forturne_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_forturne DistilBertForQuestionAnswering from Forturne +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_forturne +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_forturne` is a English model originally trained by Forturne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_forturne_en_5.2.0_3.0_1701037898530.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_forturne_en_5.2.0_3.0_1701037898530.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_forturne","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_forturne", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_forturne| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Forturne/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_foxasdf_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_foxasdf_en.md new file mode 100644 index 000000000000..946f81697d27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_foxasdf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_foxasdf DistilBertForQuestionAnswering from Foxasdf +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_foxasdf +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_foxasdf` is a English model originally trained by Foxasdf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_foxasdf_en_5.2.0_3.0_1701023023243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_foxasdf_en_5.2.0_3.0_1701023023243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_foxasdf","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_foxasdf", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_foxasdf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Foxasdf/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_francisco_denilson_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_francisco_denilson_en.md new file mode 100644 index 000000000000..7107b760a730 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_francisco_denilson_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_francisco_denilson DistilBertForQuestionAnswering from francisco-denilson +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_francisco_denilson +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_francisco_denilson` is a English model originally trained by francisco-denilson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_francisco_denilson_en_5.2.0_3.0_1701035227972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_francisco_denilson_en_5.2.0_3.0_1701035227972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_francisco_denilson","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_francisco_denilson", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_francisco_denilson| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/francisco-denilson/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_francistembo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_francistembo_en.md new file mode 100644 index 000000000000..afba347f2f13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_francistembo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_francistembo DistilBertForQuestionAnswering from francistembo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_francistembo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_francistembo` is a English model originally trained by francistembo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_francistembo_en_5.2.0_3.0_1701026386377.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_francistembo_en_5.2.0_3.0_1701026386377.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_francistembo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_francistembo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_francistembo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/francistembo/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_frozen_v1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_frozen_v1_en.md new file mode 100644 index 000000000000..45b8aee69873 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_frozen_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_frozen_v1 DistilBertForQuestionAnswering from ericRosello +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_frozen_v1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_frozen_v1` is a English model originally trained by ericRosello. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_frozen_v1_en_5.2.0_3.0_1701016519904.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_frozen_v1_en_5.2.0_3.0_1701016519904.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_frozen_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_frozen_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_frozen_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ericRosello/distilbert-base-uncased-finetuned-squad-frozen-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ftorres_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ftorres_en.md new file mode 100644 index 000000000000..967f7f76fbfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ftorres_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ftorres DistilBertForQuestionAnswering from ftorres +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ftorres +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ftorres` is a English model originally trained by ftorres. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ftorres_en_5.2.0_3.0_1701028610291.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ftorres_en_5.2.0_3.0_1701028610291.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ftorres","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ftorres", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ftorres| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ftorres/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fuh990202_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fuh990202_en.md new file mode 100644 index 000000000000..2e34934fba3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_fuh990202_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_fuh990202 DistilBertForQuestionAnswering from fuh990202 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_fuh990202 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_fuh990202` is a English model originally trained by fuh990202. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fuh990202_en_5.2.0_3.0_1701022717108.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fuh990202_en_5.2.0_3.0_1701022717108.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_fuh990202","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_fuh990202", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_fuh990202| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fuh990202/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gameboy_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gameboy_en.md new file mode 100644 index 000000000000..e99c82139742 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gameboy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_gameboy DistilBertForQuestionAnswering from GameBoy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_gameboy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_gameboy` is a English model originally trained by GameBoy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gameboy_en_5.2.0_3.0_1701029069131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gameboy_en_5.2.0_3.0_1701029069131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_gameboy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_gameboy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_gameboy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/GameBoy/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gaya_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gaya_en.md new file mode 100644 index 000000000000..f3c0b1291c4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gaya_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_gaya DistilBertForQuestionAnswering from gaya +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_gaya +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_gaya` is a English model originally trained by gaya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gaya_en_5.2.0_3.0_1701018399849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gaya_en_5.2.0_3.0_1701018399849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_gaya","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_gaya", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_gaya| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/gaya/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_georgio_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_georgio_en.md new file mode 100644 index 000000000000..1ebbded7ecba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_georgio_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_georgio DistilBertForQuestionAnswering from georgio +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_georgio +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_georgio` is a English model originally trained by georgio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_georgio_en_5.2.0_3.0_1701017990008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_georgio_en_5.2.0_3.0_1701017990008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_georgio","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_georgio", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_georgio| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/georgio/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ghostzen_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ghostzen_en.md new file mode 100644 index 000000000000..52305f541859 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ghostzen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ghostzen DistilBertForQuestionAnswering from GhostZen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ghostzen +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ghostzen` is a English model originally trained by GhostZen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ghostzen_en_5.2.0_3.0_1701028218640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ghostzen_en_5.2.0_3.0_1701028218640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ghostzen","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ghostzen", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ghostzen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/GhostZen/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gkss_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gkss_en.md new file mode 100644 index 000000000000..61a9ed70b9d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gkss_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_gkss DistilBertForQuestionAnswering from gkss +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_gkss +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_gkss` is a English model originally trained by gkss. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gkss_en_5.2.0_3.0_1701018402400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gkss_en_5.2.0_3.0_1701018402400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_gkss","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_gkss", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_gkss| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/gkss/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_golightly_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_golightly_en.md new file mode 100644 index 000000000000..34a213ed0769 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_golightly_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_golightly DistilBertForQuestionAnswering from golightly +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_golightly +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_golightly` is a English model originally trained by golightly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_golightly_en_5.2.0_3.0_1701032838371.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_golightly_en_5.2.0_3.0_1701032838371.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_golightly","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_golightly", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_golightly| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/golightly/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_guapeton_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_guapeton_en.md new file mode 100644 index 000000000000..1d922acca9dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_guapeton_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_guapeton DistilBertForQuestionAnswering from Guapeton +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_guapeton +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_guapeton` is a English model originally trained by Guapeton. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_guapeton_en_5.2.0_3.0_1701039640616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_guapeton_en_5.2.0_3.0_1701039640616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_guapeton","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_guapeton", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_guapeton| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Guapeton/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gudjonk93_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gudjonk93_en.md new file mode 100644 index 000000000000..ce0394cc6103 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gudjonk93_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_gudjonk93 DistilBertForQuestionAnswering from gudjonk93 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_gudjonk93 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_gudjonk93` is a English model originally trained by gudjonk93. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gudjonk93_en_5.2.0_3.0_1701020071890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gudjonk93_en_5.2.0_3.0_1701020071890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_gudjonk93","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_gudjonk93", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_gudjonk93| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/gudjonk93/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gulteng_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gulteng_en.md new file mode 100644 index 000000000000..74d8ac2f300a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gulteng_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_gulteng DistilBertForQuestionAnswering from gulteng +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_gulteng +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_gulteng` is a English model originally trained by gulteng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gulteng_en_5.2.0_3.0_1701032200467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gulteng_en_5.2.0_3.0_1701032200467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_gulteng","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_gulteng", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_gulteng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/gulteng/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_guo_zikun_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_guo_zikun_en.md new file mode 100644 index 000000000000..22404b9bfd2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_guo_zikun_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_guo_zikun DistilBertForQuestionAnswering from Guo-Zikun +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_guo_zikun +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_guo_zikun` is a English model originally trained by Guo-Zikun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_guo_zikun_en_5.2.0_3.0_1701029209757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_guo_zikun_en_5.2.0_3.0_1701029209757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_guo_zikun","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_guo_zikun", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_guo_zikun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Guo-Zikun/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gyubeen_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gyubeen_en.md new file mode 100644 index 000000000000..0de2f7e5f788 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gyubeen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_gyubeen DistilBertForQuestionAnswering from GyuBeen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_gyubeen +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_gyubeen` is a English model originally trained by GyuBeen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gyubeen_en_5.2.0_3.0_1701031893708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gyubeen_en_5.2.0_3.0_1701031893708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_gyubeen","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_gyubeen", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_gyubeen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/GyuBeen/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gzencha_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gzencha_en.md new file mode 100644 index 000000000000..2a20dd72b70a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_gzencha_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_gzencha DistilBertForQuestionAnswering from gzencha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_gzencha +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_gzencha` is a English model originally trained by gzencha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gzencha_en_5.2.0_3.0_1701039977327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_gzencha_en_5.2.0_3.0_1701039977327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_gzencha","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_gzencha", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_gzencha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/gzencha/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_habib1030_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_habib1030_en.md new file mode 100644 index 000000000000..91b19e8c2b7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_habib1030_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_habib1030 DistilBertForQuestionAnswering from habib1030 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_habib1030 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_habib1030` is a English model originally trained by habib1030. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_habib1030_en_5.2.0_3.0_1701031221253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_habib1030_en_5.2.0_3.0_1701031221253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_habib1030","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_habib1030", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_habib1030| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/habib1030/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haddadalwi_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haddadalwi_en.md new file mode 100644 index 000000000000..c55e140ac925 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haddadalwi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_haddadalwi DistilBertForQuestionAnswering from haddadalwi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_haddadalwi +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_haddadalwi` is a English model originally trained by haddadalwi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_haddadalwi_en_5.2.0_3.0_1701031463552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_haddadalwi_en_5.2.0_3.0_1701031463552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_haddadalwi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_haddadalwi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_haddadalwi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/haddadalwi/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hadjer_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hadjer_en.md new file mode 100644 index 000000000000..1b29ea8a264b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hadjer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hadjer DistilBertForQuestionAnswering from Hadjer +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hadjer +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hadjer` is a English model originally trained by Hadjer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hadjer_en_5.2.0_3.0_1701025770544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hadjer_en_5.2.0_3.0_1701025770544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hadjer","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hadjer", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hadjer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Hadjer/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hamdan_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hamdan_en.md new file mode 100644 index 000000000000..3984dcb67c83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hamdan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hamdan DistilBertForQuestionAnswering from Hamdan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hamdan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hamdan` is a English model originally trained by Hamdan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hamdan_en_5.2.0_3.0_1701035230691.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hamdan_en_5.2.0_3.0_1701035230691.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hamdan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hamdan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hamdan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Hamdan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hangerrits_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hangerrits_en.md new file mode 100644 index 000000000000..0731cde3a275 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hangerrits_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hangerrits DistilBertForQuestionAnswering from hangerrits +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hangerrits +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hangerrits` is a English model originally trained by hangerrits. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hangerrits_en_5.2.0_3.0_1701038717259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hangerrits_en_5.2.0_3.0_1701038717259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hangerrits","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hangerrits", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hangerrits| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hangerrits/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_harling_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_harling_en.md new file mode 100644 index 000000000000..5eb49c5b1a07 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_harling_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_harling DistilBertForQuestionAnswering from harling +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_harling +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_harling` is a English model originally trained by harling. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_harling_en_5.2.0_3.0_1701024679001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_harling_en_5.2.0_3.0_1701024679001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_harling","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_harling", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_harling| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/harling/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_harshit_070_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_harshit_070_en.md new file mode 100644 index 000000000000..156f1d302260 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_harshit_070_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_harshit_070 DistilBertForQuestionAnswering from harshit-070 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_harshit_070 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_harshit_070` is a English model originally trained by harshit-070. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_harshit_070_en_5.2.0_3.0_1701019771688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_harshit_070_en_5.2.0_3.0_1701019771688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_harshit_070","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_harshit_070", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_harshit_070| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/harshit-070/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haudren_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haudren_en.md new file mode 100644 index 000000000000..af0a47857909 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haudren_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_haudren DistilBertForQuestionAnswering from HaudreN +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_haudren +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_haudren` is a English model originally trained by HaudreN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_haudren_en_5.2.0_3.0_1701018249424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_haudren_en_5.2.0_3.0_1701018249424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_haudren","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_haudren", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_haudren| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HaudreN/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haunt224_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haunt224_en.md new file mode 100644 index 000000000000..500e29b598c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_haunt224_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_haunt224 DistilBertForQuestionAnswering from haunt224 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_haunt224 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_haunt224` is a English model originally trained by haunt224. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_haunt224_en_5.2.0_3.0_1701030519617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_haunt224_en_5.2.0_3.0_1701030519617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_haunt224","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_haunt224", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_haunt224| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/haunt224/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hedronstone_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hedronstone_en.md new file mode 100644 index 000000000000..48875c8eeec4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hedronstone_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hedronstone DistilBertForQuestionAnswering from hedronstone +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hedronstone +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hedronstone` is a English model originally trained by hedronstone. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hedronstone_en_5.2.0_3.0_1701028782684.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hedronstone_en_5.2.0_3.0_1701028782684.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hedronstone","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hedronstone", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hedronstone| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hedronstone/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_herrydaniel_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_herrydaniel_en.md new file mode 100644 index 000000000000..8f95d4095e85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_herrydaniel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_herrydaniel DistilBertForQuestionAnswering from Herrydaniel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_herrydaniel +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_herrydaniel` is a English model originally trained by Herrydaniel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_herrydaniel_en_5.2.0_3.0_1701023675764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_herrydaniel_en_5.2.0_3.0_1701023675764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_herrydaniel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_herrydaniel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_herrydaniel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Herrydaniel/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hikarubear_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hikarubear_en.md new file mode 100644 index 000000000000..305d6b7125bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hikarubear_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hikarubear DistilBertForQuestionAnswering from HikaruBear +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hikarubear +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hikarubear` is a English model originally trained by HikaruBear. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hikarubear_en_5.2.0_3.0_1701027595302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hikarubear_en_5.2.0_3.0_1701027595302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hikarubear","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hikarubear", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hikarubear| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HikaruBear/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hjds0923_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hjds0923_en.md new file mode 100644 index 000000000000..c42e2af15ea7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hjds0923_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hjds0923 DistilBertForQuestionAnswering from hjds0923 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hjds0923 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hjds0923` is a English model originally trained by hjds0923. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hjds0923_en_5.2.0_3.0_1701019748292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hjds0923_en_5.2.0_3.0_1701019748292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hjds0923","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hjds0923", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hjds0923| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hjds0923/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hogger32_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hogger32_en.md new file mode 100644 index 000000000000..09785bfb1b3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hogger32_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hogger32 DistilBertForQuestionAnswering from hogger32 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hogger32 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hogger32` is a English model originally trained by hogger32. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hogger32_en_5.2.0_3.0_1701020687648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hogger32_en_5.2.0_3.0_1701020687648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hogger32","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hogger32", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hogger32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hogger32/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hongyangli_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hongyangli_en.md new file mode 100644 index 000000000000..4dd3dbe25bc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hongyangli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hongyangli DistilBertForQuestionAnswering from HongyangLi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hongyangli +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hongyangli` is a English model originally trained by HongyangLi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hongyangli_en_5.2.0_3.0_1701022454099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hongyangli_en_5.2.0_3.0_1701022454099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hongyangli","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hongyangli", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hongyangli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HongyangLi/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_htermotto_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_htermotto_en.md new file mode 100644 index 000000000000..0558a7fe6ab6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_htermotto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_htermotto DistilBertForQuestionAnswering from htermotto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_htermotto +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_htermotto` is a English model originally trained by htermotto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_htermotto_en_5.2.0_3.0_1701026936119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_htermotto_en_5.2.0_3.0_1701026936119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_htermotto","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_htermotto", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_htermotto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/htermotto/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_huangtuoyue_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_huangtuoyue_en.md new file mode 100644 index 000000000000..1aef816f9912 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_huangtuoyue_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_huangtuoyue DistilBertForQuestionAnswering from huangtuoyue +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_huangtuoyue +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_huangtuoyue` is a English model originally trained by huangtuoyue. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_huangtuoyue_en_5.2.0_3.0_1701023964251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_huangtuoyue_en_5.2.0_3.0_1701023964251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_huangtuoyue","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_huangtuoyue", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_huangtuoyue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/huangtuoyue/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_huggingliang_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_huggingliang_en.md new file mode 100644 index 000000000000..ef2527a8cc48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_huggingliang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_huggingliang DistilBertForQuestionAnswering from huggingliang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_huggingliang +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_huggingliang` is a English model originally trained by huggingliang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_huggingliang_en_5.2.0_3.0_1701017144566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_huggingliang_en_5.2.0_3.0_1701017144566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_huggingliang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_huggingliang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_huggingliang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/huggingliang/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hugovoxx_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hugovoxx_en.md new file mode 100644 index 000000000000..803167d00097 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hugovoxx_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hugovoxx DistilBertForQuestionAnswering from HugoVoxx +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hugovoxx +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hugovoxx` is a English model originally trained by HugoVoxx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hugovoxx_en_5.2.0_3.0_1701039807838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hugovoxx_en_5.2.0_3.0_1701039807838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hugovoxx","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hugovoxx", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hugovoxx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HugoVoxx/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hyan97_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hyan97_en.md new file mode 100644 index 000000000000..5fe8235b7c8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_hyan97_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hyan97 DistilBertForQuestionAnswering from hyan97 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hyan97 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hyan97` is a English model originally trained by hyan97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hyan97_en_5.2.0_3.0_1701021809267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hyan97_en_5.2.0_3.0_1701021809267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hyan97","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hyan97", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hyan97| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hyan97/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_igorpestretsov_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_igorpestretsov_en.md new file mode 100644 index 000000000000..d14127bb58d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_igorpestretsov_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_igorpestretsov DistilBertForQuestionAnswering from IgorPestretsov +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_igorpestretsov +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_igorpestretsov` is a English model originally trained by IgorPestretsov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_igorpestretsov_en_5.2.0_3.0_1701035658561.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_igorpestretsov_en_5.2.0_3.0_1701035658561.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_igorpestretsov","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_igorpestretsov", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_igorpestretsov| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/IgorPestretsov/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_implementacion_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_implementacion_en.md new file mode 100644 index 000000000000..0160cb8e2bb8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_implementacion_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_implementacion DistilBertForQuestionAnswering from Implementacion +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_implementacion +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_implementacion` is a English model originally trained by Implementacion. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_implementacion_en_5.2.0_3.0_1701039326012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_implementacion_en_5.2.0_3.0_1701039326012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_implementacion","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_implementacion", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_implementacion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Implementacion/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ioanfr_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ioanfr_en.md new file mode 100644 index 000000000000..b33cf15295a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ioanfr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ioanfr DistilBertForQuestionAnswering from ioanfr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ioanfr +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ioanfr` is a English model originally trained by ioanfr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ioanfr_en_5.2.0_3.0_1701042548676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ioanfr_en_5.2.0_3.0_1701042548676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ioanfr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ioanfr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ioanfr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ioanfr/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ivanhf_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ivanhf_en.md new file mode 100644 index 000000000000..96177f24516f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ivanhf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ivanhf DistilBertForQuestionAnswering from IvanHF +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ivanhf +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ivanhf` is a English model originally trained by IvanHF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ivanhf_en_5.2.0_3.0_1701026778846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ivanhf_en_5.2.0_3.0_1701026778846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ivanhf","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ivanhf", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ivanhf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/IvanHF/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jajos_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jajos_en.md new file mode 100644 index 000000000000..851bd5876101 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jajos_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_jajos DistilBertForQuestionAnswering from JaJoS +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_jajos +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_jajos` is a English model originally trained by JaJoS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jajos_en_5.2.0_3.0_1701031900837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jajos_en_5.2.0_3.0_1701031900837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_jajos","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_jajos", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_jajos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/JaJoS/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jessica_ecosia_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jessica_ecosia_en.md new file mode 100644 index 000000000000..f9b851c46523 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jessica_ecosia_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_jessica_ecosia DistilBertForQuestionAnswering from jessica-ecosia +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_jessica_ecosia +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_jessica_ecosia` is a English model originally trained by jessica-ecosia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jessica_ecosia_en_5.2.0_3.0_1701019421488.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jessica_ecosia_en_5.2.0_3.0_1701019421488.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_jessica_ecosia","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_jessica_ecosia", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_jessica_ecosia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jessica-ecosia/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jiading_zhu_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jiading_zhu_en.md new file mode 100644 index 000000000000..93e50fe70c72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jiading_zhu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_jiading_zhu DistilBertForQuestionAnswering from jiading-zhu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_jiading_zhu +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_jiading_zhu` is a English model originally trained by jiading-zhu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jiading_zhu_en_5.2.0_3.0_1701038112345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jiading_zhu_en_5.2.0_3.0_1701038112345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_jiading_zhu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_jiading_zhu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_jiading_zhu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jiading-zhu/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_joaking1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_joaking1_en.md new file mode 100644 index 000000000000..7ccc54c75bfa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_joaking1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_joaking1 DistilBertForQuestionAnswering from Joaking1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_joaking1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_joaking1` is a English model originally trained by Joaking1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_joaking1_en_5.2.0_3.0_1701018996880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_joaking1_en_5.2.0_3.0_1701018996880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_joaking1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_joaking1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_joaking1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Joaking1/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jpabbuehl_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jpabbuehl_en.md new file mode 100644 index 000000000000..67dbd1686f3d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jpabbuehl_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_jpabbuehl DistilBertForQuestionAnswering from jpabbuehl +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_jpabbuehl +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_jpabbuehl` is a English model originally trained by jpabbuehl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jpabbuehl_en_5.2.0_3.0_1701026229905.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jpabbuehl_en_5.2.0_3.0_1701026229905.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_jpabbuehl","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_jpabbuehl", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_jpabbuehl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jpabbuehl/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jrisch_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jrisch_en.md new file mode 100644 index 000000000000..99a77f9c1b14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_jrisch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_jrisch DistilBertForQuestionAnswering from jrisch +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_jrisch +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_jrisch` is a English model originally trained by jrisch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jrisch_en_5.2.0_3.0_1701028470105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jrisch_en_5.2.0_3.0_1701028470105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_jrisch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_jrisch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_jrisch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jrisch/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_juanmarmol_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_juanmarmol_en.md new file mode 100644 index 000000000000..ecc89cd7e50e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_juanmarmol_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_juanmarmol DistilBertForQuestionAnswering from juanmarmol +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_juanmarmol +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_juanmarmol` is a English model originally trained by juanmarmol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_juanmarmol_en_5.2.0_3.0_1701021371224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_juanmarmol_en_5.2.0_3.0_1701021371224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_juanmarmol","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_juanmarmol", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_juanmarmol| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/juanmarmol/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_juliusco_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_juliusco_en.md new file mode 100644 index 000000000000..4e6e2b97610d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_juliusco_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_juliusco DistilBertForQuestionAnswering from juliusco +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_juliusco +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_juliusco` is a English model originally trained by juliusco. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_juliusco_en_5.2.0_3.0_1701029073470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_juliusco_en_5.2.0_3.0_1701029073470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_juliusco","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_juliusco", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_juliusco| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/juliusco/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_justalittlecrew_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_justalittlecrew_en.md new file mode 100644 index 000000000000..5a38aba090e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_justalittlecrew_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_justalittlecrew DistilBertForQuestionAnswering from justalittlecrew +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_justalittlecrew +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_justalittlecrew` is a English model originally trained by justalittlecrew. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_justalittlecrew_en_5.2.0_3.0_1701018261652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_justalittlecrew_en_5.2.0_3.0_1701018261652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_justalittlecrew","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_justalittlecrew", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_justalittlecrew| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/justalittlecrew/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kahkasha_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kahkasha_en.md new file mode 100644 index 000000000000..a51e97674d15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kahkasha_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kahkasha DistilBertForQuestionAnswering from kahkasha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kahkasha +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kahkasha` is a English model originally trained by kahkasha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kahkasha_en_5.2.0_3.0_1701033130875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kahkasha_en_5.2.0_3.0_1701033130875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kahkasha","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kahkasha", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kahkasha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kahkasha/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaipo_chang_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaipo_chang_en.md new file mode 100644 index 000000000000..4461c5df2043 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaipo_chang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kaipo_chang DistilBertForQuestionAnswering from kaipo-chang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kaipo_chang +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kaipo_chang` is a English model originally trained by kaipo-chang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kaipo_chang_en_5.2.0_3.0_1701029779552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kaipo_chang_en_5.2.0_3.0_1701029779552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kaipo_chang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kaipo_chang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kaipo_chang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kaipo-chang/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaku0o0_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaku0o0_en.md new file mode 100644 index 000000000000..1c2b0ceccf71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaku0o0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kaku0o0 DistilBertForQuestionAnswering from Kaku0o0 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kaku0o0 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kaku0o0` is a English model originally trained by Kaku0o0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kaku0o0_en_5.2.0_3.0_1701023159417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kaku0o0_en_5.2.0_3.0_1701023159417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kaku0o0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kaku0o0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kaku0o0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Kaku0o0/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kamioon_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kamioon_en.md new file mode 100644 index 000000000000..079d236f5fed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kamioon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kamioon DistilBertForQuestionAnswering from kamioon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kamioon +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kamioon` is a English model originally trained by kamioon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kamioon_en_5.2.0_3.0_1701035513576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kamioon_en_5.2.0_3.0_1701035513576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kamioon","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kamioon", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kamioon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kamioon/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaouther_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaouther_en.md new file mode 100644 index 000000000000..3bdf55d407a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kaouther_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kaouther DistilBertForQuestionAnswering from kaouther +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kaouther +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kaouther` is a English model originally trained by kaouther. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kaouther_en_5.2.0_3.0_1701020168231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kaouther_en_5.2.0_3.0_1701020168231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kaouther","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kaouther", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kaouther| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kaouther/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_karthikeya_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_karthikeya_en.md new file mode 100644 index 000000000000..fa73a741b710 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_karthikeya_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_karthikeya DistilBertForQuestionAnswering from Karthikeya +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_karthikeya +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_karthikeya` is a English model originally trained by Karthikeya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_karthikeya_en_5.2.0_3.0_1701020544682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_karthikeya_en_5.2.0_3.0_1701020544682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_karthikeya","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_karthikeya", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_karthikeya| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Karthikeya/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_karukapur_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_karukapur_en.md new file mode 100644 index 000000000000..bf0e00dcb233 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_karukapur_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_karukapur DistilBertForQuestionAnswering from karukapur +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_karukapur +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_karukapur` is a English model originally trained by karukapur. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_karukapur_en_5.2.0_3.0_1701028669230.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_karukapur_en_5.2.0_3.0_1701028669230.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_karukapur","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_karukapur", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_karukapur| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/karukapur/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kd02_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kd02_en.md new file mode 100644 index 000000000000..88437ec4a677 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kd02_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kd02 DistilBertForQuestionAnswering from KD02 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kd02 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kd02` is a English model originally trained by KD02. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kd02_en_5.2.0_3.0_1701019545725.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kd02_en_5.2.0_3.0_1701019545725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kd02","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kd02", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kd02| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/KD02/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kdot_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kdot_en.md new file mode 100644 index 000000000000..d7ad52f01ecb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kdot_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kdot DistilBertForQuestionAnswering from kdot +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kdot +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kdot` is a English model originally trained by kdot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kdot_en_5.2.0_3.0_1701031590302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kdot_en_5.2.0_3.0_1701031590302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kdot","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kdot", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kdot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kdot/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kenlevine_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kenlevine_en.md new file mode 100644 index 000000000000..262df75446a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kenlevine_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kenlevine DistilBertForQuestionAnswering from kenlevine +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kenlevine +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kenlevine` is a English model originally trained by kenlevine. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kenlevine_en_5.2.0_3.0_1701019288194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kenlevine_en_5.2.0_3.0_1701019288194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kenlevine","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kenlevine", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kenlevine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kenlevine/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kevin123_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kevin123_en.md new file mode 100644 index 000000000000..7617d40943b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kevin123_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kevin123 DistilBertForQuestionAnswering from Kevin123 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kevin123 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kevin123` is a English model originally trained by Kevin123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kevin123_en_5.2.0_3.0_1701023813464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kevin123_en_5.2.0_3.0_1701023813464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kevin123","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kevin123", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kevin123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Kevin123/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_khoadan9_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_khoadan9_en.md new file mode 100644 index 000000000000..b9bff3b069b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_khoadan9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_khoadan9 DistilBertForQuestionAnswering from KhoaDan9 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_khoadan9 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_khoadan9` is a English model originally trained by KhoaDan9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_khoadan9_en_5.2.0_3.0_1701031575075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_khoadan9_en_5.2.0_3.0_1701031575075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_khoadan9","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_khoadan9", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_khoadan9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/KhoaDan9/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kiana_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kiana_en.md new file mode 100644 index 000000000000..d2b2b257dee0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kiana_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kiana DistilBertForQuestionAnswering from kiana +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kiana +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kiana` is a English model originally trained by kiana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kiana_en_5.2.0_3.0_1701031427061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kiana_en_5.2.0_3.0_1701031427061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kiana","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kiana", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kiana| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kiana/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kiu020_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kiu020_en.md new file mode 100644 index 000000000000..56134f5a5bc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kiu020_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kiu020 DistilBertForQuestionAnswering from kiu020 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kiu020 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kiu020` is a English model originally trained by kiu020. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kiu020_en_5.2.0_3.0_1701020092168.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kiu020_en_5.2.0_3.0_1701020092168.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kiu020","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kiu020", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kiu020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kiu020/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kizunasunhy_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kizunasunhy_en.md new file mode 100644 index 000000000000..f1df962979ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kizunasunhy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kizunasunhy DistilBertForQuestionAnswering from kizunasunhy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kizunasunhy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kizunasunhy` is a English model originally trained by kizunasunhy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kizunasunhy_en_5.2.0_3.0_1701034146699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kizunasunhy_en_5.2.0_3.0_1701034146699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kizunasunhy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kizunasunhy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kizunasunhy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kizunasunhy/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kj141_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kj141_en.md new file mode 100644 index 000000000000..aea8dab1b3c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kj141_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kj141 DistilBertForQuestionAnswering from kj141 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kj141 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kj141` is a English model originally trained by kj141. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kj141_en_5.2.0_3.0_1701016845474.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kj141_en_5.2.0_3.0_1701016845474.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kj141","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kj141", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kj141| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kj141/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kkhyun_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kkhyun_en.md new file mode 100644 index 000000000000..6c370f9e8543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kkhyun_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kkhyun DistilBertForQuestionAnswering from KKHyun +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kkhyun +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kkhyun` is a English model originally trained by KKHyun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kkhyun_en_5.2.0_3.0_1701033987483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kkhyun_en_5.2.0_3.0_1701033987483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kkhyun","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kkhyun", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kkhyun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/KKHyun/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_koflynn_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_koflynn_en.md new file mode 100644 index 000000000000..db418c44ce6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_koflynn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_koflynn DistilBertForQuestionAnswering from koflynn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_koflynn +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_koflynn` is a English model originally trained by koflynn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_koflynn_en_5.2.0_3.0_1701036129042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_koflynn_en_5.2.0_3.0_1701036129042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_koflynn","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_koflynn", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_koflynn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/koflynn/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kopankom_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kopankom_en.md new file mode 100644 index 000000000000..b083ff6cbd3e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kopankom_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kopankom DistilBertForQuestionAnswering from kopankom +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kopankom +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kopankom` is a English model originally trained by kopankom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kopankom_en_5.2.0_3.0_1701041359329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kopankom_en_5.2.0_3.0_1701041359329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kopankom","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kopankom", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kopankom| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kopankom/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kyungsukim_ai_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kyungsukim_ai_en.md new file mode 100644 index 000000000000..ba6fb0465e69 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_kyungsukim_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_kyungsukim_ai DistilBertForQuestionAnswering from kyungsukim-ai +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_kyungsukim_ai +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_kyungsukim_ai` is a English model originally trained by kyungsukim-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kyungsukim_ai_en_5.2.0_3.0_1701027887849.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_kyungsukim_ai_en_5.2.0_3.0_1701027887849.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_kyungsukim_ai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_kyungsukim_ai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_kyungsukim_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kyungsukim-ai/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_laampt_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_laampt_en.md new file mode 100644 index 000000000000..afecfc4f97d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_laampt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_laampt DistilBertForQuestionAnswering from laampt +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_laampt +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_laampt` is a English model originally trained by laampt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_laampt_en_5.2.0_3.0_1701016738249.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_laampt_en_5.2.0_3.0_1701016738249.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_laampt","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_laampt", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_laampt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/laampt/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lagorio_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lagorio_en.md new file mode 100644 index 000000000000..3efe8603ab0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lagorio_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lagorio DistilBertForQuestionAnswering from lagorio +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lagorio +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lagorio` is a English model originally trained by lagorio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lagorio_en_5.2.0_3.0_1701027611909.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lagorio_en_5.2.0_3.0_1701027611909.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lagorio","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lagorio", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lagorio| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lagorio/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lahen_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lahen_en.md new file mode 100644 index 000000000000..31ef99d19592 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lahen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lahen DistilBertForQuestionAnswering from Lahen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lahen +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lahen` is a English model originally trained by Lahen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lahen_en_5.2.0_3.0_1701026752313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lahen_en_5.2.0_3.0_1701026752313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lahen","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lahen", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lahen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lahen/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lalitrajput_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lalitrajput_en.md new file mode 100644 index 000000000000..ce4833927180 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lalitrajput_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lalitrajput DistilBertForQuestionAnswering from lalitrajput +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lalitrajput +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lalitrajput` is a English model originally trained by lalitrajput. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lalitrajput_en_5.2.0_3.0_1701034355532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lalitrajput_en_5.2.0_3.0_1701034355532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lalitrajput","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lalitrajput", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lalitrajput| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lalitrajput/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lekazuha_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lekazuha_en.md new file mode 100644 index 000000000000..31acc2d258e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lekazuha_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lekazuha DistilBertForQuestionAnswering from LeKazuha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lekazuha +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lekazuha` is a English model originally trained by LeKazuha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lekazuha_en_5.2.0_3.0_1701025427576.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lekazuha_en_5.2.0_3.0_1701025427576.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lekazuha","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lekazuha", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lekazuha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LeKazuha/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lenaschmidt_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lenaschmidt_en.md new file mode 100644 index 000000000000..fe67bd48d2eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lenaschmidt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lenaschmidt DistilBertForQuestionAnswering from LenaSchmidt +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lenaschmidt +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lenaschmidt` is a English model originally trained by LenaSchmidt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lenaschmidt_en_5.2.0_3.0_1701017542938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lenaschmidt_en_5.2.0_3.0_1701017542938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lenaschmidt","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lenaschmidt", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lenaschmidt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LenaSchmidt/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lewince_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lewince_en.md new file mode 100644 index 000000000000..9ce4fb381430 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lewince_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lewince DistilBertForQuestionAnswering from LeWince +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lewince +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lewince` is a English model originally trained by LeWince. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lewince_en_5.2.0_3.0_1701023006447.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lewince_en_5.2.0_3.0_1701023006447.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lewince","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lewince", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lewince| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LeWince/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lingchensanwen_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lingchensanwen_en.md new file mode 100644 index 000000000000..0de2ff1912e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lingchensanwen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lingchensanwen DistilBertForQuestionAnswering from lingchensanwen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lingchensanwen +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lingchensanwen` is a English model originally trained by lingchensanwen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lingchensanwen_en_5.2.0_3.0_1701022029414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lingchensanwen_en_5.2.0_3.0_1701022029414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lingchensanwen","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lingchensanwen", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lingchensanwen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lingchensanwen/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_liuhaor4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_liuhaor4_en.md new file mode 100644 index 000000000000..72710b6381aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_liuhaor4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_liuhaor4 DistilBertForQuestionAnswering from liuhaor4 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_liuhaor4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_liuhaor4` is a English model originally trained by liuhaor4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_liuhaor4_en_5.2.0_3.0_1701027764121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_liuhaor4_en_5.2.0_3.0_1701027764121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_liuhaor4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_liuhaor4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_liuhaor4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/liuhaor4/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_livzandau_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_livzandau_en.md new file mode 100644 index 000000000000..0d75fc7c2043 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_livzandau_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_livzandau DistilBertForQuestionAnswering from livzandau +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_livzandau +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_livzandau` is a English model originally trained by livzandau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_livzandau_en_5.2.0_3.0_1701022869558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_livzandau_en_5.2.0_3.0_1701022869558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_livzandau","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_livzandau", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_livzandau| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/livzandau/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lmassai_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lmassai_en.md new file mode 100644 index 000000000000..5e90ff36feaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lmassai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lmassai DistilBertForQuestionAnswering from lmassai +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lmassai +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lmassai` is a English model originally trained by lmassai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lmassai_en_5.2.0_3.0_1701019431409.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lmassai_en_5.2.0_3.0_1701019431409.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lmassai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lmassai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lmassai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lmassai/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lmbsoft_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lmbsoft_en.md new file mode 100644 index 000000000000..3e7959e05e7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lmbsoft_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lmbsoft DistilBertForQuestionAnswering from lmbsoft +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lmbsoft +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lmbsoft` is a English model originally trained by lmbsoft. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lmbsoft_en_5.2.0_3.0_1701035104465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lmbsoft_en_5.2.0_3.0_1701035104465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lmbsoft","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lmbsoft", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lmbsoft| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lmbsoft/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_logisto_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_logisto_en.md new file mode 100644 index 000000000000..4104179276a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_logisto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_logisto DistilBertForQuestionAnswering from logisto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_logisto +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_logisto` is a English model originally trained by logisto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_logisto_en_5.2.0_3.0_1701031575746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_logisto_en_5.2.0_3.0_1701031575746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_logisto","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_logisto", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_logisto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/logisto/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lolaibrin_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lolaibrin_en.md new file mode 100644 index 000000000000..dbda13ccfd6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lolaibrin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lolaibrin DistilBertForQuestionAnswering from Lolaibrin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lolaibrin +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lolaibrin` is a English model originally trained by Lolaibrin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lolaibrin_en_5.2.0_3.0_1701026988239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lolaibrin_en_5.2.0_3.0_1701026988239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lolaibrin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lolaibrin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lolaibrin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Lolaibrin/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lorenzkuhn_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lorenzkuhn_en.md new file mode 100644 index 000000000000..e9f3e9b4a77b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lorenzkuhn_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lorenzkuhn DistilBertForQuestionAnswering from lorenzkuhn +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lorenzkuhn +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lorenzkuhn` is a English model originally trained by lorenzkuhn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lorenzkuhn_en_5.2.0_3.0_1701026585224.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lorenzkuhn_en_5.2.0_3.0_1701026585224.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lorenzkuhn","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lorenzkuhn", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lorenzkuhn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lorenzkuhn/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lqdisme_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lqdisme_en.md new file mode 100644 index 000000000000..9bb65a8540cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lqdisme_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lqdisme DistilBertForQuestionAnswering from lqdisme +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lqdisme +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lqdisme` is a English model originally trained by lqdisme. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lqdisme_en_5.2.0_3.0_1701032021872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lqdisme_en_5.2.0_3.0_1701032021872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lqdisme","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lqdisme", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lqdisme| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lqdisme/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lucasresck_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lucasresck_en.md new file mode 100644 index 000000000000..0eb33ab5d2ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_lucasresck_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_lucasresck DistilBertForQuestionAnswering from lucasresck +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_lucasresck +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_lucasresck` is a English model originally trained by lucasresck. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lucasresck_en_5.2.0_3.0_1701025170785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_lucasresck_en_5.2.0_3.0_1701025170785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_lucasresck","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_lucasresck", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_lucasresck| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lucasresck/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_luffyt_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_luffyt_en.md new file mode 100644 index 000000000000..b2d8cd4c9263 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_luffyt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_luffyt DistilBertForQuestionAnswering from Luffyt +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_luffyt +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_luffyt` is a English model originally trained by Luffyt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_luffyt_en_5.2.0_3.0_1701021272602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_luffyt_en_5.2.0_3.0_1701021272602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_luffyt","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_luffyt", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_luffyt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Luffyt/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_luischir_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_luischir_en.md new file mode 100644 index 000000000000..77fc158e55d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_luischir_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_luischir DistilBertForQuestionAnswering from luischir +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_luischir +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_luischir` is a English model originally trained by luischir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_luischir_en_5.2.0_3.0_1701027611898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_luischir_en_5.2.0_3.0_1701027611898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_luischir","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_luischir", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_luischir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/luischir/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_m4ycon_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_m4ycon_en.md new file mode 100644 index 000000000000..5dec9e9c9ef6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_m4ycon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_m4ycon DistilBertForQuestionAnswering from M4ycon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_m4ycon +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_m4ycon` is a English model originally trained by M4ycon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_m4ycon_en_5.2.0_3.0_1701032367841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_m4ycon_en_5.2.0_3.0_1701032367841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_m4ycon","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_m4ycon", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_m4ycon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/M4ycon/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_machine2049_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_machine2049_en.md new file mode 100644 index 000000000000..d1f3997d6396 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_machine2049_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_machine2049 DistilBertForQuestionAnswering from machine2049 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_machine2049 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_machine2049` is a English model originally trained by machine2049. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_machine2049_en_5.2.0_3.0_1701042608065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_machine2049_en_5.2.0_3.0_1701042608065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_machine2049","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_machine2049", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_machine2049| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/machine2049/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_maggiexm_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_maggiexm_en.md new file mode 100644 index 000000000000..d16918416d80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_maggiexm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_maggiexm DistilBertForQuestionAnswering from MaggieXM +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_maggiexm +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_maggiexm` is a English model originally trained by MaggieXM. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_maggiexm_en_5.2.0_3.0_1701015854420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_maggiexm_en_5.2.0_3.0_1701015854420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_maggiexm","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_maggiexm", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_maggiexm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/MaggieXM/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_maiyad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_maiyad_en.md new file mode 100644 index 000000000000..777f44233148 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_maiyad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_maiyad DistilBertForQuestionAnswering from maiyad +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_maiyad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_maiyad` is a English model originally trained by maiyad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_maiyad_en_5.2.0_3.0_1701018407239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_maiyad_en_5.2.0_3.0_1701018407239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_maiyad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_maiyad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_maiyad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/maiyad/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marcushenriksboe_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marcushenriksboe_en.md new file mode 100644 index 000000000000..b7b1de6b5fc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marcushenriksboe_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_marcushenriksboe DistilBertForQuestionAnswering from Marcushenriksboe +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_marcushenriksboe +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_marcushenriksboe` is a English model originally trained by Marcushenriksboe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_marcushenriksboe_en_5.2.0_3.0_1701020470678.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_marcushenriksboe_en_5.2.0_3.0_1701020470678.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_marcushenriksboe","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_marcushenriksboe", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_marcushenriksboe| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Marcushenriksboe/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marioarteaga_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marioarteaga_en.md new file mode 100644 index 000000000000..146aa30fc3d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marioarteaga_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_marioarteaga DistilBertForQuestionAnswering from marioarteaga +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_marioarteaga +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_marioarteaga` is a English model originally trained by marioarteaga. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_marioarteaga_en_5.2.0_3.0_1701016875424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_marioarteaga_en_5.2.0_3.0_1701016875424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_marioarteaga","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_marioarteaga", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_marioarteaga| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/marioarteaga/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marscen_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marscen_en.md new file mode 100644 index 000000000000..ca99d251ef80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_marscen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_marscen DistilBertForQuestionAnswering from Marscen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_marscen +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_marscen` is a English model originally trained by Marscen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_marscen_en_5.2.0_3.0_1701031822418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_marscen_en_5.2.0_3.0_1701031822418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_marscen","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_marscen", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_marscen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Marscen/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mbyanfei_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mbyanfei_en.md new file mode 100644 index 000000000000..3c22e68a5740 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mbyanfei_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mbyanfei DistilBertForQuestionAnswering from mbyanfei +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mbyanfei +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mbyanfei` is a English model originally trained by mbyanfei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mbyanfei_en_5.2.0_3.0_1701033858246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mbyanfei_en_5.2.0_3.0_1701033858246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mbyanfei","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mbyanfei", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mbyanfei| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mbyanfei/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mchandra_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mchandra_en.md new file mode 100644 index 000000000000..be013171c3bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mchandra_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mchandra DistilBertForQuestionAnswering from mchandra +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mchandra +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mchandra` is a English model originally trained by mchandra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mchandra_en_5.2.0_3.0_1701036722093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mchandra_en_5.2.0_3.0_1701036722093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mchandra","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mchandra", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mchandra| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mchandra/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mda_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mda_en.md new file mode 100644 index 000000000000..cd96bd7be122 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mda_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mda DistilBertForQuestionAnswering from mda +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mda +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mda` is a English model originally trained by mda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mda_en_5.2.0_3.0_1701016348264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mda_en_5.2.0_3.0_1701016348264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mda","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mda", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mda| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mda/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_meghanaanil_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_meghanaanil_en.md new file mode 100644 index 000000000000..b6c322828093 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_meghanaanil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_meghanaanil DistilBertForQuestionAnswering from meghanaanil +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_meghanaanil +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_meghanaanil` is a English model originally trained by meghanaanil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_meghanaanil_en_5.2.0_3.0_1701029332251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_meghanaanil_en_5.2.0_3.0_1701029332251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_meghanaanil","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_meghanaanil", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_meghanaanil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/meghanaanil/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mengkel_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mengkel_en.md new file mode 100644 index 000000000000..1edef56f2d51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mengkel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mengkel DistilBertForQuestionAnswering from mengkel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mengkel +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mengkel` is a English model originally trained by mengkel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mengkel_en_5.2.0_3.0_1701040726384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mengkel_en_5.2.0_3.0_1701040726384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mengkel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mengkel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mengkel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mengkel/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mentatko_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mentatko_en.md new file mode 100644 index 000000000000..2763784430d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mentatko_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mentatko DistilBertForQuestionAnswering from Mentatko +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mentatko +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mentatko` is a English model originally trained by Mentatko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mentatko_en_5.2.0_3.0_1701024516092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mentatko_en_5.2.0_3.0_1701024516092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mentatko","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mentatko", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mentatko| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Mentatko/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mevsillire_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mevsillire_en.md new file mode 100644 index 000000000000..d8d59897e2d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mevsillire_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mevsillire DistilBertForQuestionAnswering from MevSillire +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mevsillire +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mevsillire` is a English model originally trained by MevSillire. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mevsillire_en_5.2.0_3.0_1701041469414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mevsillire_en_5.2.0_3.0_1701041469414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mevsillire","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mevsillire", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mevsillire| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/MevSillire/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mfuchs37_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mfuchs37_en.md new file mode 100644 index 000000000000..d869907d6abd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mfuchs37_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mfuchs37 DistilBertForQuestionAnswering from mfuchs37 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mfuchs37 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mfuchs37` is a English model originally trained by mfuchs37. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mfuchs37_en_5.2.0_3.0_1701041825856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mfuchs37_en_5.2.0_3.0_1701041825856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mfuchs37","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mfuchs37", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mfuchs37| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mfuchs37/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_minhah_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_minhah_en.md new file mode 100644 index 000000000000..a6171f71aea6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_minhah_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_minhah DistilBertForQuestionAnswering from minhah +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_minhah +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_minhah` is a English model originally trained by minhah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_minhah_en_5.2.0_3.0_1701022153521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_minhah_en_5.2.0_3.0_1701022153521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_minhah","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_minhah", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_minhah| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/minhah/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_miroslawas_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_miroslawas_en.md new file mode 100644 index 000000000000..258685ddc4eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_miroslawas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_miroslawas DistilBertForQuestionAnswering from miroslawas +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_miroslawas +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_miroslawas` is a English model originally trained by miroslawas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_miroslawas_en_5.2.0_3.0_1701026353164.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_miroslawas_en_5.2.0_3.0_1701026353164.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_miroslawas","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_miroslawas", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_miroslawas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/miroslawas/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mmars_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mmars_en.md new file mode 100644 index 000000000000..ee0bb930fc0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mmars_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mmars DistilBertForQuestionAnswering from MMars +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mmars +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mmars` is a English model originally trained by MMars. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mmars_en_5.2.0_3.0_1701016842226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mmars_en_5.2.0_3.0_1701016842226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mmars","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mmars", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mmars| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/MMars/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mmvos_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mmvos_en.md new file mode 100644 index 000000000000..7cd44999c04e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_mmvos_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mmvos DistilBertForQuestionAnswering from MMVos +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mmvos +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mmvos` is a English model originally trained by MMVos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mmvos_en_5.2.0_3.0_1701027128320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mmvos_en_5.2.0_3.0_1701027128320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mmvos","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mmvos", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mmvos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/MMVos/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_msms_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_msms_en.md new file mode 100644 index 000000000000..69985fada530 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_msms_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_msms DistilBertForQuestionAnswering from msms +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_msms +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_msms` is a English model originally trained by msms. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_msms_en_5.2.0_3.0_1701031138016.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_msms_en_5.2.0_3.0_1701031138016.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_msms","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_msms", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_msms| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/msms/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_muhtalhakhan_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_muhtalhakhan_en.md new file mode 100644 index 000000000000..7dbc29aedde5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_muhtalhakhan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_muhtalhakhan DistilBertForQuestionAnswering from muhtalhakhan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_muhtalhakhan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_muhtalhakhan` is a English model originally trained by muhtalhakhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_muhtalhakhan_en_5.2.0_3.0_1701025676679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_muhtalhakhan_en_5.2.0_3.0_1701025676679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_muhtalhakhan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_muhtalhakhan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_muhtalhakhan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/muhtalhakhan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_my0hesap_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_my0hesap_en.md new file mode 100644 index 000000000000..16375ee378e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_my0hesap_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_my0hesap DistilBertForQuestionAnswering from my0hesap +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_my0hesap +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_my0hesap` is a English model originally trained by my0hesap. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_my0hesap_en_5.2.0_3.0_1701018835120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_my0hesap_en_5.2.0_3.0_1701018835120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_my0hesap","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_my0hesap", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_my0hesap| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/my0hesap/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nandu1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nandu1_en.md new file mode 100644 index 000000000000..953c3c3c03b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nandu1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nandu1 DistilBertForQuestionAnswering from nandu1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nandu1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nandu1` is a English model originally trained by nandu1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nandu1_en_5.2.0_3.0_1701030821420.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nandu1_en_5.2.0_3.0_1701030821420.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nandu1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nandu1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nandu1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nandu1/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_naveensp_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_naveensp_en.md new file mode 100644 index 000000000000..da9ff689efad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_naveensp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_naveensp DistilBertForQuestionAnswering from naveensp +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_naveensp +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_naveensp` is a English model originally trained by naveensp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_naveensp_en_5.2.0_3.0_1701017277430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_naveensp_en_5.2.0_3.0_1701017277430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_naveensp","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_naveensp", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_naveensp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/naveensp/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nehamj_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nehamj_en.md new file mode 100644 index 000000000000..2877d1d165ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nehamj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nehamj DistilBertForQuestionAnswering from nehamj +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nehamj +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nehamj` is a English model originally trained by nehamj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nehamj_en_5.2.0_3.0_1701022884019.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nehamj_en_5.2.0_3.0_1701022884019.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nehamj","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nehamj", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nehamj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nehamj/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nightlighttw_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nightlighttw_en.md new file mode 100644 index 000000000000..8b27bac3fe21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nightlighttw_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nightlighttw DistilBertForQuestionAnswering from nightlighttw +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nightlighttw +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nightlighttw` is a English model originally trained by nightlighttw. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nightlighttw_en_5.2.0_3.0_1701026198069.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nightlighttw_en_5.2.0_3.0_1701026198069.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nightlighttw","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nightlighttw", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nightlighttw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nightlighttw/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nikcook_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nikcook_en.md new file mode 100644 index 000000000000..310e4a5daf12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nikcook_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nikcook DistilBertForQuestionAnswering from nikcook +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nikcook +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nikcook` is a English model originally trained by nikcook. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nikcook_en_5.2.0_3.0_1701018850638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nikcook_en_5.2.0_3.0_1701018850638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nikcook","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nikcook", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nikcook| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nikcook/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nitishkumar_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nitishkumar_en.md new file mode 100644 index 000000000000..2468989cc83b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nitishkumar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nitishkumar DistilBertForQuestionAnswering from NitishKumar +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nitishkumar +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nitishkumar` is a English model originally trained by NitishKumar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nitishkumar_en_5.2.0_3.0_1701028227502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nitishkumar_en_5.2.0_3.0_1701028227502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nitishkumar","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nitishkumar", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nitishkumar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/NitishKumar/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlp_if_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlp_if_en.md new file mode 100644 index 000000000000..43a78464ecf4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlp_if_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nlp_if DistilBertForQuestionAnswering from nlp-if +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nlp_if +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nlp_if` is a English model originally trained by nlp-if. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nlp_if_en_5.2.0_3.0_1701020183977.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nlp_if_en_5.2.0_3.0_1701020183977.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nlp_if","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nlp_if", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nlp_if| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nlp-if/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlphug_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlphug_en.md new file mode 100644 index 000000000000..b875582cd9f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlphug_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nlphug DistilBertForQuestionAnswering from nlphug +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nlphug +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nlphug` is a English model originally trained by nlphug. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nlphug_en_5.2.0_3.0_1701029566522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nlphug_en_5.2.0_3.0_1701029566522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nlphug","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nlphug", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nlphug| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nlphug/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlplab130_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlplab130_en.md new file mode 100644 index 000000000000..d71cb33cb84c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nlplab130_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nlplab130 DistilBertForQuestionAnswering from nlplab130 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nlplab130 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nlplab130` is a English model originally trained by nlplab130. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nlplab130_en_5.2.0_3.0_1701031423969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nlplab130_en_5.2.0_3.0_1701031423969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nlplab130","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nlplab130", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nlplab130| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nlplab130/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nnnnm_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nnnnm_en.md new file mode 100644 index 000000000000..95daabd2ca33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nnnnm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nnnnm DistilBertForQuestionAnswering from nnnnm +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nnnnm +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nnnnm` is a English model originally trained by nnnnm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nnnnm_en_5.2.0_3.0_1701035516579.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nnnnm_en_5.2.0_3.0_1701035516579.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nnnnm","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nnnnm", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nnnnm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nnnnm/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nnoureddine_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nnoureddine_en.md new file mode 100644 index 000000000000..a5118ed0da22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_nnoureddine_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nnoureddine DistilBertForQuestionAnswering from Nnoureddine +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nnoureddine +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nnoureddine` is a English model originally trained by Nnoureddine. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nnoureddine_en_5.2.0_3.0_1701033534088.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nnoureddine_en_5.2.0_3.0_1701033534088.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nnoureddine","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nnoureddine", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nnoureddine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Nnoureddine/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_no_one_really_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_no_one_really_en.md new file mode 100644 index 000000000000..e786cd273752 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_no_one_really_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_no_one_really DistilBertForQuestionAnswering from No-one-really +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_no_one_really +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_no_one_really` is a English model originally trained by No-one-really. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_no_one_really_en_5.2.0_3.0_1701037546003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_no_one_really_en_5.2.0_3.0_1701037546003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_no_one_really","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_no_one_really", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_no_one_really| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/No-one-really/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_oananovac_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_oananovac_en.md new file mode 100644 index 000000000000..04e79c89e88c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_oananovac_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_oananovac DistilBertForQuestionAnswering from oananovac +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_oananovac +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_oananovac` is a English model originally trained by oananovac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_oananovac_en_5.2.0_3.0_1701021973624.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_oananovac_en_5.2.0_3.0_1701021973624.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_oananovac","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_oananovac", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_oananovac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/oananovac/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_oo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_oo_en.md new file mode 100644 index 000000000000..dfde4535aad4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_oo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_oo DistilBertForQuestionAnswering from oo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_oo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_oo` is a English model originally trained by oo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_oo_en_5.2.0_3.0_1701018994480.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_oo_en_5.2.0_3.0_1701018994480.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_oo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_oo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_oo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/oo/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ozsenior13_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ozsenior13_en.md new file mode 100644 index 000000000000..15070881c798 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ozsenior13_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ozsenior13 DistilBertForQuestionAnswering from ozsenior13 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ozsenior13 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ozsenior13` is a English model originally trained by ozsenior13. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ozsenior13_en_5.2.0_3.0_1701020778562.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ozsenior13_en_5.2.0_3.0_1701020778562.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ozsenior13","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ozsenior13", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ozsenior13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ozsenior13/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pabloguinea_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pabloguinea_en.md new file mode 100644 index 000000000000..6867cf1788ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pabloguinea_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pabloguinea DistilBertForQuestionAnswering from PabloGuinea +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pabloguinea +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pabloguinea` is a English model originally trained by PabloGuinea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pabloguinea_en_5.2.0_3.0_1701042951047.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pabloguinea_en_5.2.0_3.0_1701042951047.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pabloguinea","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pabloguinea", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pabloguinea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/PabloGuinea/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pandeygarima_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pandeygarima_en.md new file mode 100644 index 000000000000..8c257ca692af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pandeygarima_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pandeygarima DistilBertForQuestionAnswering from pandeygarima +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pandeygarima +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pandeygarima` is a English model originally trained by pandeygarima. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pandeygarima_en_5.2.0_3.0_1701039413075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pandeygarima_en_5.2.0_3.0_1701039413075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pandeygarima","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pandeygarima", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pandeygarima| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pandeygarima/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pankajmistry_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pankajmistry_en.md new file mode 100644 index 000000000000..936a002eebbd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pankajmistry_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pankajmistry DistilBertForQuestionAnswering from Pankajmistry +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pankajmistry +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pankajmistry` is a English model originally trained by Pankajmistry. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pankajmistry_en_5.2.0_3.0_1701026642358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pankajmistry_en_5.2.0_3.0_1701026642358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pankajmistry","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pankajmistry", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pankajmistry| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Pankajmistry/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_paoloca_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_paoloca_en.md new file mode 100644 index 000000000000..ca2494e40c24 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_paoloca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_paoloca DistilBertForQuestionAnswering from paoloca +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_paoloca +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_paoloca` is a English model originally trained by paoloca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_paoloca_en_5.2.0_3.0_1701025035934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_paoloca_en_5.2.0_3.0_1701025035934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_paoloca","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_paoloca", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_paoloca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/paoloca/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_parallelnominded_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_parallelnominded_en.md new file mode 100644 index 000000000000..3b542572edab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_parallelnominded_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_parallelnominded DistilBertForQuestionAnswering from ParallelnoMinded +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_parallelnominded +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_parallelnominded` is a English model originally trained by ParallelnoMinded. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_parallelnominded_en_5.2.0_3.0_1701025751159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_parallelnominded_en_5.2.0_3.0_1701025751159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_parallelnominded","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_parallelnominded", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_parallelnominded| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ParallelnoMinded/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pavlysafwat_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pavlysafwat_en.md new file mode 100644 index 000000000000..507a8a3ab2e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pavlysafwat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pavlysafwat DistilBertForQuestionAnswering from PavlySafwat +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pavlysafwat +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pavlysafwat` is a English model originally trained by PavlySafwat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pavlysafwat_en_5.2.0_3.0_1701035658489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pavlysafwat_en_5.2.0_3.0_1701035658489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pavlysafwat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pavlysafwat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pavlysafwat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/PavlySafwat/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pdx_etm_21_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pdx_etm_21_en.md new file mode 100644 index 000000000000..9bcbdea9403b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pdx_etm_21_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pdx_etm_21 DistilBertForQuestionAnswering from pdx-etm-21 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pdx_etm_21 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pdx_etm_21` is a English model originally trained by pdx-etm-21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pdx_etm_21_en_5.2.0_3.0_1701027930541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pdx_etm_21_en_5.2.0_3.0_1701027930541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pdx_etm_21","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pdx_etm_21", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pdx_etm_21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pdx-etm-21/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_peng0208_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_peng0208_en.md new file mode 100644 index 000000000000..88cb027d2198 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_peng0208_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_peng0208 DistilBertForQuestionAnswering from peng0208 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_peng0208 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_peng0208` is a English model originally trained by peng0208. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_peng0208_en_5.2.0_3.0_1701041998239.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_peng0208_en_5.2.0_3.0_1701041998239.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_peng0208","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_peng0208", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_peng0208| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/peng0208/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pfsv_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pfsv_en.md new file mode 100644 index 000000000000..81336f5d55b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pfsv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pfsv DistilBertForQuestionAnswering from pfsv +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pfsv +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pfsv` is a English model originally trained by pfsv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pfsv_en_5.2.0_3.0_1701016348134.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pfsv_en_5.2.0_3.0_1701016348134.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pfsv","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pfsv", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pfsv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pfsv/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_phkag_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_phkag_en.md new file mode 100644 index 000000000000..5103b9bf3a3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_phkag_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_phkag DistilBertForQuestionAnswering from phkag +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_phkag +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_phkag` is a English model originally trained by phkag. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_phkag_en_5.2.0_3.0_1701024686418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_phkag_en_5.2.0_3.0_1701024686418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_phkag","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_phkag", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_phkag| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/phkag/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pixyz_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pixyz_en.md new file mode 100644 index 000000000000..fc062a26a330 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pixyz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pixyz DistilBertForQuestionAnswering from pixyz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pixyz +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pixyz` is a English model originally trained by pixyz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pixyz_en_5.2.0_3.0_1701021852204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pixyz_en_5.2.0_3.0_1701021852204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pixyz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pixyz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pixyz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pixyz/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_plantsandcats_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_plantsandcats_en.md new file mode 100644 index 000000000000..cd7961f5f9a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_plantsandcats_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_plantsandcats DistilBertForQuestionAnswering from plantsANDcats +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_plantsandcats +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_plantsandcats` is a English model originally trained by plantsANDcats. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_plantsandcats_en_5.2.0_3.0_1701017406899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_plantsandcats_en_5.2.0_3.0_1701017406899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_plantsandcats","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_plantsandcats", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_plantsandcats| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/plantsANDcats/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_podulator_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_podulator_en.md new file mode 100644 index 000000000000..d0bbd9f9cc39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_podulator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_podulator DistilBertForQuestionAnswering from podulator +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_podulator +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_podulator` is a English model originally trained by podulator. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_podulator_en_5.2.0_3.0_1701042250103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_podulator_en_5.2.0_3.0_1701042250103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_podulator","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_podulator", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_podulator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/podulator/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pooh_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pooh_en.md new file mode 100644 index 000000000000..eab160d4f4ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pooh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pooh DistilBertForQuestionAnswering from pooh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pooh +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pooh` is a English model originally trained by pooh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pooh_en_5.2.0_3.0_1701038827298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pooh_en_5.2.0_3.0_1701038827298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pooh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pooh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pooh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pooh/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pozman_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pozman_en.md new file mode 100644 index 000000000000..e9244c1ee086 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pozman_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pozman DistilBertForQuestionAnswering from pozman +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pozman +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pozman` is a English model originally trained by pozman. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pozman_en_5.2.0_3.0_1701028785242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pozman_en_5.2.0_3.0_1701028785242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pozman","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pozman", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pozman| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pozman/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_prahalad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_prahalad_en.md new file mode 100644 index 000000000000..84d1fe7e0e98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_prahalad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_prahalad DistilBertForQuestionAnswering from Prahalad +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_prahalad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_prahalad` is a English model originally trained by Prahalad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_prahalad_en_5.2.0_3.0_1701025192773.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_prahalad_en_5.2.0_3.0_1701025192773.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_prahalad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_prahalad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_prahalad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Prahalad/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pranavsilimkhan_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pranavsilimkhan_en.md new file mode 100644 index 000000000000..fcaf324e66d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pranavsilimkhan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pranavsilimkhan DistilBertForQuestionAnswering from pranavsilimkhan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pranavsilimkhan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pranavsilimkhan` is a English model originally trained by pranavsilimkhan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pranavsilimkhan_en_5.2.0_3.0_1701022741073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pranavsilimkhan_en_5.2.0_3.0_1701022741073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pranavsilimkhan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pranavsilimkhan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pranavsilimkhan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pranavsilimkhan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pranjalsurana_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pranjalsurana_en.md new file mode 100644 index 000000000000..eb02563288ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pranjalsurana_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pranjalsurana DistilBertForQuestionAnswering from pranjalsurana +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pranjalsurana +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pranjalsurana` is a English model originally trained by pranjalsurana. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pranjalsurana_en_5.2.0_3.0_1701016690401.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pranjalsurana_en_5.2.0_3.0_1701016690401.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pranjalsurana","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pranjalsurana", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pranjalsurana| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pranjalsurana/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_princebansal42_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_princebansal42_en.md new file mode 100644 index 000000000000..abf0f48d47b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_princebansal42_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_princebansal42 DistilBertForQuestionAnswering from princebansal42 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_princebansal42 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_princebansal42` is a English model originally trained by princebansal42. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_princebansal42_en_5.2.0_3.0_1701021673092.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_princebansal42_en_5.2.0_3.0_1701021673092.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_princebansal42","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_princebansal42", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_princebansal42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/princebansal42/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_psato_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_psato_en.md new file mode 100644 index 000000000000..ab296298de5d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_psato_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_psato DistilBertForQuestionAnswering from psato +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_psato +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_psato` is a English model originally trained by psato. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_psato_en_5.2.0_3.0_1701024521512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_psato_en_5.2.0_3.0_1701024521512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_psato","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_psato", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_psato| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/psato/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pythy_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pythy_en.md new file mode 100644 index 000000000000..b42d42ba19ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_pythy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_pythy DistilBertForQuestionAnswering from Pythy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_pythy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_pythy` is a English model originally trained by Pythy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pythy_en_5.2.0_3.0_1701027799698.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_pythy_en_5.2.0_3.0_1701027799698.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_pythy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_pythy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_pythy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Pythy/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_qa_en.md new file mode 100644 index 000000000000..5014152390d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_qa DistilBertForQuestionAnswering from EricPeter +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_qa` is a English model originally trained by EricPeter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_qa_en_5.2.0_3.0_1701030395454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_qa_en_5.2.0_3.0_1701030395454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/EricPeter/distilbert-base-uncased-finetuned-squad-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_qweasd1122_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_qweasd1122_en.md new file mode 100644 index 000000000000..4b8055dfcf7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_qweasd1122_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_qweasd1122 DistilBertForQuestionAnswering from QWEasd1122 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_qweasd1122 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_qweasd1122` is a English model originally trained by QWEasd1122. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_qweasd1122_en_5.2.0_3.0_1701022732253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_qweasd1122_en_5.2.0_3.0_1701022732253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_qweasd1122","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_qweasd1122", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_qweasd1122| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/QWEasd1122/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_r202004762rf_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_r202004762rf_en.md new file mode 100644 index 000000000000..6139efb739b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_r202004762rf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_r202004762rf DistilBertForQuestionAnswering from r202004762rf +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_r202004762rf +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_r202004762rf` is a English model originally trained by r202004762rf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_r202004762rf_en_5.2.0_3.0_1701039711469.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_r202004762rf_en_5.2.0_3.0_1701039711469.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_r202004762rf","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_r202004762rf", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_r202004762rf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/r202004762rf/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_raihan0155_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_raihan0155_en.md new file mode 100644 index 000000000000..9921f644c93a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_raihan0155_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_raihan0155 DistilBertForQuestionAnswering from raihan0155 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_raihan0155 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_raihan0155` is a English model originally trained by raihan0155. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_raihan0155_en_5.2.0_3.0_1701020778338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_raihan0155_en_5.2.0_3.0_1701020778338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_raihan0155","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_raihan0155", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_raihan0155| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/raihan0155/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ramanirudh_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ramanirudh_en.md new file mode 100644 index 000000000000..391d9331e7cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ramanirudh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ramanirudh DistilBertForQuestionAnswering from ramanirudh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ramanirudh +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ramanirudh` is a English model originally trained by ramanirudh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ramanirudh_en_5.2.0_3.0_1701038298566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ramanirudh_en_5.2.0_3.0_1701038298566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ramanirudh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ramanirudh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ramanirudh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ramanirudh/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rangacharysrinivasan_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rangacharysrinivasan_en.md new file mode 100644 index 000000000000..ce8b4278f456 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rangacharysrinivasan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rangacharysrinivasan DistilBertForQuestionAnswering from rangacharysrinivasan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rangacharysrinivasan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rangacharysrinivasan` is a English model originally trained by rangacharysrinivasan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rangacharysrinivasan_en_5.2.0_3.0_1701021114219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rangacharysrinivasan_en_5.2.0_3.0_1701021114219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rangacharysrinivasan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rangacharysrinivasan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rangacharysrinivasan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rangacharysrinivasan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ranjittechie_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ranjittechie_en.md new file mode 100644 index 000000000000..63b45a2aa776 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ranjittechie_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ranjittechie DistilBertForQuestionAnswering from Ranjittechie +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ranjittechie +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ranjittechie` is a English model originally trained by Ranjittechie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ranjittechie_en_5.2.0_3.0_1701042151682.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ranjittechie_en_5.2.0_3.0_1701042151682.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ranjittechie","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ranjittechie", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ranjittechie| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Ranjittechie/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rathodsankul_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rathodsankul_en.md new file mode 100644 index 000000000000..52fd5128baff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rathodsankul_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rathodsankul DistilBertForQuestionAnswering from RathodSankul +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rathodsankul +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rathodsankul` is a English model originally trained by RathodSankul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rathodsankul_en_5.2.0_3.0_1701024299270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rathodsankul_en_5.2.0_3.0_1701024299270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rathodsankul","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rathodsankul", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rathodsankul| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/RathodSankul/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rbiswas4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rbiswas4_en.md new file mode 100644 index 000000000000..bdfa8249fc8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rbiswas4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rbiswas4 DistilBertForQuestionAnswering from rbiswas4 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rbiswas4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rbiswas4` is a English model originally trained by rbiswas4. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rbiswas4_en_5.2.0_3.0_1701030537690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rbiswas4_en_5.2.0_3.0_1701030537690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rbiswas4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rbiswas4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rbiswas4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rbiswas4/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rhakbari_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rhakbari_en.md new file mode 100644 index 000000000000..e2c47d110b2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rhakbari_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rhakbari DistilBertForQuestionAnswering from rhakbari +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rhakbari +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rhakbari` is a English model originally trained by rhakbari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rhakbari_en_5.2.0_3.0_1701019144085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rhakbari_en_5.2.0_3.0_1701019144085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rhakbari","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rhakbari", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rhakbari| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rhakbari/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rheyaas_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rheyaas_en.md new file mode 100644 index 000000000000..2d6f74768415 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rheyaas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rheyaas DistilBertForQuestionAnswering from rheyaas +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rheyaas +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rheyaas` is a English model originally trained by rheyaas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rheyaas_en_5.2.0_3.0_1701020227799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rheyaas_en_5.2.0_3.0_1701020227799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rheyaas","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rheyaas", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rheyaas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rheyaas/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rickwu_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rickwu_en.md new file mode 100644 index 000000000000..e4be38e8610f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rickwu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rickwu DistilBertForQuestionAnswering from RickWu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rickwu +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rickwu` is a English model originally trained by RickWu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rickwu_en_5.2.0_3.0_1701020544528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rickwu_en_5.2.0_3.0_1701020544528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rickwu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rickwu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rickwu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/RickWu/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_robertolc_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_robertolc_en.md new file mode 100644 index 000000000000..78a62ff3038f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_robertolc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_robertolc DistilBertForQuestionAnswering from robertoLC +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_robertolc +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_robertolc` is a English model originally trained by robertoLC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_robertolc_en_5.2.0_3.0_1701036423421.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_robertolc_en_5.2.0_3.0_1701036423421.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_robertolc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_robertolc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_robertolc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/robertoLC/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rocketq_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rocketq_en.md new file mode 100644 index 000000000000..d4e35f74bfe6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rocketq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rocketq DistilBertForQuestionAnswering from rocketq +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rocketq +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rocketq` is a English model originally trained by rocketq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rocketq_en_5.2.0_3.0_1701034973102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rocketq_en_5.2.0_3.0_1701034973102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rocketq","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rocketq", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rocketq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rocketq/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rohbrian_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rohbrian_en.md new file mode 100644 index 000000000000..c1c19a219e99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rohbrian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rohbrian DistilBertForQuestionAnswering from rohbrian +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rohbrian +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rohbrian` is a English model originally trained by rohbrian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rohbrian_en_5.2.0_3.0_1701024292946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rohbrian_en_5.2.0_3.0_1701024292946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rohbrian","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rohbrian", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rohbrian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rohbrian/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rpv_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rpv_en.md new file mode 100644 index 000000000000..b800c46dec2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_rpv_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rpv DistilBertForQuestionAnswering from rpv +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rpv +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rpv` is a English model originally trained by rpv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rpv_en_5.2.0_3.0_1701018101096.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rpv_en_5.2.0_3.0_1701018101096.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rpv","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rpv", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rpv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rpv/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ruddus716_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ruddus716_en.md new file mode 100644 index 000000000000..bec756686ebe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ruddus716_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ruddus716 DistilBertForQuestionAnswering from ruddus716 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ruddus716 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ruddus716` is a English model originally trained by ruddus716. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ruddus716_en_5.2.0_3.0_1701034234498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ruddus716_en_5.2.0_3.0_1701034234498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ruddus716","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ruddus716", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ruddus716| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ruddus716/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saba1881_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saba1881_en.md new file mode 100644 index 000000000000..921616950084 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saba1881_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_saba1881 DistilBertForQuestionAnswering from Saba1881 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_saba1881 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_saba1881` is a English model originally trained by Saba1881. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_saba1881_en_5.2.0_3.0_1701026773972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_saba1881_en_5.2.0_3.0_1701026773972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_saba1881","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_saba1881", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_saba1881| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Saba1881/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabah17_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabah17_en.md new file mode 100644 index 000000000000..57120eeea982 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabah17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sabah17 DistilBertForQuestionAnswering from sabah17 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sabah17 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sabah17` is a English model originally trained by sabah17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sabah17_en_5.2.0_3.0_1701024118775.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sabah17_en_5.2.0_3.0_1701024118775.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sabah17","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sabah17", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sabah17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sabah17/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabasazad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabasazad_en.md new file mode 100644 index 000000000000..6df6f41ca950 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabasazad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sabasazad DistilBertForQuestionAnswering from sabasazad +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sabasazad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sabasazad` is a English model originally trained by sabasazad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sabasazad_en_5.2.0_3.0_1701039116441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sabasazad_en_5.2.0_3.0_1701039116441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sabasazad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sabasazad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sabasazad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sabasazad/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabbir29_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabbir29_en.md new file mode 100644 index 000000000000..9851e0150242 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sabbir29_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sabbir29 DistilBertForQuestionAnswering from Sabbir29 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sabbir29 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sabbir29` is a English model originally trained by Sabbir29. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sabbir29_en_5.2.0_3.0_1701023161815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sabbir29_en_5.2.0_3.0_1701023161815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sabbir29","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sabbir29", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sabbir29| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sabbir29/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saleemullah_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saleemullah_en.md new file mode 100644 index 000000000000..44c99a92ac7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saleemullah_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_saleemullah DistilBertForQuestionAnswering from SaleemUllah +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_saleemullah +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_saleemullah` is a English model originally trained by SaleemUllah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_saleemullah_en_5.2.0_3.0_1701031459278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_saleemullah_en_5.2.0_3.0_1701031459278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_saleemullah","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_saleemullah", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_saleemullah| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SaleemUllah/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sam999_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sam999_en.md new file mode 100644 index 000000000000..3793665dad0a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sam999_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sam999 DistilBertForQuestionAnswering from sam999 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sam999 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sam999` is a English model originally trained by sam999. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sam999_en_5.2.0_3.0_1701020557505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sam999_en_5.2.0_3.0_1701020557505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sam999","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sam999", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sam999| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sam999/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sambosis_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sambosis_en.md new file mode 100644 index 000000000000..a4199cd990dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sambosis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sambosis DistilBertForQuestionAnswering from Sambosis +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sambosis +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sambosis` is a English model originally trained by Sambosis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sambosis_en_5.2.0_3.0_1701028220874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sambosis_en_5.2.0_3.0_1701028220874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sambosis","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sambosis", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sambosis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Sambosis/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_samuel0802_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_samuel0802_en.md new file mode 100644 index 000000000000..dd037755ec2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_samuel0802_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_samuel0802 DistilBertForQuestionAnswering from samuel0802 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_samuel0802 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_samuel0802` is a English model originally trained by samuel0802. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_samuel0802_en_5.2.0_3.0_1701041590863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_samuel0802_en_5.2.0_3.0_1701041590863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_samuel0802","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_samuel0802", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_samuel0802| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/samuel0802/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sandy317_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sandy317_en.md new file mode 100644 index 000000000000..5b07062969db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sandy317_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sandy317 DistilBertForQuestionAnswering from Sandy317 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sandy317 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sandy317` is a English model originally trained by Sandy317. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sandy317_en_5.2.0_3.0_1701024570299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sandy317_en_5.2.0_3.0_1701024570299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sandy317","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sandy317", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sandy317| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sandy317/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sangita_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sangita_en.md new file mode 100644 index 000000000000..9a2c9ee53dd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sangita_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sangita DistilBertForQuestionAnswering from Sangita +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sangita +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sangita` is a English model originally trained by Sangita. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sangita_en_5.2.0_3.0_1701021139972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sangita_en_5.2.0_3.0_1701021139972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sangita","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sangita", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sangita| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sangita/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_santvasu_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_santvasu_en.md new file mode 100644 index 000000000000..3755fc597592 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_santvasu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_santvasu DistilBertForQuestionAnswering from santvasu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_santvasu +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_santvasu` is a English model originally trained by santvasu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_santvasu_en_5.2.0_3.0_1701041773128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_santvasu_en_5.2.0_3.0_1701041773128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_santvasu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_santvasu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_santvasu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/santvasu/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saravanaj_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saravanaj_en.md new file mode 100644 index 000000000000..c6f57ec9d56a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_saravanaj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_saravanaj DistilBertForQuestionAnswering from saravanaj +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_saravanaj +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_saravanaj` is a English model originally trained by saravanaj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_saravanaj_en_5.2.0_3.0_1701035904800.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_saravanaj_en_5.2.0_3.0_1701035904800.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_saravanaj","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_saravanaj", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_saravanaj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/saravanaj/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sasuke_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sasuke_en.md new file mode 100644 index 000000000000..e0af61a4edfc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sasuke_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sasuke DistilBertForQuestionAnswering from sasuke +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sasuke +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sasuke` is a English model originally trained by sasuke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sasuke_en_5.2.0_3.0_1701023257136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sasuke_en_5.2.0_3.0_1701023257136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sasuke","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sasuke", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sasuke| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sasuke/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_schen1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_schen1_en.md new file mode 100644 index 000000000000..73f8db236272 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_schen1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_schen1 DistilBertForQuestionAnswering from schen1 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_schen1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_schen1` is a English model originally trained by schen1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_schen1_en_5.2.0_3.0_1701039120892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_schen1_en_5.2.0_3.0_1701039120892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_schen1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_schen1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_schen1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/schen1/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_42_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_42_en.md new file mode 100644 index 000000000000..5403c553c6d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_42_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_seed_42 DistilBertForQuestionAnswering from htermotto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_seed_42 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_seed_42` is a English model originally trained by htermotto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_42_en_5.2.0_3.0_1701021816670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_42_en_5.2.0_3.0_1701021816670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_seed_42","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_seed_42", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_seed_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/htermotto/distilbert-base-uncased-finetuned-squad-seed-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_69_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_69_en.md new file mode 100644 index 000000000000..d249a87cb9c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_69_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_seed_69 DistilBertForQuestionAnswering from zates +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_seed_69 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_seed_69` is a English model originally trained by zates. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_69_en_5.2.0_3.0_1701021973709.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_69_en_5.2.0_3.0_1701021973709.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_seed_69","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_seed_69", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_seed_69| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zates/distilbert-base-uncased-finetuned-squad-seed-69 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_9001_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_9001_en.md new file mode 100644 index 000000000000..60af666bb0c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_9001_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_seed_9001 DistilBertForQuestionAnswering from zates +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_seed_9001 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_seed_9001` is a English model originally trained by zates. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_9001_en_5.2.0_3.0_1701021646111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_9001_en_5.2.0_3.0_1701021646111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_seed_9001","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_seed_9001", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_seed_9001| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zates/distilbert-base-uncased-finetuned-squad-seed-9001 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_999_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_999_en.md new file mode 100644 index 000000000000..17189aa963ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seed_999_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_seed_999 DistilBertForQuestionAnswering from htermotto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_seed_999 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_seed_999` is a English model originally trained by htermotto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_999_en_5.2.0_3.0_1701023031323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seed_999_en_5.2.0_3.0_1701023031323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_seed_999","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_seed_999", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_seed_999| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/htermotto/distilbert-base-uncased-finetuned-squad-seed-999 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seokheeyam_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seokheeyam_en.md new file mode 100644 index 000000000000..5360ab63ff62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seokheeyam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_seokheeyam DistilBertForQuestionAnswering from seokheeyam +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_seokheeyam +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_seokheeyam` is a English model originally trained by seokheeyam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seokheeyam_en_5.2.0_3.0_1701026370825.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seokheeyam_en_5.2.0_3.0_1701026370825.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_seokheeyam","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_seokheeyam", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_seokheeyam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seokheeyam/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seomh_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seomh_en.md new file mode 100644 index 000000000000..1b8dc095cabb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seomh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_seomh DistilBertForQuestionAnswering from seomh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_seomh +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_seomh` is a English model originally trained by seomh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seomh_en_5.2.0_3.0_1701017423988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seomh_en_5.2.0_3.0_1701017423988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_seomh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_seomh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_seomh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seomh/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seviladiguzel_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seviladiguzel_en.md new file mode 100644 index 000000000000..81a97a93ead6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_seviladiguzel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_seviladiguzel DistilBertForQuestionAnswering from seviladiguzel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_seviladiguzel +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_seviladiguzel` is a English model originally trained by seviladiguzel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seviladiguzel_en_5.2.0_3.0_1701019569457.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_seviladiguzel_en_5.2.0_3.0_1701019569457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_seviladiguzel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_seviladiguzel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_seviladiguzel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seviladiguzel/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shafa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shafa_en.md new file mode 100644 index 000000000000..8cda1059db38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shafa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shafa DistilBertForQuestionAnswering from shafa +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shafa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shafa` is a English model originally trained by shafa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shafa_en_5.2.0_3.0_1701035067357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shafa_en_5.2.0_3.0_1701035067357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shafa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shafa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shafa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shafa/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shahma_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shahma_en.md new file mode 100644 index 000000000000..d88634279249 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shahma_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shahma DistilBertForQuestionAnswering from shahma +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shahma +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shahma` is a English model originally trained by shahma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shahma_en_5.2.0_3.0_1701021504936.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shahma_en_5.2.0_3.0_1701021504936.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shahma","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shahma", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shahma| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shahma/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shaojie_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shaojie_en.md new file mode 100644 index 000000000000..9e89fbce9946 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shaojie_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shaojie DistilBertForQuestionAnswering from shaojie +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shaojie +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shaojie` is a English model originally trained by shaojie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shaojie_en_5.2.0_3.0_1701019290567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shaojie_en_5.2.0_3.0_1701019290567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shaojie","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shaojie", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shaojie| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shaojie/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shaoyezh_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shaoyezh_en.md new file mode 100644 index 000000000000..748d9b22054d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shaoyezh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shaoyezh DistilBertForQuestionAnswering from shaoyezh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shaoyezh +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shaoyezh` is a English model originally trained by shaoyezh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shaoyezh_en_5.2.0_3.0_1701019757389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shaoyezh_en_5.2.0_3.0_1701019757389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shaoyezh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shaoyezh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shaoyezh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shaoyezh/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sharonpeng_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sharonpeng_en.md new file mode 100644 index 000000000000..c0ce8852cb72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sharonpeng_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sharonpeng DistilBertForQuestionAnswering from sharonpeng +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sharonpeng +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sharonpeng` is a English model originally trained by sharonpeng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sharonpeng_en_5.2.0_3.0_1701027922427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sharonpeng_en_5.2.0_3.0_1701027922427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sharonpeng","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sharonpeng", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sharonpeng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sharonpeng/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shayavivi_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shayavivi_en.md new file mode 100644 index 000000000000..09457eaca8c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shayavivi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shayavivi DistilBertForQuestionAnswering from shayavivi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shayavivi +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shayavivi` is a English model originally trained by shayavivi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shayavivi_en_5.2.0_3.0_1701040893687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shayavivi_en_5.2.0_3.0_1701040893687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shayavivi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shayavivi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shayavivi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shayavivi/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shelvin_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shelvin_en.md new file mode 100644 index 000000000000..5f03e8a411a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shelvin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shelvin DistilBertForQuestionAnswering from Shelvin +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shelvin +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shelvin` is a English model originally trained by Shelvin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shelvin_en_5.2.0_3.0_1701026719252.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shelvin_en_5.2.0_3.0_1701026719252.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shelvin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shelvin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shelvin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Shelvin/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sherlockguo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sherlockguo_en.md new file mode 100644 index 000000000000..272849596375 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sherlockguo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sherlockguo DistilBertForQuestionAnswering from SherlockGuo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sherlockguo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sherlockguo` is a English model originally trained by SherlockGuo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sherlockguo_en_5.2.0_3.0_1701017126184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sherlockguo_en_5.2.0_3.0_1701017126184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sherlockguo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sherlockguo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sherlockguo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SherlockGuo/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shila_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shila_en.md new file mode 100644 index 000000000000..d482cf040c5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shila_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shila DistilBertForQuestionAnswering from shila +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shila +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shila` is a English model originally trained by shila. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shila_en_5.2.0_3.0_1701027254358.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shila_en_5.2.0_3.0_1701027254358.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shila","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shila", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shila| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shila/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shivkumarganesh_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shivkumarganesh_en.md new file mode 100644 index 000000000000..86f5e305fe4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shivkumarganesh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shivkumarganesh DistilBertForQuestionAnswering from shivkumarganesh +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shivkumarganesh +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shivkumarganesh` is a English model originally trained by shivkumarganesh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shivkumarganesh_en_5.2.0_3.0_1701019254256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shivkumarganesh_en_5.2.0_3.0_1701019254256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shivkumarganesh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shivkumarganesh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shivkumarganesh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shivkumarganesh/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shizil_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shizil_en.md new file mode 100644 index 000000000000..fa179bcffca9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shizil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shizil DistilBertForQuestionAnswering from shizil +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shizil +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shizil` is a English model originally trained by shizil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shizil_en_5.2.0_3.0_1701023392449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shizil_en_5.2.0_3.0_1701023392449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shizil","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shizil", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shizil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shizil/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shunichiro_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shunichiro_en.md new file mode 100644 index 000000000000..a50a3903e9ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shunichiro_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shunichiro DistilBertForQuestionAnswering from Shunichiro +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shunichiro +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shunichiro` is a English model originally trained by Shunichiro. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shunichiro_en_5.2.0_3.0_1701027524764.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shunichiro_en_5.2.0_3.0_1701027524764.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shunichiro","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shunichiro", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shunichiro| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Shunichiro/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shwetha_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shwetha_en.md new file mode 100644 index 000000000000..de36c35f8e64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_shwetha_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_shwetha DistilBertForQuestionAnswering from shwetha +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_shwetha +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_shwetha` is a English model originally trained by shwetha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shwetha_en_5.2.0_3.0_1701026229893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_shwetha_en_5.2.0_3.0_1701026229893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_shwetha","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_shwetha", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_shwetha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shwetha/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_silveto_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_silveto_en.md new file mode 100644 index 000000000000..f9afa667d467 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_silveto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_silveto DistilBertForQuestionAnswering from silveto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_silveto +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_silveto` is a English model originally trained by silveto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_silveto_en_5.2.0_3.0_1701026385778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_silveto_en_5.2.0_3.0_1701026385778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_silveto","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_silveto", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_silveto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/silveto/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_simonli123_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_simonli123_en.md new file mode 100644 index 000000000000..b41317d74002 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_simonli123_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_simonli123 DistilBertForQuestionAnswering from SimonLi123 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_simonli123 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_simonli123` is a English model originally trained by SimonLi123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_simonli123_en_5.2.0_3.0_1701028069526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_simonli123_en_5.2.0_3.0_1701028069526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_simonli123","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_simonli123", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_simonli123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SimonLi123/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sivakumar_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sivakumar_en.md new file mode 100644 index 000000000000..bb4e5cc3b55b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sivakumar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sivakumar DistilBertForQuestionAnswering from Sivakumar +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sivakumar +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sivakumar` is a English model originally trained by Sivakumar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sivakumar_en_5.2.0_3.0_1701025901238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sivakumar_en_5.2.0_3.0_1701025901238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sivakumar","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sivakumar", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sivakumar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sivakumar/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sjchoure_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sjchoure_en.md new file mode 100644 index 000000000000..2ff3ffda94b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sjchoure_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sjchoure DistilBertForQuestionAnswering from sjchoure +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sjchoure +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sjchoure` is a English model originally trained by sjchoure. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sjchoure_en_5.2.0_3.0_1701029633910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sjchoure_en_5.2.0_3.0_1701029633910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sjchoure","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sjchoure", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sjchoure| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sjchoure/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_smonah_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_smonah_en.md new file mode 100644 index 000000000000..86325f2fe9cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_smonah_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_smonah DistilBertForQuestionAnswering from smonah +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_smonah +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_smonah` is a English model originally trained by smonah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_smonah_en_5.2.0_3.0_1701035383367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_smonah_en_5.2.0_3.0_1701035383367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_smonah","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_smonah", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_smonah| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/smonah/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sneka_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sneka_en.md new file mode 100644 index 000000000000..015e1f3a993a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sneka_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sneka DistilBertForQuestionAnswering from Sneka +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sneka +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sneka` is a English model originally trained by Sneka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sneka_en_5.2.0_3.0_1701042401449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sneka_en_5.2.0_3.0_1701042401449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sneka","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sneka", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sneka| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sneka/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sourabhd_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sourabhd_en.md new file mode 100644 index 000000000000..c6f56ca1545d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sourabhd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sourabhd DistilBertForQuestionAnswering from sourabhd +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sourabhd +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sourabhd` is a English model originally trained by sourabhd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sourabhd_en_5.2.0_3.0_1701027764956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sourabhd_en_5.2.0_3.0_1701027764956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sourabhd","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sourabhd", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sourabhd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sourabhd/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_squamto_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_squamto_en.md new file mode 100644 index 000000000000..21875b4a9938 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_squamto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_squamto DistilBertForQuestionAnswering from Squamto +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_squamto +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_squamto` is a English model originally trained by Squamto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_squamto_en_5.2.0_3.0_1701023658500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_squamto_en_5.2.0_3.0_1701023658500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_squamto","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_squamto", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_squamto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Squamto/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_srmukundb_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_srmukundb_en.md new file mode 100644 index 000000000000..7544dd49d2ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_srmukundb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_srmukundb DistilBertForQuestionAnswering from srmukundb +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_srmukundb +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_srmukundb` is a English model originally trained by srmukundb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_srmukundb_en_5.2.0_3.0_1701030802113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_srmukundb_en_5.2.0_3.0_1701030802113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_srmukundb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_srmukundb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_srmukundb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/srmukundb/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_srushti97_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_srushti97_en.md new file mode 100644 index 000000000000..4cd08aa992bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_srushti97_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_srushti97 DistilBertForQuestionAnswering from srushti97 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_srushti97 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_srushti97` is a English model originally trained by srushti97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_srushti97_en_5.2.0_3.0_1701036442093.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_srushti97_en_5.2.0_3.0_1701036442093.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_srushti97","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_srushti97", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_srushti97| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/srushti97/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ssarim_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ssarim_en.md new file mode 100644 index 000000000000..d5fdbfd8bbc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ssarim_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ssarim DistilBertForQuestionAnswering from SSarim +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ssarim +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ssarim` is a English model originally trained by SSarim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ssarim_en_5.2.0_3.0_1701030237595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ssarim_en_5.2.0_3.0_1701030237595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ssarim","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ssarim", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ssarim| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SSarim/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ssunny_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ssunny_en.md new file mode 100644 index 000000000000..56bca4b3cba7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ssunny_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ssunny DistilBertForQuestionAnswering from ssunny +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ssunny +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ssunny` is a English model originally trained by ssunny. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ssunny_en_5.2.0_3.0_1701026381335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ssunny_en_5.2.0_3.0_1701026381335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ssunny","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ssunny", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ssunny| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ssunny/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_stevemobs_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_stevemobs_en.md new file mode 100644 index 000000000000..db9bb8671aff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_stevemobs_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_stevemobs DistilBertForQuestionAnswering from stevemobs +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_stevemobs +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_stevemobs` is a English model originally trained by stevemobs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_stevemobs_en_5.2.0_3.0_1701022314978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_stevemobs_en_5.2.0_3.0_1701022314978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_stevemobs","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_stevemobs", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_stevemobs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/stevemobs/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_stig_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_stig_en.md new file mode 100644 index 000000000000..d9c18a87a16d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_stig_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_stig DistilBertForQuestionAnswering from stig +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_stig +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_stig` is a English model originally trained by stig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_stig_en_5.2.0_3.0_1701026062116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_stig_en_5.2.0_3.0_1701026062116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_stig","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_stig", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_stig| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/stig/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sulphurage_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sulphurage_en.md new file mode 100644 index 000000000000..9674ad9ecf21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sulphurage_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sulphurage DistilBertForQuestionAnswering from sulphurage +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sulphurage +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sulphurage` is a English model originally trained by sulphurage. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sulphurage_en_5.2.0_3.0_1701034752875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sulphurage_en_5.2.0_3.0_1701034752875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sulphurage","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sulphurage", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sulphurage| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sulphurage/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_summerzhang_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_summerzhang_en.md new file mode 100644 index 000000000000..65372742a5b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_summerzhang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_summerzhang DistilBertForQuestionAnswering from SummerZhang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_summerzhang +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_summerzhang` is a English model originally trained by SummerZhang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_summerzhang_en_5.2.0_3.0_1701017992542.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_summerzhang_en_5.2.0_3.0_1701017992542.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_summerzhang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_summerzhang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_summerzhang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SummerZhang/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_supachoke44_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_supachoke44_en.md new file mode 100644 index 000000000000..eae9109e3867 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_supachoke44_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_supachoke44 DistilBertForQuestionAnswering from supachoke44 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_supachoke44 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_supachoke44` is a English model originally trained by supachoke44. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_supachoke44_en_5.2.0_3.0_1701028544192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_supachoke44_en_5.2.0_3.0_1701028544192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_supachoke44","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_supachoke44", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_supachoke44| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/supachoke44/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_supriyashri_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_supriyashri_en.md new file mode 100644 index 000000000000..b041760b2eaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_supriyashri_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_supriyashri DistilBertForQuestionAnswering from supriyashri +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_supriyashri +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_supriyashri` is a English model originally trained by supriyashri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_supriyashri_en_5.2.0_3.0_1701025597165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_supriyashri_en_5.2.0_3.0_1701025597165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_supriyashri","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_supriyashri", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_supriyashri| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/supriyashri/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sutd_ai_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sutd_ai_en.md new file mode 100644 index 000000000000..00a2bd260690 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sutd_ai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sutd_ai DistilBertForQuestionAnswering from sutd-ai +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sutd_ai +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sutd_ai` is a English model originally trained by sutd-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sutd_ai_en_5.2.0_3.0_1701020039077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sutd_ai_en_5.2.0_3.0_1701020039077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sutd_ai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sutd_ai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sutd_ai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sutd-ai/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_suzuki_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_suzuki_en.md new file mode 100644 index 000000000000..632a8b8b34d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_suzuki_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_suzuki DistilBertForQuestionAnswering from suzuki +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_suzuki +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_suzuki` is a English model originally trained by suzuki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_suzuki_en_5.2.0_3.0_1701019114553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_suzuki_en_5.2.0_3.0_1701019114553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_suzuki","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_suzuki", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_suzuki| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/suzuki/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swang2000_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swang2000_en.md new file mode 100644 index 000000000000..e07749a0e4ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swang2000_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_swang2000 DistilBertForQuestionAnswering from swang2000 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_swang2000 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_swang2000` is a English model originally trained by swang2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_swang2000_en_5.2.0_3.0_1701040812713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_swang2000_en_5.2.0_3.0_1701040812713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_swang2000","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_swang2000", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_swang2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/swang2000/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swq_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swq_en.md new file mode 100644 index 000000000000..2251eeff643c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_swq DistilBertForQuestionAnswering from SWQ +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_swq +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_swq` is a English model originally trained by SWQ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_swq_en_5.2.0_3.0_1701026077027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_swq_en_5.2.0_3.0_1701026077027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_swq","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_swq", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_swq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SWQ/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swty_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swty_en.md new file mode 100644 index 000000000000..8dcea6e81348 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_swty_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_swty DistilBertForQuestionAnswering from Swty +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_swty +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_swty` is a English model originally trained by Swty. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_swty_en_5.2.0_3.0_1701029760437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_swty_en_5.2.0_3.0_1701029760437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_swty","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_swty", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_swty| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Swty/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sybghat_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sybghat_en.md new file mode 100644 index 000000000000..794faeea54cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_sybghat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sybghat DistilBertForQuestionAnswering from Sybghat +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sybghat +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sybghat` is a English model originally trained by Sybghat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sybghat_en_5.2.0_3.0_1701040595504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sybghat_en_5.2.0_3.0_1701040595504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sybghat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sybghat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sybghat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sybghat/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_taikunzhang_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_taikunzhang_en.md new file mode 100644 index 000000000000..4bf5242cbcdd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_taikunzhang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_taikunzhang DistilBertForQuestionAnswering from taikunzhang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_taikunzhang +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_taikunzhang` is a English model originally trained by taikunzhang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_taikunzhang_en_5.2.0_3.0_1701017718453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_taikunzhang_en_5.2.0_3.0_1701017718453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_taikunzhang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_taikunzhang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_taikunzhang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/taikunzhang/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tarikul_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tarikul_en.md new file mode 100644 index 000000000000..13a79dff88ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tarikul_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_tarikul DistilBertForQuestionAnswering from tarikul +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_tarikul +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_tarikul` is a English model originally trained by tarikul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tarikul_en_5.2.0_3.0_1701034439340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tarikul_en_5.2.0_3.0_1701034439340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_tarikul","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_tarikul", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_tarikul| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tarikul/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_thamaine_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_thamaine_en.md new file mode 100644 index 000000000000..3666ce12b057 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_thamaine_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_thamaine DistilBertForQuestionAnswering from thamaine +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_thamaine +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_thamaine` is a English model originally trained by thamaine. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_thamaine_en_5.2.0_3.0_1701016519920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_thamaine_en_5.2.0_3.0_1701016519920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_thamaine","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_thamaine", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_thamaine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/thamaine/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tiennvcs_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tiennvcs_en.md new file mode 100644 index 000000000000..7d3d0e252ee5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tiennvcs_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_tiennvcs DistilBertForQuestionAnswering from tiennvcs +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_tiennvcs +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_tiennvcs` is a English model originally trained by tiennvcs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tiennvcs_en_5.2.0_3.0_1701028071969.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tiennvcs_en_5.2.0_3.0_1701028071969.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_tiennvcs","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_tiennvcs", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_tiennvcs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tiennvcs/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_timopixel_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_timopixel_en.md new file mode 100644 index 000000000000..0172691cfd39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_timopixel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_timopixel DistilBertForQuestionAnswering from timopixel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_timopixel +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_timopixel` is a English model originally trained by timopixel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_timopixel_en_5.2.0_3.0_1701022026816.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_timopixel_en_5.2.0_3.0_1701022026816.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_timopixel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_timopixel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_timopixel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/timopixel/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tom192180_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tom192180_en.md new file mode 100644 index 000000000000..5b4b627220a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tom192180_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_tom192180 DistilBertForQuestionAnswering from tom192180 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_tom192180 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_tom192180` is a English model originally trained by tom192180. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tom192180_en_5.2.0_3.0_1701032344367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tom192180_en_5.2.0_3.0_1701032344367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_tom192180","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_tom192180", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_tom192180| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tom192180/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tomxbe_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tomxbe_en.md new file mode 100644 index 000000000000..67571a0c6dc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tomxbe_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_tomxbe DistilBertForQuestionAnswering from tomXBE +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_tomxbe +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_tomxbe` is a English model originally trained by tomXBE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tomxbe_en_5.2.0_3.0_1701023969104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tomxbe_en_5.2.0_3.0_1701023969104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_tomxbe","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_tomxbe", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_tomxbe| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tomXBE/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_trucks_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_trucks_en.md new file mode 100644 index 000000000000..532c115f30c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_trucks_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_trucks DistilBertForQuestionAnswering from trucks +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_trucks +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_trucks` is a English model originally trained by trucks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_trucks_en_5.2.0_3.0_1701034235888.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_trucks_en_5.2.0_3.0_1701034235888.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_trucks","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_trucks", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_trucks| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/trucks/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tusbaki_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tusbaki_en.md new file mode 100644 index 000000000000..b5b9e734fa51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_tusbaki_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_tusbaki DistilBertForQuestionAnswering from tusbaki +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_tusbaki +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_tusbaki` is a English model originally trained by tusbaki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tusbaki_en_5.2.0_3.0_1701021356101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tusbaki_en_5.2.0_3.0_1701021356101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_tusbaki","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_tusbaki", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_tusbaki| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tusbaki/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_uditsharma16_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_uditsharma16_en.md new file mode 100644 index 000000000000..99fe0b949f04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_uditsharma16_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_uditsharma16 DistilBertForQuestionAnswering from uditsharma16 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_uditsharma16 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_uditsharma16` is a English model originally trained by uditsharma16. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_uditsharma16_en_5.2.0_3.0_1701039028274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_uditsharma16_en_5.2.0_3.0_1701039028274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_uditsharma16","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_uditsharma16", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_uditsharma16| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/uditsharma16/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ulajessen_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ulajessen_en.md new file mode 100644 index 000000000000..9c0a83c6df8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ulajessen_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ulajessen DistilBertForQuestionAnswering from ulajessen +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ulajessen +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ulajessen` is a English model originally trained by ulajessen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ulajessen_en_5.2.0_3.0_1701022593528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ulajessen_en_5.2.0_3.0_1701022593528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ulajessen","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ulajessen", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ulajessen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ulajessen/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_umarpreet_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_umarpreet_en.md new file mode 100644 index 000000000000..644d3a502e94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_umarpreet_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_umarpreet DistilBertForQuestionAnswering from Umarpreet +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_umarpreet +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_umarpreet` is a English model originally trained by Umarpreet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_umarpreet_en_5.2.0_3.0_1701021069515.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_umarpreet_en_5.2.0_3.0_1701021069515.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_umarpreet","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_umarpreet", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_umarpreet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Umarpreet/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_unbelievable111_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_unbelievable111_en.md new file mode 100644 index 000000000000..42d52001d90a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_unbelievable111_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_unbelievable111 DistilBertForQuestionAnswering from unbelievable111 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_unbelievable111 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_unbelievable111` is a English model originally trained by unbelievable111. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_unbelievable111_en_5.2.0_3.0_1701020856240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_unbelievable111_en_5.2.0_3.0_1701020856240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_unbelievable111","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_unbelievable111", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_unbelievable111| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/unbelievable111/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_usmanawais_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_usmanawais_en.md new file mode 100644 index 000000000000..c7e645f38602 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_usmanawais_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_usmanawais DistilBertForQuestionAnswering from usmanawais +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_usmanawais +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_usmanawais` is a English model originally trained by usmanawais. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_usmanawais_en_5.2.0_3.0_1701041773106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_usmanawais_en_5.2.0_3.0_1701041773106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_usmanawais","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_usmanawais", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_usmanawais| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/usmanawais/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_3_en.md new file mode 100644 index 000000000000..43096528ae0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v2_3 DistilBertForQuestionAnswering from seviladiguzel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v2_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v2_3` is a English model originally trained by seviladiguzel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_3_en_5.2.0_3.0_1701020989889.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_3_en_5.2.0_3.0_1701020989889.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v2_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v2_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v2_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seviladiguzel/distilbert-base-uncased-finetuned-squad_v2_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_4_en.md new file mode 100644 index 000000000000..e0cf9290cc5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v2_4 DistilBertForQuestionAnswering from seviladiguzel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v2_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v2_4` is a English model originally trained by seviladiguzel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_4_en_5.2.0_3.0_1701037258874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_4_en_5.2.0_3.0_1701037258874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v2_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v2_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v2_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seviladiguzel/distilbert-base-uncased-finetuned-squad_v2_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_5_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_5_en.md new file mode 100644 index 000000000000..2882e9d3b8bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v2_5 DistilBertForQuestionAnswering from seviladiguzel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v2_5 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v2_5` is a English model originally trained by seviladiguzel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_5_en_5.2.0_3.0_1701022168049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_5_en_5.2.0_3.0_1701022168049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v2_5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v2_5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v2_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seviladiguzel/distilbert-base-uncased-finetuned-squad_v2_5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_delofigueiredo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_delofigueiredo_en.md new file mode 100644 index 000000000000..f1aa5ae33f17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_delofigueiredo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v2_delofigueiredo DistilBertForQuestionAnswering from delofigueiredo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v2_delofigueiredo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v2_delofigueiredo` is a English model originally trained by delofigueiredo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_delofigueiredo_en_5.2.0_3.0_1701020384116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_delofigueiredo_en_5.2.0_3.0_1701020384116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v2_delofigueiredo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v2_delofigueiredo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v2_delofigueiredo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/delofigueiredo/distilbert-base-uncased-finetuned-squad-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_pennywise881_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_pennywise881_en.md new file mode 100644 index 000000000000..12e12e6a923f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_pennywise881_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v2_pennywise881 DistilBertForQuestionAnswering from Pennywise881 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v2_pennywise881 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v2_pennywise881` is a English model originally trained by Pennywise881. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_pennywise881_en_5.2.0_3.0_1701036479384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_pennywise881_en_5.2.0_3.0_1701036479384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v2_pennywise881","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v2_pennywise881", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v2_pennywise881| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Pennywise881/distilbert-base-uncased-finetuned-squad-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_seviladiguzel_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_seviladiguzel_en.md new file mode 100644 index 000000000000..30ef2d7e5697 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_seviladiguzel_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v2_seviladiguzel DistilBertForQuestionAnswering from seviladiguzel +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v2_seviladiguzel +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v2_seviladiguzel` is a English model originally trained by seviladiguzel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_seviladiguzel_en_5.2.0_3.0_1701018124004.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_seviladiguzel_en_5.2.0_3.0_1701018124004.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v2_seviladiguzel","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v2_seviladiguzel", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v2_seviladiguzel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seviladiguzel/distilbert-base-uncased-finetuned-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos_en.md new file mode 100644 index 000000000000..17a994fa733f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos DistilBertForQuestionAnswering from wiselinjayajos +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos` is a English model originally trained by wiselinjayajos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos_en_5.2.0_3.0_1701019548283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos_en_5.2.0_3.0_1701019548283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v2_wiselinjayajos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/wiselinjayajos/distilbert-base-uncased-finetuned-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vaibhav9_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vaibhav9_en.md new file mode 100644 index 000000000000..2cc6c4884dcc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vaibhav9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_vaibhav9 DistilBertForQuestionAnswering from vaibhav9 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_vaibhav9 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_vaibhav9` is a English model originally trained by vaibhav9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vaibhav9_en_5.2.0_3.0_1701033571539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vaibhav9_en_5.2.0_3.0_1701033571539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_vaibhav9","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_vaibhav9", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_vaibhav9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vaibhav9/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_veer09_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_veer09_en.md new file mode 100644 index 000000000000..b471f9430165 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_veer09_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_veer09 DistilBertForQuestionAnswering from Veer09 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_veer09 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_veer09` is a English model originally trained by Veer09. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_veer09_en_5.2.0_3.0_1701017365853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_veer09_en_5.2.0_3.0_1701017365853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_veer09","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_veer09", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_veer09| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Veer09/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver1_en.md new file mode 100644 index 000000000000..4aed0c37ca51 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ver1 DistilBertForQuestionAnswering from Alred +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ver1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ver1` is a English model originally trained by Alred. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver1_en_5.2.0_3.0_1701023508386.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver1_en_5.2.0_3.0_1701023508386.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ver1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ver1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ver1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Alred/distilbert-base-uncased-finetuned-squad-ver1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver2_en.md new file mode 100644 index 000000000000..07214330fa23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ver2 DistilBertForQuestionAnswering from Alred +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ver2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ver2` is a English model originally trained by Alred. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver2_en_5.2.0_3.0_1701027096107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver2_en_5.2.0_3.0_1701027096107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ver2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ver2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ver2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Alred/distilbert-base-uncased-finetuned-squad-ver2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver3_en.md new file mode 100644 index 000000000000..9aed2ac14726 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ver3 DistilBertForQuestionAnswering from Alred +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ver3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ver3` is a English model originally trained by Alred. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver3_en_5.2.0_3.0_1701022143178.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver3_en_5.2.0_3.0_1701022143178.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ver3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ver3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ver3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Alred/distilbert-base-uncased-finetuned-squad-ver3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver4_en.md new file mode 100644 index 000000000000..7674bd5facf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ver4 DistilBertForQuestionAnswering from Alred +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ver4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ver4` is a English model originally trained by Alred. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver4_en_5.2.0_3.0_1701027312257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver4_en_5.2.0_3.0_1701027312257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ver4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ver4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ver4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Alred/distilbert-base-uncased-finetuned-squad-ver4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver5_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver5_en.md new file mode 100644 index 000000000000..472dcf9c179a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_ver5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ver5 DistilBertForQuestionAnswering from Alred +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ver5 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ver5` is a English model originally trained by Alred. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver5_en_5.2.0_3.0_1701028944735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ver5_en_5.2.0_3.0_1701028944735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ver5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ver5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ver5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Alred/distilbert-base-uncased-finetuned-squad-ver5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2_en.md new file mode 100644 index 000000000000..4afacb61adff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2 DistilBertForQuestionAnswering from choz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2` is a English model originally trained by choz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2_en_5.2.0_3.0_1701038114363.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2_en_5.2.0_3.0_1701038114363.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_vervio_finetuned_team2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/choz/distilbert-base-uncased-finetuned-squad-vervio-finetuned-team2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_victorlee071200_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_victorlee071200_en.md new file mode 100644 index 000000000000..e750c5355d64 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_victorlee071200_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_victorlee071200 DistilBertForQuestionAnswering from victorlee071200 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_victorlee071200 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_victorlee071200` is a English model originally trained by victorlee071200. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_victorlee071200_en_5.2.0_3.0_1701022310390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_victorlee071200_en_5.2.0_3.0_1701022310390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_victorlee071200","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_victorlee071200", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_victorlee071200| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/victorlee071200/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_viennawagner_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_viennawagner_en.md new file mode 100644 index 000000000000..997d55a87979 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_viennawagner_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_viennawagner DistilBertForQuestionAnswering from ViennaWagner +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_viennawagner +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_viennawagner` is a English model originally trained by ViennaWagner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_viennawagner_en_5.2.0_3.0_1701026637022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_viennawagner_en_5.2.0_3.0_1701026637022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_viennawagner","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_viennawagner", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_viennawagner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ViennaWagner/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vincenzodeleo_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vincenzodeleo_en.md new file mode 100644 index 000000000000..232add4353c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vincenzodeleo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_vincenzodeleo DistilBertForQuestionAnswering from vincenzodeleo +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_vincenzodeleo +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_vincenzodeleo` is a English model originally trained by vincenzodeleo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vincenzodeleo_en_5.2.0_3.0_1701026925109.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vincenzodeleo_en_5.2.0_3.0_1701026925109.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_vincenzodeleo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_vincenzodeleo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_vincenzodeleo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vincenzodeleo/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vnktrmnb_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vnktrmnb_en.md new file mode 100644 index 000000000000..bec15d7f7d72 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vnktrmnb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_vnktrmnb DistilBertForQuestionAnswering from vnktrmnb +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_vnktrmnb +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_vnktrmnb` is a English model originally trained by vnktrmnb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vnktrmnb_en_5.2.0_3.0_1701027091376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vnktrmnb_en_5.2.0_3.0_1701027091376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_vnktrmnb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_vnktrmnb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_vnktrmnb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vnktrmnb/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_votr_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_votr_en.md new file mode 100644 index 000000000000..457a9d691942 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_votr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_votr DistilBertForQuestionAnswering from votr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_votr +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_votr` is a English model originally trained by votr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_votr_en_5.2.0_3.0_1701037270275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_votr_en_5.2.0_3.0_1701037270275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_votr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_votr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_votr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/votr/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vr513_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vr513_en.md new file mode 100644 index 000000000000..1ff80447453a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_vr513_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_vr513 DistilBertForQuestionAnswering from vr513 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_vr513 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_vr513` is a English model originally trained by vr513. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vr513_en_5.2.0_3.0_1701043036391.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_vr513_en_5.2.0_3.0_1701043036391.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_vr513","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_vr513", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_vr513| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vr513/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_walterchamy_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_walterchamy_en.md new file mode 100644 index 000000000000..8bb78ffdb4cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_walterchamy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_walterchamy DistilBertForQuestionAnswering from Walterchamy +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_walterchamy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_walterchamy` is a English model originally trained by Walterchamy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_walterchamy_en_5.2.0_3.0_1701016013689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_walterchamy_en_5.2.0_3.0_1701016013689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_walterchamy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_walterchamy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_walterchamy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Walterchamy/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_wilsonmarasigan_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_wilsonmarasigan_en.md new file mode 100644 index 000000000000..261ee258814f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_wilsonmarasigan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_wilsonmarasigan DistilBertForQuestionAnswering from wilsonmarasigan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_wilsonmarasigan +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_wilsonmarasigan` is a English model originally trained by wilsonmarasigan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_wilsonmarasigan_en_5.2.0_3.0_1701040253172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_wilsonmarasigan_en_5.2.0_3.0_1701040253172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_wilsonmarasigan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_wilsonmarasigan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_wilsonmarasigan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/wilsonmarasigan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_wizofavalon_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_wizofavalon_en.md new file mode 100644 index 000000000000..ba9b9bb3158d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_wizofavalon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_wizofavalon DistilBertForQuestionAnswering from wizofavalon +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_wizofavalon +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_wizofavalon` is a English model originally trained by wizofavalon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_wizofavalon_en_5.2.0_3.0_1701025503338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_wizofavalon_en_5.2.0_3.0_1701025503338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_wizofavalon","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_wizofavalon", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_wizofavalon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/wizofavalon/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yashwantk_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yashwantk_en.md new file mode 100644 index 000000000000..e702b5ade6c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yashwantk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_yashwantk DistilBertForQuestionAnswering from yashwantk +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_yashwantk +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_yashwantk` is a English model originally trained by yashwantk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yashwantk_en_5.2.0_3.0_1701022171053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yashwantk_en_5.2.0_3.0_1701022171053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_yashwantk","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_yashwantk", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_yashwantk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yashwantk/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yeihc_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yeihc_en.md new file mode 100644 index 000000000000..87d8425b85e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yeihc_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_yeihc DistilBertForQuestionAnswering from yeihc +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_yeihc +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_yeihc` is a English model originally trained by yeihc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yeihc_en_5.2.0_3.0_1701025797231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yeihc_en_5.2.0_3.0_1701025797231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_yeihc","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_yeihc", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_yeihc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yeihc/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yogesh0502_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yogesh0502_en.md new file mode 100644 index 000000000000..8ecbba35b34e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yogesh0502_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_yogesh0502 DistilBertForQuestionAnswering from yogesh0502 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_yogesh0502 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_yogesh0502` is a English model originally trained by yogesh0502. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yogesh0502_en_5.2.0_3.0_1701020200941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yogesh0502_en_5.2.0_3.0_1701020200941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_yogesh0502","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_yogesh0502", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_yogesh0502| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yogesh0502/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yohein_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yohein_en.md new file mode 100644 index 000000000000..941ba4fb9613 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yohein_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_yohein DistilBertForQuestionAnswering from yohein +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_yohein +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_yohein` is a English model originally trained by yohein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yohein_en_5.2.0_3.0_1701027872282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yohein_en_5.2.0_3.0_1701027872282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_yohein","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_yohein", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_yohein| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yohein/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yshubham8419_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yshubham8419_en.md new file mode 100644 index 000000000000..5f46f9643fe0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_yshubham8419_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_yshubham8419 DistilBertForQuestionAnswering from yshubham8419 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_yshubham8419 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_yshubham8419` is a English model originally trained by yshubham8419. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yshubham8419_en_5.2.0_3.0_1701029260533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_yshubham8419_en_5.2.0_3.0_1701029260533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_yshubham8419","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_yshubham8419", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_yshubham8419| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/yshubham8419/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zeroro80_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zeroro80_en.md new file mode 100644 index 000000000000..01a76d5b2f4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zeroro80_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_zeroro80 DistilBertForQuestionAnswering from zeroro80 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_zeroro80 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_zeroro80` is a English model originally trained by zeroro80. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zeroro80_en_5.2.0_3.0_1701026510073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zeroro80_en_5.2.0_3.0_1701026510073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_zeroro80","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_zeroro80", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_zeroro80| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zeroro80/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhangfx7_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhangfx7_en.md new file mode 100644 index 000000000000..1368fce35bea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhangfx7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_zhangfx7 DistilBertForQuestionAnswering from zhangfx7 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_zhangfx7 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_zhangfx7` is a English model originally trained by zhangfx7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zhangfx7_en_5.2.0_3.0_1701023395190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zhangfx7_en_5.2.0_3.0_1701023395190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_zhangfx7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_zhangfx7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_zhangfx7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zhangfx7/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhenyueyu_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhenyueyu_en.md new file mode 100644 index 000000000000..70aedc7de467 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhenyueyu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_zhenyueyu DistilBertForQuestionAnswering from zhenyueyu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_zhenyueyu +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_zhenyueyu` is a English model originally trained by zhenyueyu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zhenyueyu_en_5.2.0_3.0_1701020326085.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zhenyueyu_en_5.2.0_3.0_1701020326085.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_zhenyueyu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_zhenyueyu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_zhenyueyu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zhenyueyu/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhiqin818_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhiqin818_en.md new file mode 100644 index 000000000000..fc779f460e2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zhiqin818_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_zhiqin818 DistilBertForQuestionAnswering from ZhiQin818 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_zhiqin818 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_zhiqin818` is a English model originally trained by ZhiQin818. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zhiqin818_en_5.2.0_3.0_1701029402550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zhiqin818_en_5.2.0_3.0_1701029402550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_zhiqin818","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_zhiqin818", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_zhiqin818| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ZhiQin818/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zirongh_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zirongh_en.md new file mode 100644 index 000000000000..eb668b9caec3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zirongh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_zirongh DistilBertForQuestionAnswering from ZirongH +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_zirongh +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_zirongh` is a English model originally trained by ZirongH. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zirongh_en_5.2.0_3.0_1701035047238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zirongh_en_5.2.0_3.0_1701035047238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_zirongh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_zirongh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_zirongh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ZirongH/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zshang3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zshang3_en.md new file mode 100644 index 000000000000..2d54f247c26d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squad_zshang3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_zshang3 DistilBertForQuestionAnswering from zshang3 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_zshang3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_zshang3` is a English model originally trained by zshang3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zshang3_en_5.2.0_3.0_1701021215622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_zshang3_en_5.2.0_3.0_1701021215622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_zshang3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_zshang3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_zshang3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zshang3/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squado_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squado_en.md new file mode 100644 index 000000000000..6cb00c4d2b32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squado_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squado DistilBertForQuestionAnswering from HASAN55 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squado +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squado` is a English model originally trained by HASAN55. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squado_en_5.2.0_3.0_1701037691237.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squado_en_5.2.0_3.0_1701037691237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squado","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squado", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squado| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HASAN55/distilbert-base-uncased-finetuned-squado \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squadtr_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squadtr_en.md new file mode 100644 index 000000000000..3cb2ef957590 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squadtr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squadtr DistilBertForQuestionAnswering from HASAN55 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squadtr +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squadtr` is a English model originally trained by HASAN55. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squadtr_en_5.2.0_3.0_1701042158257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squadtr_en_5.2.0_3.0_1701042158257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squadtr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squadtr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squadtr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HASAN55/distilbert-base-uncased-finetuned-squadtr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squadv1_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squadv1_1_en.md new file mode 100644 index 000000000000..bed18502d202 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_squadv1_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squadv1_1 DistilBertForQuestionAnswering from lauraparra28 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squadv1_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squadv1_1` is a English model originally trained by lauraparra28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squadv1_1_en_5.2.0_3.0_1701040458378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squadv1_1_en_5.2.0_3.0_1701040458378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squadv1_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squadv1_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squadv1_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lauraparra28/Distilbert-base-uncased-finetuned-SQuADv1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_triviaqa_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_triviaqa_finetuned_squad_en.md new file mode 100644 index 000000000000..3d85fc206519 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_finetuned_triviaqa_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_triviaqa_finetuned_squad DistilBertForQuestionAnswering from FabianWillner +author: John Snow Labs +name: distilbert_base_uncased_finetuned_triviaqa_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_triviaqa_finetuned_squad` is a English model originally trained by FabianWillner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_triviaqa_finetuned_squad_en_5.2.0_3.0_1701019962460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_triviaqa_finetuned_squad_en_5.2.0_3.0_1701019962460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_triviaqa_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_triviaqa_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_triviaqa_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/FabianWillner/distilbert-base-uncased-finetuned-triviaqa-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_hansollll_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_hansollll_en.md new file mode 100644 index 000000000000..a7cc05e39618 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_hansollll_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_hansollll DistilBertForQuestionAnswering from Hansollll +author: John Snow Labs +name: distilbert_base_uncased_hansollll +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_hansollll` is a English model originally trained by Hansollll. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_hansollll_en_5.2.0_3.0_1701027226955.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_hansollll_en_5.2.0_3.0_1701027226955.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_hansollll","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_hansollll", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_hansollll| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Hansollll/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_indonesia_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_indonesia_squadv2_en.md new file mode 100644 index 000000000000..0f885ddd973f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_indonesia_squadv2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_indonesia_squadv2 DistilBertForQuestionAnswering from asaduas +author: John Snow Labs +name: distilbert_base_uncased_indonesia_squadv2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_indonesia_squadv2` is a English model originally trained by asaduas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_indonesia_squadv2_en_5.2.0_3.0_1701016066866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_indonesia_squadv2_en_5.2.0_3.0_1701016066866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_indonesia_squadv2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_indonesia_squadv2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_indonesia_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/asaduas/distilbert-base-uncased-indonesia-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_meded_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_meded_en.md new file mode 100644 index 000000000000..375352b1c454 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_meded_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_meded DistilBertForQuestionAnswering from mdineshk +author: John Snow Labs +name: distilbert_base_uncased_meded +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_meded` is a English model originally trained by mdineshk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_meded_en_5.2.0_3.0_1701031763373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_meded_en_5.2.0_3.0_1701031763373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_meded","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_meded", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_meded| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mdineshk/distilbert-base-uncased-meded \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_mod_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_mod_en.md new file mode 100644 index 000000000000..ca84bbd554a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_mod_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_mod DistilBertForQuestionAnswering from damapika +author: John Snow Labs +name: distilbert_base_uncased_mod +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_mod` is a English model originally trained by damapika. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mod_en_5.2.0_3.0_1701015881625.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mod_en_5.2.0_3.0_1701015881625.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_mod","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_mod", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_mod| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/damapika/distilbert-base-uncased_mod \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_mod_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_mod_squad_en.md new file mode 100644 index 000000000000..ccdfc38a8b35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_mod_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_mod_squad DistilBertForQuestionAnswering from damapika +author: John Snow Labs +name: distilbert_base_uncased_mod_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_mod_squad` is a English model originally trained by damapika. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mod_squad_en_5.2.0_3.0_1701018854246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_mod_squad_en_5.2.0_3.0_1701018854246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_mod_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_mod_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_mod_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/damapika/distilbert-base-uncased_mod_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_1_en.md new file mode 100644 index 000000000000..49ed0d81b932 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_model_1 DistilBertForQuestionAnswering from evelynerhuan +author: John Snow Labs +name: distilbert_base_uncased_model_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_model_1` is a English model originally trained by evelynerhuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_1_en_5.2.0_3.0_1701021486761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_1_en_5.2.0_3.0_1701021486761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_model_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_model_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_model_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/evelynerhuan/distilbert-base-uncased-model-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_2_en.md new file mode 100644 index 000000000000..4c1943126798 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_model_2 DistilBertForQuestionAnswering from evelynerhuan +author: John Snow Labs +name: distilbert_base_uncased_model_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_model_2` is a English model originally trained by evelynerhuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_2_en_5.2.0_3.0_1701026492031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_2_en_5.2.0_3.0_1701026492031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_model_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_model_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_model_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/evelynerhuan/distilbert-base-uncased-model-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_3_en.md new file mode 100644 index 000000000000..069b421bfcd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_model_3 DistilBertForQuestionAnswering from evelynerhuan +author: John Snow Labs +name: distilbert_base_uncased_model_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_model_3` is a English model originally trained by evelynerhuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_3_en_5.2.0_3.0_1701032395871.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_3_en_5.2.0_3.0_1701032395871.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_model_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_model_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_model_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/evelynerhuan/distilbert-base-uncased-model-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_4_en.md new file mode 100644 index 000000000000..6af4d891dfe3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_model_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_model_4 DistilBertForQuestionAnswering from evelynerhuan +author: John Snow Labs +name: distilbert_base_uncased_model_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_model_4` is a English model originally trained by evelynerhuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_4_en_5.2.0_3.0_1701025449120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_model_4_en_5.2.0_3.0_1701025449120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_model_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_model_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_model_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/evelynerhuan/distilbert-base-uncased-model-4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_modelo_becas0_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_modelo_becas0_en.md new file mode 100644 index 000000000000..e9dd1c9eb959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_modelo_becas0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_modelo_becas0 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_modelo_becas0 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_modelo_becas0` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_modelo_becas0_en_5.2.0_3.0_1701030393275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_modelo_becas0_en_5.2.0_3.0_1701030393275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_modelo_becas0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_modelo_becas0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_modelo_becas0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-modelo-becas0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_naturalquestions_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_naturalquestions_en.md new file mode 100644 index 000000000000..40f241052574 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_naturalquestions_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_naturalquestions DistilBertForQuestionAnswering from AsmaAwad +author: John Snow Labs +name: distilbert_base_uncased_naturalquestions +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_naturalquestions` is a English model originally trained by AsmaAwad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_naturalquestions_en_5.2.0_3.0_1701022441242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_naturalquestions_en_5.2.0_3.0_1701022441242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_naturalquestions","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_naturalquestions", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_naturalquestions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AsmaAwad/distilbert-base-uncased-NaturalQuestions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_nq_short_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_nq_short_en.md new file mode 100644 index 000000000000..b396145d72b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_nq_short_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_nq_short DistilBertForQuestionAnswering from dl4nlp +author: John Snow Labs +name: distilbert_base_uncased_nq_short +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_nq_short` is a English model originally trained by dl4nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_nq_short_en_5.2.0_3.0_1701024100998.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_nq_short_en_5.2.0_3.0_1701024100998.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_nq_short","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_nq_short", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_nq_short| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/dl4nlp/distilbert-base-uncased-nq-short \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_nq_short_for_square_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_nq_short_for_square_en.md new file mode 100644 index 000000000000..9b5dff84cc02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_nq_short_for_square_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_nq_short_for_square DistilBertForQuestionAnswering from dl4nlp +author: John Snow Labs +name: distilbert_base_uncased_nq_short_for_square +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_nq_short_for_square` is a English model originally trained by dl4nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_nq_short_for_square_en_5.2.0_3.0_1701042607827.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_nq_short_for_square_en_5.2.0_3.0_1701042607827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_nq_short_for_square","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_nq_short_for_square", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_nq_short_for_square| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.5 MB| + +## References + +https://huggingface.co/dl4nlp/distilbert-base-uncased-nq-short-for-square \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_original_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_original_finetuned_squad_en.md new file mode 100644 index 000000000000..f22cfe6a9cf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_original_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_original_finetuned_squad DistilBertForQuestionAnswering from evelynerhuan +author: John Snow Labs +name: distilbert_base_uncased_original_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_original_finetuned_squad` is a English model originally trained by evelynerhuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_original_finetuned_squad_en_5.2.0_3.0_1701030979574.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_original_finetuned_squad_en_5.2.0_3.0_1701030979574.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_original_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_original_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_original_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/evelynerhuan/distilbert-base-uncased-original-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_pennywise881_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_pennywise881_en.md new file mode 100644 index 000000000000..de60abddb642 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_pennywise881_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_pennywise881 DistilBertForQuestionAnswering from Pennywise881 +author: John Snow Labs +name: distilbert_base_uncased_pennywise881 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_pennywise881` is a English model originally trained by Pennywise881. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pennywise881_en_5.2.0_3.0_1701036790554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_pennywise881_en_5.2.0_3.0_1701036790554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_pennywise881","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_pennywise881", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_pennywise881| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Pennywise881/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_prueba2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_prueba2_en.md new file mode 100644 index 000000000000..5b2f69ce7fbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_prueba2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_prueba2 DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_prueba2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_prueba2` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_prueba2_en_5.2.0_3.0_1701020038522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_prueba2_en_5.2.0_3.0_1701020038522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_prueba2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_prueba2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_prueba2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-prueba2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_prueba_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_prueba_en.md new file mode 100644 index 000000000000..161f4d60fd30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_prueba_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_prueba DistilBertForQuestionAnswering from Evelyn18 +author: John Snow Labs +name: distilbert_base_uncased_prueba +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_prueba` is a English model originally trained by Evelyn18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_prueba_en_5.2.0_3.0_1701026615077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_prueba_en_5.2.0_3.0_1701026615077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_prueba","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_prueba", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_prueba| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Evelyn18/distilbert-base-uncased-prueba \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_qa_en.md new file mode 100644 index 000000000000..c7e19c1bff98 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_qa DistilBertForQuestionAnswering from badokorach +author: John Snow Labs +name: distilbert_base_uncased_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_qa` is a English model originally trained by badokorach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_qa_en_5.2.0_3.0_1701033995731.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_qa_en_5.2.0_3.0_1701033995731.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/badokorach/distilbert-base-uncased-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squad_en.md new file mode 100644 index 000000000000..e32ec2fbe0eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_squad DistilBertForQuestionAnswering from tkarr +author: John Snow Labs +name: distilbert_base_uncased_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_squad` is a English model originally trained by tkarr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_en_5.2.0_3.0_1701034214833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_en_5.2.0_3.0_1701034214833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tkarr/distilbert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squad_v2_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squad_v2_finetuned_en.md new file mode 100644 index 000000000000..fd1b9299ccb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squad_v2_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_squad_v2_finetuned DistilBertForQuestionAnswering from skyskuy +author: John Snow Labs +name: distilbert_base_uncased_squad_v2_finetuned +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_squad_v2_finetuned` is a English model originally trained by skyskuy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_v2_finetuned_en_5.2.0_3.0_1701033417497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_v2_finetuned_en_5.2.0_3.0_1701033417497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_squad_v2_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_squad_v2_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_squad_v2_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/skyskuy/distilbert-base-uncased-squad_v2-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadmodeldistiluncased_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadmodeldistiluncased_en.md new file mode 100644 index 000000000000..9e8fd0e86954 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadmodeldistiluncased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_squadmodeldistiluncased DistilBertForQuestionAnswering from HASAN55 +author: John Snow Labs +name: distilbert_base_uncased_squadmodeldistiluncased +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_squadmodeldistiluncased` is a English model originally trained by HASAN55. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squadmodeldistiluncased_en_5.2.0_3.0_1701025310331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squadmodeldistiluncased_en_5.2.0_3.0_1701025310331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_squadmodeldistiluncased","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_squadmodeldistiluncased", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_squadmodeldistiluncased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HASAN55/distilbert-base-uncased-SQUADMODELDISTILUNCASED \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_en.md new file mode 100644 index 000000000000..20d7c39a8982 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_squadv1_1_sparse_80_1x4_block DistilBertForQuestionAnswering from Intel +author: John Snow Labs +name: distilbert_base_uncased_squadv1_1_sparse_80_1x4_block +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_squadv1_1_sparse_80_1x4_block` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_en_5.2.0_3.0_1701018245706.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_en_5.2.0_3.0_1701018245706.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_squadv1_1_sparse_80_1x4_block","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_squadv1_1_sparse_80_1x4_block", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_squadv1_1_sparse_80_1x4_block| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|125.9 MB| + +## References + +https://huggingface.co/Intel/distilbert-base-uncased-squadv1.1-sparse-80-1X4-block \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa_en.md new file mode 100644 index 000000000000..9def5cd9a592 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa DistilBertForQuestionAnswering from Intel +author: John Snow Labs +name: distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa` is a English model originally trained by Intel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1701016179972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa_en_5.2.0_3.0_1701016179972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_squadv1_1_sparse_80_1x4_block_pruneofa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|125.9 MB| + +## References + +https://huggingface.co/Intel/distilbert-base-uncased-squadv1.1-sparse-80-1x4-block-pruneofa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_trained_on_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_trained_on_squad_en.md new file mode 100644 index 000000000000..138872fa4c3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_trained_on_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_trained_on_squad DistilBertForQuestionAnswering from Joschi3 +author: John Snow Labs +name: distilbert_base_uncased_trained_on_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_trained_on_squad` is a English model originally trained by Joschi3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_trained_on_squad_en_5.2.0_3.0_1701021215695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_trained_on_squad_en_5.2.0_3.0_1701021215695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_trained_on_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_trained_on_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_trained_on_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Joschi3/distilbert-base-uncased_trained_on_SQuAD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_trivia_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_trivia_qa_en.md new file mode 100644 index 000000000000..b6f545e9fdd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_trivia_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_trivia_qa DistilBertForQuestionAnswering from sheldon297 +author: John Snow Labs +name: distilbert_base_uncased_trivia_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_trivia_qa` is a English model originally trained by sheldon297. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_trivia_qa_en_5.2.0_3.0_1701025929274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_trivia_qa_en_5.2.0_3.0_1701025929274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_trivia_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_trivia_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_trivia_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sheldon297/distilbert-base-uncased_trivia-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_vaibhav9_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_vaibhav9_en.md new file mode 100644 index 000000000000..190ed57d22f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_base_uncased_vaibhav9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_vaibhav9 DistilBertForQuestionAnswering from vaibhav9 +author: John Snow Labs +name: distilbert_base_uncased_vaibhav9 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_vaibhav9` is a English model originally trained by vaibhav9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_vaibhav9_en_5.2.0_3.0_1701028918545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_vaibhav9_en_5.2.0_3.0_1701028918545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_vaibhav9","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_vaibhav9", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_vaibhav9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|84.8 MB| + +## References + +https://huggingface.co/vaibhav9/distilbert-base-uncased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_cased_finetuned_newsqa_accelerate_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_cased_finetuned_newsqa_accelerate_en.md new file mode 100644 index 000000000000..9838b64c3878 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_cased_finetuned_newsqa_accelerate_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_cased_finetuned_newsqa_accelerate DistilBertForQuestionAnswering from m3kkasi +author: John Snow Labs +name: distilbert_cased_finetuned_newsqa_accelerate +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_cased_finetuned_newsqa_accelerate` is a English model originally trained by m3kkasi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_cased_finetuned_newsqa_accelerate_en_5.2.0_3.0_1701036977915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_cased_finetuned_newsqa_accelerate_en_5.2.0_3.0_1701036977915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_cased_finetuned_newsqa_accelerate","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_cased_finetuned_newsqa_accelerate", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_cased_finetuned_newsqa_accelerate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|123.5 MB| + +## References + +https://huggingface.co/m3kkasi/distilbert-cased-finetuned-newsqa-accelerate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_cased_finetuned_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_cased_finetuned_newsqa_en.md new file mode 100644 index 000000000000..e725a9719021 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_cased_finetuned_newsqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_cased_finetuned_newsqa DistilBertForQuestionAnswering from m3kkasi +author: John Snow Labs +name: distilbert_cased_finetuned_newsqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_cased_finetuned_newsqa` is a English model originally trained by m3kkasi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_cased_finetuned_newsqa_en_5.2.0_3.0_1701024530524.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_cased_finetuned_newsqa_en_5.2.0_3.0_1701024530524.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_cased_finetuned_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_cased_finetuned_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_cased_finetuned_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/m3kkasi/distilbert-cased-finetuned-newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_convolutional_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_convolutional_classifier_en.md new file mode 100644 index 000000000000..90f08f8c214f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_convolutional_classifier_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_convolutional_classifier DistilBertForQuestionAnswering from nlpunibo +author: John Snow Labs +name: distilbert_convolutional_classifier +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_convolutional_classifier` is a English model originally trained by nlpunibo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_convolutional_classifier_en_5.2.0_3.0_1701036801771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_convolutional_classifier_en_5.2.0_3.0_1701036801771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_convolutional_classifier","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_convolutional_classifier", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_convolutional_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nlpunibo/distilbert_convolutional_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetune_musique_test_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetune_musique_test_1_en.md new file mode 100644 index 000000000000..b1faffc8d74c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetune_musique_test_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetune_musique_test_1 DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_finetune_musique_test_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetune_musique_test_1` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetune_musique_test_1_en_5.2.0_3.0_1701017994877.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetune_musique_test_1_en_5.2.0_3.0_1701017994877.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetune_musique_test_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetune_musique_test_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetune_musique_test_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|236.0 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_finetune_musique_test_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_covidqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_covidqa_en.md new file mode 100644 index 000000000000..fdf3377f861e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_covidqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_covidqa DistilBertForQuestionAnswering from dqduong2003 +author: John Snow Labs +name: distilbert_finetuned_covidqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_covidqa` is a English model originally trained by dqduong2003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_covidqa_en_5.2.0_3.0_1701041248298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_covidqa_en_5.2.0_3.0_1701041248298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_covidqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_covidqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_covidqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/dqduong2003/distilbert-finetuned-covidqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs10_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs10_en.md new file mode 100644 index 000000000000..8ed439c4d506 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_05_epochs10 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_05_epochs10 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_05_epochs10` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs10_en_5.2.0_3.0_1701027742362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs10_en_5.2.0_3.0_1701027742362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_05_epochs10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_05_epochs10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_05_epochs10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-05-epochs10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs15_en.md new file mode 100644 index 000000000000..e916abdadf33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_05_epochs15 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_05_epochs15 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_05_epochs15` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs15_en_5.2.0_3.0_1701027410784.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs15_en_5.2.0_3.0_1701027410784.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_05_epochs15","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_05_epochs15", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_05_epochs15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-05-epochs15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs25_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs25_en.md new file mode 100644 index 000000000000..3668ff4bc0ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs25_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_05_epochs25 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_05_epochs25 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_05_epochs25` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs25_en_5.2.0_3.0_1701030975899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs25_en_5.2.0_3.0_1701030975899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_05_epochs25","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_05_epochs25", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_05_epochs25| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-05-epochs25 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs50_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs50_en.md new file mode 100644 index 000000000000..e33b23557325 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_05_epochs50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_05_epochs50 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_05_epochs50 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_05_epochs50` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs50_en_5.2.0_3.0_1701018239456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_05_epochs50_en_5.2.0_3.0_1701018239456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_05_epochs50","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_05_epochs50", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_05_epochs50| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-05-epochs50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs10_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs10_en.md new file mode 100644 index 000000000000..49c479cfa809 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_06_epochs10 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_06_epochs10 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_06_epochs10` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs10_en_5.2.0_3.0_1701018406652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs10_en_5.2.0_3.0_1701018406652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_06_epochs10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_06_epochs10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_06_epochs10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-06-epochs10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs15_en.md new file mode 100644 index 000000000000..e864ae8f2c4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_06_epochs15 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_06_epochs15 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_06_epochs15` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs15_en_5.2.0_3.0_1701024121916.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs15_en_5.2.0_3.0_1701024121916.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_06_epochs15","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_06_epochs15", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_06_epochs15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-06-epochs15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs25_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs25_en.md new file mode 100644 index 000000000000..30480b816020 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs25_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_06_epochs25 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_06_epochs25 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_06_epochs25` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs25_en_5.2.0_3.0_1701026972086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs25_en_5.2.0_3.0_1701026972086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_06_epochs25","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_06_epochs25", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_06_epochs25| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-06-epochs25 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs50_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs50_en.md new file mode 100644 index 000000000000..43cb0b873026 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_06_epochs50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_06_epochs50 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_06_epochs50 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_06_epochs50` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs50_en_5.2.0_3.0_1701027658626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_06_epochs50_en_5.2.0_3.0_1701027658626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_06_epochs50","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_06_epochs50", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_06_epochs50| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-06-epochs50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs10_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs10_en.md new file mode 100644 index 000000000000..d203295f7a9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_07_epochs10 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_07_epochs10 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_07_epochs10` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs10_en_5.2.0_3.0_1701021240738.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs10_en_5.2.0_3.0_1701021240738.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_07_epochs10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_07_epochs10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_07_epochs10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-07-epochs10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs15_en.md new file mode 100644 index 000000000000..6a84999574d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs15_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_07_epochs15 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_07_epochs15 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_07_epochs15` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs15_en_5.2.0_3.0_1701022735285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs15_en_5.2.0_3.0_1701022735285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_07_epochs15","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_07_epochs15", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_07_epochs15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-07-epochs15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs25_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs25_en.md new file mode 100644 index 000000000000..85f6777cd4ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs25_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_07_epochs25 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_07_epochs25 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_07_epochs25` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs25_en_5.2.0_3.0_1701028385968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs25_en_5.2.0_3.0_1701028385968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_07_epochs25","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_07_epochs25", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_07_epochs25| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-07-epochs25 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs50_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs50_en.md new file mode 100644 index 000000000000..42b025bcdc8d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_lr1e_07_epochs50_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_lr1e_07_epochs50 DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: distilbert_finetuned_lr1e_07_epochs50 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_lr1e_07_epochs50` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs50_en_5.2.0_3.0_1701022458796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_lr1e_07_epochs50_en_5.2.0_3.0_1701022458796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_lr1e_07_epochs50","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_lr1e_07_epochs50", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_lr1e_07_epochs50| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/gallyamovi/distilbert-finetuned-lr1e-07-epochs50 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_ru.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_ru.md new file mode 100644 index 000000000000..680c0f97942d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian distilbert_finetuned DistilBertForQuestionAnswering from GeorgeKhlestov +author: John Snow Labs +name: distilbert_finetuned +date: 2023-11-26 +tags: [distilbert, ru, open_source, question_answering, onnx] +task: Question Answering +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned` is a Russian model originally trained by GeorgeKhlestov. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ru_5.2.0_3.0_1701015441470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_ru_5.2.0_3.0_1701015441470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned","ru") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned", "ru") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ru| +|Size:|505.4 MB| + +## References + +https://huggingface.co/GeorgeKhlestov/distilbert_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_covidqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_covidqa_en.md new file mode 100644 index 000000000000..93af2d296077 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_covidqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_squad_covidqa DistilBertForQuestionAnswering from dqduong2003 +author: John Snow Labs +name: distilbert_finetuned_squad_covidqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_squad_covidqa` is a English model originally trained by dqduong2003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_squad_covidqa_en_5.2.0_3.0_1701039194206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_squad_covidqa_en_5.2.0_3.0_1701039194206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_squad_covidqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_squad_covidqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_squad_covidqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/dqduong2003/distilbert-finetuned-squad-covidqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_darkbloodevil_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_darkbloodevil_en.md new file mode 100644 index 000000000000..43ee87ba756f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_darkbloodevil_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_squad_darkbloodevil DistilBertForQuestionAnswering from darkbloodevil +author: John Snow Labs +name: distilbert_finetuned_squad_darkbloodevil +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_squad_darkbloodevil` is a English model originally trained by darkbloodevil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_squad_darkbloodevil_en_5.2.0_3.0_1701021812492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_squad_darkbloodevil_en_5.2.0_3.0_1701021812492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_squad_darkbloodevil","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_squad_darkbloodevil", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_squad_darkbloodevil| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/darkbloodevil/distilbert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_test2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_test2_en.md new file mode 100644 index 000000000000..49b48ca01986 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_squad_test2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_squad_test2 DistilBertForQuestionAnswering from qkrwnstj +author: John Snow Labs +name: distilbert_finetuned_squad_test2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_squad_test2` is a English model originally trained by qkrwnstj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_squad_test2_en_5.2.0_3.0_1701021501215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_squad_test2_en_5.2.0_3.0_1701021501215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_squad_test2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_squad_test2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_squad_test2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/qkrwnstj/distilbert-finetuned-squad-test2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_uncased_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_uncased_squad_v2_en.md new file mode 100644 index 000000000000..4d79a1ef3019 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_finetuned_uncased_squad_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned_uncased_squad_v2 DistilBertForQuestionAnswering from aai520-group6 +author: John Snow Labs +name: distilbert_finetuned_uncased_squad_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned_uncased_squad_v2` is a English model originally trained by aai520-group6. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_uncased_squad_v2_en_5.2.0_3.0_1701032498454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned_uncased_squad_v2_en_5.2.0_3.0_1701032498454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned_uncased_squad_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned_uncased_squad_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned_uncased_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aai520-group6/distilbert-finetuned-uncased-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_indonesian_squad_id.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_indonesian_squad_id.md new file mode 100644 index 000000000000..202ebf6738a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_indonesian_squad_id.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Indonesian distilbert_indonesian_squad DistilBertForQuestionAnswering from boimbukanbaim +author: John Snow Labs +name: distilbert_indonesian_squad +date: 2023-11-26 +tags: [distilbert, id, open_source, question_answering, onnx] +task: Question Answering +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_indonesian_squad` is a Indonesian model originally trained by boimbukanbaim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_indonesian_squad_id_5.2.0_3.0_1701016057582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_indonesian_squad_id_5.2.0_3.0_1701016057582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_indonesian_squad","id") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_indonesian_squad", "id") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_indonesian_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|id| +|Size:|253.0 MB| + +## References + +https://huggingface.co/boimbukanbaim/distilbert-indonesian-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_medical_question_answer_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_medical_question_answer_en.md new file mode 100644 index 000000000000..dca5fa0e7ff6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_medical_question_answer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_medical_question_answer DistilBertForQuestionAnswering from OnePoint16 +author: John Snow Labs +name: distilbert_medical_question_answer +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_medical_question_answer` is a English model originally trained by OnePoint16. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_medical_question_answer_en_5.2.0_3.0_1701014058763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_medical_question_answer_en_5.2.0_3.0_1701014058763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_medical_question_answer","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_medical_question_answer", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_medical_question_answer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/OnePoint16/distilbert-medical-question_answer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_model_en.md new file mode 100644 index 000000000000..61bc59b2a91f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_model DistilBertForQuestionAnswering from dennischan +author: John Snow Labs +name: distilbert_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_model` is a English model originally trained by dennischan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_model_en_5.2.0_3.0_1701017832400.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_model_en_5.2.0_3.0_1701017832400.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dennischan/distilbert_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_nd_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_nd_squad_en.md new file mode 100644 index 000000000000..e222ebf0aa08 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_nd_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_nd_squad DistilBertForQuestionAnswering from Trisert +author: John Snow Labs +name: distilbert_nd_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_nd_squad` is a English model originally trained by Trisert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_nd_squad_en_5.2.0_3.0_1701034366259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_nd_squad_en_5.2.0_3.0_1701034366259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_nd_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_nd_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_nd_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Trisert/distilbert-nd-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_newsqa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_newsqa_model_en.md new file mode 100644 index 000000000000..64677c10c2dd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_newsqa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_newsqa_model DistilBertForQuestionAnswering from sophiebottani +author: John Snow Labs +name: distilbert_newsqa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_newsqa_model` is a English model originally trained by sophiebottani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_newsqa_model_en_5.2.0_3.0_1701016209436.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_newsqa_model_en_5.2.0_3.0_1701016209436.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_newsqa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_newsqa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_newsqa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|198.3 MB| + +## References + +https://huggingface.co/sophiebottani/distilbert_NewsQA_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad_en.md new file mode 100644 index 000000000000..677cfbb85d4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad DistilBertForQuestionAnswering from Gholamreza +author: John Snow Labs +name: distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad` is a English model originally trained by Gholamreza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad_en_5.2.0_3.0_1701028546406.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad_en_5.2.0_3.0_1701028546406.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_persian_farsi_zwnj_base_finetuned_2epoch_pquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|282.3 MB| + +## References + +https://huggingface.co/Gholamreza/distilbert-fa-zwnj-base-finetuned-2epoch-pquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_finetuned_pquad_fa.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_finetuned_pquad_fa.md new file mode 100644 index 000000000000..66c80066bf11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_finetuned_pquad_fa.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Persian distilbert_persian_farsi_zwnj_base_finetuned_pquad DistilBertForQuestionAnswering from Gholamreza +author: John Snow Labs +name: distilbert_persian_farsi_zwnj_base_finetuned_pquad +date: 2023-11-26 +tags: [distilbert, fa, open_source, question_answering, onnx] +task: Question Answering +language: fa +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_persian_farsi_zwnj_base_finetuned_pquad` is a Persian model originally trained by Gholamreza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_zwnj_base_finetuned_pquad_fa_5.2.0_3.0_1701014203111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_zwnj_base_finetuned_pquad_fa_5.2.0_3.0_1701014203111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_persian_farsi_zwnj_base_finetuned_pquad","fa") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_persian_farsi_zwnj_base_finetuned_pquad", "fa") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_persian_farsi_zwnj_base_finetuned_pquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|fa| +|Size:|282.3 MB| + +## References + +https://huggingface.co/Gholamreza/distilbert-fa-zwnj-base-finetuned-pquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad_en.md new file mode 100644 index 000000000000..dfad3ce653d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad DistilBertForQuestionAnswering from Gholamreza +author: John Snow Labs +name: distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad` is a English model originally trained by Gholamreza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad_en_5.2.0_3.0_1701015525231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad_en_5.2.0_3.0_1701015525231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_persian_farsi_zwnj_base_mlm_pquad_finetuned_pquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|281.8 MB| + +## References + +https://huggingface.co/Gholamreza/distilbert-fa-zwnj-base-MLM-pquad-finetuned-pquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_21iridescent_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_21iridescent_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..52e332ab79c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_21iridescent_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from 21iridescent) +author: John Snow Labs +name: distilbert_qa_21iridescent_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `21iridescent`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_21iridescent_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010360347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_21iridescent_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010360347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_21iridescent_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_21iridescent_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_21iridescent").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_21iridescent_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/21iridescent/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_aaraki_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_aaraki_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..d9cd0e86bb6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_aaraki_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from aaraki) +author: John Snow Labs +name: distilbert_qa_aaraki_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `aaraki`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_aaraki_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011155118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_aaraki_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011155118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_aaraki_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_aaraki_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_aaraki").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_aaraki_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aaraki/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_adrian_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_adrian_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..5dc0d71afe80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_adrian_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Adrian) +author: John Snow Labs +name: distilbert_qa_adrian_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Adrian`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_adrian_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010373023.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_adrian_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010373023.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_adrian_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_adrian_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_adrian_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Adrian/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_akr_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_akr_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..e6096bb2510c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_akr_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from akr) +author: John Snow Labs +name: distilbert_qa_akr_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `akr`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_akr_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011561084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_akr_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011561084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_akr_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_akr_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_akr").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_akr_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/akr/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_andi611_base_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_andi611_base_uncased_squad_en.md new file mode 100644 index 000000000000..25bc131a252b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_andi611_base_uncased_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) Squad +author: John Snow Labs +name: distilbert_qa_andi611_base_uncased_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_andi611_base_uncased_squad_en_5.2.0_3.0_1701010993857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_andi611_base_uncased_squad_en_5.2.0_3.0_1701010993857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_andi611_base_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_andi611_base_uncased_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_andi611").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_andi611_base_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_anurag0077_base_uncased_finetuned_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_anurag0077_base_uncased_finetuned_squad2_en.md new file mode 100644 index 000000000000..cce504a80b7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_anurag0077_base_uncased_finetuned_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from anurag0077) +author: John Snow Labs +name: distilbert_qa_anurag0077_base_uncased_finetuned_squad2 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad2` is a English model originally trained by `anurag0077`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_anurag0077_base_uncased_finetuned_squad2_en_5.2.0_3.0_1701011556255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_anurag0077_base_uncased_finetuned_squad2_en_5.2.0_3.0_1701011556255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_anurag0077_base_uncased_finetuned_squad2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_anurag0077_base_uncased_finetuned_squad2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.distil_bert.base_uncased.by_anurag0077").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_anurag0077_base_uncased_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anurag0077/distilbert-base-uncased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_anurag0077_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_anurag0077_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..b2f7a6b8156b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_anurag0077_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from anurag0077) +author: John Snow Labs +name: distilbert_qa_anurag0077_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `anurag0077`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_anurag0077_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011318521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_anurag0077_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011318521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_anurag0077_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_anurag0077_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased_v3.by_anurag0077").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_anurag0077_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anurag0077/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_arvalinno_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_arvalinno_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..ce97b422a928 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_arvalinno_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from arvalinno) +author: John Snow Labs +name: distilbert_qa_arvalinno_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `arvalinno`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_arvalinno_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011776100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_arvalinno_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011776100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_arvalinno_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_arvalinno_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_arvalinno").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_arvalinno_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/arvalinno/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_avioo1_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_avioo1_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..a50651432ba4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_avioo1_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from avioo1) +author: John Snow Labs +name: distilbert_qa_avioo1_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `avioo1`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_avioo1_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011139859.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_avioo1_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011139859.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_avioo1_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_avioo1_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_avioo1").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_avioo1_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/avioo1/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_chaii_en.md new file mode 100644 index 000000000000..991c1a295a4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_chaii_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from SauravMaheshkar) +author: John Snow Labs +name: distilbert_qa_base_cased_distilled_chaii +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-cased-distilled-chaii` is a English model originally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_cased_distilled_chaii_en_5.2.0_3.0_1701011185049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_cased_distilled_chaii_en_5.2.0_3.0_1701011185049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_cased_distilled_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_cased_distilled_chaii","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.distil_bert.base_cased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_cased_distilled_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/distilbert-base-cased-distilled-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_squad_finetuned_squad_small_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_squad_finetuned_squad_small_en.md new file mode 100644 index 000000000000..da51dff91f60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_squad_finetuned_squad_small_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Small model (from ncduy) +author: John Snow Labs +name: distilbert_qa_base_cased_distilled_squad_finetuned_squad_small +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-cased-distilled-squad-finetuned-squad-small` is a English model originally trained by `ncduy`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_cased_distilled_squad_finetuned_squad_small_en_5.2.0_3.0_1701011322815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_cased_distilled_squad_finetuned_squad_small_en_5.2.0_3.0_1701011322815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_cased_distilled_squad_finetuned_squad_small","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_cased_distilled_squad_finetuned_squad_small","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_small_cased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_cased_distilled_squad_finetuned_squad_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ncduy/distilbert-base-cased-distilled-squad-finetuned-squad-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_squad_finetuned_squad_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_squad_finetuned_squad_test_en.md new file mode 100644 index 000000000000..97fc9cf90e8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_cased_distilled_squad_finetuned_squad_test_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from ncduy) +author: John Snow Labs +name: distilbert_qa_base_cased_distilled_squad_finetuned_squad_test +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-cased-distilled-squad-finetuned-squad-test` is a English model originally trained by `ncduy`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_cased_distilled_squad_finetuned_squad_test_en_5.2.0_3.0_1701011553029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_cased_distilled_squad_finetuned_squad_test_en_5.2.0_3.0_1701011553029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_cased_distilled_squad_finetuned_squad_test","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_cased_distilled_squad_finetuned_squad_test","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_cased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_cased_distilled_squad_finetuned_squad_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ncduy/distilbert-base-cased-distilled-squad-finetuned-squad-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config1_en.md new file mode 100644 index 000000000000..cc7a666c3068 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Cased model (from nlpunibo) +author: John Snow Labs +name: distilbert_qa_base_config1 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert_base_config1` is a English model originally trained by `nlpunibo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_config1_en_5.2.0_3.0_1701011433269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_config1_en_5.2.0_3.0_1701011433269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_config1","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_config1","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.base_config1.by_nlpunibo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_config1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/nlpunibo/distilbert_base_config1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config2_en.md new file mode 100644 index 000000000000..578256796c99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Cased model (from nlpunibo) +author: John Snow Labs +name: distilbert_qa_base_config2 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert_base_config2` is a English model originally trained by `nlpunibo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_config2_en_5.2.0_3.0_1701011776227.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_config2_en_5.2.0_3.0_1701011776227.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_config2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_config2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.base_config2.by_nlpunibo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_config2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/nlpunibo/distilbert_base_config2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config3_en.md new file mode 100644 index 000000000000..f1cd80e9a1da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_config3_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Cased model (from nlpunibo) +author: John Snow Labs +name: distilbert_qa_base_config3 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert_base_config3` is a English model originally trained by `nlpunibo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_config3_en_5.2.0_3.0_1701012026976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_config3_en_5.2.0_3.0_1701012026976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_config3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_config3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.base_config3.by_nlpunibo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_config3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/nlpunibo/distilbert_base_config3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_squad2_custom_dataset_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_squad2_custom_dataset_en.md new file mode 100644 index 000000000000..00537ca6d473 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_squad2_custom_dataset_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Cased model (from superspray) +author: John Snow Labs +name: distilbert_qa_base_squad2_custom_dataset +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert_base_squad2_custom_dataset` is a English model originally trained by `superspray`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_squad2_custom_dataset_en_5.2.0_3.0_1701011599794.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_squad2_custom_dataset_en_5.2.0_3.0_1701011599794.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_squad2_custom_dataset","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_squad2_custom_dataset","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.distil_bert.base").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_squad2_custom_dataset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/superspray/distilbert_base_squad2_custom_dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_3feb_2022_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_3feb_2022_finetuned_squad_en.md new file mode 100644 index 000000000000..e10a393630be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_3feb_2022_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from sunitha) +author: John Snow Labs +name: distilbert_qa_base_uncased_3feb_2022_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-3feb-2022-finetuned-squad` is a English model originally trained by `sunitha`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_3feb_2022_finetuned_squad_en_5.2.0_3.0_1701012022260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_3feb_2022_finetuned_squad_en_5.2.0_3.0_1701012022260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_3feb_2022_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_3feb_2022_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_sunitha").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_3feb_2022_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sunitha/distilbert-base-uncased-3feb-2022-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_chaii_en.md new file mode 100644 index 000000000000..8a999458f2d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_chaii_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Uncased model (from SauravMaheshkar) +author: John Snow Labs +name: distilbert_qa_base_uncased_distilled_chaii +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-distilled-chaii` is a English model originally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_distilled_chaii_en_5.2.0_3.0_1701012218734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_distilled_chaii_en_5.2.0_3.0_1701012218734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_distilled_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_distilled_chaii","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.distil_bert.base_uncased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_distilled_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/distilbert-base-uncased-distilled-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_squad_en.md new file mode 100644 index 000000000000..96c16df6544a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Uncased model +author: John Snow Labs +name: distilbert_qa_base_uncased_distilled_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-distilled-squad` is a English model originally trained by HuggingFace. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_distilled_squad_en_5.2.0_3.0_1701011782917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_distilled_squad_en_5.2.0_3.0_1701011782917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_distilled_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_distilled_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_uploaded by huggingface").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_distilled_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/distilbert-base-uncased-distilled-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_squad_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_squad_finetuned_squad_en.md new file mode 100644 index 000000000000..6b7c61519e73 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_distilled_squad_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from deepakvk) +author: John Snow Labs +name: distilbert_qa_base_uncased_distilled_squad_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-distilled-squad-finetuned-squad` is a English model originally trained by `deepakvk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_distilled_squad_finetuned_squad_en_5.2.0_3.0_1701011780408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_distilled_squad_finetuned_squad_en_5.2.0_3.0_1701011780408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_distilled_squad_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_distilled_squad_finetuned_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_deepakvk").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_distilled_squad_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/deepakvk/distilbert-base-uncased-distilled-squad-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_advers_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_advers_en.md new file mode 100644 index 000000000000..aadebf93208e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_advers_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from T-qualizer) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_advers +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-advers` is a English model originally trained by `T-qualizer`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_advers_en_5.2.0_3.0_1701012418747.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_advers_en_5.2.0_3.0_1701012418747.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_advers","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_advers","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.base_uncased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_advers| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/T-qualizer/distilbert-base-uncased-finetuned-advers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_duorc_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_duorc_en.md new file mode 100644 index 000000000000..b91aa0832568 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_duorc_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from machine2049) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_duorc +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-duorc_distilbert` is a English model originally trained by `machine2049`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_duorc_en_5.2.0_3.0_1701012032139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_duorc_en_5.2.0_3.0_1701012032139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_duorc","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_duorc","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_duorc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/machine2049/distilbert-base-uncased-finetuned-duorc_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_indosquad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_indosquad_v2_en.md new file mode 100644 index 000000000000..34002a19c4e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_indosquad_v2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from arvalinno) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_indosquad_v2 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-indosquad-v2` is a English model originally trained by `arvalinno`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_indosquad_v2_en_5.2.0_3.0_1701011217942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_indosquad_v2_en_5.2.0_3.0_1701011217942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_indosquad_v2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_indosquad_v2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased_v2").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_indosquad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/arvalinno/distilbert-base-uncased-finetuned-indosquad-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_infovqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_infovqa_en.md new file mode 100644 index 000000000000..34a20f6dd8e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_infovqa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from tiennvcs) Infovqa +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_infovqa +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-infovqa` is a English model originally trained by `tiennvcs`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_infovqa_en_5.2.0_3.0_1701012029724.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_infovqa_en_5.2.0_3.0_1701012029724.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_infovqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_infovqa","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.base_uncased.by_tiennvcs").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_infovqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tiennvcs/distilbert-base-uncased-finetuned-infovqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_jumbling_squad_15_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_jumbling_squad_15_en.md new file mode 100644 index 000000000000..91fc1736643a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_jumbling_squad_15_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from huxxx657) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_jumbling_squad_15 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-jumbling-squad-15` is a English model originally trained by `huxxx657`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_jumbling_squad_15_en_5.2.0_3.0_1701012223728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_jumbling_squad_15_en_5.2.0_3.0_1701012223728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_jumbling_squad_15","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_jumbling_squad_15","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased_v2.by_huxxx657").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_jumbling_squad_15| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/huxxx657/distilbert-base-uncased-finetuned-jumbling-squad-15 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_natural_questions_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_natural_questions_en.md new file mode 100644 index 000000000000..c18f6ddeae40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_natural_questions_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from datarpit) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_natural_questions +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-natural-questions` is a English model originally trained by `datarpit`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_natural_questions_en_5.2.0_3.0_1701012203834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_natural_questions_en_5.2.0_3.0_1701012203834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_natural_questions","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_natural_questions","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.base_uncased.by_datarpit").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_natural_questions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/datarpit/distilbert-base-uncased-finetuned-natural-questions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad3_en.md new file mode 100644 index 000000000000..984172d9393b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad3_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from anurag0077) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_squad3 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad3` is a English model originally trained by `anurag0077`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad3_en_5.2.0_3.0_1701011402173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad3_en_5.2.0_3.0_1701011402173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_anurag0077").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_squad3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/anurag0077/distilbert-base-uncased-finetuned-squad3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_colab_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_colab_en.md new file mode 100644 index 000000000000..7f6af7a03b41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_colab_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from Adrian) Squad2 +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_squad_colab +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad-colab` is a English model originally trained by `Adrian`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_colab_en_5.2.0_3.0_1701012391972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_colab_en_5.2.0_3.0_1701012391972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad_colab","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad_colab","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased_colab.by_Adrian").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_squad_colab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Adrian/distilbert-base-uncased-finetuned-squad-colab \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..e7d0963c6ddb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_base_uncased_finetuned_squad DistilBertForQuestionAnswering from machine2049 +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_base_uncased_finetuned_squad` is a English model originally trained by machine2049. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011576854.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011576854.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/machine2049/distilbert-base-uncased-finetuned-squad_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_frozen_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_frozen_v2_en.md new file mode 100644 index 000000000000..78f2da8841a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_frozen_v2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from ericRosello) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_squad_frozen_v2 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad-frozen-v2` is a English model originally trained by `ericRosello`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_frozen_v2_en_5.2.0_3.0_1701012420797.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_frozen_v2_en_5.2.0_3.0_1701012420797.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad_frozen_v2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad_frozen_v2","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased_v2.by_ericRosello").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_squad_frozen_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/ericRosello/distilbert-base-uncased-finetuned-squad-frozen-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_v1_en.md new file mode 100644 index 000000000000..32f0855e77eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_squad_v1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from lewtun) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_squad_v1 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad-v1` is a English model originally trained by `lewtun`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_v1_en_5.2.0_3.0_1701012571504.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_squad_v1_en_5.2.0_3.0_1701012571504.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad_v1","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_squad_v1","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_lewtun").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/lewtun/distilbert-base-uncased-finetuned-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_triviaqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_triviaqa_en.md new file mode 100644 index 000000000000..94bb9e03c9f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_finetuned_triviaqa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from FabianWillner) +author: John Snow Labs +name: distilbert_qa_base_uncased_finetuned_triviaqa +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-triviaqa` is a English model originally trained by `FabianWillner`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_triviaqa_en_5.2.0_3.0_1701012767125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_finetuned_triviaqa_en_5.2.0_3.0_1701012767125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_triviaqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_finetuned_triviaqa","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.trivia.distil_bert.base_uncased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_finetuned_triviaqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/FabianWillner/distilbert-base-uncased-finetuned-triviaqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_gradient_clinic_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_gradient_clinic_en.md new file mode 100644 index 000000000000..5b1699d9be4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_gradient_clinic_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from charlieoneill) +author: John Snow Labs +name: distilbert_qa_base_uncased_gradient_clinic +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-gradient-clinic` is a English model originally trained by `charlieoneill`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_gradient_clinic_en_5.2.0_3.0_1701012221147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_gradient_clinic_en_5.2.0_3.0_1701012221147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_gradient_clinic","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_gradient_clinic","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.base_uncased.by_charlieoneill").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_gradient_clinic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/charlieoneill/distilbert-base-uncased-gradient-clinic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_holtin_finetuned_full_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_holtin_finetuned_full_squad_en.md new file mode 100644 index 000000000000..4634dab428eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_holtin_finetuned_full_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from holtin) +author: John Snow Labs +name: distilbert_qa_base_uncased_holtin_finetuned_full_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-holtin-finetuned-full-squad` is a English model originally trained by `holtin`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_holtin_finetuned_full_squad_en_5.2.0_3.0_1701011771017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_holtin_finetuned_full_squad_en_5.2.0_3.0_1701011771017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_holtin_finetuned_full_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_holtin_finetuned_full_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased_full.by_holtin").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_holtin_finetuned_full_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/holtin/distilbert-base-uncased-holtin-finetuned-full-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_holtin_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_holtin_finetuned_squad_en.md new file mode 100644 index 000000000000..a3340ec82b5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_holtin_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from holtin) +author: John Snow Labs +name: distilbert_qa_base_uncased_holtin_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-holtin-finetuned-squad` is a English model originally trained by `holtin`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_holtin_finetuned_squad_en_5.2.0_3.0_1701011996097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_holtin_finetuned_squad_en_5.2.0_3.0_1701011996097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_holtin_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_holtin_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_holtin").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_holtin_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/holtin/distilbert-base-uncased-holtin-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_qa_with_ner_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_qa_with_ner_en.md new file mode 100644 index 000000000000..82a5c33f16d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_qa_with_ner_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) +author: John Snow Labs +name: distilbert_qa_base_uncased_qa_with_ner +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-qa-with-ner` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_qa_with_ner_en_5.2.0_3.0_1701012988072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_qa_with_ner_en_5.2.0_3.0_1701012988072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_qa_with_ner","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_qa_with_ner","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.conll.distil_bert.base_uncased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_qa_with_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-qa-with-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_covid_qa_deepset_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_covid_qa_deepset_en.md new file mode 100644 index 000000000000..664dfda28346 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_covid_qa_deepset_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from armageddon) +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2_covid_qa_deepset +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2-covid-qa-deepset` is a English model originally trained by `armageddon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1701012414763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_covid_qa_deepset_en_5.2.0_3.0_1701012414763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_covid_qa_deepset","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_covid_qa_deepset","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_covid.distil_bert.base_uncased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2_covid_qa_deepset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/armageddon/distilbert-base-uncased-squad2-covid-qa-deepset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_en.md new file mode 100644 index 000000000000..84d62c9489c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from twmkn9) +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2` is a English model originally trained by `twmkn9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_en_5.2.0_3.0_1701012607215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_en_5.2.0_3.0_1701012607215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.distil_bert.base_uncased.by_twmkn9").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/twmkn9/distilbert-base-uncased-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_en.md new file mode 100644 index 000000000000..1897e990e2ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) Squad2 +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2_with_ner +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2-with-ner` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_en_5.2.0_3.0_1701013182615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_en_5.2.0_3.0_1701013182615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_conll.distil_bert.base_uncased.by_andi611").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2_with_ner| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-squad2-with-ner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en.md new file mode 100644 index 000000000000..2719c3869b6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) Squad2 with Restaurant, Neg, Repeat +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2-with-ner-mit-restaurant-with-neg-with-repeat` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en_5.2.0_3.0_1701012609666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat_en_5.2.0_3.0_1701012609666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.distil_bert.base_uncased.by_andi611").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2_with_ner_mit_restaurant_with_neg_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-squad2-with-ner-mit-restaurant-with-neg-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_en.md new file mode 100644 index 000000000000..fa8ca9d572d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) Squad2 with Neg +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2_with_ner_with_neg +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2-with-ner-with-neg` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_en_5.2.0_3.0_1701012588981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_en_5.2.0_3.0_1701012588981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_conll.distil_bert.base_uncased_with_neg.by_andi611").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2_with_ner_with_neg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-squad2-with-ner-with-neg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_en.md new file mode 100644 index 000000000000..c30a1b2dbaf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) Squad2 with Neg, Multi +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2-with-ner-with-neg-with-multi` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_en_5.2.0_3.0_1701012793453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_en_5.2.0_3.0_1701012793453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_conll.distil_bert.base_uncased_with_neg_with_multi.by_andi611").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-squad2-with-ner-with-neg-with-multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat_en.md new file mode 100644 index 000000000000..48af8638614a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) Squad2 with Neg, Multi, Repeat +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2-with-ner-with-neg-with-multi-with-repeat` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat_en_5.2.0_3.0_1701012184049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat_en_5.2.0_3.0_1701012184049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_conll.distil_bert.base_uncased_with_neg_with_multi_with_repeat.by_andi611").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_multi_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-squad2-with-ner-with-neg-with-multi-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat_en.md new file mode 100644 index 000000000000..6e6d3593530c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from andi611) Squad2 with Neg, Repeat +author: John Snow Labs +name: distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad2-with-ner-with-neg-with-repeat` is a English model originally trained by `andi611`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat_en_5.2.0_3.0_1701013387027.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat_en_5.2.0_3.0_1701013387027.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2_conll.distil_bert.base_uncased_with_neg_with_repeat.by_andi611").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_base_uncased_squad2_with_ner_with_neg_with_repeat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/andi611/distilbert-base-uncased-squad2-with-ner-with-neg-with-repeat \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_bdickson_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_bdickson_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..bd19d2b272fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_bdickson_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from bdickson) +author: John Snow Labs +name: distilbert_qa_bdickson_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `bdickson`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_bdickson_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012784364.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_bdickson_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012784364.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_bdickson_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_bdickson_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_bdickson").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_bdickson_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/bdickson/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_caiosantillo_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_caiosantillo_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..1a9464c7dde8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_caiosantillo_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from caiosantillo) +author: John Snow Labs +name: distilbert_qa_caiosantillo_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `caiosantillo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_caiosantillo_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012980623.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_caiosantillo_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012980623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_caiosantillo_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_caiosantillo_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_caiosantillo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_caiosantillo_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/caiosantillo/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_en.md new file mode 100644 index 000000000000..8dd06930ce97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_checkpoint DistilBertForQuestionAnswering from Eitanli +author: John Snow Labs +name: distilbert_qa_checkpoint +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_checkpoint` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_en_5.2.0_3.0_1701021510073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_en_5.2.0_3.0_1701021510073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_checkpoint","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_checkpoint", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_checkpoint| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Eitanli/distilbert-qa-checkpoint \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v2_en.md new file mode 100644 index 000000000000..c8a3aa3667c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_checkpoint_v2 DistilBertForQuestionAnswering from Eitanli +author: John Snow Labs +name: distilbert_qa_checkpoint_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_checkpoint_v2` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v2_en_5.2.0_3.0_1701017956770.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v2_en_5.2.0_3.0_1701017956770.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_checkpoint_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_checkpoint_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_checkpoint_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Eitanli/distilbert-qa-checkpoint-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v3_en.md new file mode 100644 index 000000000000..adbb52398094 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_checkpoint_v3 DistilBertForQuestionAnswering from Eitanli +author: John Snow Labs +name: distilbert_qa_checkpoint_v3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_checkpoint_v3` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v3_en_5.2.0_3.0_1701025277789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v3_en_5.2.0_3.0_1701025277789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_checkpoint_v3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_checkpoint_v3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_checkpoint_v3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Eitanli/distilbert-qa-checkpoint-v3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v4_en.md new file mode 100644 index 000000000000..ee2228d30712 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_checkpoint_v4 DistilBertForQuestionAnswering from Eitanli +author: John Snow Labs +name: distilbert_qa_checkpoint_v4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_checkpoint_v4` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v4_en_5.2.0_3.0_1701021139441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v4_en_5.2.0_3.0_1701021139441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_checkpoint_v4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_checkpoint_v4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_checkpoint_v4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Eitanli/distilbert-qa-checkpoint-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v5_eitanli_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v5_eitanli_en.md new file mode 100644 index 000000000000..6af5d52f442d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v5_eitanli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_checkpoint_v5_eitanli DistilBertForQuestionAnswering from Eitanli +author: John Snow Labs +name: distilbert_qa_checkpoint_v5_eitanli +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_checkpoint_v5_eitanli` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v5_eitanli_en_5.2.0_3.0_1701015742273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v5_eitanli_en_5.2.0_3.0_1701015742273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_checkpoint_v5_eitanli","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_checkpoint_v5_eitanli", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_checkpoint_v5_eitanli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Eitanli/distilbert-qa-checkpoint-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v5_thefoodprocessor_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v5_thefoodprocessor_en.md new file mode 100644 index 000000000000..011d4e84b5f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_checkpoint_v5_thefoodprocessor_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_checkpoint_v5_thefoodprocessor DistilBertForQuestionAnswering from Thefoodprocessor +author: John Snow Labs +name: distilbert_qa_checkpoint_v5_thefoodprocessor +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_checkpoint_v5_thefoodprocessor` is a English model originally trained by Thefoodprocessor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v5_thefoodprocessor_en_5.2.0_3.0_1701019902780.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_checkpoint_v5_thefoodprocessor_en_5.2.0_3.0_1701019902780.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_checkpoint_v5_thefoodprocessor","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_checkpoint_v5_thefoodprocessor", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_checkpoint_v5_thefoodprocessor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Thefoodprocessor/distilbert-qa-checkpoint-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom3_en.md new file mode 100644 index 000000000000..dd0e869bda0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom3_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from aszidon) +author: John Snow Labs +name: distilbert_qa_custom3 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbertcustom3` is a English model originally trained by `aszidon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom3_en_5.2.0_3.0_1701012437555.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom3_en_5.2.0_3.0_1701012437555.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom3","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom3","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.custom3.by_aszidon").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_custom3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aszidon/distilbertcustom3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom4_en.md new file mode 100644 index 000000000000..8614fa6f65c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom4_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from aszidon) +author: John Snow Labs +name: distilbert_qa_custom4 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbertcustom4` is a English model originally trained by `aszidon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom4_en_5.2.0_3.0_1701012985599.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom4_en_5.2.0_3.0_1701012985599.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom4","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom4","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.custom4.by_aszidon").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_custom4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aszidon/distilbertcustom4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom5_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom5_en.md new file mode 100644 index 000000000000..4f8dbdd745be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom5_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from aszidon) +author: John Snow Labs +name: distilbert_qa_custom5 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbertcustom5` is a English model originally trained by `aszidon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom5_en_5.2.0_3.0_1701013157247.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom5_en_5.2.0_3.0_1701013157247.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom5","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom5","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.custom5.by_aszidon").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_custom5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aszidon/distilbertcustom5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom_en.md new file mode 100644 index 000000000000..153cced94e46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_custom_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from aszidon) +author: John Snow Labs +name: distilbert_qa_custom +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbertcustom` is a English model originally trained by `aszidon`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom_en_5.2.0_3.0_1701013603195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_custom_en_5.2.0_3.0_1701013603195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_custom","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.custom.by_aszidon").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_custom| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/aszidon/distilbertcustom \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_dbert_3epoch_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_dbert_3epoch_en.md new file mode 100644 index 000000000000..855ba937fb9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_dbert_3epoch_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from Ifenna) +author: John Snow Labs +name: distilbert_qa_dbert_3epoch +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `dbert-3epoch` is a English model originally trained by `Ifenna`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_dbert_3epoch_en_5.2.0_3.0_1701013183278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_dbert_3epoch_en_5.2.0_3.0_1701013183278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_dbert_3epoch","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_dbert_3epoch","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.by_Ifenna").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_dbert_3epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Ifenna/dbert-3epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa_es.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa_es.md new file mode 100644 index 000000000000..5617a2ef76a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa_es.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Spanish DistilBertForQuestionAnswering model (from CenIA) MLQA +author: John Snow Labs +name: distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa +date: 2023-11-26 +tags: [es, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distillbert-base-spanish-uncased-finetuned-qa-mlqa` is a English model originally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa_es_5.2.0_3.0_1701013391697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa_es_5.2.0_3.0_1701013391697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa","es") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.mlqa.distil_bert.base_uncased").predict("""¿Cuál es mi nombre?|||"Mi nombre es Clara y vivo en Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_mlqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|250.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/distillbert-base-spanish-uncased-finetuned-qa-mlqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac_es.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac_es.md new file mode 100644 index 000000000000..0fa1c6cedea1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac_es.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Spanish DistilBertForQuestionAnswering model (from CenIA) SQAC +author: John Snow Labs +name: distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac +date: 2023-11-26 +tags: [es, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distillbert-base-spanish-uncased-finetuned-qa-sqac` is a English model originally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac_es_5.2.0_3.0_1701012604518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac_es_5.2.0_3.0_1701012604518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac","es") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.sqac.distil_bert.base_uncased").predict("""¿Cuál es mi nombre?|||"Mi nombre es Clara y vivo en Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_sqac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|250.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/distillbert-base-spanish-uncased-finetuned-qa-sqac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar_es.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar_es.md new file mode 100644 index 000000000000..56922194e186 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar_es.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Spanish DistilBertForQuestionAnswering model (from CenIA) TAR +author: John Snow Labs +name: distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar +date: 2023-11-26 +tags: [es, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distillbert-base-spanish-uncased-finetuned-qa-tar` is a English model originally trained by `CenIA`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar_es_5.2.0_3.0_1701012796157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar_es_5.2.0_3.0_1701012796157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar","es") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar","es") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("¿Cuál es mi nombre?", "Mi nombre es Clara y vivo en Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.answer_question.distil_bert.base_uncased").predict("""¿Cuál es mi nombre?|||"Mi nombre es Clara y vivo en Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_distillbert_base_spanish_uncased_finetuned_qa_tar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|es| +|Size:|250.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/CenIA/distillbert-base-spanish-uncased-finetuned-qa-tar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_emre_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_emre_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..8cd0d8d1dbcc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_emre_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from emre) +author: John Snow Labs +name: distilbert_qa_emre_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `emre`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_emre_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013840123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_emre_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013840123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_emre_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_emre_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_emre").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_emre_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/emre/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_english_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_english_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..90eb9a015c7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_english_base_uncased_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_qa_english_base_uncased_finetuned_squad DistilBertForQuestionAnswering from en +author: John Snow Labs +name: distilbert_qa_english_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_english_base_uncased_finetuned_squad` is a English model originally trained by en. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_english_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013523533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_english_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013523533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_english_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_english_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_english_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/en/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_fadhilarkan_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_fadhilarkan_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..1629a22cdaee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_fadhilarkan_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from fadhilarkan) +author: John Snow Labs +name: distilbert_qa_fadhilarkan_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `fadhilarkan`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_fadhilarkan_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014053754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_fadhilarkan_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014053754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_fadhilarkan_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_fadhilarkan_base_uncased_finetuned_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_fadhilarkan").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_fadhilarkan_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/fadhilarkan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_finetuned_squad_pt.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_finetuned_squad_pt.md new file mode 100644 index 000000000000..c30518161622 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_finetuned_squad_pt.md @@ -0,0 +1,100 @@ +--- +layout: model +title: Portuguese DistilBertForQuestionAnswering Cased model (from mrm8488) +author: John Snow Labs +name: distilbert_qa_finetuned_squad +date: 2023-11-26 +tags: [open_source, distilbert, question_answering, pt, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBERT Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-multi-finedtuned-squad-pt` is a Portuguese model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_finetuned_squad_pt_5.2.0_3.0_1701014426202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_finetuned_squad_pt_5.2.0_3.0_1701014426202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ + .setInputCols(["question", "context"]) \ + .setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_finetuned_squad","pt") \ + .setInputCols(["document_question", "document_context"]) \ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["PUT YOUR 'QUESTION' STRING HERE?", "PUT YOUR 'CONTEXT' STRING HERE"]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_finetuned_squad","pt") + .setInputCols(Array("document", "token")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("PUT YOUR 'QUESTION' STRING HERE?", "PUT YOUR 'CONTEXT' STRING HERE").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("pt.answer_question.distil_bert.squad.finetuned").predict("""PUT YOUR 'QUESTION' STRING HERE?|||"PUT YOUR 'CONTEXT' STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|505.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +https://huggingface.co/mrm8488/distilbert-multi-finedtuned-squad-pt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_firat_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_firat_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..1e79b0735a79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_firat_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Firat) +author: John Snow Labs +name: distilbert_qa_firat_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Firat`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_firat_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010378270.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_firat_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010378270.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_firat_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_firat_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_firat_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Firat/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_fofer_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_fofer_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..6437ca855dac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_fofer_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from FOFer) +author: John Snow Labs +name: distilbert_qa_fofer_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `FOFer`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_fofer_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010378251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_fofer_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010378251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_fofer_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_fofer_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_fofer_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/FOFer/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_gayathri_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_gayathri_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..9b8c7f6d9588 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_gayathri_base_uncased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English distilbert_qa_gayathri_base_uncased_finetuned_squad DistilBertForQuestionAnswering from Gayathri +author: John Snow Labs +name: distilbert_qa_gayathri_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_gayathri_base_uncased_finetuned_squad` is a English model originally trained by Gayathri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_gayathri_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010549020.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_gayathri_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010549020.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_gayathri_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_gayathri_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_gayathri_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Gayathri/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_gokulkarthik_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_gokulkarthik_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..e6c61fdad4a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_gokulkarthik_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from gokulkarthik) +author: John Snow Labs +name: distilbert_qa_gokulkarthik_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `gokulkarthik`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_gokulkarthik_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012985604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_gokulkarthik_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012985604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_gokulkarthik_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_gokulkarthik_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_gokulkarthik").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_gokulkarthik_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/gokulkarthik/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_graviraja_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_graviraja_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..72de211e8636 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_graviraja_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from graviraja) +author: John Snow Labs +name: distilbert_qa_graviraja_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `graviraja`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_graviraja_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012769552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_graviraja_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012769552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_graviraja_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_graviraja_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_graviraja").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_graviraja_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/graviraja/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_guhuawuli_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_guhuawuli_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..4a70e800acd8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_guhuawuli_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from guhuawuli) +author: John Snow Labs +name: distilbert_qa_guhuawuli_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `guhuawuli`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_guhuawuli_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013702865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_guhuawuli_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013702865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_guhuawuli_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_guhuawuli_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_guhuawuli").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_guhuawuli_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/guhuawuli/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hark99_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hark99_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..770f47af92de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hark99_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from hark99) +author: John Snow Labs +name: distilbert_qa_hark99_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `hark99`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_hark99_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013182236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_hark99_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013182236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_hark99_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_hark99_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_hark99").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_hark99_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hark99/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hcy11_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hcy11_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..28b7c0893489 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hcy11_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from hcy11) +author: John Snow Labs +name: distilbert_qa_hcy11_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `hcy11`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_hcy11_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013359484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_hcy11_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013359484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_hcy11_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_hcy11_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_hcy11").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_hcy11_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hcy11/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hiiii23_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hiiii23_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..c745684951fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hiiii23_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from hiiii23) +author: John Snow Labs +name: distilbert_qa_hiiii23_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `hiiii23`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_hiiii23_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014603585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_hiiii23_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014603585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_hiiii23_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_hiiii23_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_hiiii23").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_hiiii23_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/hiiii23/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hoang_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hoang_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..4f51d4e290d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_hoang_base_uncased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English distilbert_qa_hoang_base_uncased_finetuned_squad DistilBertForQuestionAnswering from Hoang +author: John Snow Labs +name: distilbert_qa_hoang_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_hoang_base_uncased_finetuned_squad` is a English model originally trained by Hoang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_hoang_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010584329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_hoang_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010584329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_hoang_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_hoang_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_hoang_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Hoang/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_holtin_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_holtin_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..788d42758596 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_holtin_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from holtin) +author: John Snow Labs +name: distilbert_qa_holtin_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `holtin`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_holtin_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012963299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_holtin_base_uncased_finetuned_squad_en_5.2.0_3.0_1701012963299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_holtin_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_holtin_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased_v2.by_holtin").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_holtin_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/holtin/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..725b692d833d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from huggingfaceepita) +author: John Snow Labs +name: distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `huggingfaceepita`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013123018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013123018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_huggingfaceepita").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_huggingfaceepita_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/huggingfaceepita/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_huxxx657_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_huxxx657_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..625f375ff3b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_huxxx657_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from huxxx657) +author: John Snow Labs +name: distilbert_qa_huxxx657_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `huxxx657`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_huxxx657_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013865891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_huxxx657_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013865891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_huxxx657_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_huxxx657_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_huxxx657").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_huxxx657_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/huxxx657/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jgammack_base_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jgammack_base_uncased_squad_en.md new file mode 100644 index 000000000000..ab934d6fac2a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jgammack_base_uncased_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from jgammack) +author: John Snow Labs +name: distilbert_qa_jgammack_base_uncased_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-squad` is a English model originally trained by `jgammack`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_jgammack_base_uncased_squad_en_5.2.0_3.0_1701014077513.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_jgammack_base_uncased_squad_en_5.2.0_3.0_1701014077513.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_jgammack_base_uncased_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_jgammack_base_uncased_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_jgammack").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_jgammack_base_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jgammack/distilbert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jhoonk_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jhoonk_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..a405742590ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jhoonk_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from jhoonk) +author: John Snow Labs +name: distilbert_qa_jhoonk_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `jhoonk`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_jhoonk_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013376380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_jhoonk_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013376380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_jhoonk_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_jhoonk_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_jhoonk").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_jhoonk_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jhoonk/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jsunster_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jsunster_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..c5b66b374929 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_jsunster_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from jsunster) +author: John Snow Labs +name: distilbert_qa_jsunster_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `jsunster`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_jsunster_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013301704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_jsunster_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013301704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_jsunster_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_jsunster_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_jsunster").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_jsunster_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jsunster/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_kaggleodin_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_kaggleodin_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..80283b4646ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_kaggleodin_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from kaggleodin) +author: John Snow Labs +name: distilbert_qa_kaggleodin_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `kaggleodin`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_kaggleodin_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014269487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_kaggleodin_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014269487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_kaggleodin_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_kaggleodin_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_kaggleodin").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_kaggleodin_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/kaggleodin/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_mtl_base_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_mtl_base_uncased_squad_en.md new file mode 100644 index 000000000000..0b5fd1569a95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_mtl_base_uncased_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from jgammack) +author: John Snow Labs +name: distilbert_qa_mtl_base_uncased_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MTL-distilbert-base-uncased-squad` is a English model originally trained by `jgammack`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_mtl_base_uncased_squad_en_5.2.0_3.0_1701010618908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_mtl_base_uncased_squad_en_5.2.0_3.0_1701010618908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_mtl_base_uncased_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_mtl_base_uncased_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_mtl_base_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jgammack/MTL-distilbert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finedtuned_squad_pt.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finedtuned_squad_pt.md new file mode 100644 index 000000000000..442fadfb0193 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finedtuned_squad_pt.md @@ -0,0 +1,96 @@ +--- +layout: model +title: Portuguese DistilBertForQuestionAnswering model (from mrm8488) +author: John Snow Labs +name: distilbert_qa_multi_finedtuned_squad +date: 2023-11-26 +tags: [question_answering, distilbert, pt, open_source, onnx] +task: Question Answering +language: pt +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-multi-finedtuned-squad-pt` is a Portuguese model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_multi_finedtuned_squad_pt_5.2.0_3.0_1701014838666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_multi_finedtuned_squad_pt_5.2.0_3.0_1701014838666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_multi_finedtuned_squad", "pt") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) +data = spark.createDataFrame([["Qual é o meu nome?", "Meu nome é Clara e moro em Berkeley."]]).toDF("question", "context") +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_multi_finedtuned_squad", "pt") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) +val data = Seq("Qual é o meu nome?", "Meu nome é Clara e moro em Berkeley.").toDF("question", "context") +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("pt.answer_question.squad.distil_bert").predict("""Qual é o meu nome?|||"Meu nome é Clara e moro em Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_multi_finedtuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|pt| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tli8hf/unqover-distilbert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finetuned_for_xqua_on_chaii_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finetuned_for_xqua_on_chaii_en.md new file mode 100644 index 000000000000..2369e8f75898 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finetuned_for_xqua_on_chaii_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from SauravMaheshkar) Xqua +author: John Snow Labs +name: distilbert_qa_multi_finetuned_for_xqua_on_chaii +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-multi-finetuned-for-xqua-on-chaii` is a English model originally trained by `SauravMaheshkar`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_multi_finetuned_for_xqua_on_chaii_en_5.2.0_3.0_1701013635234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_multi_finetuned_for_xqua_on_chaii_en_5.2.0_3.0_1701013635234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_multi_finetuned_for_xqua_on_chaii","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_multi_finetuned_for_xqua_on_chaii","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.chaii.distil_bert").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_multi_finetuned_for_xqua_on_chaii| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SauravMaheshkar/distilbert-multi-finetuned-for-xqua-on-chaii \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finetuned_for_xqua_on_tydiqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finetuned_for_xqua_on_tydiqa_en.md new file mode 100644 index 000000000000..e201d6e26099 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_multi_finetuned_for_xqua_on_tydiqa_en.md @@ -0,0 +1,103 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from mrm8488) Xqua +author: John Snow Labs +name: distilbert_qa_multi_finetuned_for_xqua_on_tydiqa +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-multi-finetuned-for-xqua-on-tydiqa` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_multi_finetuned_for_xqua_on_tydiqa_en_5.2.0_3.0_1701013640158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_multi_finetuned_for_xqua_on_tydiqa_en_5.2.0_3.0_1701013640158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_multi_finetuned_for_xqua_on_tydiqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_multi_finetuned_for_xqua_on_tydiqa","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.tydiqa.distil_bert").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_multi_finetuned_for_xqua_on_tydiqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.4 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mrm8488/distilbert-multi-finetuned-for-xqua-on-tydiqa +- https://ai.google.com/research/tydiqa +- https://github.com/google-research-datasets/tydiqa/blob/master/README.md#the-tasks +- https://twitter.com/mrm8488 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_mvonwyl_base_uncased_finetuned_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_mvonwyl_base_uncased_finetuned_squad2_en.md new file mode 100644 index 000000000000..5c4bff7a99fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_mvonwyl_base_uncased_finetuned_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from mvonwyl) +author: John Snow Labs +name: distilbert_qa_mvonwyl_base_uncased_finetuned_squad2 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad2` is a English model originally trained by `mvonwyl`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_mvonwyl_base_uncased_finetuned_squad2_en_5.2.0_3.0_1701014437462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_mvonwyl_base_uncased_finetuned_squad2_en_5.2.0_3.0_1701014437462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_mvonwyl_base_uncased_finetuned_squad2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_mvonwyl_base_uncased_finetuned_squad2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.distil_bert.base_uncased").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_mvonwyl_base_uncased_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mvonwyl/distilbert-base-uncased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_myx4567_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_myx4567_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..304a490dcf92 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_myx4567_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from MYX4567) +author: John Snow Labs +name: distilbert_qa_myx4567_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `MYX4567`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_myx4567_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010621345.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_myx4567_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010621345.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_myx4567_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_myx4567_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_myx4567_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/MYX4567/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_nadhiya_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_nadhiya_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..a74f589b284c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_nadhiya_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Nadhiya) +author: John Snow Labs +name: distilbert_qa_nadhiya_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Nadhiya`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_nadhiya_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010753501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_nadhiya_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010753501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_nadhiya_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_nadhiya_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_nadhiya_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Nadhiya/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_plimpton_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_plimpton_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..5e24d41ddcc0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_plimpton_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Plimpton) +author: John Snow Labs +name: distilbert_qa_plimpton_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Plimpton`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_plimpton_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010581396.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_plimpton_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010581396.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_plimpton_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_plimpton_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_plimpton_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Plimpton/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_projectmodel_bert_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_projectmodel_bert_en.md new file mode 100644 index 000000000000..6dd1acb75814 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_projectmodel_bert_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from Sarmad) +author: John Snow Labs +name: distilbert_qa_projectmodel_bert +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `projectmodel-bert` is a English model originally trained by `Sarmad`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_projectmodel_bert_en_5.2.0_3.0_1701013822824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_projectmodel_bert_en_5.2.0_3.0_1701013822824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_projectmodel_bert","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_projectmodel_bert","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_projectmodel_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Sarmad/projectmodel-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_qa_en.md new file mode 100644 index 000000000000..b30ba667bf44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_qa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from poom-sci) +author: John Snow Labs +name: distilbert_qa_qa +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-qa` is a English model originally trained by `poom-sci`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_qa_en_5.2.0_3.0_1701014615598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_qa_en_5.2.0_3.0_1701014615598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_qa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_qa","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.distil_bert.by_poom-sci").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/poom-sci/distilbert-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_raphaelg9_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_raphaelg9_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..ea2574e04834 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_raphaelg9_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Raphaelg9) +author: John Snow Labs +name: distilbert_qa_raphaelg9_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Raphaelg9`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_raphaelg9_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010843509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_raphaelg9_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010843509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_raphaelg9_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_raphaelg9_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_raphaelg9_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Raphaelg9/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_sae_base_uncased_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_sae_base_uncased_squad_en.md new file mode 100644 index 000000000000..0874e8214a33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_sae_base_uncased_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from jgammack) +author: John Snow Labs +name: distilbert_qa_sae_base_uncased_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SAE-distilbert-base-uncased-squad` is a English model originally trained by `jgammack`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_sae_base_uncased_squad_en_5.2.0_3.0_1701010804429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_sae_base_uncased_squad_en_5.2.0_3.0_1701010804429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_sae_base_uncased_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_sae_base_uncased_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_sae_base_uncased_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/jgammack/SAE-distilbert-base-uncased-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_seishin_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_seishin_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..5236f3cfa53d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_seishin_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from SEISHIN) +author: John Snow Labs +name: distilbert_qa_seishin_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `SEISHIN`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_seishin_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010753456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_seishin_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010753456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_seishin_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_seishin_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_seishin_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SEISHIN/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_shashidhar_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_shashidhar_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..3f785609fb68 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_shashidhar_base_uncased_finetuned_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English distilbert_qa_shashidhar_base_uncased_finetuned_squad DistilBertForQuestionAnswering from Shashidhar +author: John Snow Labs +name: distilbert_qa_shashidhar_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_shashidhar_base_uncased_finetuned_squad` is a English model originally trained by Shashidhar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_shashidhar_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010801796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_shashidhar_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010801796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_shashidhar_base_uncased_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_shashidhar_base_uncased_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_shashidhar_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/Shashidhar/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_single_label_n_max_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_single_label_n_max_en.md new file mode 100644 index 000000000000..4eaa6c2aaf54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_single_label_n_max_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from mcurmei) +author: John Snow Labs +name: distilbert_qa_single_label_n_max +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `single_label_N_max` is a English model originally trained by `mcurmei`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_single_label_n_max_en_5.2.0_3.0_1701013820377.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_single_label_n_max_en_5.2.0_3.0_1701013820377.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_single_label_n_max","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_single_label_n_max","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_single_label_n_max| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/mcurmei/single_label_N_max \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_sourabh714_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_sourabh714_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..94d440737a7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_sourabh714_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Sourabh714) +author: John Snow Labs +name: distilbert_qa_sourabh714_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Sourabh714`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_sourabh714_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010999287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_sourabh714_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010999287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_sourabh714_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_sourabh714_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_sourabh714_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Sourabh714/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_squad_slp_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_squad_slp_en.md new file mode 100644 index 000000000000..cb02144514cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_squad_slp_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from rowan1224) +author: John Snow Labs +name: distilbert_qa_squad_slp +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-squad-slp` is a English model originally trained by `rowan1224`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_squad_slp_en_5.2.0_3.0_1701014042295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_squad_slp_en_5.2.0_3.0_1701014042295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_squad_slp","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_squad_slp","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.by_rowan1224").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_squad_slp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/rowan1224/distilbert-squad-slp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_squadv1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_squadv1_en.md new file mode 100644 index 000000000000..2c103a50b716 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_squadv1_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Cased model (from abhilash1910) +author: John Snow Labs +name: distilbert_qa_squadv1 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-squadv1` is a English model originally trained by `abhilash1910`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_squadv1_en_5.2.0_3.0_1701014822159.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_squadv1_en_5.2.0_3.0_1701014822159.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_squadv1","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_squadv1","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.by_abhilash1910").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_squadv1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/abhilash1910/distilbert-squadv1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_supriyaarun_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_supriyaarun_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..b12af0961448 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_supriyaarun_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from SupriyaArun) +author: John Snow Labs +name: distilbert_qa_supriyaarun_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `SupriyaArun`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_supriyaarun_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010975060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_supriyaarun_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010975060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_supriyaarun_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_supriyaarun_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_supriyaarun_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/SupriyaArun/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tabo_base_uncased_finetuned_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tabo_base_uncased_finetuned_squad2_en.md new file mode 100644 index 000000000000..21dc72b455ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tabo_base_uncased_finetuned_squad2_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from tabo) +author: John Snow Labs +name: distilbert_qa_tabo_base_uncased_finetuned_squad2 +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad2` is a English model originally trained by `tabo`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_tabo_base_uncased_finetuned_squad2_en_5.2.0_3.0_1701013475212.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_tabo_base_uncased_finetuned_squad2_en_5.2.0_3.0_1701013475212.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tabo_base_uncased_finetuned_squad2","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tabo_base_uncased_finetuned_squad2","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squadv2.distil_bert.base_uncased.by_tabo").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_tabo_base_uncased_finetuned_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tabo/distilbert-base-uncased-finetuned-squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_thitaree_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_thitaree_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..a3e81856a46d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_thitaree_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Thitaree) +author: John Snow Labs +name: distilbert_qa_thitaree_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Thitaree`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_thitaree_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011003522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_thitaree_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011003522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_thitaree_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_thitaree_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_thitaree_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Thitaree/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tianle_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tianle_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..8a0338d8676b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tianle_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Tianle) +author: John Snow Labs +name: distilbert_qa_tianle_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Tianle`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_tianle_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011197267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_tianle_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011197267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tianle_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tianle_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_tianle_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Tianle/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tiny_base_cased_distilled_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tiny_base_cased_distilled_squad_en.md new file mode 100644 index 000000000000..7a58653214e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tiny_base_cased_distilled_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from sshleifer) +author: John Snow Labs +name: distilbert_qa_tiny_base_cased_distilled_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tiny-distilbert-base-cased-distilled-squad` is a English model originally trained by `sshleifer`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_tiny_base_cased_distilled_squad_en_5.2.0_3.0_1701014956119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_tiny_base_cased_distilled_squad_en_5.2.0_3.0_1701014956119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tiny_base_cased_distilled_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tiny_base_cased_distilled_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_tiny_cased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_tiny_base_cased_distilled_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|523.3 KB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/sshleifer/tiny-distilbert-base-cased-distilled-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_transformers_base_uncased_finetuneqa_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_transformers_base_uncased_finetuneqa_squad_en.md new file mode 100644 index 000000000000..822063e67dcc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_transformers_base_uncased_finetuneqa_squad_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English distilbert_qa_transformers_base_uncased_finetuneqa_squad DistilBertForQuestionAnswering from manudotc +author: John Snow Labs +name: distilbert_qa_transformers_base_uncased_finetuneqa_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_transformers_base_uncased_finetuneqa_squad` is a English model originally trained by manudotc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_transformers_base_uncased_finetuneqa_squad_en_5.2.0_3.0_1701015015271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_transformers_base_uncased_finetuneqa_squad_en_5.2.0_3.0_1701015015271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_transformers_base_uncased_finetuneqa_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_transformers_base_uncased_finetuneqa_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_transformers_base_uncased_finetuneqa_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +https://huggingface.co/manudotc/transformers_distilbert-base-uncased_finetuneQA_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tucan9389_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tucan9389_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..76e4deec1ede --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_tucan9389_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from tucan9389) +author: John Snow Labs +name: distilbert_qa_tucan9389_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `tucan9389`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_tucan9389_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013650762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_tucan9389_base_uncased_finetuned_squad_en_5.2.0_3.0_1701013650762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tucan9389_base_uncased_finetuned_squad","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_tucan9389_base_uncased_finetuned_squad","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_tucan9389").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_tucan9389_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tucan9389/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_unique_n_max_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_unique_n_max_en.md new file mode 100644 index 000000000000..0ff6b812ac60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_unique_n_max_en.md @@ -0,0 +1,95 @@ +--- +layout: model +title: English distilbert_qa_unique_n_max DistilBertForQuestionAnswering from mcurmei +author: John Snow Labs +name: distilbert_qa_unique_n_max +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_qa_unique_n_max` is a English model originally trained by mcurmei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_unique_n_max_en_5.2.0_3.0_1701015134223.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_unique_n_max_en_5.2.0_3.0_1701015134223.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_unique_n_max","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_qa_unique_n_max", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_unique_n_max| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|true| +|Max sentence length:|512| + +## References + +https://huggingface.co/mcurmei/unique_N_max \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_unqover_base_uncased_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_unqover_base_uncased_newsqa_en.md new file mode 100644 index 000000000000..b6e685205b56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_unqover_base_uncased_newsqa_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering model (from tli8hf) Newsqa +author: John Snow Labs +name: distilbert_qa_unqover_base_uncased_newsqa +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained Question Answering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `unqover-distilbert-base-uncased-newsqa` is a English model originally trained by `tli8hf`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_unqover_base_uncased_newsqa_en_5.2.0_3.0_1701015205435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_unqover_base_uncased_newsqa_en_5.2.0_3.0_1701015205435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = MultiDocumentAssembler() \ +.setInputCols(["question", "context"]) \ +.setOutputCols(["document_question", "document_context"]) + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_qa_unqover_base_uncased_newsqa","en") \ +.setInputCols(["document_question", "document_context"]) \ +.setOutputCol("answer")\ +.setCaseSensitive(True) + +pipeline = Pipeline(stages=[documentAssembler, spanClassifier]) + +data = spark.createDataFrame([["What is my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new MultiDocumentAssembler() +.setInputCols(Array("question", "context")) +.setOutputCols(Array("document_question", "document_context")) + +val spanClassifer = DistilBertForQuestionAnswering.pretrained("distilbert_qa_unqover_base_uncased_newsqa","en") +.setInputCols(Array("document", "token")) +.setOutputCol("answer") +.setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier)) + +val data = Seq("What is my name?", "My name is Clara and I live in Berkeley.").toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.news.distil_bert.base_uncased").predict("""What is my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_unqover_base_uncased_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/tli8hf/unqover-distilbert-base-uncased-newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_usami_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_usami_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..c0c1d0e1c6f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_usami_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from usami) +author: John Snow Labs +name: distilbert_qa_usami_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `usami`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_usami_base_uncased_finetuned_squad_en_5.2.0_3.0_1701015399201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_usami_base_uncased_finetuned_squad_en_5.2.0_3.0_1701015399201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_usami_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_usami_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_usami").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_usami_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/usami/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_v3rx2000_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_v3rx2000_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..ce05791dbd0c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_v3rx2000_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from V3RX2000) +author: John Snow Labs +name: distilbert_qa_v3rx2000_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `V3RX2000`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_v3rx2000_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011359032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_v3rx2000_base_uncased_finetuned_squad_en_5.2.0_3.0_1701011359032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_v3rx2000_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_v3rx2000_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_v3rx2000_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/V3RX2000/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vitusya_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vitusya_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..4ebcd80974b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vitusya_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from vitusya) +author: John Snow Labs +name: distilbert_qa_vitusya_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `vitusya`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_vitusya_base_uncased_finetuned_squad_en_5.2.0_3.0_1701015562582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_vitusya_base_uncased_finetuned_squad_en_5.2.0_3.0_1701015562582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_vitusya_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_vitusya_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_vitusya").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_vitusya_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vitusya/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vkmr_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vkmr_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..87fb9bae7833 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vkmr_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from vkmr) +author: John Snow Labs +name: distilbert_qa_vkmr_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `vkmr`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_vkmr_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014045661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_vkmr_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014045661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_vkmr_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_vkmr_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_vkmr").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_vkmr_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vkmr/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..c3a7d2cf6dc1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad_en.md @@ -0,0 +1,100 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from vkrishnamoorthy) +author: John Snow Labs +name: distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `vkrishnamoorthy`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014246078.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad_en_5.2.0_3.0_1701014246078.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.answer_question.squad.distil_bert.base_uncased.by_vkrishnamoorthy").predict("""What's my name?|||"My name is Clara and I live in Berkeley.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_vkrishnamoorthy_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/vkrishnamoorthy/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_wiam_base_uncased_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_wiam_base_uncased_finetuned_squad_en.md new file mode 100644 index 000000000000..a8eb106cc48a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_qa_wiam_base_uncased_finetuned_squad_en.md @@ -0,0 +1,94 @@ +--- +layout: model +title: English DistilBertForQuestionAnswering Base Uncased model (from Wiam) +author: John Snow Labs +name: distilbert_qa_wiam_base_uncased_finetuned_squad +date: 2023-11-26 +tags: [en, open_source, distilbert, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilbert-base-uncased-finetuned-squad` is a English model originally trained by `Wiam`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_qa_wiam_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010948782.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_qa_wiam_base_uncased_finetuned_squad_en_5.2.0_3.0_1701010948782.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +Document_Assembler = MultiDocumentAssembler()\ + .setInputCols(["question", "context"])\ + .setOutputCols(["document_question", "document_context"]) + +Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_wiam_base_uncased_finetuned_squad","en")\ + .setInputCols(["document_question", "document_context"])\ + .setOutputCol("answer")\ + .setCaseSensitive(True) + +pipeline = Pipeline(stages=[Document_Assembler, Question_Answering]) + +data = spark.createDataFrame([["What's my name?","My name is Clara and I live in Berkeley."]]).toDF("question", "context") + +result = pipeline.fit(data).transform(data) +``` +```scala +val Document_Assembler = new MultiDocumentAssembler() + .setInputCols(Array("question", "context")) + .setOutputCols(Array("document_question", "document_context")) + +val Question_Answering = DistilBertForQuestionAnswering.pretrained("distilbert_qa_wiam_base_uncased_finetuned_squad","en") + .setInputCols(Array("document_question", "document_context")) + .setOutputCol("answer") + .setCaseSensitive(true) + +val pipeline = new Pipeline().setStages(Array(Document_Assembler, Question_Answering)) + +val data = Seq("What's my name?","My name is Clara and I live in Berkeley.").toDS.toDF("question", "context") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_qa_wiam_base_uncased_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| +|Case sensitive:|false| +|Max sentence length:|512| + +## References + +References + +- https://huggingface.co/Wiam/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad2_custom_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad2_custom_en.md new file mode 100644 index 000000000000..a82d84d58537 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad2_custom_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad2_custom DistilBertForQuestionAnswering from arver +author: John Snow Labs +name: distilbert_squad2_custom +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad2_custom` is a English model originally trained by arver. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad2_custom_en_5.2.0_3.0_1701020032947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad2_custom_en_5.2.0_3.0_1701020032947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad2_custom","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad2_custom", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad2_custom| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/arver/distilbert_squad2_custom \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad2_en.md new file mode 100644 index 000000000000..42b643e1b9a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad2 DistilBertForQuestionAnswering from johnjose223 +author: John Snow Labs +name: distilbert_squad2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad2` is a English model originally trained by johnjose223. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad2_en_5.2.0_3.0_1701033432676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad2_en_5.2.0_3.0_1701033432676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/johnjose223/distilbert_squad2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_256seq_8batch_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_256seq_8batch_test_en.md new file mode 100644 index 000000000000..0d94c2afa4c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_256seq_8batch_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_256seq_8batch_test DistilBertForQuestionAnswering from manishiitg +author: John Snow Labs +name: distilbert_squad_256seq_8batch_test +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_256seq_8batch_test` is a English model originally trained by manishiitg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_256seq_8batch_test_en_5.2.0_3.0_1701040595350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_256seq_8batch_test_en_5.2.0_3.0_1701040595350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_256seq_8batch_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_256seq_8batch_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_256seq_8batch_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/manishiitg/distilbert-squad-256seq-8batch-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_abrarelidrisi_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_abrarelidrisi_en.md new file mode 100644 index 000000000000..5bbee66a409b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_abrarelidrisi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_abrarelidrisi DistilBertForQuestionAnswering from AbrarElidrisi +author: John Snow Labs +name: distilbert_squad_abrarelidrisi +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_abrarelidrisi` is a English model originally trained by AbrarElidrisi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_abrarelidrisi_en_5.2.0_3.0_1701023389872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_abrarelidrisi_en_5.2.0_3.0_1701023389872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_abrarelidrisi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_abrarelidrisi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_abrarelidrisi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AbrarElidrisi/distilbert-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_colab_finetuned_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_colab_finetuned_model_en.md new file mode 100644 index 000000000000..26483311c1d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_colab_finetuned_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_colab_finetuned_model DistilBertForQuestionAnswering from mpinedaa +author: John Snow Labs +name: distilbert_squad_colab_finetuned_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_colab_finetuned_model` is a English model originally trained by mpinedaa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_colab_finetuned_model_en_5.2.0_3.0_1701033025370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_colab_finetuned_model_en_5.2.0_3.0_1701033025370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_colab_finetuned_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_colab_finetuned_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_colab_finetuned_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mpinedaa/distilbert_squad_colab_finetuned_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_2_en.md new file mode 100644 index 000000000000..2be93489b710 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_for_musique_2 DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_squad_for_musique_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_for_musique_2` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_2_en_5.2.0_3.0_1701025640702.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_2_en_5.2.0_3.0_1701025640702.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_for_musique_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_for_musique_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_for_musique_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|235.8 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_squad_for_musique_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_3_en.md new file mode 100644 index 000000000000..acdda43b027e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_for_musique_3 DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_squad_for_musique_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_for_musique_3` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_3_en_5.2.0_3.0_1701022573249.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_3_en_5.2.0_3.0_1701022573249.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_for_musique_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_for_musique_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_for_musique_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|236.7 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_squad_for_musique_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_4_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_4_en.md new file mode 100644 index 000000000000..1816cdffdca2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_for_musique_4 DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_squad_for_musique_4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_for_musique_4` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_4_en_5.2.0_3.0_1701026854728.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_4_en_5.2.0_3.0_1701026854728.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_for_musique_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_for_musique_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_for_musique_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|233.1 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_squad_for_musique_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_5_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_5_en.md new file mode 100644 index 000000000000..3c55af01c7f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_for_musique_5 DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_squad_for_musique_5 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_for_musique_5` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_5_en_5.2.0_3.0_1701038591161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_5_en_5.2.0_3.0_1701038591161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_for_musique_5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_for_musique_5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_for_musique_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|238.5 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_squad_for_musique_5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_6_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_6_en.md new file mode 100644 index 000000000000..ea7370ac7128 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_for_musique_6 DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_squad_for_musique_6 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_for_musique_6` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_6_en_5.2.0_3.0_1701023815906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_6_en_5.2.0_3.0_1701023815906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_for_musique_6","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_for_musique_6", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_for_musique_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|237.3 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_squad_for_musique_6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_7_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_7_en.md new file mode 100644 index 000000000000..525fdaba1ffe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_for_musique_7 DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_squad_for_musique_7 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_for_musique_7` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_7_en_5.2.0_3.0_1701031735084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_7_en_5.2.0_3.0_1701031735084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_for_musique_7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_for_musique_7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_for_musique_7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|237.1 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_squad_for_musique_7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_en.md new file mode 100644 index 000000000000..b642089500a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_for_musique_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_for_musique DistilBertForQuestionAnswering from keremnazliel +author: John Snow Labs +name: distilbert_squad_for_musique +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_for_musique` is a English model originally trained by keremnazliel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_en_5.2.0_3.0_1701032159269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_for_musique_en_5.2.0_3.0_1701032159269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_for_musique","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_for_musique", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_for_musique| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|236.1 MB| + +## References + +https://huggingface.co/keremnazliel/distilbert_squad_for_musique \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_johnjose223_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_johnjose223_en.md new file mode 100644 index 000000000000..d3d3196c103f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_johnjose223_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_johnjose223 DistilBertForQuestionAnswering from johnjose223 +author: John Snow Labs +name: distilbert_squad_johnjose223 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_johnjose223` is a English model originally trained by johnjose223. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_johnjose223_en_5.2.0_3.0_1701019672455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_johnjose223_en_5.2.0_3.0_1701019672455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_johnjose223","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_johnjose223", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_johnjose223| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/johnjose223/distilbert_squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_newsqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_newsqa_en.md new file mode 100644 index 000000000000..96824f73ca4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_newsqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_newsqa DistilBertForQuestionAnswering from sophiebottani +author: John Snow Labs +name: distilbert_squad_newsqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_newsqa` is a English model originally trained by sophiebottani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_newsqa_en_5.2.0_3.0_1701016705276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_newsqa_en_5.2.0_3.0_1701016705276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_newsqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_newsqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_newsqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|195.3 MB| + +## References + +https://huggingface.co/sophiebottani/distilbert_squad_newsqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_sample_finetuned_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_sample_finetuned_model_en.md new file mode 100644 index 000000000000..bf137f689962 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_sample_finetuned_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_sample_finetuned_model DistilBertForQuestionAnswering from mpinedaa +author: John Snow Labs +name: distilbert_squad_sample_finetuned_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_sample_finetuned_model` is a English model originally trained by mpinedaa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_sample_finetuned_model_en_5.2.0_3.0_1701028385955.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_sample_finetuned_model_en_5.2.0_3.0_1701028385955.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_sample_finetuned_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_sample_finetuned_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_sample_finetuned_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mpinedaa/distilbert_squad_sample_finetuned_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_skyskuy_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_skyskuy_en.md new file mode 100644 index 000000000000..c34f4e099dcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_skyskuy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_skyskuy DistilBertForQuestionAnswering from skyskuy +author: John Snow Labs +name: distilbert_squad_skyskuy +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_skyskuy` is a English model originally trained by skyskuy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_skyskuy_en_5.2.0_3.0_1701020627104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_skyskuy_en_5.2.0_3.0_1701020627104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_skyskuy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_skyskuy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_skyskuy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/skyskuy/distilbert-squad-skyskuy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_v1_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_v1_en.md new file mode 100644 index 000000000000..b92427f4787b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_squad_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_squad_v1 DistilBertForQuestionAnswering from mihirinamdar +author: John Snow Labs +name: distilbert_squad_v1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_squad_v1` is a English model originally trained by mihirinamdar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_squad_v1_en_5.2.0_3.0_1701040134146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_squad_v1_en_5.2.0_3.0_1701040134146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_squad_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_squad_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_squad_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mihirinamdar/distilbert-squad-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_test_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_test_2_en.md new file mode 100644 index 000000000000..0fc7efef61d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_test_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_test_2 DistilBertForQuestionAnswering from jeffnjy +author: John Snow Labs +name: distilbert_test_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_test_2` is a English model originally trained by jeffnjy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_test_2_en_5.2.0_3.0_1701014826681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_test_2_en_5.2.0_3.0_1701014826681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_test_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_test_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_test_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jeffnjy/distilbert-test-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbert_v1_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbert_v1_finetuned_squad_en.md new file mode 100644 index 000000000000..250037ecc035 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbert_v1_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_v1_finetuned_squad DistilBertForQuestionAnswering from gunold +author: John Snow Labs +name: distilbert_v1_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_v1_finetuned_squad` is a English model originally trained by gunold. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_v1_finetuned_squad_en_5.2.0_3.0_1701023534422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_v1_finetuned_squad_en_5.2.0_3.0_1701023534422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_v1_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_v1_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_v1_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|276.0 MB| + +## References + +https://huggingface.co/gunold/distilbert_v1-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilbertcustom2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilbertcustom2_en.md new file mode 100644 index 000000000000..f9aeac2b26ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilbertcustom2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbertcustom2 DistilBertForQuestionAnswering from aszidon +author: John Snow Labs +name: distilbertcustom2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbertcustom2` is a English model originally trained by aszidon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbertcustom2_en_5.2.0_3.0_1701024418837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbertcustom2_en_5.2.0_3.0_1701024418837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbertcustom2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbertcustom2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbertcustom2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aszidon/distilbertcustom2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilebert_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilebert_squad_v2_en.md new file mode 100644 index 000000000000..88723403f16b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilebert_squad_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilebert_squad_v2 DistilBertForQuestionAnswering from fahmiaziz +author: John Snow Labs +name: distilebert_squad_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilebert_squad_v2` is a English model originally trained by fahmiaziz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilebert_squad_v2_en_5.2.0_3.0_1701033126847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilebert_squad_v2_en_5.2.0_3.0_1701033126847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilebert_squad_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilebert_squad_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilebert_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fahmiaziz/distilebert-squad-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distiled_bert_finetuned_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distiled_bert_finetuned_squad_v2_en.md new file mode 100644 index 000000000000..314ed3e835a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distiled_bert_finetuned_squad_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distiled_bert_finetuned_squad_v2 DistilBertForQuestionAnswering from johnjose223 +author: John Snow Labs +name: distiled_bert_finetuned_squad_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distiled_bert_finetuned_squad_v2` is a English model originally trained by johnjose223. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distiled_bert_finetuned_squad_v2_en_5.2.0_3.0_1701020249866.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distiled_bert_finetuned_squad_v2_en_5.2.0_3.0_1701020249866.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distiled_bert_finetuned_squad_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distiled_bert_finetuned_squad_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distiled_bert_finetuned_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/johnjose223/distiled_bert-finetuned-squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distill_bert_uncase_finetuned_squad_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distill_bert_uncase_finetuned_squad_v2_en.md new file mode 100644 index 000000000000..920eb12a0d8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distill_bert_uncase_finetuned_squad_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distill_bert_uncase_finetuned_squad_v2 DistilBertForQuestionAnswering from ahmadbinshafiq +author: John Snow Labs +name: distill_bert_uncase_finetuned_squad_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distill_bert_uncase_finetuned_squad_v2` is a English model originally trained by ahmadbinshafiq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distill_bert_uncase_finetuned_squad_v2_en_5.2.0_3.0_1701023966835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distill_bert_uncase_finetuned_squad_v2_en_5.2.0_3.0_1701023966835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distill_bert_uncase_finetuned_squad_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distill_bert_uncase_finetuned_squad_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distill_bert_uncase_finetuned_squad_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ahmadbinshafiq/distill_bert_uncase_finetuned_squad_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distillbert_base_uncased_fine_tuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-distillbert_base_uncased_fine_tuned_squad_en.md new file mode 100644 index 000000000000..31944f614322 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distillbert_base_uncased_fine_tuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distillbert_base_uncased_fine_tuned_squad DistilBertForQuestionAnswering from monakth +author: John Snow Labs +name: distillbert_base_uncased_fine_tuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncased_fine_tuned_squad` is a English model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_fine_tuned_squad_en_5.2.0_3.0_1701016175422.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_fine_tuned_squad_en_5.2.0_3.0_1701016175422.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distillbert_base_uncased_fine_tuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distillbert_base_uncased_fine_tuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncased_fine_tuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/monakth/distillbert-base-uncased-fine-tuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distillbert_base_uncased_fine_tuned_squadv2_en.md b/docs/_posts/ahmedlone127/2023-11-26-distillbert_base_uncased_fine_tuned_squadv2_en.md new file mode 100644 index 000000000000..bd9f58991465 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distillbert_base_uncased_fine_tuned_squadv2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distillbert_base_uncased_fine_tuned_squadv2 DistilBertForQuestionAnswering from monakth +author: John Snow Labs +name: distillbert_base_uncased_fine_tuned_squadv2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_base_uncased_fine_tuned_squadv2` is a English model originally trained by monakth. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_fine_tuned_squadv2_en_5.2.0_3.0_1701017511673.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_base_uncased_fine_tuned_squadv2_en_5.2.0_3.0_1701017511673.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distillbert_base_uncased_fine_tuned_squadv2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distillbert_base_uncased_fine_tuned_squadv2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_base_uncased_fine_tuned_squadv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/monakth/distillbert-base-uncased-fine-tuned-squadv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distillbert_custom_answer_en.md b/docs/_posts/ahmedlone127/2023-11-26-distillbert_custom_answer_en.md new file mode 100644 index 000000000000..7bf7818fa81e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distillbert_custom_answer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distillbert_custom_answer DistilBertForQuestionAnswering from Valli +author: John Snow Labs +name: distillbert_custom_answer +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_custom_answer` is a English model originally trained by Valli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_custom_answer_en_5.2.0_3.0_1701027477479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_custom_answer_en_5.2.0_3.0_1701027477479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distillbert_custom_answer","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distillbert_custom_answer", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_custom_answer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Valli/distillbert_custom_answer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distillbert_for_question_answering_en.md b/docs/_posts/ahmedlone127/2023-11-26-distillbert_for_question_answering_en.md new file mode 100644 index 000000000000..00c8f2a90db4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distillbert_for_question_answering_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distillbert_for_question_answering DistilBertForQuestionAnswering from Zamachi +author: John Snow Labs +name: distillbert_for_question_answering +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distillbert_for_question_answering` is a English model originally trained by Zamachi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distillbert_for_question_answering_en_5.2.0_3.0_1701039470343.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distillbert_for_question_answering_en_5.2.0_3.0_1701039470343.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distillbert_for_question_answering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distillbert_for_question_answering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distillbert_for_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|248.7 MB| + +## References + +https://huggingface.co/Zamachi/distillbert-for-question-answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilled_bert_covidqa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-distilled_bert_covidqa_model_en.md new file mode 100644 index 000000000000..63984351fba7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilled_bert_covidqa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilled_bert_covidqa_model DistilBertForQuestionAnswering from goodcoffee +author: John Snow Labs +name: distilled_bert_covidqa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilled_bert_covidqa_model` is a English model originally trained by goodcoffee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilled_bert_covidqa_model_en_5.2.0_3.0_1701014355643.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilled_bert_covidqa_model_en_5.2.0_3.0_1701014355643.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilled_bert_covidqa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilled_bert_covidqa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilled_bert_covidqa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/goodcoffee/Distilled_bert_CovidQA_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-distilrubert_tiny_cased_conversational_ru.md b/docs/_posts/ahmedlone127/2023-11-26-distilrubert_tiny_cased_conversational_ru.md new file mode 100644 index 000000000000..ee1db70efd6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-distilrubert_tiny_cased_conversational_ru.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Russian distilrubert_tiny_cased_conversational DistilBertForQuestionAnswering from kisa-misa +author: John Snow Labs +name: distilrubert_tiny_cased_conversational +date: 2023-11-26 +tags: [distilbert, ru, open_source, question_answering, onnx] +task: Question Answering +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilrubert_tiny_cased_conversational` is a Russian model originally trained by kisa-misa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_ru_5.2.0_3.0_1701014559809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilrubert_tiny_cased_conversational_ru_5.2.0_3.0_1701014559809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilrubert_tiny_cased_conversational","ru") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilrubert_tiny_cased_conversational", "ru") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilrubert_tiny_cased_conversational| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|ru| +|Size:|397.8 MB| + +## References + +https://huggingface.co/kisa-misa/distilrubert-tiny-cased-conversational \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-dummy_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-dummy_qa_model_en.md new file mode 100644 index 000000000000..8d926adc9781 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-dummy_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dummy_qa_model DistilBertForQuestionAnswering from amrutha3899 +author: John Snow Labs +name: dummy_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_qa_model` is a English model originally trained by amrutha3899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_qa_model_en_5.2.0_3.0_1701035247746.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_qa_model_en_5.2.0_3.0_1701035247746.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("dummy_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("dummy_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dummy_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/amrutha3899/dummy_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-edtechchatbot_en.md b/docs/_posts/ahmedlone127/2023-11-26-edtechchatbot_en.md new file mode 100644 index 000000000000..bf40ba1dc83e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-edtechchatbot_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English edtechchatbot DistilBertForQuestionAnswering from phanimvsk +author: John Snow Labs +name: edtechchatbot +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`edtechchatbot` is a English model originally trained by phanimvsk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/edtechchatbot_en_5.2.0_3.0_1701033688242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/edtechchatbot_en_5.2.0_3.0_1701033688242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("edtechchatbot","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("edtechchatbot", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|edtechchatbot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|248.0 MB| + +## References + +https://huggingface.co/phanimvsk/EdtechChatbot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-epochs_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-epochs_finetuned_squad_en.md new file mode 100644 index 000000000000..f6d726f08a66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-epochs_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English epochs_finetuned_squad DistilBertForQuestionAnswering from arshiya20 +author: John Snow Labs +name: epochs_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`epochs_finetuned_squad` is a English model originally trained by arshiya20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/epochs_finetuned_squad_en_5.2.0_3.0_1701035390374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/epochs_finetuned_squad_en_5.2.0_3.0_1701035390374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("epochs_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("epochs_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|epochs_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/arshiya20/epochs-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-eurekaqa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-eurekaqa_model_en.md new file mode 100644 index 000000000000..6f14003ece62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-eurekaqa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English eurekaqa_model DistilBertForQuestionAnswering from Kaludi +author: John Snow Labs +name: eurekaqa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`eurekaqa_model` is a English model originally trained by Kaludi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/eurekaqa_model_en_5.2.0_3.0_1701029088460.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/eurekaqa_model_en_5.2.0_3.0_1701029088460.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("eurekaqa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("eurekaqa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|eurekaqa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Kaludi/eurekaQA-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-extractive_question_answering_en.md b/docs/_posts/ahmedlone127/2023-11-26-extractive_question_answering_en.md new file mode 100644 index 000000000000..d4679114641b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-extractive_question_answering_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English extractive_question_answering DistilBertForQuestionAnswering from mrbach +author: John Snow Labs +name: extractive_question_answering +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`extractive_question_answering` is a English model originally trained by mrbach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/extractive_question_answering_en_5.2.0_3.0_1701030138740.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/extractive_question_answering_en_5.2.0_3.0_1701030138740.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("extractive_question_answering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("extractive_question_answering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|extractive_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mrbach/extractive_question_answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-extractive_question_answering_not_evaluated_en.md b/docs/_posts/ahmedlone127/2023-11-26-extractive_question_answering_not_evaluated_en.md new file mode 100644 index 000000000000..56e0437976fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-extractive_question_answering_not_evaluated_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English extractive_question_answering_not_evaluated DistilBertForQuestionAnswering from autoevaluate +author: John Snow Labs +name: extractive_question_answering_not_evaluated +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`extractive_question_answering_not_evaluated` is a English model originally trained by autoevaluate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/extractive_question_answering_not_evaluated_en_5.2.0_3.0_1701016520802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/extractive_question_answering_not_evaluated_en_5.2.0_3.0_1701016520802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("extractive_question_answering_not_evaluated","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("extractive_question_answering_not_evaluated", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|extractive_question_answering_not_evaluated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/autoevaluate/extractive-question-answering-not-evaluated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-find_mention_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-find_mention_finetuned_squad_en.md new file mode 100644 index 000000000000..c66b64dcc0ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-find_mention_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English find_mention_finetuned_squad DistilBertForQuestionAnswering from AvishayDev +author: John Snow Labs +name: find_mention_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`find_mention_finetuned_squad` is a English model originally trained by AvishayDev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/find_mention_finetuned_squad_en_5.2.0_3.0_1701031220491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/find_mention_finetuned_squad_en_5.2.0_3.0_1701031220491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("find_mention_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("find_mention_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|find_mention_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AvishayDev/find-mention-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-g5_hw4_sentiment_part2_en.md b/docs/_posts/ahmedlone127/2023-11-26-g5_hw4_sentiment_part2_en.md new file mode 100644 index 000000000000..bff45c508b67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-g5_hw4_sentiment_part2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English g5_hw4_sentiment_part2 DistilBertForQuestionAnswering from parsi-ai-nlpclass +author: John Snow Labs +name: g5_hw4_sentiment_part2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`g5_hw4_sentiment_part2` is a English model originally trained by parsi-ai-nlpclass. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/g5_hw4_sentiment_part2_en_5.2.0_3.0_1701019646992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/g5_hw4_sentiment_part2_en_5.2.0_3.0_1701019646992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("g5_hw4_sentiment_part2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("g5_hw4_sentiment_part2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|g5_hw4_sentiment_part2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/parsi-ai-nlpclass/G5_HW4_sentiment_part2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-helical_quest_ans_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-helical_quest_ans_1_en.md new file mode 100644 index 000000000000..1ea428411e9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-helical_quest_ans_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English helical_quest_ans_1 DistilBertForQuestionAnswering from pravinandhale +author: John Snow Labs +name: helical_quest_ans_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`helical_quest_ans_1` is a English model originally trained by pravinandhale. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/helical_quest_ans_1_en_5.2.0_3.0_1701037427999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/helical_quest_ans_1_en_5.2.0_3.0_1701037427999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("helical_quest_ans_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("helical_quest_ans_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|helical_quest_ans_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pravinandhale/Helical_Quest_Ans_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-helical_quest_ans_en.md b/docs/_posts/ahmedlone127/2023-11-26-helical_quest_ans_en.md new file mode 100644 index 000000000000..cbe918e082b2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-helical_quest_ans_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English helical_quest_ans DistilBertForQuestionAnswering from pravinandhale +author: John Snow Labs +name: helical_quest_ans +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`helical_quest_ans` is a English model originally trained by pravinandhale. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/helical_quest_ans_en_5.2.0_3.0_1701042398947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/helical_quest_ans_en_5.2.0_3.0_1701042398947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("helical_quest_ans","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("helical_quest_ans", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|helical_quest_ans| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pravinandhale/Helical_quest_ans \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-hkp24_en.md b/docs/_posts/ahmedlone127/2023-11-26-hkp24_en.md new file mode 100644 index 000000000000..7c5edf30fb71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-hkp24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hkp24 DistilBertForQuestionAnswering from harikp20 +author: John Snow Labs +name: hkp24 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hkp24` is a English model originally trained by harikp20. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hkp24_en_5.2.0_3.0_1701030685191.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hkp24_en_5.2.0_3.0_1701030685191.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("hkp24","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("hkp24", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hkp24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/harikp20/hkp24 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_en.md b/docs/_posts/ahmedlone127/2023-11-26-how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_en.md new file mode 100644 index 000000000000..a1737915edee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks DistilBertForQuestionAnswering from Tural +author: John Snow Labs +name: how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks` is a English model originally trained by Tural. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_en_5.2.0_3.0_1701030103384.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks_en_5.2.0_3.0_1701030103384.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|how_tonga_tonga_islands_fine_tune_a_model_for_common_downstream_tasks| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Tural/How_to_fine-tune_a_model_for_common_downstream_tasks \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-hruthik_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-hruthik_qa_en.md new file mode 100644 index 000000000000..b1e3b2f11787 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-hruthik_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hruthik_qa DistilBertForQuestionAnswering from hruthik122 +author: John Snow Labs +name: hruthik_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hruthik_qa` is a English model originally trained by hruthik122. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hruthik_qa_en_5.2.0_3.0_1701014553901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hruthik_qa_en_5.2.0_3.0_1701014553901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("hruthik_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("hruthik_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hruthik_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hruthik122/Hruthik_QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companylocation_extraction_qa_model_1_1_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companylocation_extraction_qa_model_1_1_distilbert_en.md new file mode 100644 index 000000000000..a1e958ba3af5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companylocation_extraction_qa_model_1_1_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companylocation_extraction_qa_model_1_1_distilbert DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companylocation_extraction_qa_model_1_1_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companylocation_extraction_qa_model_1_1_distilbert` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companylocation_extraction_qa_model_1_1_distilbert_en_5.2.0_3.0_1701029616762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companylocation_extraction_qa_model_1_1_distilbert_en_5.2.0_3.0_1701029616762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companylocation_extraction_qa_model_1_1_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companylocation_extraction_qa_model_1_1_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companylocation_extraction_qa_model_1_1_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyLocation_Extraction_QA_Model_1.1_Distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert_en.md new file mode 100644 index 000000000000..0a3c05817323 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert_en_5.2.0_3.0_1701038854918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert_en_5.2.0_3.0_1701038854918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_5_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.5_DistilBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_en.md new file mode 100644 index 000000000000..fba0c1088cad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_en_5.2.0_3.0_1701033259305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_en_5.2.0_3.0_1701033259305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.6_DistilBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest_en.md new file mode 100644 index 000000000000..fadad757bbf9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest_en_5.2.0_3.0_1701015865776.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest_en_5.2.0_3.0_1701015865776.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_6_distilbert_unk_retest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.6_DistilBert_UNK_RETEST \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3_en.md new file mode 100644 index 000000000000..615a5873179d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3 DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3_en_5.2.0_3.0_1701034624997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3_en_5.2.0_3.0_1701034624997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.7_DistilBert_DIFFERENT_UNK_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_en.md new file mode 100644 index 000000000000..ce8513d7fc5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_en_5.2.0_3.0_1701040407805.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_en_5.2.0_3.0_1701040407805.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.7_DistilBert_DIFFERENT_UNK \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset_en.md new file mode 100644 index 000000000000..1b0e2c931839 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset_en_5.2.0_3.0_1701013924015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset_en_5.2.0_3.0_1701013924015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_9_distilbert_unk_dataset| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.9_DistilBert_UNK_DATASET \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert_en.md new file mode 100644 index 000000000000..5222dd85b233 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert_en_5.2.0_3.0_1701028943917.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert_en_5.2.0_3.0_1701028943917.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_extraction_qa_model_1_3_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_Extraction_QA_Model_1.3_Distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert_en.md new file mode 100644 index 000000000000..0a321f1f0e76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert_en_5.2.0_3.0_1701025555844.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert_en_5.2.0_3.0_1701025555844.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_extraction_qa_model_1_4_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_Extraction_QA_Model_1.4_Distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_1_1_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_1_1_en.md new file mode 100644 index 000000000000..76cb7938dbf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_1_1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_extraction_qa_model_1_1 DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_extraction_qa_model_1_1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_extraction_qa_model_1_1` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_extraction_qa_model_1_1_en_5.2.0_3.0_1701023031269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_extraction_qa_model_1_1_en_5.2.0_3.0_1701023031269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_extraction_qa_model_1_1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_extraction_qa_model_1_1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_extraction_qa_model_1_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_Extraction_QA_Model_1.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_1_2_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_1_2_distilbert_en.md new file mode 100644 index 000000000000..2d7c961faa32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_1_2_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_extraction_qa_model_1_2_distilbert DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_extraction_qa_model_1_2_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_extraction_qa_model_1_2_distilbert` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_extraction_qa_model_1_2_distilbert_en_5.2.0_3.0_1701020985102.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_extraction_qa_model_1_2_distilbert_en_5.2.0_3.0_1701020985102.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_extraction_qa_model_1_2_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_extraction_qa_model_1_2_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_extraction_qa_model_1_2_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_Extraction_QA_Model_1.2_Distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_unk_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_unk_test_en.md new file mode 100644 index 000000000000..ec54f69fc660 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_companyname_extraction_qa_model_unk_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_extraction_qa_model_unk_test DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_extraction_qa_model_unk_test +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_extraction_qa_model_unk_test` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_extraction_qa_model_unk_test_en_5.2.0_3.0_1701039414043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_extraction_qa_model_unk_test_en_5.2.0_3.0_1701039414043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_extraction_qa_model_unk_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_extraction_qa_model_unk_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_extraction_qa_model_unk_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_Extraction_QA_Model_UNK_Test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase_en.md b/docs/_posts/ahmedlone127/2023-11-26-iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase_en.md new file mode 100644 index 000000000000..5a9bbcd46a8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase_en_5.2.0_3.0_1701028076153.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase_en_5.2.0_3.0_1701028076153.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_qa_model_1_95_distilbert_unk_dataset_no_paraphrase| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_QA_Model_1.95_DistilBert_UNK_DATASET_NO_PARAPHRASE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-jane_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-jane_model_en.md new file mode 100644 index 000000000000..01372b0645e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-jane_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jane_model DistilBertForQuestionAnswering from janecai1825 +author: John Snow Labs +name: jane_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jane_model` is a English model originally trained by janecai1825. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jane_model_en_5.2.0_3.0_1701042823181.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jane_model_en_5.2.0_3.0_1701042823181.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("jane_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("jane_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jane_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/janecai1825/jane_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-javad_mohammadzadeh_en.md b/docs/_posts/ahmedlone127/2023-11-26-javad_mohammadzadeh_en.md new file mode 100644 index 000000000000..43b18e3380ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-javad_mohammadzadeh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English javad_mohammadzadeh DistilBertForQuestionAnswering from mshkhabis +author: John Snow Labs +name: javad_mohammadzadeh +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`javad_mohammadzadeh` is a English model originally trained by mshkhabis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/javad_mohammadzadeh_en_5.2.0_3.0_1701039257492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/javad_mohammadzadeh_en_5.2.0_3.0_1701039257492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("javad_mohammadzadeh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("javad_mohammadzadeh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|javad_mohammadzadeh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mshkhabis/javad_mohammadzadeh \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-jungnerd_qa_model_jungnerd_en.md b/docs/_posts/ahmedlone127/2023-11-26-jungnerd_qa_model_jungnerd_en.md new file mode 100644 index 000000000000..5dc584c377e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-jungnerd_qa_model_jungnerd_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jungnerd_qa_model_jungnerd DistilBertForQuestionAnswering from jungnerd +author: John Snow Labs +name: jungnerd_qa_model_jungnerd +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jungnerd_qa_model_jungnerd` is a English model originally trained by jungnerd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jungnerd_qa_model_jungnerd_en_5.2.0_3.0_1701029930600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jungnerd_qa_model_jungnerd_en_5.2.0_3.0_1701029930600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("jungnerd_qa_model_jungnerd","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("jungnerd_qa_model_jungnerd", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jungnerd_qa_model_jungnerd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jungnerd/jungnerd_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-jungnerd_qa_model_sunmin_dev_en.md b/docs/_posts/ahmedlone127/2023-11-26-jungnerd_qa_model_sunmin_dev_en.md new file mode 100644 index 000000000000..14a49b960597 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-jungnerd_qa_model_sunmin_dev_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English jungnerd_qa_model_sunmin_dev DistilBertForQuestionAnswering from Sunmin-dev +author: John Snow Labs +name: jungnerd_qa_model_sunmin_dev +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jungnerd_qa_model_sunmin_dev` is a English model originally trained by Sunmin-dev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jungnerd_qa_model_sunmin_dev_en_5.2.0_3.0_1701035228194.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jungnerd_qa_model_sunmin_dev_en_5.2.0_3.0_1701035228194.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("jungnerd_qa_model_sunmin_dev","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("jungnerd_qa_model_sunmin_dev", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jungnerd_qa_model_sunmin_dev| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sunmin-dev/jungnerd_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-kaggle_llm_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-kaggle_llm_qa_model_en.md new file mode 100644 index 000000000000..139f1d8bf40c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-kaggle_llm_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kaggle_llm_qa_model DistilBertForQuestionAnswering from anirbanbanerjee170577 +author: John Snow Labs +name: kaggle_llm_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kaggle_llm_qa_model` is a English model originally trained by anirbanbanerjee170577. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kaggle_llm_qa_model_en_5.2.0_3.0_1701015194718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kaggle_llm_qa_model_en_5.2.0_3.0_1701015194718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("kaggle_llm_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("kaggle_llm_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kaggle_llm_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anirbanbanerjee170577/kaggle_llm_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-kaggle_q_a_en.md b/docs/_posts/ahmedlone127/2023-11-26-kaggle_q_a_en.md new file mode 100644 index 000000000000..dc60674accf6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-kaggle_q_a_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kaggle_q_a DistilBertForQuestionAnswering from mrbach +author: John Snow Labs +name: kaggle_q_a +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kaggle_q_a` is a English model originally trained by mrbach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kaggle_q_a_en_5.2.0_3.0_1701015438029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kaggle_q_a_en_5.2.0_3.0_1701015438029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("kaggle_q_a","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("kaggle_q_a", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kaggle_q_a| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mrbach/kaggle_q_a \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-limburgish_tis_tuned_2_en.md b/docs/_posts/ahmedlone127/2023-11-26-limburgish_tis_tuned_2_en.md new file mode 100644 index 000000000000..17567aa3ee06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-limburgish_tis_tuned_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English limburgish_tis_tuned_2 DistilBertForQuestionAnswering from alexkueck +author: John Snow Labs +name: limburgish_tis_tuned_2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`limburgish_tis_tuned_2` is a English model originally trained by alexkueck. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/limburgish_tis_tuned_2_en_5.2.0_3.0_1701033561489.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/limburgish_tis_tuned_2_en_5.2.0_3.0_1701033561489.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("limburgish_tis_tuned_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("limburgish_tis_tuned_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|limburgish_tis_tuned_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/alexkueck/li-tis-tuned-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-m_distilbert_large_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-m_distilbert_large_qa_model_en.md new file mode 100644 index 000000000000..7fa9fab759e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-m_distilbert_large_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English m_distilbert_large_qa_model DistilBertForQuestionAnswering from Chetna19 +author: John Snow Labs +name: m_distilbert_large_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`m_distilbert_large_qa_model` is a English model originally trained by Chetna19. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/m_distilbert_large_qa_model_en_5.2.0_3.0_1701014824329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/m_distilbert_large_qa_model_en_5.2.0_3.0_1701014824329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("m_distilbert_large_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("m_distilbert_large_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|m_distilbert_large_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Chetna19/m_distilbert_large_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-machinehack_updated_en.md b/docs/_posts/ahmedlone127/2023-11-26-machinehack_updated_en.md new file mode 100644 index 000000000000..bee3d35906f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-machinehack_updated_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English machinehack_updated DistilBertForQuestionAnswering from sivapriyakumar16 +author: John Snow Labs +name: machinehack_updated +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`machinehack_updated` is a English model originally trained by sivapriyakumar16. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/machinehack_updated_en_5.2.0_3.0_1701014682935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/machinehack_updated_en_5.2.0_3.0_1701014682935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("machinehack_updated","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("machinehack_updated", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|machinehack_updated| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/sivapriyakumar16/machinehack_updated \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-metacade_queston_answer_distlbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-metacade_queston_answer_distlbert_en.md new file mode 100644 index 000000000000..4f55a2e61239 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-metacade_queston_answer_distlbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English metacade_queston_answer_distlbert DistilBertForQuestionAnswering from whalesdotxyz +author: John Snow Labs +name: metacade_queston_answer_distlbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`metacade_queston_answer_distlbert` is a English model originally trained by whalesdotxyz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/metacade_queston_answer_distlbert_en_5.2.0_3.0_1701039579131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/metacade_queston_answer_distlbert_en_5.2.0_3.0_1701039579131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("metacade_queston_answer_distlbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("metacade_queston_answer_distlbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|metacade_queston_answer_distlbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/whalesdotxyz/metacade_queston_answer_distlbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-model_qa_5_epoch_ru_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-model_qa_5_epoch_ru_finetuned_squad_en.md new file mode 100644 index 000000000000..9a3113a7bd82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-model_qa_5_epoch_ru_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model_qa_5_epoch_ru_finetuned_squad DistilBertForQuestionAnswering from gallyamovi +author: John Snow Labs +name: model_qa_5_epoch_ru_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_qa_5_epoch_ru_finetuned_squad` is a English model originally trained by gallyamovi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_qa_5_epoch_ru_finetuned_squad_en_5.2.0_3.0_1701041219834.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_qa_5_epoch_ru_finetuned_squad_en_5.2.0_3.0_1701041219834.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("model_qa_5_epoch_ru_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("model_qa_5_epoch_ru_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_qa_5_epoch_ru_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/gallyamovi/model-QA-5-epoch-RU-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-multipleqg_full_ctxt_only_filtered_0_15_bertqa_en.md b/docs/_posts/ahmedlone127/2023-11-26-multipleqg_full_ctxt_only_filtered_0_15_bertqa_en.md new file mode 100644 index 000000000000..4f26713a7d2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-multipleqg_full_ctxt_only_filtered_0_15_bertqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English multipleqg_full_ctxt_only_filtered_0_15_bertqa DistilBertForQuestionAnswering from LeWince +author: John Snow Labs +name: multipleqg_full_ctxt_only_filtered_0_15_bertqa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multipleqg_full_ctxt_only_filtered_0_15_bertqa` is a English model originally trained by LeWince. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multipleqg_full_ctxt_only_filtered_0_15_bertqa_en_5.2.0_3.0_1701037636484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multipleqg_full_ctxt_only_filtered_0_15_bertqa_en_5.2.0_3.0_1701037636484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("multipleqg_full_ctxt_only_filtered_0_15_bertqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("multipleqg_full_ctxt_only_filtered_0_15_bertqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multipleqg_full_ctxt_only_filtered_0_15_bertqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LeWince/MultipleQG-Full_Ctxt_Only-filtered_0_15_BertQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-nepal_bhasa_dummy_file_en.md b/docs/_posts/ahmedlone127/2023-11-26-nepal_bhasa_dummy_file_en.md new file mode 100644 index 000000000000..099442fd9c99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-nepal_bhasa_dummy_file_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nepal_bhasa_dummy_file DistilBertForQuestionAnswering from sophchoe +author: John Snow Labs +name: nepal_bhasa_dummy_file +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nepal_bhasa_dummy_file` is a English model originally trained by sophchoe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nepal_bhasa_dummy_file_en_5.2.0_3.0_1701036296200.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nepal_bhasa_dummy_file_en_5.2.0_3.0_1701036296200.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("nepal_bhasa_dummy_file","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("nepal_bhasa_dummy_file", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nepal_bhasa_dummy_file| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sophchoe/new_dummy_file \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-nlp4web_en.md b/docs/_posts/ahmedlone127/2023-11-26-nlp4web_en.md new file mode 100644 index 000000000000..e86c58182ac8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-nlp4web_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlp4web DistilBertForQuestionAnswering from Danghor +author: John Snow Labs +name: nlp4web +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp4web` is a English model originally trained by Danghor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp4web_en_5.2.0_3.0_1701041210146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp4web_en_5.2.0_3.0_1701041210146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("nlp4web","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("nlp4web", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp4web| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Danghor/NLP4Web \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-nlp4webexcercise_en.md b/docs/_posts/ahmedlone127/2023-11-26-nlp4webexcercise_en.md new file mode 100644 index 000000000000..0ef7b182935c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-nlp4webexcercise_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English nlp4webexcercise DistilBertForQuestionAnswering from IDontWantAnAccount +author: John Snow Labs +name: nlp4webexcercise +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp4webexcercise` is a English model originally trained by IDontWantAnAccount. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp4webexcercise_en_5.2.0_3.0_1701035657068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp4webexcercise_en_5.2.0_3.0_1701035657068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("nlp4webexcercise","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("nlp4webexcercise", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp4webexcercise| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/IDontWantAnAccount/NLP4WebExcercise \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-oneapi_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-oneapi_qa_model_en.md new file mode 100644 index 000000000000..87c8b09b43fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-oneapi_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English oneapi_qa_model DistilBertForQuestionAnswering from badalsahani +author: John Snow Labs +name: oneapi_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`oneapi_qa_model` is a English model originally trained by badalsahani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/oneapi_qa_model_en_5.2.0_3.0_1701015288954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/oneapi_qa_model_en_5.2.0_3.0_1701015288954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("oneapi_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("oneapi_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|oneapi_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/badalsahani/oneAPI_QA_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-output_byte0_en.md b/docs/_posts/ahmedlone127/2023-11-26-output_byte0_en.md new file mode 100644 index 000000000000..240a426859b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-output_byte0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English output_byte0 DistilBertForQuestionAnswering from Byte0 +author: John Snow Labs +name: output_byte0 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`output_byte0` is a English model originally trained by Byte0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/output_byte0_en_5.2.0_3.0_1701042051553.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/output_byte0_en_5.2.0_3.0_1701042051553.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("output_byte0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("output_byte0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|output_byte0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Byte0/output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-output_gaohuakai_en.md b/docs/_posts/ahmedlone127/2023-11-26-output_gaohuakai_en.md new file mode 100644 index 000000000000..774d2ab3ae00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-output_gaohuakai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English output_gaohuakai DistilBertForQuestionAnswering from gaohuakai +author: John Snow Labs +name: output_gaohuakai +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`output_gaohuakai` is a English model originally trained by gaohuakai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/output_gaohuakai_en_5.2.0_3.0_1701038741130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/output_gaohuakai_en_5.2.0_3.0_1701038741130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("output_gaohuakai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("output_gaohuakai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|output_gaohuakai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/gaohuakai/output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-psy_q_a_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-psy_q_a_test_en.md new file mode 100644 index 000000000000..26b3e2031ba4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-psy_q_a_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English psy_q_a_test DistilBertForQuestionAnswering from plgrm720 +author: John Snow Labs +name: psy_q_a_test +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`psy_q_a_test` is a English model originally trained by plgrm720. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/psy_q_a_test_en_5.2.0_3.0_1701038441999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/psy_q_a_test_en_5.2.0_3.0_1701038441999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("psy_q_a_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("psy_q_a_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|psy_q_a_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/plgrm720/psy_q_a_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-psychic_en.md b/docs/_posts/ahmedlone127/2023-11-26-psychic_en.md new file mode 100644 index 000000000000..996c41e78969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-psychic_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English psychic DistilBertForQuestionAnswering from HannaAbiAkl +author: John Snow Labs +name: psychic +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`psychic` is a English model originally trained by HannaAbiAkl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/psychic_en_5.2.0_3.0_1701030085097.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/psychic_en_5.2.0_3.0_1701030085097.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("psychic","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("psychic", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|psychic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HannaAbiAkl/psychic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_bert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_bert_finetuned_squad_en.md new file mode 100644 index 000000000000..93818dc3c747 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_bert_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_bert_finetuned_squad DistilBertForQuestionAnswering from jmparejaz +author: John Snow Labs +name: qa_bert_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_bert_finetuned_squad` is a English model originally trained by jmparejaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_bert_finetuned_squad_en_5.2.0_3.0_1701021653839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_bert_finetuned_squad_en_5.2.0_3.0_1701021653839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_bert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_bert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_bert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jmparejaz/qa_bert_finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_base_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_base_squad_en.md new file mode 100644 index 000000000000..04f6b3e2f397 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_base_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_distilbert_base_squad DistilBertForQuestionAnswering from ayoubkirouane +author: John Snow Labs +name: qa_distilbert_base_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_distilbert_base_squad` is a English model originally trained by ayoubkirouane. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_distilbert_base_squad_en_5.2.0_3.0_1701016725870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_distilbert_base_squad_en_5.2.0_3.0_1701016725870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_distilbert_base_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_distilbert_base_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_distilbert_base_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ayoubkirouane/QA-DistilBERT-base-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_finetuned_en.md new file mode 100644 index 000000000000..d1c3f62f24a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_distilbert_finetuned DistilBertForQuestionAnswering from qiqiquq +author: John Snow Labs +name: qa_distilbert_finetuned +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_distilbert_finetuned` is a English model originally trained by qiqiquq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_distilbert_finetuned_en_5.2.0_3.0_1701021071741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_distilbert_finetuned_en_5.2.0_3.0_1701021071741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_distilbert_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_distilbert_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_distilbert_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/qiqiquq/qa_distilbert_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_finetuned_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_finetuned_test_en.md new file mode 100644 index 000000000000..d8b4610f52b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_distilbert_finetuned_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_distilbert_finetuned_test DistilBertForQuestionAnswering from qiqiquq +author: John Snow Labs +name: qa_distilbert_finetuned_test +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_distilbert_finetuned_test` is a English model originally trained by qiqiquq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_distilbert_finetuned_test_en_5.2.0_3.0_1701024570278.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_distilbert_finetuned_test_en_5.2.0_3.0_1701024570278.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_distilbert_finetuned_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_distilbert_finetuned_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_distilbert_finetuned_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/qiqiquq/qa_distilbert_finetuned_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_english_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_english_en.md new file mode 100644 index 000000000000..2ab9a299b173 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_english_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_english DistilBertForQuestionAnswering from mathildeparlo +author: John Snow Labs +name: qa_english +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_english` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_english_en_5.2.0_3.0_1701014972150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_english_en_5.2.0_3.0_1701014972150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_english","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_english", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mathildeparlo/qa_en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_model_aharneish_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_model_aharneish_en.md new file mode 100644 index 000000000000..b45d873f4cd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_model_aharneish_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_aharneish DistilBertForQuestionAnswering from Aharneish +author: John Snow Labs +name: qa_model_aharneish +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_aharneish` is a English model originally trained by Aharneish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_aharneish_en_5.2.0_3.0_1701038296862.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_aharneish_en_5.2.0_3.0_1701038296862.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_aharneish","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_aharneish", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_aharneish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Aharneish/qa-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_model_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_model_distilbert_en.md new file mode 100644 index 000000000000..8a5ecde37959 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_model_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_distilbert DistilBertForQuestionAnswering from SMD00 +author: John Snow Labs +name: qa_model_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_distilbert` is a English model originally trained by SMD00. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_distilbert_en_5.2.0_3.0_1701024853970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_distilbert_en_5.2.0_3.0_1701024853970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SMD00/QA_model-distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_model_madhura_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_model_madhura_en.md new file mode 100644 index 000000000000..0d6c803f74bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_model_madhura_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_madhura DistilBertForQuestionAnswering from Madhura +author: John Snow Labs +name: qa_model_madhura +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_madhura` is a English model originally trained by Madhura. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_madhura_en_5.2.0_3.0_1701033137127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_madhura_en_5.2.0_3.0_1701033137127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_madhura","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_madhura", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_madhura| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Madhura/qa-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_model_ptamm_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_model_ptamm_en.md new file mode 100644 index 000000000000..b007c607d8e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_model_ptamm_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_ptamm DistilBertForQuestionAnswering from ptamm +author: John Snow Labs +name: qa_model_ptamm +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_ptamm` is a English model originally trained by ptamm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_ptamm_en_5.2.0_3.0_1701018970484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_ptamm_en_5.2.0_3.0_1701018970484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_ptamm","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_ptamm", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_ptamm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ptamm/qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_model_punnam_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_model_punnam_en.md new file mode 100644 index 000000000000..3603b8da3cab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_model_punnam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_punnam DistilBertForQuestionAnswering from punnam +author: John Snow Labs +name: qa_model_punnam +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_punnam` is a English model originally trained by punnam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_punnam_en_5.2.0_3.0_1701030257548.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_punnam_en_5.2.0_3.0_1701030257548.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_punnam","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_punnam", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_punnam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/punnam/qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_nlp_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_nlp_model_en.md new file mode 100644 index 000000000000..97315c0e316f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_nlp_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_nlp_model DistilBertForQuestionAnswering from jolual2747 +author: John Snow Labs +name: qa_nlp_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_nlp_model` is a English model originally trained by jolual2747. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_nlp_model_en_5.2.0_3.0_1701037856991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_nlp_model_en_5.2.0_3.0_1701037856991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_nlp_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_nlp_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_nlp_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jolual2747/qa_nlp_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde_en.md new file mode 100644 index 000000000000..582a84b4d6a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde DistilBertForQuestionAnswering from prajwalJumde +author: John Snow Labs +name: qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde` is a English model originally trained by prajwalJumde. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde_en_5.2.0_3.0_1701038971992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde_en_5.2.0_3.0_1701038971992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_synthetic_data_only_18_aug_distilbert_base_prajwaljumde| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/prajwalJumde/QA_SYNTHETIC_DATA_ONLY_18_AUG_distilbert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_tquad_distilbert_base_turkish_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_tquad_distilbert_base_turkish_en.md new file mode 100644 index 000000000000..1b848b9deeb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_tquad_distilbert_base_turkish_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_tquad_distilbert_base_turkish DistilBertForQuestionAnswering from Izzet +author: John Snow Labs +name: qa_tquad_distilbert_base_turkish +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_tquad_distilbert_base_turkish` is a English model originally trained by Izzet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_tquad_distilbert_base_turkish_en_5.2.0_3.0_1701016849913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_tquad_distilbert_base_turkish_en_5.2.0_3.0_1701016849913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_tquad_distilbert_base_turkish","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_tquad_distilbert_base_turkish", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_tquad_distilbert_base_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|251.8 MB| + +## References + +https://huggingface.co/Izzet/qa_tquad_distilbert-base-turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qa_ytu_distilbert_base_turkish_en.md b/docs/_posts/ahmedlone127/2023-11-26-qa_ytu_distilbert_base_turkish_en.md new file mode 100644 index 000000000000..449c3f35b11a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qa_ytu_distilbert_base_turkish_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_ytu_distilbert_base_turkish DistilBertForQuestionAnswering from Izzet +author: John Snow Labs +name: qa_ytu_distilbert_base_turkish +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_ytu_distilbert_base_turkish` is a English model originally trained by Izzet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_ytu_distilbert_base_turkish_en_5.2.0_3.0_1701021650629.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_ytu_distilbert_base_turkish_en_5.2.0_3.0_1701021650629.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_ytu_distilbert_base_turkish","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_ytu_distilbert_base_turkish", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_ytu_distilbert_base_turkish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|251.8 MB| + +## References + +https://huggingface.co/Izzet/qa_ytu_distilbert-base-turkish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qabert_small_en.md b/docs/_posts/ahmedlone127/2023-11-26-qabert_small_en.md new file mode 100644 index 000000000000..1f7c3094e4b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qabert_small_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qabert_small DistilBertForQuestionAnswering from SRDdev +author: John Snow Labs +name: qabert_small +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qabert_small` is a English model originally trained by SRDdev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qabert_small_en_5.2.0_3.0_1701014181595.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qabert_small_en_5.2.0_3.0_1701014181595.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qabert_small","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qabert_small", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qabert_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SRDdev/QABERT-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-qamodel_distillbert_squad_small_en.md b/docs/_posts/ahmedlone127/2023-11-26-qamodel_distillbert_squad_small_en.md new file mode 100644 index 000000000000..f2d9135ff239 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-qamodel_distillbert_squad_small_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qamodel_distillbert_squad_small DistilBertForQuestionAnswering from rashmikamath01 +author: John Snow Labs +name: qamodel_distillbert_squad_small +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qamodel_distillbert_squad_small` is a English model originally trained by rashmikamath01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qamodel_distillbert_squad_small_en_5.2.0_3.0_1701025912106.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qamodel_distillbert_squad_small_en_5.2.0_3.0_1701025912106.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qamodel_distillbert_squad_small","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qamodel_distillbert_squad_small", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qamodel_distillbert_squad_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rashmikamath01/qamodel-distillbert-squad-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-question_answer_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-question_answer_model_en.md new file mode 100644 index 000000000000..5553563018c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-question_answer_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answer_model DistilBertForQuestionAnswering from hameersiddique +author: John Snow Labs +name: question_answer_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answer_model` is a English model originally trained by hameersiddique. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answer_model_en_5.2.0_3.0_1701033997105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answer_model_en_5.2.0_3.0_1701033997105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("question_answer_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("question_answer_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answer_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hameersiddique/question_answer_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-question_answering_falconsai_en.md b/docs/_posts/ahmedlone127/2023-11-26-question_answering_falconsai_en.md new file mode 100644 index 000000000000..4f6467fc836b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-question_answering_falconsai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_falconsai DistilBertForQuestionAnswering from Falconsai +author: John Snow Labs +name: question_answering_falconsai +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_falconsai` is a English model originally trained by Falconsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_falconsai_en_5.2.0_3.0_1701015692327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_falconsai_en_5.2.0_3.0_1701015692327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("question_answering_falconsai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("question_answering_falconsai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_falconsai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Falconsai/question_answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-question_answering_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-question_answering_model_en.md new file mode 100644 index 000000000000..e223eac60c82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-question_answering_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_model DistilBertForQuestionAnswering from Areeb123 +author: John Snow Labs +name: question_answering_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_model` is a English model originally trained by Areeb123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_model_en_5.2.0_3.0_1701016991440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_model_en_5.2.0_3.0_1701016991440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("question_answering_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("question_answering_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Areeb123/Question_Answering_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-question_answering_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-question_answering_test_en.md new file mode 100644 index 000000000000..5fc6f3757066 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-question_answering_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_test DistilBertForQuestionAnswering from SalmonAI123 +author: John Snow Labs +name: question_answering_test +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_test` is a English model originally trained by SalmonAI123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_test_en_5.2.0_3.0_1701042788080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_test_en_5.2.0_3.0_1701042788080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("question_answering_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("question_answering_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SalmonAI123/Question_answering_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-question_answering_v2_en.md b/docs/_posts/ahmedlone127/2023-11-26-question_answering_v2_en.md new file mode 100644 index 000000000000..e37eea88952c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-question_answering_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_v2 DistilBertForQuestionAnswering from Falconsai +author: John Snow Labs +name: question_answering_v2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_v2` is a English model originally trained by Falconsai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_v2_en_5.2.0_3.0_1701014380274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_v2_en_5.2.0_3.0_1701014380274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("question_answering_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("question_answering_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Falconsai/question_answering_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-questionanswering_distilbert_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_distilbert_squad_en.md new file mode 100644 index 000000000000..f0894b6324a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_distilbert_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English questionanswering_distilbert_squad DistilBertForQuestionAnswering from StatsGary +author: John Snow Labs +name: questionanswering_distilbert_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`questionanswering_distilbert_squad` is a English model originally trained by StatsGary. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/questionanswering_distilbert_squad_en_5.2.0_3.0_1701025599772.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/questionanswering_distilbert_squad_en_5.2.0_3.0_1701025599772.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("questionanswering_distilbert_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("questionanswering_distilbert_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|questionanswering_distilbert_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/StatsGary/questionanswering-distilbert-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v1_en.md b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v1_en.md new file mode 100644 index 000000000000..79e0b062191e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English questionanswering_v1 DistilBertForQuestionAnswering from abdalrahmanshahrour +author: John Snow Labs +name: questionanswering_v1 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`questionanswering_v1` is a English model originally trained by abdalrahmanshahrour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/questionanswering_v1_en_5.2.0_3.0_1701029463793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/questionanswering_v1_en_5.2.0_3.0_1701029463793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("questionanswering_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("questionanswering_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|questionanswering_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/abdalrahmanshahrour/questionanswering-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v4_en.md b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v4_en.md new file mode 100644 index 000000000000..b3f9ef9f2094 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English questionanswering_v4 DistilBertForQuestionAnswering from abdalrahmanshahrour +author: John Snow Labs +name: questionanswering_v4 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`questionanswering_v4` is a English model originally trained by abdalrahmanshahrour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/questionanswering_v4_en_5.2.0_3.0_1701025343717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/questionanswering_v4_en_5.2.0_3.0_1701025343717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("questionanswering_v4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("questionanswering_v4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|questionanswering_v4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abdalrahmanshahrour/questionanswering-v4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v5_en.md b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v5_en.md new file mode 100644 index 000000000000..fd0e5a550d99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English questionanswering_v5 DistilBertForQuestionAnswering from abdalrahmanshahrour +author: John Snow Labs +name: questionanswering_v5 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`questionanswering_v5` is a English model originally trained by abdalrahmanshahrour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/questionanswering_v5_en_5.2.0_3.0_1701022590100.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/questionanswering_v5_en_5.2.0_3.0_1701022590100.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("questionanswering_v5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("questionanswering_v5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|questionanswering_v5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/abdalrahmanshahrour/questionanswering-v5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v7_en.md b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v7_en.md new file mode 100644 index 000000000000..a468c871c114 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English questionanswering_v7 DistilBertForQuestionAnswering from abdalrahmanshahrour +author: John Snow Labs +name: questionanswering_v7 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`questionanswering_v7` is a English model originally trained by abdalrahmanshahrour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/questionanswering_v7_en_5.2.0_3.0_1701025017346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/questionanswering_v7_en_5.2.0_3.0_1701025017346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("questionanswering_v7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("questionanswering_v7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|questionanswering_v7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abdalrahmanshahrour/questionanswering-v7 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v8_en.md b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v8_en.md new file mode 100644 index 000000000000..617bbef1867e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-questionanswering_v8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English questionanswering_v8 DistilBertForQuestionAnswering from abdalrahmanshahrour +author: John Snow Labs +name: questionanswering_v8 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`questionanswering_v8` is a English model originally trained by abdalrahmanshahrour. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/questionanswering_v8_en_5.2.0_3.0_1701017973673.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/questionanswering_v8_en_5.2.0_3.0_1701017973673.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("questionanswering_v8","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("questionanswering_v8", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|questionanswering_v8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abdalrahmanshahrour/questionanswering-v8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-shiyan2_en.md b/docs/_posts/ahmedlone127/2023-11-26-shiyan2_en.md new file mode 100644 index 000000000000..f683cc99aa71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-shiyan2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English shiyan2 DistilBertForQuestionAnswering from wbxlala +author: John Snow Labs +name: shiyan2 +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`shiyan2` is a English model originally trained by wbxlala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/shiyan2_en_5.2.0_3.0_1701017272285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/shiyan2_en_5.2.0_3.0_1701017272285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("shiyan2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("shiyan2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|shiyan2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/wbxlala/shiyan2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-simpletransformer_qa_distilbert_base_cased_en.md b/docs/_posts/ahmedlone127/2023-11-26-simpletransformer_qa_distilbert_base_cased_en.md new file mode 100644 index 000000000000..2896c17a2d02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-simpletransformer_qa_distilbert_base_cased_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English simpletransformer_qa_distilbert_base_cased DistilBertForQuestionAnswering from mtanzi +author: John Snow Labs +name: simpletransformer_qa_distilbert_base_cased +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`simpletransformer_qa_distilbert_base_cased` is a English model originally trained by mtanzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/simpletransformer_qa_distilbert_base_cased_en_5.2.0_3.0_1701033865972.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/simpletransformer_qa_distilbert_base_cased_en_5.2.0_3.0_1701033865972.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("simpletransformer_qa_distilbert_base_cased","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("simpletransformer_qa_distilbert_base_cased", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|simpletransformer_qa_distilbert_base_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/mtanzi/simpletransformer-qa-distilbert-base-cased \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-synthetic_qa_en.md b/docs/_posts/ahmedlone127/2023-11-26-synthetic_qa_en.md new file mode 100644 index 000000000000..ed4cca8cd53c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-synthetic_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English synthetic_qa DistilBertForQuestionAnswering from cesare-spinoso +author: John Snow Labs +name: synthetic_qa +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`synthetic_qa` is a English model originally trained by cesare-spinoso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/synthetic_qa_en_5.2.0_3.0_1701038491257.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/synthetic_qa_en_5.2.0_3.0_1701038491257.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("synthetic_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("synthetic_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|synthetic_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cesare-spinoso/synthetic_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-task_qa_distilbert_en.md b/docs/_posts/ahmedlone127/2023-11-26-task_qa_distilbert_en.md new file mode 100644 index 000000000000..7e2eb5a88cbf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-task_qa_distilbert_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English task_qa_distilbert DistilBertForQuestionAnswering from smile367 +author: John Snow Labs +name: task_qa_distilbert +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`task_qa_distilbert` is a English model originally trained by smile367. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/task_qa_distilbert_en_5.2.0_3.0_1701022315012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/task_qa_distilbert_en_5.2.0_3.0_1701022315012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("task_qa_distilbert","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("task_qa_distilbert", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|task_qa_distilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/smile367/task_qa_distilbert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-test_q_a_1st_en.md b/docs/_posts/ahmedlone127/2023-11-26-test_q_a_1st_en.md new file mode 100644 index 000000000000..b3bbf8fa8463 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-test_q_a_1st_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_q_a_1st DistilBertForQuestionAnswering from mrbach +author: John Snow Labs +name: test_q_a_1st +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_q_a_1st` is a English model originally trained by mrbach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_q_a_1st_en_5.2.0_3.0_1701025154982.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_q_a_1st_en_5.2.0_3.0_1701025154982.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("test_q_a_1st","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("test_q_a_1st", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_q_a_1st| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mrbach/test_q_a_1st \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-tiny_random_distilbertforquestionanswering_en.md b/docs/_posts/ahmedlone127/2023-11-26-tiny_random_distilbertforquestionanswering_en.md new file mode 100644 index 000000000000..ed092ab64875 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-tiny_random_distilbertforquestionanswering_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tiny_random_distilbertforquestionanswering DistilBertForQuestionAnswering from hf-tiny-model-private +author: John Snow Labs +name: tiny_random_distilbertforquestionanswering +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tiny_random_distilbertforquestionanswering` is a English model originally trained by hf-tiny-model-private. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertforquestionanswering_en_5.2.0_3.0_1701041229814.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tiny_random_distilbertforquestionanswering_en_5.2.0_3.0_1701041229814.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("tiny_random_distilbertforquestionanswering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("tiny_random_distilbertforquestionanswering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tiny_random_distilbertforquestionanswering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|343.9 KB| + +## References + +https://huggingface.co/hf-tiny-model-private/tiny-random-DistilBertForQuestionAnswering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-ub_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-ub_model_en.md new file mode 100644 index 000000000000..c190aa346069 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-ub_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ub_model DistilBertForQuestionAnswering from ratno +author: John Snow Labs +name: ub_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ub_model` is a English model originally trained by ratno. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ub_model_en_5.2.0_3.0_1701036504920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ub_model_en_5.2.0_3.0_1701036504920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("ub_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("ub_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ub_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ratno/ub_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-upload_test_en.md b/docs/_posts/ahmedlone127/2023-11-26-upload_test_en.md new file mode 100644 index 000000000000..de27f3f2e15a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-upload_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English upload_test DistilBertForQuestionAnswering from 96harsh56 +author: John Snow Labs +name: upload_test +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`upload_test` is a English model originally trained by 96harsh56. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/upload_test_en_5.2.0_3.0_1701033279131.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/upload_test_en_5.2.0_3.0_1701033279131.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("upload_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("upload_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|upload_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/96harsh56/upload_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-valrad_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-26-valrad_qa_model_en.md new file mode 100644 index 000000000000..9fbf119ba2ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-valrad_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English valrad_qa_model DistilBertForQuestionAnswering from radyad +author: John Snow Labs +name: valrad_qa_model +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`valrad_qa_model` is a English model originally trained by radyad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/valrad_qa_model_en_5.2.0_3.0_1701036344346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/valrad_qa_model_en_5.2.0_3.0_1701036344346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("valrad_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("valrad_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|valrad_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/radyad/valrad_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-vmehlin_distilbert_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-26-vmehlin_distilbert_finetuned_squad_en.md new file mode 100644 index 000000000000..9b972e2edcdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-vmehlin_distilbert_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English vmehlin_distilbert_finetuned_squad DistilBertForQuestionAnswering from vanme +author: John Snow Labs +name: vmehlin_distilbert_finetuned_squad +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vmehlin_distilbert_finetuned_squad` is a English model originally trained by vanme. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vmehlin_distilbert_finetuned_squad_en_5.2.0_3.0_1701017577302.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vmehlin_distilbert_finetuned_squad_en_5.2.0_3.0_1701017577302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("vmehlin_distilbert_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("vmehlin_distilbert_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vmehlin_distilbert_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vanme/vmehlin_distilbert-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-work_questions_en.md b/docs/_posts/ahmedlone127/2023-11-26-work_questions_en.md new file mode 100644 index 000000000000..9f826ac9c8e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-work_questions_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English work_questions DistilBertForQuestionAnswering from Jornt +author: John Snow Labs +name: work_questions +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`work_questions` is a English model originally trained by Jornt. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/work_questions_en_5.2.0_3.0_1701038593846.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/work_questions_en_5.2.0_3.0_1701038593846.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("work_questions","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("work_questions", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|work_questions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Jornt/work-questions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-yb_model_dir_en.md b/docs/_posts/ahmedlone127/2023-11-26-yb_model_dir_en.md new file mode 100644 index 000000000000..1d602dc07586 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-yb_model_dir_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English yb_model_dir DistilBertForQuestionAnswering from suhcrates +author: John Snow Labs +name: yb_model_dir +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yb_model_dir` is a English model originally trained by suhcrates. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yb_model_dir_en_5.2.0_3.0_1701039193485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yb_model_dir_en_5.2.0_3.0_1701039193485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("yb_model_dir","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("yb_model_dir", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yb_model_dir| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/suhcrates/yb_model_dir \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-26-zgbot_en.md b/docs/_posts/ahmedlone127/2023-11-26-zgbot_en.md new file mode 100644 index 000000000000..8f274ccca626 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-26-zgbot_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English zgbot DistilBertForQuestionAnswering from FranzderPapst +author: John Snow Labs +name: zgbot +date: 2023-11-26 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`zgbot` is a English model originally trained by FranzderPapst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/zgbot_en_5.2.0_3.0_1701026084346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/zgbot_en_5.2.0_3.0_1701026084346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("zgbot","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("zgbot", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|zgbot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/FranzderPapst/ZGBot \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-1000length_en.md b/docs/_posts/ahmedlone127/2023-11-27-1000length_en.md new file mode 100644 index 000000000000..f94bbee363ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-1000length_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English 1000length DistilBertForQuestionAnswering from kevinbror +author: John Snow Labs +name: 1000length +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`1000length` is a English model originally trained by kevinbror. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/1000length_en_5.2.0_3.0_1701073906898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/1000length_en_5.2.0_3.0_1701073906898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("1000length","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("1000length", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|1000length| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kevinbror/1000length \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-adhd_test_qa_model_linatarasenko99_en.md b/docs/_posts/ahmedlone127/2023-11-27-adhd_test_qa_model_linatarasenko99_en.md new file mode 100644 index 000000000000..0e23c8d37f4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-adhd_test_qa_model_linatarasenko99_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English adhd_test_qa_model_linatarasenko99 DistilBertForQuestionAnswering from LinaTarasenko99 +author: John Snow Labs +name: adhd_test_qa_model_linatarasenko99 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adhd_test_qa_model_linatarasenko99` is a English model originally trained by LinaTarasenko99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adhd_test_qa_model_linatarasenko99_en_5.2.0_3.0_1701046541404.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adhd_test_qa_model_linatarasenko99_en_5.2.0_3.0_1701046541404.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("adhd_test_qa_model_linatarasenko99","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("adhd_test_qa_model_linatarasenko99", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adhd_test_qa_model_linatarasenko99| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LinaTarasenko99/ADHD_Test_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-autotrain_college4_54345127337_en.md b/docs/_posts/ahmedlone127/2023-11-27-autotrain_college4_54345127337_en.md new file mode 100644 index 000000000000..174e8e64a301 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-autotrain_college4_54345127337_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_college4_54345127337 DistilBertForQuestionAnswering from harshith34 +author: John Snow Labs +name: autotrain_college4_54345127337 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_college4_54345127337` is a English model originally trained by harshith34. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_college4_54345127337_en_5.2.0_3.0_1701044662857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_college4_54345127337_en_5.2.0_3.0_1701044662857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("autotrain_college4_54345127337","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("autotrain_college4_54345127337", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_college4_54345127337| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/harshith34/autotrain-college4-54345127337 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-autotrain_college4_54345127338_en.md b/docs/_posts/ahmedlone127/2023-11-27-autotrain_college4_54345127338_en.md new file mode 100644 index 000000000000..296b7ffbb2af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-autotrain_college4_54345127338_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_college4_54345127338 DistilBertForQuestionAnswering from harshith34 +author: John Snow Labs +name: autotrain_college4_54345127338 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_college4_54345127338` is a English model originally trained by harshith34. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_college4_54345127338_en_5.2.0_3.0_1701052322753.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_college4_54345127338_en_5.2.0_3.0_1701052322753.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("autotrain_college4_54345127338","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("autotrain_college4_54345127338", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_college4_54345127338| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/harshith34/autotrain-college4-54345127338 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-autotrain_data_fac_3_55097128679_en.md b/docs/_posts/ahmedlone127/2023-11-27-autotrain_data_fac_3_55097128679_en.md new file mode 100644 index 000000000000..20bd2fd418e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-autotrain_data_fac_3_55097128679_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_data_fac_3_55097128679 DistilBertForQuestionAnswering from harshith34 +author: John Snow Labs +name: autotrain_data_fac_3_55097128679 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_data_fac_3_55097128679` is a English model originally trained by harshith34. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_data_fac_3_55097128679_en_5.2.0_3.0_1701060501231.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_data_fac_3_55097128679_en_5.2.0_3.0_1701060501231.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("autotrain_data_fac_3_55097128679","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("autotrain_data_fac_3_55097128679", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_data_fac_3_55097128679| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/harshith34/autotrain-data-fac-3-55097128679 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-autotrain_ltu_distil_68049137171_en.md b/docs/_posts/ahmedlone127/2023-11-27-autotrain_ltu_distil_68049137171_en.md new file mode 100644 index 000000000000..5065aaf48e9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-autotrain_ltu_distil_68049137171_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English autotrain_ltu_distil_68049137171 DistilBertForQuestionAnswering from Inashamad +author: John Snow Labs +name: autotrain_ltu_distil_68049137171 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_ltu_distil_68049137171` is a English model originally trained by Inashamad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_ltu_distil_68049137171_en_5.2.0_3.0_1701045384968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_ltu_distil_68049137171_en_5.2.0_3.0_1701045384968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("autotrain_ltu_distil_68049137171","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("autotrain_ltu_distil_68049137171", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_ltu_distil_68049137171| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Inashamad/autotrain-ltu-distil-68049137171 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-azam_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-azam_qa_model_en.md new file mode 100644 index 000000000000..1cae793dea8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-azam_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English azam_qa_model DistilBertForQuestionAnswering from Azamsayeed +author: John Snow Labs +name: azam_qa_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`azam_qa_model` is a English model originally trained by Azamsayeed. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/azam_qa_model_en_5.2.0_3.0_1701043372970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/azam_qa_model_en_5.2.0_3.0_1701043372970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("azam_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("azam_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|azam_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Azamsayeed/azam_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-bankstatementmodelver1_en.md b/docs/_posts/ahmedlone127/2023-11-27-bankstatementmodelver1_en.md new file mode 100644 index 000000000000..7c42e6d43566 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-bankstatementmodelver1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bankstatementmodelver1 DistilBertForQuestionAnswering from Souvik123 +author: John Snow Labs +name: bankstatementmodelver1 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bankstatementmodelver1` is a English model originally trained by Souvik123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bankstatementmodelver1_en_5.2.0_3.0_1701064935279.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bankstatementmodelver1_en_5.2.0_3.0_1701064935279.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bankstatementmodelver1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bankstatementmodelver1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bankstatementmodelver1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.3 MB| + +## References + +https://huggingface.co/Souvik123/bankstatementmodelver1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-bankstatementmodelver2_en.md b/docs/_posts/ahmedlone127/2023-11-27-bankstatementmodelver2_en.md new file mode 100644 index 000000000000..64cd0814e2ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-bankstatementmodelver2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bankstatementmodelver2 DistilBertForQuestionAnswering from Souvik123 +author: John Snow Labs +name: bankstatementmodelver2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bankstatementmodelver2` is a English model originally trained by Souvik123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bankstatementmodelver2_en_5.2.0_3.0_1701053341298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bankstatementmodelver2_en_5.2.0_3.0_1701053341298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bankstatementmodelver2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bankstatementmodelver2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bankstatementmodelver2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.0 MB| + +## References + +https://huggingface.co/Souvik123/bankstatementmodelver2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-baseline_en.md b/docs/_posts/ahmedlone127/2023-11-27-baseline_en.md new file mode 100644 index 000000000000..7e45e2fec4fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-baseline_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English baseline DistilBertForQuestionAnswering from leslielleslles +author: John Snow Labs +name: baseline +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`baseline` is a English model originally trained by leslielleslles. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/baseline_en_5.2.0_3.0_1701058856080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/baseline_en_5.2.0_3.0_1701058856080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("baseline","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("baseline", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|baseline| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/leslielleslles/baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-bert_poa_en.md b/docs/_posts/ahmedlone127/2023-11-27-bert_poa_en.md new file mode 100644 index 000000000000..db1be5afcf7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-bert_poa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_poa DistilBertForQuestionAnswering from sabrinah +author: John Snow Labs +name: bert_poa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_poa` is a English model originally trained by sabrinah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_poa_en_5.2.0_3.0_1701084068585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_poa_en_5.2.0_3.0_1701084068585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_poa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_poa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_poa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sabrinah/BERT-PoA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-bert_qa_model_mabecerra100_en.md b/docs/_posts/ahmedlone127/2023-11-27-bert_qa_model_mabecerra100_en.md new file mode 100644 index 000000000000..d8649b3e5b35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-bert_qa_model_mabecerra100_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English bert_qa_model_mabecerra100 DistilBertForQuestionAnswering from mabecerra100 +author: John Snow Labs +name: bert_qa_model_mabecerra100 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_qa_model_mabecerra100` is a English model originally trained by mabecerra100. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_qa_model_mabecerra100_en_5.2.0_3.0_1701043703536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_qa_model_mabecerra100_en_5.2.0_3.0_1701043703536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("bert_qa_model_mabecerra100","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("bert_qa_model_mabecerra100", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_qa_model_mabecerra100| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mabecerra100/bert_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_model_classification_w_adapter_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_model_classification_w_adapter_en.md new file mode 100644 index 000000000000..d7a9639722f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_model_classification_w_adapter_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_model_classification_w_adapter DistilBertForQuestionAnswering from houdi +author: John Snow Labs +name: burmese_awesome_model_classification_w_adapter +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_classification_w_adapter` is a English model originally trained by houdi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_classification_w_adapter_en_5.2.0_3.0_1701044176649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_classification_w_adapter_en_5.2.0_3.0_1701044176649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_model_classification_w_adapter","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_model_classification_w_adapter", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_classification_w_adapter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/houdi/my_awesome_model_classification_w_adapter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_en.md new file mode 100644 index 000000000000..4a939d1540be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa DistilBertForQuestionAnswering from junkmind +author: John Snow Labs +name: burmese_awesome_qa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa` is a English model originally trained by junkmind. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_en_5.2.0_3.0_1701062937508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_en_5.2.0_3.0_1701062937508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/junkmind/my_awesome_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model1_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model1_en.md new file mode 100644 index 000000000000..030a4ff1de0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model1 DistilBertForQuestionAnswering from kkkh1 +author: John Snow Labs +name: burmese_awesome_qa_model1 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model1` is a English model originally trained by kkkh1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model1_en_5.2.0_3.0_1701073038960.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model1_en_5.2.0_3.0_1701073038960.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kkkh1/my_awesome_qa_model1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_aanchalsatyan_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_aanchalsatyan_en.md new file mode 100644 index 000000000000..31fda9bb2a90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_aanchalsatyan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_aanchalsatyan DistilBertForQuestionAnswering from aanchalsatyan +author: John Snow Labs +name: burmese_awesome_qa_model_aanchalsatyan +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_aanchalsatyan` is a English model originally trained by aanchalsatyan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_aanchalsatyan_en_5.2.0_3.0_1701083606464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_aanchalsatyan_en_5.2.0_3.0_1701083606464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_aanchalsatyan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_aanchalsatyan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_aanchalsatyan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aanchalsatyan/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_abhinav_reddy_hugging_face_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_abhinav_reddy_hugging_face_en.md new file mode 100644 index 000000000000..78db0cee473e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_abhinav_reddy_hugging_face_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_abhinav_reddy_hugging_face DistilBertForQuestionAnswering from abhinav-reddy-hugging-face +author: John Snow Labs +name: burmese_awesome_qa_model_abhinav_reddy_hugging_face +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_abhinav_reddy_hugging_face` is a English model originally trained by abhinav-reddy-hugging-face. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_abhinav_reddy_hugging_face_en_5.2.0_3.0_1701043733715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_abhinav_reddy_hugging_face_en_5.2.0_3.0_1701043733715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_abhinav_reddy_hugging_face","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_abhinav_reddy_hugging_face", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_abhinav_reddy_hugging_face| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abhinav-reddy-hugging-face/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_abishines_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_abishines_en.md new file mode 100644 index 000000000000..240570e6a8db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_abishines_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_abishines DistilBertForQuestionAnswering from abishines +author: John Snow Labs +name: burmese_awesome_qa_model_abishines +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_abishines` is a English model originally trained by abishines. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_abishines_en_5.2.0_3.0_1701078597170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_abishines_en_5.2.0_3.0_1701078597170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_abishines","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_abishines", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_abishines| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abishines/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_advaiths7857_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_advaiths7857_en.md new file mode 100644 index 000000000000..18a781ed4a30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_advaiths7857_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_advaiths7857 DistilBertForQuestionAnswering from advaithS7857 +author: John Snow Labs +name: burmese_awesome_qa_model_advaiths7857 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_advaiths7857` is a English model originally trained by advaithS7857. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_advaiths7857_en_5.2.0_3.0_1701048787685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_advaiths7857_en_5.2.0_3.0_1701048787685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_advaiths7857","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_advaiths7857", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_advaiths7857| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/advaithS7857/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_afinucci_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_afinucci_en.md new file mode 100644 index 000000000000..87382c8ed9ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_afinucci_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_afinucci DistilBertForQuestionAnswering from Afinucci +author: John Snow Labs +name: burmese_awesome_qa_model_afinucci +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_afinucci` is a English model originally trained by Afinucci. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_afinucci_en_5.2.0_3.0_1701044029580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_afinucci_en_5.2.0_3.0_1701044029580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_afinucci","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_afinucci", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_afinucci| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Afinucci/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_akashlinux_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_akashlinux_en.md new file mode 100644 index 000000000000..b8d1b50f6c1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_akashlinux_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_akashlinux DistilBertForQuestionAnswering from akashlinux +author: John Snow Labs +name: burmese_awesome_qa_model_akashlinux +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_akashlinux` is a English model originally trained by akashlinux. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_akashlinux_en_5.2.0_3.0_1701057449398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_akashlinux_en_5.2.0_3.0_1701057449398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_akashlinux","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_akashlinux", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_akashlinux| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/akashlinux/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_al123_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_al123_en.md new file mode 100644 index 000000000000..9c2c8a11325d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_al123_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_al123 DistilBertForQuestionAnswering from al123 +author: John Snow Labs +name: burmese_awesome_qa_model_al123 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_al123` is a English model originally trained by al123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_al123_en_5.2.0_3.0_1701054349289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_al123_en_5.2.0_3.0_1701054349289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_al123","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_al123", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_al123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/al123/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_alexperkin_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_alexperkin_en.md new file mode 100644 index 000000000000..b913163bc649 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_alexperkin_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_alexperkin DistilBertForQuestionAnswering from AlexPerkin +author: John Snow Labs +name: burmese_awesome_qa_model_alexperkin +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_alexperkin` is a English model originally trained by AlexPerkin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_alexperkin_en_5.2.0_3.0_1701043738522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_alexperkin_en_5.2.0_3.0_1701043738522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_alexperkin","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_alexperkin", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_alexperkin| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AlexPerkin/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_alexrider_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_alexrider_en.md new file mode 100644 index 000000000000..5b7dae2b4619 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_alexrider_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_alexrider DistilBertForQuestionAnswering from alexrider +author: John Snow Labs +name: burmese_awesome_qa_model_alexrider +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_alexrider` is a English model originally trained by alexrider. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_alexrider_en_5.2.0_3.0_1701079019117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_alexrider_en_5.2.0_3.0_1701079019117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_alexrider","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_alexrider", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_alexrider| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/alexrider/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_allancuenca_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_allancuenca_en.md new file mode 100644 index 000000000000..4b9fe6ca782b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_allancuenca_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_allancuenca DistilBertForQuestionAnswering from allancuenca +author: John Snow Labs +name: burmese_awesome_qa_model_allancuenca +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_allancuenca` is a English model originally trained by allancuenca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_allancuenca_en_5.2.0_3.0_1701086611913.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_allancuenca_en_5.2.0_3.0_1701086611913.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_allancuenca","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_allancuenca", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_allancuenca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/allancuenca/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_altamirosantos_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_altamirosantos_en.md new file mode 100644 index 000000000000..585fbd0a0f5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_altamirosantos_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_altamirosantos DistilBertForQuestionAnswering from altamirosantos +author: John Snow Labs +name: burmese_awesome_qa_model_altamirosantos +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_altamirosantos` is a English model originally trained by altamirosantos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_altamirosantos_en_5.2.0_3.0_1701084581208.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_altamirosantos_en_5.2.0_3.0_1701084581208.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_altamirosantos","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_altamirosantos", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_altamirosantos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/altamirosantos/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_amoghnagunoori_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_amoghnagunoori_en.md new file mode 100644 index 000000000000..887f069e7baf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_amoghnagunoori_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_amoghnagunoori DistilBertForQuestionAnswering from AmoghNagunoori +author: John Snow Labs +name: burmese_awesome_qa_model_amoghnagunoori +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_amoghnagunoori` is a English model originally trained by AmoghNagunoori. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_amoghnagunoori_en_5.2.0_3.0_1701064237220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_amoghnagunoori_en_5.2.0_3.0_1701064237220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_amoghnagunoori","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_amoghnagunoori", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_amoghnagunoori| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AmoghNagunoori/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_amrutha3899_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_amrutha3899_en.md new file mode 100644 index 000000000000..dd1c4a759b3c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_amrutha3899_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_amrutha3899 DistilBertForQuestionAnswering from amrutha3899 +author: John Snow Labs +name: burmese_awesome_qa_model_amrutha3899 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_amrutha3899` is a English model originally trained by amrutha3899. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_amrutha3899_en_5.2.0_3.0_1701059422445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_amrutha3899_en_5.2.0_3.0_1701059422445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_amrutha3899","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_amrutha3899", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_amrutha3899| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/amrutha3899/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anhoang_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anhoang_en.md new file mode 100644 index 000000000000..8a026dcecfa9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anhoang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_anhoang DistilBertForQuestionAnswering from AnHoang +author: John Snow Labs +name: burmese_awesome_qa_model_anhoang +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_anhoang` is a English model originally trained by AnHoang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anhoang_en_5.2.0_3.0_1701091182521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anhoang_en_5.2.0_3.0_1701091182521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_anhoang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_anhoang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_anhoang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AnHoang/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ani123456789_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ani123456789_en.md new file mode 100644 index 000000000000..c3343cc16ba7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ani123456789_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ani123456789 DistilBertForQuestionAnswering from ani123456789 +author: John Snow Labs +name: burmese_awesome_qa_model_ani123456789 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ani123456789` is a English model originally trained by ani123456789. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ani123456789_en_5.2.0_3.0_1701046283158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ani123456789_en_5.2.0_3.0_1701046283158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ani123456789","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ani123456789", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ani123456789| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ani123456789/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anirbankgec_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anirbankgec_en.md new file mode 100644 index 000000000000..2b8b5333e665 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anirbankgec_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_anirbankgec DistilBertForQuestionAnswering from anirbankgec +author: John Snow Labs +name: burmese_awesome_qa_model_anirbankgec +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_anirbankgec` is a English model originally trained by anirbankgec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anirbankgec_en_5.2.0_3.0_1701044347328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anirbankgec_en_5.2.0_3.0_1701044347328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_anirbankgec","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_anirbankgec", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_anirbankgec| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anirbankgec/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anjalidagar_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anjalidagar_en.md new file mode 100644 index 000000000000..74a9bcfd7d5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anjalidagar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_anjalidagar DistilBertForQuestionAnswering from anjalidagar +author: John Snow Labs +name: burmese_awesome_qa_model_anjalidagar +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_anjalidagar` is a English model originally trained by anjalidagar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anjalidagar_en_5.2.0_3.0_1701083482158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anjalidagar_en_5.2.0_3.0_1701083482158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_anjalidagar","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_anjalidagar", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_anjalidagar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anjalidagar/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anyuanay_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anyuanay_en.md new file mode 100644 index 000000000000..a72a277b3de0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_anyuanay_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_anyuanay DistilBertForQuestionAnswering from anyuanay +author: John Snow Labs +name: burmese_awesome_qa_model_anyuanay +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_anyuanay` is a English model originally trained by anyuanay. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anyuanay_en_5.2.0_3.0_1701065214214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_anyuanay_en_5.2.0_3.0_1701065214214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_anyuanay","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_anyuanay", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_anyuanay| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anyuanay/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_arkash_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_arkash_en.md new file mode 100644 index 000000000000..f1fd66c1abda --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_arkash_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_arkash DistilBertForQuestionAnswering from Arkash +author: John Snow Labs +name: burmese_awesome_qa_model_arkash +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_arkash` is a English model originally trained by Arkash. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_arkash_en_5.2.0_3.0_1701052445985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_arkash_en_5.2.0_3.0_1701052445985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_arkash","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_arkash", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_arkash| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Arkash/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_arminazizi59_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_arminazizi59_en.md new file mode 100644 index 000000000000..5052ae01d264 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_arminazizi59_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_arminazizi59 DistilBertForQuestionAnswering from arminazizi59 +author: John Snow Labs +name: burmese_awesome_qa_model_arminazizi59 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_arminazizi59` is a English model originally trained by arminazizi59. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_arminazizi59_en_5.2.0_3.0_1701084068666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_arminazizi59_en_5.2.0_3.0_1701084068666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_arminazizi59","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_arminazizi59", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_arminazizi59| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/arminazizi59/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_asingh3_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_asingh3_en.md new file mode 100644 index 000000000000..1b4934e7811c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_asingh3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_asingh3 DistilBertForQuestionAnswering from asingh3 +author: John Snow Labs +name: burmese_awesome_qa_model_asingh3 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_asingh3` is a English model originally trained by asingh3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_asingh3_en_5.2.0_3.0_1701058679853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_asingh3_en_5.2.0_3.0_1701058679853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_asingh3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_asingh3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_asingh3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/asingh3/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_azuelsdorf_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_azuelsdorf_en.md new file mode 100644 index 000000000000..af208b8c661c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_azuelsdorf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_azuelsdorf DistilBertForQuestionAnswering from azuelsdorf +author: John Snow Labs +name: burmese_awesome_qa_model_azuelsdorf +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_azuelsdorf` is a English model originally trained by azuelsdorf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_azuelsdorf_en_5.2.0_3.0_1701088289245.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_azuelsdorf_en_5.2.0_3.0_1701088289245.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_azuelsdorf","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_azuelsdorf", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_azuelsdorf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/azuelsdorf/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_balaji16_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_balaji16_en.md new file mode 100644 index 000000000000..0fc5d96bace0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_balaji16_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_balaji16 DistilBertForQuestionAnswering from Balaji16 +author: John Snow Labs +name: burmese_awesome_qa_model_balaji16 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_balaji16` is a English model originally trained by Balaji16. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_balaji16_en_5.2.0_3.0_1701084837478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_balaji16_en_5.2.0_3.0_1701084837478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_balaji16","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_balaji16", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_balaji16| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Balaji16/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_benjaminlhr_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_benjaminlhr_en.md new file mode 100644 index 000000000000..d8d12435d640 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_benjaminlhr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_benjaminlhr DistilBertForQuestionAnswering from BenjaminLHR +author: John Snow Labs +name: burmese_awesome_qa_model_benjaminlhr +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_benjaminlhr` is a English model originally trained by BenjaminLHR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_benjaminlhr_en_5.2.0_3.0_1701081118757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_benjaminlhr_en_5.2.0_3.0_1701081118757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_benjaminlhr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_benjaminlhr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_benjaminlhr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/BenjaminLHR/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_bobbyone_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_bobbyone_en.md new file mode 100644 index 000000000000..bcde881616bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_bobbyone_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_bobbyone DistilBertForQuestionAnswering from bobbyOne +author: John Snow Labs +name: burmese_awesome_qa_model_bobbyone +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_bobbyone` is a English model originally trained by bobbyOne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bobbyone_en_5.2.0_3.0_1701086226674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_bobbyone_en_5.2.0_3.0_1701086226674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_bobbyone","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_bobbyone", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_bobbyone| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bobbyOne/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_borey456_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_borey456_en.md new file mode 100644 index 000000000000..ae3f48b09334 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_borey456_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_borey456 DistilBertForQuestionAnswering from borey456 +author: John Snow Labs +name: burmese_awesome_qa_model_borey456 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_borey456` is a English model originally trained by borey456. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_borey456_en_5.2.0_3.0_1701051497453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_borey456_en_5.2.0_3.0_1701051497453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_borey456","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_borey456", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_borey456| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|505.4 MB| + +## References + +https://huggingface.co/borey456/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_braden99_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_braden99_en.md new file mode 100644 index 000000000000..d4652ef6d177 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_braden99_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_braden99 DistilBertForQuestionAnswering from Braden99 +author: John Snow Labs +name: burmese_awesome_qa_model_braden99 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_braden99` is a English model originally trained by Braden99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_braden99_en_5.2.0_3.0_1701077978715.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_braden99_en_5.2.0_3.0_1701077978715.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_braden99","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_braden99", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_braden99| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Braden99/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_browak_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_browak_en.md new file mode 100644 index 000000000000..c4c2095cb039 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_browak_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_browak DistilBertForQuestionAnswering from browak +author: John Snow Labs +name: burmese_awesome_qa_model_browak +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_browak` is a English model originally trained by browak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_browak_en_5.2.0_3.0_1701082713619.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_browak_en_5.2.0_3.0_1701082713619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_browak","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_browak", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_browak| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/browak/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_cmelende_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_cmelende_en.md new file mode 100644 index 000000000000..646236a1e194 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_cmelende_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_cmelende DistilBertForQuestionAnswering from cmelende +author: John Snow Labs +name: burmese_awesome_qa_model_cmelende +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_cmelende` is a English model originally trained by cmelende. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_cmelende_en_5.2.0_3.0_1701070080433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_cmelende_en_5.2.0_3.0_1701070080433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_cmelende","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_cmelende", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_cmelende| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/cmelende/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_dafini8_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_dafini8_en.md new file mode 100644 index 000000000000..4b4764de7447 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_dafini8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_dafini8 DistilBertForQuestionAnswering from dafini8 +author: John Snow Labs +name: burmese_awesome_qa_model_dafini8 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_dafini8` is a English model originally trained by dafini8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_dafini8_en_5.2.0_3.0_1701077234794.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_dafini8_en_5.2.0_3.0_1701077234794.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_dafini8","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_dafini8", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_dafini8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dafini8/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_darius_expanded_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_darius_expanded_en.md new file mode 100644 index 000000000000..23c765d1a469 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_darius_expanded_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_darius_expanded DistilBertForQuestionAnswering from DariusStaugas +author: John Snow Labs +name: burmese_awesome_qa_model_darius_expanded +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_darius_expanded` is a English model originally trained by DariusStaugas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_darius_expanded_en_5.2.0_3.0_1701087515355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_darius_expanded_en_5.2.0_3.0_1701087515355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_darius_expanded","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_darius_expanded", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_darius_expanded| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DariusStaugas/my_awesome_qa_model_Darius_expanded \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_dariusstaugas_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_dariusstaugas_en.md new file mode 100644 index 000000000000..3eac3dd30e6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_dariusstaugas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_dariusstaugas DistilBertForQuestionAnswering from DariusStaugas +author: John Snow Labs +name: burmese_awesome_qa_model_dariusstaugas +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_dariusstaugas` is a English model originally trained by DariusStaugas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_dariusstaugas_en_5.2.0_3.0_1701084837471.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_dariusstaugas_en_5.2.0_3.0_1701084837471.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_dariusstaugas","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_dariusstaugas", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_dariusstaugas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DariusStaugas/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_darkwarren_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_darkwarren_en.md new file mode 100644 index 000000000000..d9449ddeadc5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_darkwarren_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_darkwarren DistilBertForQuestionAnswering from DarkWarren +author: John Snow Labs +name: burmese_awesome_qa_model_darkwarren +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_darkwarren` is a English model originally trained by DarkWarren. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_darkwarren_en_5.2.0_3.0_1701044550644.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_darkwarren_en_5.2.0_3.0_1701044550644.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_darkwarren","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_darkwarren", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_darkwarren| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DarkWarren/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_duytu_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_duytu_en.md new file mode 100644 index 000000000000..95cf25634593 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_duytu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_duytu DistilBertForQuestionAnswering from duytu +author: John Snow Labs +name: burmese_awesome_qa_model_duytu +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_duytu` is a English model originally trained by duytu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_duytu_en_5.2.0_3.0_1701081028385.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_duytu_en_5.2.0_3.0_1701081028385.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_duytu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_duytu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_duytu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/duytu/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_eawang_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_eawang_en.md new file mode 100644 index 000000000000..0c251445940c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_eawang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_eawang DistilBertForQuestionAnswering from eawang +author: John Snow Labs +name: burmese_awesome_qa_model_eawang +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_eawang` is a English model originally trained by eawang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_eawang_en_5.2.0_3.0_1701061557906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_eawang_en_5.2.0_3.0_1701061557906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_eawang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_eawang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_eawang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/eawang/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_egu0_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_egu0_en.md new file mode 100644 index 000000000000..a709b37e45fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_egu0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_egu0 DistilBertForQuestionAnswering from egu0 +author: John Snow Labs +name: burmese_awesome_qa_model_egu0 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_egu0` is a English model originally trained by egu0. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_egu0_en_5.2.0_3.0_1701044593051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_egu0_en_5.2.0_3.0_1701044593051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_egu0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_egu0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_egu0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/egu0/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_eitanli_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_eitanli_en.md new file mode 100644 index 000000000000..a022a0e0ab9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_eitanli_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_eitanli DistilBertForQuestionAnswering from Eitanli +author: John Snow Labs +name: burmese_awesome_qa_model_eitanli +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_eitanli` is a English model originally trained by Eitanli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_eitanli_en_5.2.0_3.0_1701078768552.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_eitanli_en_5.2.0_3.0_1701078768552.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_eitanli","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_eitanli", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_eitanli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Eitanli/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_elis_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_elis_en.md new file mode 100644 index 000000000000..1ac9740b1527 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_elis_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_elis DistilBertForQuestionAnswering from DariusStaugas +author: John Snow Labs +name: burmese_awesome_qa_model_elis +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_elis` is a English model originally trained by DariusStaugas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_elis_en_5.2.0_3.0_1701090386308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_elis_en_5.2.0_3.0_1701090386308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_elis","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_elis", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_elis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DariusStaugas/my_awesome_qa_model_Elis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_elis_expanded_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_elis_expanded_en.md new file mode 100644 index 000000000000..af3b0a2e984a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_elis_expanded_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_elis_expanded DistilBertForQuestionAnswering from DariusStaugas +author: John Snow Labs +name: burmese_awesome_qa_model_elis_expanded +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_elis_expanded` is a English model originally trained by DariusStaugas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_elis_expanded_en_5.2.0_3.0_1701081966169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_elis_expanded_en_5.2.0_3.0_1701081966169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_elis_expanded","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_elis_expanded", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_elis_expanded| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DariusStaugas/my_awesome_qa_model_Elis_expanded \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_emresefer_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_emresefer_en.md new file mode 100644 index 000000000000..11b10a515dd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_emresefer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_emresefer DistilBertForQuestionAnswering from emresefer +author: John Snow Labs +name: burmese_awesome_qa_model_emresefer +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_emresefer` is a English model originally trained by emresefer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_emresefer_en_5.2.0_3.0_1701087515156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_emresefer_en_5.2.0_3.0_1701087515156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_emresefer","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_emresefer", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_emresefer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/emresefer/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ezrawilliam_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ezrawilliam_en.md new file mode 100644 index 000000000000..77f763104f1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ezrawilliam_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ezrawilliam DistilBertForQuestionAnswering from EzraWilliam +author: John Snow Labs +name: burmese_awesome_qa_model_ezrawilliam +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ezrawilliam` is a English model originally trained by EzraWilliam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ezrawilliam_en_5.2.0_3.0_1701053253520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ezrawilliam_en_5.2.0_3.0_1701053253520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ezrawilliam","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ezrawilliam", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ezrawilliam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/EzraWilliam/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_fardinbh_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_fardinbh_en.md new file mode 100644 index 000000000000..23a9fd1c7ecd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_fardinbh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_fardinbh DistilBertForQuestionAnswering from fardinbh +author: John Snow Labs +name: burmese_awesome_qa_model_fardinbh +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_fardinbh` is a English model originally trained by fardinbh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_fardinbh_en_5.2.0_3.0_1701075992794.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_fardinbh_en_5.2.0_3.0_1701075992794.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_fardinbh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_fardinbh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_fardinbh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fardinbh/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_fathyshalab_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_fathyshalab_en.md new file mode 100644 index 000000000000..ffb9fe8a3e48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_fathyshalab_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_fathyshalab DistilBertForQuestionAnswering from fathyshalab +author: John Snow Labs +name: burmese_awesome_qa_model_fathyshalab +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_fathyshalab` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_fathyshalab_en_5.2.0_3.0_1701059280013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_fathyshalab_en_5.2.0_3.0_1701059280013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_fathyshalab","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_fathyshalab", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_fathyshalab| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fathyshalab/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_frogwang2000_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_frogwang2000_en.md new file mode 100644 index 000000000000..3a16f773a49a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_frogwang2000_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_frogwang2000 DistilBertForQuestionAnswering from frogwang2000 +author: John Snow Labs +name: burmese_awesome_qa_model_frogwang2000 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_frogwang2000` is a English model originally trained by frogwang2000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_frogwang2000_en_5.2.0_3.0_1701052445802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_frogwang2000_en_5.2.0_3.0_1701052445802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_frogwang2000","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_frogwang2000", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_frogwang2000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/frogwang2000/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ganga12_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ganga12_en.md new file mode 100644 index 000000000000..700d8bd4d519 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ganga12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ganga12 DistilBertForQuestionAnswering from Ganga12 +author: John Snow Labs +name: burmese_awesome_qa_model_ganga12 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ganga12` is a English model originally trained by Ganga12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ganga12_en_5.2.0_3.0_1701087438556.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ganga12_en_5.2.0_3.0_1701087438556.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ganga12","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ganga12", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ganga12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Ganga12/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_gnanaprakash2004_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_gnanaprakash2004_en.md new file mode 100644 index 000000000000..4247c5a16e60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_gnanaprakash2004_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_gnanaprakash2004 DistilBertForQuestionAnswering from GnanaPrakash2004 +author: John Snow Labs +name: burmese_awesome_qa_model_gnanaprakash2004 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_gnanaprakash2004` is a English model originally trained by GnanaPrakash2004. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_gnanaprakash2004_en_5.2.0_3.0_1701071003932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_gnanaprakash2004_en_5.2.0_3.0_1701071003932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_gnanaprakash2004","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_gnanaprakash2004", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_gnanaprakash2004| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/GnanaPrakash2004/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_godlikeheheda_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_godlikeheheda_en.md new file mode 100644 index 000000000000..e0aa00ac3c42 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_godlikeheheda_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_godlikeheheda DistilBertForQuestionAnswering from godlikeheheda +author: John Snow Labs +name: burmese_awesome_qa_model_godlikeheheda +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_godlikeheheda` is a English model originally trained by godlikeheheda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_godlikeheheda_en_5.2.0_3.0_1701074881033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_godlikeheheda_en_5.2.0_3.0_1701074881033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_godlikeheheda","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_godlikeheheda", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_godlikeheheda| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/godlikeheheda/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hansollll_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hansollll_en.md new file mode 100644 index 000000000000..ca33177271f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hansollll_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_hansollll DistilBertForQuestionAnswering from Hansollll +author: John Snow Labs +name: burmese_awesome_qa_model_hansollll +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_hansollll` is a English model originally trained by Hansollll. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hansollll_en_5.2.0_3.0_1701055108774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hansollll_en_5.2.0_3.0_1701055108774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_hansollll","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_hansollll", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_hansollll| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Hansollll/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_heydars_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_heydars_en.md new file mode 100644 index 000000000000..bd24c284bf1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_heydars_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_heydars DistilBertForQuestionAnswering from HeydarS +author: John Snow Labs +name: burmese_awesome_qa_model_heydars +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_heydars` is a English model originally trained by HeydarS. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_heydars_en_5.2.0_3.0_1701090484410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_heydars_en_5.2.0_3.0_1701090484410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_heydars","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_heydars", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_heydars| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/HeydarS/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hfease_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hfease_en.md new file mode 100644 index 000000000000..516c25984731 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hfease_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_hfease DistilBertForQuestionAnswering from hfease +author: John Snow Labs +name: burmese_awesome_qa_model_hfease +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_hfease` is a English model originally trained by hfease. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hfease_en_5.2.0_3.0_1701081209977.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hfease_en_5.2.0_3.0_1701081209977.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_hfease","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_hfease", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_hfease| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hfease/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hunguyen3525_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hunguyen3525_en.md new file mode 100644 index 000000000000..e5b8e5f7f6ac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hunguyen3525_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_hunguyen3525 DistilBertForQuestionAnswering from hunguyen3525 +author: John Snow Labs +name: burmese_awesome_qa_model_hunguyen3525 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_hunguyen3525` is a English model originally trained by hunguyen3525. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hunguyen3525_en_5.2.0_3.0_1701044389071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hunguyen3525_en_5.2.0_3.0_1701044389071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_hunguyen3525","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_hunguyen3525", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_hunguyen3525| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hunguyen3525/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hxtheone_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hxtheone_en.md new file mode 100644 index 000000000000..c565b067d2aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_hxtheone_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_hxtheone DistilBertForQuestionAnswering from hxtheone +author: John Snow Labs +name: burmese_awesome_qa_model_hxtheone +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_hxtheone` is a English model originally trained by hxtheone. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hxtheone_en_5.2.0_3.0_1701072970876.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_hxtheone_en_5.2.0_3.0_1701072970876.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_hxtheone","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_hxtheone", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_hxtheone| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hxtheone/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_idriska_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_idriska_en.md new file mode 100644 index 000000000000..d41bea0d3e1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_idriska_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_idriska DistilBertForQuestionAnswering from Idriska +author: John Snow Labs +name: burmese_awesome_qa_model_idriska +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_idriska` is a English model originally trained by Idriska. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_idriska_en_5.2.0_3.0_1701074900416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_idriska_en_5.2.0_3.0_1701074900416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_idriska","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_idriska", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_idriska| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Idriska/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jackkidding_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jackkidding_en.md new file mode 100644 index 000000000000..d9b1ba492a8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jackkidding_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jackkidding DistilBertForQuestionAnswering from jackkidding +author: John Snow Labs +name: burmese_awesome_qa_model_jackkidding +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jackkidding` is a English model originally trained by jackkidding. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jackkidding_en_5.2.0_3.0_1701092118124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jackkidding_en_5.2.0_3.0_1701092118124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jackkidding","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jackkidding", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jackkidding| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jackkidding/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jaskmankan_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jaskmankan_en.md new file mode 100644 index 000000000000..050e49bef088 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jaskmankan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jaskmankan DistilBertForQuestionAnswering from jaskmanKan +author: John Snow Labs +name: burmese_awesome_qa_model_jaskmankan +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jaskmankan` is a English model originally trained by jaskmanKan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jaskmankan_en_5.2.0_3.0_1701090714379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jaskmankan_en_5.2.0_3.0_1701090714379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jaskmankan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jaskmankan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jaskmankan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jaskmanKan/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jasscr_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jasscr_en.md new file mode 100644 index 000000000000..030a03b745d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jasscr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jasscr DistilBertForQuestionAnswering from jasscr +author: John Snow Labs +name: burmese_awesome_qa_model_jasscr +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jasscr` is a English model originally trained by jasscr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jasscr_en_5.2.0_3.0_1701072331729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jasscr_en_5.2.0_3.0_1701072331729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jasscr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jasscr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jasscr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jasscr/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jennndexter_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jennndexter_en.md new file mode 100644 index 000000000000..b8dc5929ceba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jennndexter_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jennndexter DistilBertForQuestionAnswering from JennnDexter +author: John Snow Labs +name: burmese_awesome_qa_model_jennndexter +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jennndexter` is a English model originally trained by JennnDexter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jennndexter_en_5.2.0_3.0_1701089642722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jennndexter_en_5.2.0_3.0_1701089642722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jennndexter","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jennndexter", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jennndexter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/JennnDexter/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jlines_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jlines_en.md new file mode 100644 index 000000000000..b9ecb8e725a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jlines_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jlines DistilBertForQuestionAnswering from jlines +author: John Snow Labs +name: burmese_awesome_qa_model_jlines +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jlines` is a English model originally trained by jlines. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jlines_en_5.2.0_3.0_1701089838029.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jlines_en_5.2.0_3.0_1701089838029.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jlines","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jlines", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jlines| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jlines/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jmartell_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jmartell_en.md new file mode 100644 index 000000000000..8986f04d5a66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jmartell_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jmartell DistilBertForQuestionAnswering from jmartell +author: John Snow Labs +name: burmese_awesome_qa_model_jmartell +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jmartell` is a English model originally trained by jmartell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jmartell_en_5.2.0_3.0_1701043612292.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jmartell_en_5.2.0_3.0_1701043612292.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jmartell","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jmartell", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jmartell| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jmartell/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_joddiy_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_joddiy_en.md new file mode 100644 index 000000000000..240831c9058f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_joddiy_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_joddiy DistilBertForQuestionAnswering from joddiy +author: John Snow Labs +name: burmese_awesome_qa_model_joddiy +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_joddiy` is a English model originally trained by joddiy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_joddiy_en_5.2.0_3.0_1701092755483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_joddiy_en_5.2.0_3.0_1701092755483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_joddiy","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_joddiy", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_joddiy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/joddiy/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jord_hanus_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jord_hanus_en.md new file mode 100644 index 000000000000..bb425ab3a221 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_jord_hanus_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_jord_hanus DistilBertForQuestionAnswering from jord-hanus +author: John Snow Labs +name: burmese_awesome_qa_model_jord_hanus +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_jord_hanus` is a English model originally trained by jord-hanus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jord_hanus_en_5.2.0_3.0_1701074880226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_jord_hanus_en_5.2.0_3.0_1701074880226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_jord_hanus","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_jord_hanus", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_jord_hanus| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jord-hanus/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_josalpho_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_josalpho_en.md new file mode 100644 index 000000000000..6a81f9715025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_josalpho_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_josalpho DistilBertForQuestionAnswering from JOSALPHO +author: John Snow Labs +name: burmese_awesome_qa_model_josalpho +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_josalpho` is a English model originally trained by JOSALPHO. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_josalpho_en_5.2.0_3.0_1701050209483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_josalpho_en_5.2.0_3.0_1701050209483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_josalpho","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_josalpho", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_josalpho| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/JOSALPHO/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kaanah_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kaanah_en.md new file mode 100644 index 000000000000..a462cbce4875 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kaanah_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kaanah DistilBertForQuestionAnswering from kaanah +author: John Snow Labs +name: burmese_awesome_qa_model_kaanah +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kaanah` is a English model originally trained by kaanah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kaanah_en_5.2.0_3.0_1701056319309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kaanah_en_5.2.0_3.0_1701056319309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kaanah","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kaanah", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kaanah| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kaanah/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kalaiarasi24_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kalaiarasi24_en.md new file mode 100644 index 000000000000..6d2a53136bae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kalaiarasi24_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kalaiarasi24 DistilBertForQuestionAnswering from Kalaiarasi24 +author: John Snow Labs +name: burmese_awesome_qa_model_kalaiarasi24 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kalaiarasi24` is a English model originally trained by Kalaiarasi24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kalaiarasi24_en_5.2.0_3.0_1701063872129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kalaiarasi24_en_5.2.0_3.0_1701063872129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kalaiarasi24","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kalaiarasi24", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kalaiarasi24| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Kalaiarasi24/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kazisami_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kazisami_en.md new file mode 100644 index 000000000000..b4089fb40e0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kazisami_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kazisami DistilBertForQuestionAnswering from kazisami +author: John Snow Labs +name: burmese_awesome_qa_model_kazisami +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kazisami` is a English model originally trained by kazisami. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kazisami_en_5.2.0_3.0_1701067798824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kazisami_en_5.2.0_3.0_1701067798824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kazisami","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kazisami", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kazisami| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kazisami/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kevinhemsig_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kevinhemsig_en.md new file mode 100644 index 000000000000..43c595fea3c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kevinhemsig_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kevinhemsig DistilBertForQuestionAnswering from KevinHemsig +author: John Snow Labs +name: burmese_awesome_qa_model_kevinhemsig +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kevinhemsig` is a English model originally trained by KevinHemsig. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kevinhemsig_en_5.2.0_3.0_1701043622841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kevinhemsig_en_5.2.0_3.0_1701043622841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kevinhemsig","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kevinhemsig", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kevinhemsig| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/KevinHemsig/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kimshine_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kimshine_en.md new file mode 100644 index 000000000000..98e2efe9b776 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kimshine_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kimshine DistilBertForQuestionAnswering from KimSHine +author: John Snow Labs +name: burmese_awesome_qa_model_kimshine +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kimshine` is a English model originally trained by KimSHine. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kimshine_en_5.2.0_3.0_1701045784940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kimshine_en_5.2.0_3.0_1701045784940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kimshine","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kimshine", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kimshine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/KimSHine/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kolodach_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kolodach_en.md new file mode 100644 index 000000000000..c057d8946cbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kolodach_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kolodach DistilBertForQuestionAnswering from kolodach +author: John Snow Labs +name: burmese_awesome_qa_model_kolodach +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kolodach` is a English model originally trained by kolodach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kolodach_en_5.2.0_3.0_1701080463487.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kolodach_en_5.2.0_3.0_1701080463487.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kolodach","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kolodach", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kolodach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kolodach/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_koushik000_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_koushik000_en.md new file mode 100644 index 000000000000..4f912d16f841 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_koushik000_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_koushik000 DistilBertForQuestionAnswering from Koushik000 +author: John Snow Labs +name: burmese_awesome_qa_model_koushik000 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_koushik000` is a English model originally trained by Koushik000. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_koushik000_en_5.2.0_3.0_1701058856061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_koushik000_en_5.2.0_3.0_1701058856061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_koushik000","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_koushik000", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_koushik000| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Koushik000/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kymsa_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kymsa_en.md new file mode 100644 index 000000000000..3815920cfdf6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_kymsa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_kymsa DistilBertForQuestionAnswering from kymsa +author: John Snow Labs +name: burmese_awesome_qa_model_kymsa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_kymsa` is a English model originally trained by kymsa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kymsa_en_5.2.0_3.0_1701091403987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_kymsa_en_5.2.0_3.0_1701091403987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_kymsa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_kymsa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_kymsa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kymsa/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_leesb_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_leesb_en.md new file mode 100644 index 000000000000..41289edcc1c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_leesb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_leesb DistilBertForQuestionAnswering from LeeSB +author: John Snow Labs +name: burmese_awesome_qa_model_leesb +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_leesb` is a English model originally trained by LeeSB. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_leesb_en_5.2.0_3.0_1701065792349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_leesb_en_5.2.0_3.0_1701065792349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_leesb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_leesb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_leesb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LeeSB/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_leslielleslles_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_leslielleslles_en.md new file mode 100644 index 000000000000..f771d6f5ca2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_leslielleslles_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_leslielleslles DistilBertForQuestionAnswering from leslielleslles +author: John Snow Labs +name: burmese_awesome_qa_model_leslielleslles +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_leslielleslles` is a English model originally trained by leslielleslles. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_leslielleslles_en_5.2.0_3.0_1701071092254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_leslielleslles_en_5.2.0_3.0_1701071092254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_leslielleslles","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_leslielleslles", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_leslielleslles| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/leslielleslles/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_lina_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_lina_en.md new file mode 100644 index 000000000000..8b590cfba969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_lina_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_lina DistilBertForQuestionAnswering from DariusStaugas +author: John Snow Labs +name: burmese_awesome_qa_model_lina +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_lina` is a English model originally trained by DariusStaugas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_lina_en_5.2.0_3.0_1701083938439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_lina_en_5.2.0_3.0_1701083938439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_lina","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_lina", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_lina| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/DariusStaugas/my_awesome_qa_model_Lina \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_lofelix_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_lofelix_en.md new file mode 100644 index 000000000000..5c1d5b1a360b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_lofelix_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_lofelix DistilBertForQuestionAnswering from lofelix +author: John Snow Labs +name: burmese_awesome_qa_model_lofelix +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_lofelix` is a English model originally trained by lofelix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_lofelix_en_5.2.0_3.0_1701088419348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_lofelix_en_5.2.0_3.0_1701088419348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_lofelix","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_lofelix", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_lofelix| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lofelix/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_loony_user_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_loony_user_en.md new file mode 100644 index 000000000000..5222159a9ca9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_loony_user_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_loony_user DistilBertForQuestionAnswering from loony-user +author: John Snow Labs +name: burmese_awesome_qa_model_loony_user +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_loony_user` is a English model originally trained by loony-user. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_loony_user_en_5.2.0_3.0_1701079560090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_loony_user_en_5.2.0_3.0_1701079560090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_loony_user","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_loony_user", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_loony_user| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/loony-user/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_manish5678_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_manish5678_en.md new file mode 100644 index 000000000000..f68d2d7cc023 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_manish5678_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_manish5678 DistilBertForQuestionAnswering from manish5678 +author: John Snow Labs +name: burmese_awesome_qa_model_manish5678 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_manish5678` is a English model originally trained by manish5678. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_manish5678_en_5.2.0_3.0_1701046076907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_manish5678_en_5.2.0_3.0_1701046076907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_manish5678","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_manish5678", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_manish5678| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/manish5678/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_michael_kingston_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_michael_kingston_en.md new file mode 100644 index 000000000000..e46d25467fa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_michael_kingston_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_michael_kingston DistilBertForQuestionAnswering from michael-kingston +author: John Snow Labs +name: burmese_awesome_qa_model_michael_kingston +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_michael_kingston` is a English model originally trained by michael-kingston. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_michael_kingston_en_5.2.0_3.0_1701044592938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_michael_kingston_en_5.2.0_3.0_1701044592938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_michael_kingston","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_michael_kingston", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_michael_kingston| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/michael-kingston/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_minggz_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_minggz_en.md new file mode 100644 index 000000000000..29735f444fef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_minggz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_minggz DistilBertForQuestionAnswering from Minggz +author: John Snow Labs +name: burmese_awesome_qa_model_minggz +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_minggz` is a English model originally trained by Minggz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_minggz_en_5.2.0_3.0_1701069086902.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_minggz_en_5.2.0_3.0_1701069086902.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_minggz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_minggz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_minggz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Minggz/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_miroslawas_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_miroslawas_en.md new file mode 100644 index 000000000000..0c0eae63bc56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_miroslawas_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_miroslawas DistilBertForQuestionAnswering from miroslawas +author: John Snow Labs +name: burmese_awesome_qa_model_miroslawas +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_miroslawas` is a English model originally trained by miroslawas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_miroslawas_en_5.2.0_3.0_1701077979615.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_miroslawas_en_5.2.0_3.0_1701077979615.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_miroslawas","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_miroslawas", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_miroslawas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/miroslawas/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_mrbach_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_mrbach_en.md new file mode 100644 index 000000000000..8069972944d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_mrbach_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_mrbach DistilBertForQuestionAnswering from mrbach +author: John Snow Labs +name: burmese_awesome_qa_model_mrbach +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_mrbach` is a English model originally trained by mrbach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_mrbach_en_5.2.0_3.0_1701091087466.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_mrbach_en_5.2.0_3.0_1701091087466.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_mrbach","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_mrbach", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_mrbach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mrbach/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_mwilsonreyes_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_mwilsonreyes_en.md new file mode 100644 index 000000000000..6b8b035c2b7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_mwilsonreyes_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_mwilsonreyes DistilBertForQuestionAnswering from mwilsonreyes +author: John Snow Labs +name: burmese_awesome_qa_model_mwilsonreyes +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_mwilsonreyes` is a English model originally trained by mwilsonreyes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_mwilsonreyes_en_5.2.0_3.0_1701077641286.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_mwilsonreyes_en_5.2.0_3.0_1701077641286.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_mwilsonreyes","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_mwilsonreyes", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_mwilsonreyes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mwilsonreyes/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_myle0901_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_myle0901_en.md new file mode 100644 index 000000000000..b0484243151c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_myle0901_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_myle0901 DistilBertForQuestionAnswering from myle0901 +author: John Snow Labs +name: burmese_awesome_qa_model_myle0901 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_myle0901` is a English model originally trained by myle0901. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_myle0901_en_5.2.0_3.0_1701054306710.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_myle0901_en_5.2.0_3.0_1701054306710.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_myle0901","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_myle0901", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_myle0901| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/myle0901/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_nicolesarvasi_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_nicolesarvasi_en.md new file mode 100644 index 000000000000..c578911c49d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_nicolesarvasi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_nicolesarvasi DistilBertForQuestionAnswering from NicoleSarvasi +author: John Snow Labs +name: burmese_awesome_qa_model_nicolesarvasi +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_nicolesarvasi` is a English model originally trained by NicoleSarvasi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nicolesarvasi_en_5.2.0_3.0_1701064935426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nicolesarvasi_en_5.2.0_3.0_1701064935426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_nicolesarvasi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_nicolesarvasi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_nicolesarvasi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/NicoleSarvasi/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_nik12_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_nik12_en.md new file mode 100644 index 000000000000..41874daf8091 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_nik12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_nik12 DistilBertForQuestionAnswering from nik12 +author: John Snow Labs +name: burmese_awesome_qa_model_nik12 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_nik12` is a English model originally trained by nik12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nik12_en_5.2.0_3.0_1701062824894.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_nik12_en_5.2.0_3.0_1701062824894.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_nik12","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_nik12", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_nik12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nik12/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_noushath_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_noushath_en.md new file mode 100644 index 000000000000..d6f869c1b6cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_noushath_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_noushath DistilBertForQuestionAnswering from Noushath +author: John Snow Labs +name: burmese_awesome_qa_model_noushath +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_noushath` is a English model originally trained by Noushath. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_noushath_en_5.2.0_3.0_1701044337336.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_noushath_en_5.2.0_3.0_1701044337336.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_noushath","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_noushath", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_noushath| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Noushath/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_onzi_suba_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_onzi_suba_en.md new file mode 100644 index 000000000000..1819dafe0479 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_onzi_suba_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_onzi_suba DistilBertForQuestionAnswering from onzi-suba +author: John Snow Labs +name: burmese_awesome_qa_model_onzi_suba +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_onzi_suba` is a English model originally trained by onzi-suba. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_onzi_suba_en_5.2.0_3.0_1701048217774.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_onzi_suba_en_5.2.0_3.0_1701048217774.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_onzi_suba","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_onzi_suba", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_onzi_suba| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/onzi-suba/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_padu98_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_padu98_en.md new file mode 100644 index 000000000000..cb6ac4c96c61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_padu98_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_padu98 DistilBertForQuestionAnswering from Padu98 +author: John Snow Labs +name: burmese_awesome_qa_model_padu98 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_padu98` is a English model originally trained by Padu98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_padu98_en_5.2.0_3.0_1701081751412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_padu98_en_5.2.0_3.0_1701081751412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_padu98","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_padu98", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_padu98| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Padu98/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pascalleone51_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pascalleone51_en.md new file mode 100644 index 000000000000..3c70cb06127d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pascalleone51_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pascalleone51 DistilBertForQuestionAnswering from pascalleone51 +author: John Snow Labs +name: burmese_awesome_qa_model_pascalleone51 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pascalleone51` is a English model originally trained by pascalleone51. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pascalleone51_en_5.2.0_3.0_1701054322711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pascalleone51_en_5.2.0_3.0_1701054322711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pascalleone51","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pascalleone51", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pascalleone51| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pascalleone51/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_paytonray_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_paytonray_en.md new file mode 100644 index 000000000000..357775192239 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_paytonray_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_paytonray DistilBertForQuestionAnswering from paytonray +author: John Snow Labs +name: burmese_awesome_qa_model_paytonray +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_paytonray` is a English model originally trained by paytonray. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_paytonray_en_5.2.0_3.0_1701077722490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_paytonray_en_5.2.0_3.0_1701077722490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_paytonray","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_paytonray", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_paytonray| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/paytonray/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pdp_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pdp_en.md new file mode 100644 index 000000000000..1fe2e6961f94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pdp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pdp DistilBertForQuestionAnswering from pradeepie +author: John Snow Labs +name: burmese_awesome_qa_model_pdp +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pdp` is a English model originally trained by pradeepie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pdp_en_5.2.0_3.0_1701055098791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pdp_en_5.2.0_3.0_1701055098791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pdp","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pdp", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pdp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pradeepie/my_awesome_qa_model_pdp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_peng0208_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_peng0208_en.md new file mode 100644 index 000000000000..13ac5ffa39d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_peng0208_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_peng0208 DistilBertForQuestionAnswering from peng0208 +author: John Snow Labs +name: burmese_awesome_qa_model_peng0208 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_peng0208` is a English model originally trained by peng0208. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_peng0208_en_5.2.0_3.0_1701052445788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_peng0208_en_5.2.0_3.0_1701052445788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_peng0208","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_peng0208", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_peng0208| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/peng0208/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pinak1297_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pinak1297_en.md new file mode 100644 index 000000000000..78d700953310 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pinak1297_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pinak1297 DistilBertForQuestionAnswering from Pinak1297 +author: John Snow Labs +name: burmese_awesome_qa_model_pinak1297 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pinak1297` is a English model originally trained by Pinak1297. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pinak1297_en_5.2.0_3.0_1701059733925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pinak1297_en_5.2.0_3.0_1701059733925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pinak1297","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pinak1297", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pinak1297| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Pinak1297/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pinelope_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pinelope_en.md new file mode 100644 index 000000000000..a466bc4d61ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pinelope_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pinelope DistilBertForQuestionAnswering from pinelope +author: John Snow Labs +name: burmese_awesome_qa_model_pinelope +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pinelope` is a English model originally trained by pinelope. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pinelope_en_5.2.0_3.0_1701061328501.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pinelope_en_5.2.0_3.0_1701061328501.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pinelope","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pinelope", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pinelope| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pinelope/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pkhanna2_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pkhanna2_en.md new file mode 100644 index 000000000000..1034152fa1c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_pkhanna2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_pkhanna2 DistilBertForQuestionAnswering from pkhanna2 +author: John Snow Labs +name: burmese_awesome_qa_model_pkhanna2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_pkhanna2` is a English model originally trained by pkhanna2. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pkhanna2_en_5.2.0_3.0_1701056220754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_pkhanna2_en_5.2.0_3.0_1701056220754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_pkhanna2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_pkhanna2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_pkhanna2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/pkhanna2/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_prahalad_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_prahalad_en.md new file mode 100644 index 000000000000..c1d4cd25d0b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_prahalad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_prahalad DistilBertForQuestionAnswering from Prahalad +author: John Snow Labs +name: burmese_awesome_qa_model_prahalad +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_prahalad` is a English model originally trained by Prahalad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_prahalad_en_5.2.0_3.0_1701073960018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_prahalad_en_5.2.0_3.0_1701073960018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_prahalad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_prahalad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_prahalad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Prahalad/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_qkrwnstj_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_qkrwnstj_en.md new file mode 100644 index 000000000000..45776a0215d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_qkrwnstj_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_qkrwnstj DistilBertForQuestionAnswering from qkrwnstj +author: John Snow Labs +name: burmese_awesome_qa_model_qkrwnstj +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_qkrwnstj` is a English model originally trained by qkrwnstj. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_qkrwnstj_en_5.2.0_3.0_1701048217781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_qkrwnstj_en_5.2.0_3.0_1701048217781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_qkrwnstj","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_qkrwnstj", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_qkrwnstj| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/qkrwnstj/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ramani2002_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ramani2002_en.md new file mode 100644 index 000000000000..6dbd12b627e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ramani2002_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ramani2002 DistilBertForQuestionAnswering from ramani2002 +author: John Snow Labs +name: burmese_awesome_qa_model_ramani2002 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ramani2002` is a English model originally trained by ramani2002. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ramani2002_en_5.2.0_3.0_1701080042557.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ramani2002_en_5.2.0_3.0_1701080042557.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ramani2002","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ramani2002", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ramani2002| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ramani2002/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rasdani_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rasdani_en.md new file mode 100644 index 000000000000..46458f13a0ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rasdani_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_rasdani DistilBertForQuestionAnswering from rasdani +author: John Snow Labs +name: burmese_awesome_qa_model_rasdani +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_rasdani` is a English model originally trained by rasdani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rasdani_en_5.2.0_3.0_1701066736437.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rasdani_en_5.2.0_3.0_1701066736437.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_rasdani","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_rasdani", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_rasdani| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rasdani/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ravi00ei51_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ravi00ei51_en.md new file mode 100644 index 000000000000..477de8d15220 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ravi00ei51_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ravi00ei51 DistilBertForQuestionAnswering from ravi00ei51 +author: John Snow Labs +name: burmese_awesome_qa_model_ravi00ei51 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ravi00ei51` is a English model originally trained by ravi00ei51. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ravi00ei51_en_5.2.0_3.0_1701071931550.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ravi00ei51_en_5.2.0_3.0_1701071931550.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ravi00ei51","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ravi00ei51", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ravi00ei51| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ravi00ei51/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_renukakakasaheb_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_renukakakasaheb_en.md new file mode 100644 index 000000000000..caa462fa1245 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_renukakakasaheb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_renukakakasaheb DistilBertForQuestionAnswering from renukakakasaheb +author: John Snow Labs +name: burmese_awesome_qa_model_renukakakasaheb +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_renukakakasaheb` is a English model originally trained by renukakakasaheb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_renukakakasaheb_en_5.2.0_3.0_1701084921507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_renukakakasaheb_en_5.2.0_3.0_1701084921507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_renukakakasaheb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_renukakakasaheb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_renukakakasaheb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/renukakakasaheb/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_revanth117_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_revanth117_en.md new file mode 100644 index 000000000000..ee428c851014 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_revanth117_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_revanth117 DistilBertForQuestionAnswering from revanth117 +author: John Snow Labs +name: burmese_awesome_qa_model_revanth117 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_revanth117` is a English model originally trained by revanth117. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_revanth117_en_5.2.0_3.0_1701063872139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_revanth117_en_5.2.0_3.0_1701063872139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_revanth117","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_revanth117", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_revanth117| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/revanth117/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_robinsonml_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_robinsonml_en.md new file mode 100644 index 000000000000..d6f1db8666f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_robinsonml_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_robinsonml DistilBertForQuestionAnswering from RobinsonML +author: John Snow Labs +name: burmese_awesome_qa_model_robinsonml +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_robinsonml` is a English model originally trained by RobinsonML. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_robinsonml_en_5.2.0_3.0_1701062824860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_robinsonml_en_5.2.0_3.0_1701062824860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_robinsonml","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_robinsonml", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_robinsonml| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/RobinsonML/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rohitnair212_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rohitnair212_en.md new file mode 100644 index 000000000000..4fc9560cc8b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rohitnair212_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_rohitnair212 DistilBertForQuestionAnswering from rohitnair212 +author: John Snow Labs +name: burmese_awesome_qa_model_rohitnair212 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_rohitnair212` is a English model originally trained by rohitnair212. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rohitnair212_en_5.2.0_3.0_1701047106645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rohitnair212_en_5.2.0_3.0_1701047106645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_rohitnair212","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_rohitnair212", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_rohitnair212| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rohitnair212/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ronit01_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ronit01_en.md new file mode 100644 index 000000000000..eba41007a2ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ronit01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ronit01 DistilBertForQuestionAnswering from ronit01 +author: John Snow Labs +name: burmese_awesome_qa_model_ronit01 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ronit01` is a English model originally trained by ronit01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ronit01_en_5.2.0_3.0_1701049111391.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ronit01_en_5.2.0_3.0_1701049111391.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ronit01","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ronit01", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ronit01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ronit01/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rpkarnavat_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rpkarnavat_en.md new file mode 100644 index 000000000000..01a6cf969fe2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rpkarnavat_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_rpkarnavat DistilBertForQuestionAnswering from rpkarnavat +author: John Snow Labs +name: burmese_awesome_qa_model_rpkarnavat +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_rpkarnavat` is a English model originally trained by rpkarnavat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rpkarnavat_en_5.2.0_3.0_1701089241545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rpkarnavat_en_5.2.0_3.0_1701089241545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_rpkarnavat","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_rpkarnavat", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_rpkarnavat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rpkarnavat/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rugvedabodke_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rugvedabodke_en.md new file mode 100644 index 000000000000..af055570826b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_rugvedabodke_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_rugvedabodke DistilBertForQuestionAnswering from rugvedabodke +author: John Snow Labs +name: burmese_awesome_qa_model_rugvedabodke +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_rugvedabodke` is a English model originally trained by rugvedabodke. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rugvedabodke_en_5.2.0_3.0_1701083168343.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_rugvedabodke_en_5.2.0_3.0_1701083168343.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_rugvedabodke","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_rugvedabodke", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_rugvedabodke| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rugvedabodke/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_samlansley_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_samlansley_en.md new file mode 100644 index 000000000000..5125d7796cc1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_samlansley_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_samlansley DistilBertForQuestionAnswering from samlansley +author: John Snow Labs +name: burmese_awesome_qa_model_samlansley +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_samlansley` is a English model originally trained by samlansley. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_samlansley_en_5.2.0_3.0_1701090336493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_samlansley_en_5.2.0_3.0_1701090336493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_samlansley","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_samlansley", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_samlansley| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/samlansley/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_san007_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_san007_en.md new file mode 100644 index 000000000000..be40b4248046 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_san007_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_san007 DistilBertForQuestionAnswering from san007 +author: John Snow Labs +name: burmese_awesome_qa_model_san007 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_san007` is a English model originally trained by san007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_san007_en_5.2.0_3.0_1701061557906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_san007_en_5.2.0_3.0_1701061557906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_san007","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_san007", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_san007| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/san007/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sandeep8021_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sandeep8021_en.md new file mode 100644 index 000000000000..d3e3e45b016d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sandeep8021_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sandeep8021 DistilBertForQuestionAnswering from Sandeep8021 +author: John Snow Labs +name: burmese_awesome_qa_model_sandeep8021 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sandeep8021` is a English model originally trained by Sandeep8021. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sandeep8021_en_5.2.0_3.0_1701060438820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sandeep8021_en_5.2.0_3.0_1701060438820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sandeep8021","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sandeep8021", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sandeep8021| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sandeep8021/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_saraaa9675_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_saraaa9675_en.md new file mode 100644 index 000000000000..4da488c36db4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_saraaa9675_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_saraaa9675 DistilBertForQuestionAnswering from saraaa9675 +author: John Snow Labs +name: burmese_awesome_qa_model_saraaa9675 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_saraaa9675` is a English model originally trained by saraaa9675. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_saraaa9675_en_5.2.0_3.0_1701075902344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_saraaa9675_en_5.2.0_3.0_1701075902344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_saraaa9675","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_saraaa9675", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_saraaa9675| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/saraaa9675/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sarthak4497_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sarthak4497_en.md new file mode 100644 index 000000000000..4c7cf3096b79 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sarthak4497_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sarthak4497 DistilBertForQuestionAnswering from sarthak4497 +author: John Snow Labs +name: burmese_awesome_qa_model_sarthak4497 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sarthak4497` is a English model originally trained by sarthak4497. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sarthak4497_en_5.2.0_3.0_1701062078425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sarthak4497_en_5.2.0_3.0_1701062078425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sarthak4497","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sarthak4497", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sarthak4497| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sarthak4497/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sarvagna_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sarvagna_en.md new file mode 100644 index 000000000000..dcde75719610 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sarvagna_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sarvagna DistilBertForQuestionAnswering from sarvagna +author: John Snow Labs +name: burmese_awesome_qa_model_sarvagna +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sarvagna` is a English model originally trained by sarvagna. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sarvagna_en_5.2.0_3.0_1701082068908.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sarvagna_en_5.2.0_3.0_1701082068908.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sarvagna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sarvagna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sarvagna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sarvagna/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_schelle7_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_schelle7_en.md new file mode 100644 index 000000000000..3d54a6cbda8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_schelle7_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_schelle7 DistilBertForQuestionAnswering from Schelle7 +author: John Snow Labs +name: burmese_awesome_qa_model_schelle7 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_schelle7` is a English model originally trained by Schelle7. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_schelle7_en_5.2.0_3.0_1701059733942.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_schelle7_en_5.2.0_3.0_1701059733942.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_schelle7","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_schelle7", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_schelle7| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Schelle7/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_seemorebricks_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_seemorebricks_en.md new file mode 100644 index 000000000000..9c82a6a2c90f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_seemorebricks_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_seemorebricks DistilBertForQuestionAnswering from seemorebricks +author: John Snow Labs +name: burmese_awesome_qa_model_seemorebricks +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_seemorebricks` is a English model originally trained by seemorebricks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_seemorebricks_en_5.2.0_3.0_1701078768676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_seemorebricks_en_5.2.0_3.0_1701078768676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_seemorebricks","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_seemorebricks", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_seemorebricks| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/seemorebricks/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_shokhjakhon_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_shokhjakhon_en.md new file mode 100644 index 000000000000..357e5d56636d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_shokhjakhon_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_shokhjakhon DistilBertForQuestionAnswering from shokhjakhon +author: John Snow Labs +name: burmese_awesome_qa_model_shokhjakhon +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_shokhjakhon` is a English model originally trained by shokhjakhon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_shokhjakhon_en_5.2.0_3.0_1701059284577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_shokhjakhon_en_5.2.0_3.0_1701059284577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_shokhjakhon","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_shokhjakhon", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_shokhjakhon| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/shokhjakhon/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_silviaks_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_silviaks_en.md new file mode 100644 index 000000000000..98f77c0059e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_silviaks_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_silviaks DistilBertForQuestionAnswering from SilviaKs +author: John Snow Labs +name: burmese_awesome_qa_model_silviaks +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_silviaks` is a English model originally trained by SilviaKs. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_silviaks_en_5.2.0_3.0_1701092115331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_silviaks_en_5.2.0_3.0_1701092115331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_silviaks","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_silviaks", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_silviaks| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SilviaKs/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_socratesvak_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_socratesvak_en.md new file mode 100644 index 000000000000..188ec00ab641 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_socratesvak_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_socratesvak DistilBertForQuestionAnswering from SocratesVak +author: John Snow Labs +name: burmese_awesome_qa_model_socratesvak +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_socratesvak` is a English model originally trained by SocratesVak. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_socratesvak_en_5.2.0_3.0_1701057022114.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_socratesvak_en_5.2.0_3.0_1701057022114.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_socratesvak","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_socratesvak", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_socratesvak| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SocratesVak/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sonaal_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sonaal_en.md new file mode 100644 index 000000000000..9247f0519e00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sonaal_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sonaal DistilBertForQuestionAnswering from sonaal +author: John Snow Labs +name: burmese_awesome_qa_model_sonaal +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sonaal` is a English model originally trained by sonaal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sonaal_en_5.2.0_3.0_1701047972448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sonaal_en_5.2.0_3.0_1701047972448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sonaal","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sonaal", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sonaal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sonaal/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_spleonard1_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_spleonard1_en.md new file mode 100644 index 000000000000..db1dc0f0d391 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_spleonard1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_spleonard1 DistilBertForQuestionAnswering from Spleonard1 +author: John Snow Labs +name: burmese_awesome_qa_model_spleonard1 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_spleonard1` is a English model originally trained by Spleonard1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_spleonard1_en_5.2.0_3.0_1701071931692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_spleonard1_en_5.2.0_3.0_1701071931692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_spleonard1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_spleonard1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_spleonard1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Spleonard1/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sravanipilla_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sravanipilla_en.md new file mode 100644 index 000000000000..619af00a932e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sravanipilla_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sravanipilla DistilBertForQuestionAnswering from Sravanipilla +author: John Snow Labs +name: burmese_awesome_qa_model_sravanipilla +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sravanipilla` is a English model originally trained by Sravanipilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sravanipilla_en_5.2.0_3.0_1701088419362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sravanipilla_en_5.2.0_3.0_1701088419362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sravanipilla","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sravanipilla", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sravanipilla| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sravanipilla/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sreeja12_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sreeja12_en.md new file mode 100644 index 000000000000..841e2f53e004 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sreeja12_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sreeja12 DistilBertForQuestionAnswering from Sreeja12 +author: John Snow Labs +name: burmese_awesome_qa_model_sreeja12 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sreeja12` is a English model originally trained by Sreeja12. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sreeja12_en_5.2.0_3.0_1701070160228.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sreeja12_en_5.2.0_3.0_1701070160228.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sreeja12","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sreeja12", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sreeja12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sreeja12/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sruschetta_peptone_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sruschetta_peptone_en.md new file mode 100644 index 000000000000..93c4c58d1239 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sruschetta_peptone_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sruschetta_peptone DistilBertForQuestionAnswering from sruschetta-peptone +author: John Snow Labs +name: burmese_awesome_qa_model_sruschetta_peptone +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sruschetta_peptone` is a English model originally trained by sruschetta-peptone. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sruschetta_peptone_en_5.2.0_3.0_1701073906949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sruschetta_peptone_en_5.2.0_3.0_1701073906949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sruschetta_peptone","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sruschetta_peptone", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sruschetta_peptone| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sruschetta-peptone/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_stoemb_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_stoemb_en.md new file mode 100644 index 000000000000..28295dae8cba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_stoemb_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_stoemb DistilBertForQuestionAnswering from Stoemb +author: John Snow Labs +name: burmese_awesome_qa_model_stoemb +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_stoemb` is a English model originally trained by Stoemb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_stoemb_en_5.2.0_3.0_1701089310271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_stoemb_en_5.2.0_3.0_1701089310271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_stoemb","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_stoemb", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_stoemb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Stoemb/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sudiyanto_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sudiyanto_en.md new file mode 100644 index 000000000000..b6152e0faef0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sudiyanto_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sudiyanto DistilBertForQuestionAnswering from Sudiyanto +author: John Snow Labs +name: burmese_awesome_qa_model_sudiyanto +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sudiyanto` is a English model originally trained by Sudiyanto. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sudiyanto_en_5.2.0_3.0_1701074438988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sudiyanto_en_5.2.0_3.0_1701074438988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sudiyanto","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sudiyanto", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sudiyanto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sudiyanto/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_suhrud_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_suhrud_en.md new file mode 100644 index 000000000000..b8bdf15444d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_suhrud_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_suhrud DistilBertForQuestionAnswering from suhrud +author: John Snow Labs +name: burmese_awesome_qa_model_suhrud +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_suhrud` is a English model originally trained by suhrud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_suhrud_en_5.2.0_3.0_1701057066430.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_suhrud_en_5.2.0_3.0_1701057066430.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_suhrud","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_suhrud", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_suhrud| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/suhrud/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sumeyyecelik_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sumeyyecelik_en.md new file mode 100644 index 000000000000..b4152099b841 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_sumeyyecelik_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_sumeyyecelik DistilBertForQuestionAnswering from sumeyyecelik +author: John Snow Labs +name: burmese_awesome_qa_model_sumeyyecelik +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_sumeyyecelik` is a English model originally trained by sumeyyecelik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sumeyyecelik_en_5.2.0_3.0_1701078088232.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_sumeyyecelik_en_5.2.0_3.0_1701078088232.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_sumeyyecelik","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_sumeyyecelik", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_sumeyyecelik| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sumeyyecelik/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_tdjey33_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_tdjey33_en.md new file mode 100644 index 000000000000..de5b7c857aaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_tdjey33_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_tdjey33 DistilBertForQuestionAnswering from tdjey33 +author: John Snow Labs +name: burmese_awesome_qa_model_tdjey33 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_tdjey33` is a English model originally trained by tdjey33. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_tdjey33_en_5.2.0_3.0_1701044235611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_tdjey33_en_5.2.0_3.0_1701044235611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_tdjey33","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_tdjey33", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_tdjey33| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tdjey33/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_teckx_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_teckx_en.md new file mode 100644 index 000000000000..ddf54e551e28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_teckx_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_teckx DistilBertForQuestionAnswering from Teckx +author: John Snow Labs +name: burmese_awesome_qa_model_teckx +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_teckx` is a English model originally trained by Teckx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_teckx_en_5.2.0_3.0_1701057967338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_teckx_en_5.2.0_3.0_1701057967338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_teckx","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_teckx", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_teckx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Teckx/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_thejohndecosta_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_thejohndecosta_en.md new file mode 100644 index 000000000000..0cada70ed731 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_thejohndecosta_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_thejohndecosta DistilBertForQuestionAnswering from theJohndecosta +author: John Snow Labs +name: burmese_awesome_qa_model_thejohndecosta +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_thejohndecosta` is a English model originally trained by theJohndecosta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_thejohndecosta_en_5.2.0_3.0_1701070079768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_thejohndecosta_en_5.2.0_3.0_1701070079768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_thejohndecosta","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_thejohndecosta", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_thejohndecosta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/theJohndecosta/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vanshkodesia21_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vanshkodesia21_en.md new file mode 100644 index 000000000000..c968a5341468 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vanshkodesia21_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_vanshkodesia21 DistilBertForQuestionAnswering from VanshKodesia21 +author: John Snow Labs +name: burmese_awesome_qa_model_vanshkodesia21 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_vanshkodesia21` is a English model originally trained by VanshKodesia21. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vanshkodesia21_en_5.2.0_3.0_1701044378960.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vanshkodesia21_en_5.2.0_3.0_1701044378960.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_vanshkodesia21","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_vanshkodesia21", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_vanshkodesia21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/VanshKodesia21/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vasu07_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vasu07_en.md new file mode 100644 index 000000000000..e2e52e9561ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vasu07_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_vasu07 DistilBertForQuestionAnswering from Vasu07 +author: John Snow Labs +name: burmese_awesome_qa_model_vasu07 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_vasu07` is a English model originally trained by Vasu07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vasu07_en_5.2.0_3.0_1701047386721.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vasu07_en_5.2.0_3.0_1701047386721.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_vasu07","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_vasu07", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_vasu07| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Vasu07/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_veer09_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_veer09_en.md new file mode 100644 index 000000000000..811dc0a05551 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_veer09_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_veer09 DistilBertForQuestionAnswering from Veer09 +author: John Snow Labs +name: burmese_awesome_qa_model_veer09 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_veer09` is a English model originally trained by Veer09. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_veer09_en_5.2.0_3.0_1701084921520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_veer09_en_5.2.0_3.0_1701084921520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_veer09","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_veer09", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_veer09| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Veer09/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_viv_san_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_viv_san_en.md new file mode 100644 index 000000000000..504dbcbe2947 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_viv_san_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_viv_san DistilBertForQuestionAnswering from viv-san +author: John Snow Labs +name: burmese_awesome_qa_model_viv_san +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_viv_san` is a English model originally trained by viv-san. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_viv_san_en_5.2.0_3.0_1701075221315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_viv_san_en_5.2.0_3.0_1701075221315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_viv_san","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_viv_san", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_viv_san| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/viv-san/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vlso_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vlso_en.md new file mode 100644 index 000000000000..3f79e25b50e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_vlso_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_vlso DistilBertForQuestionAnswering from vlso +author: John Snow Labs +name: burmese_awesome_qa_model_vlso +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_vlso` is a English model originally trained by vlso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vlso_en_5.2.0_3.0_1701055114258.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_vlso_en_5.2.0_3.0_1701055114258.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_vlso","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_vlso", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_vlso| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vlso/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_w_adapter_erbvjnad_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_w_adapter_erbvjnad_en.md new file mode 100644 index 000000000000..d476b7330db1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_w_adapter_erbvjnad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_w_adapter_erbvjnad DistilBertForQuestionAnswering from erbvjnad +author: John Snow Labs +name: burmese_awesome_qa_model_w_adapter_erbvjnad +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_w_adapter_erbvjnad` is a English model originally trained by erbvjnad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_w_adapter_erbvjnad_en_5.2.0_3.0_1701055022544.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_w_adapter_erbvjnad_en_5.2.0_3.0_1701055022544.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_w_adapter_erbvjnad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_w_adapter_erbvjnad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_w_adapter_erbvjnad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/erbvjnad/my_awesome_qa_model_w_adapter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_wanghao2022_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_wanghao2022_en.md new file mode 100644 index 000000000000..b550692bb151 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_wanghao2022_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_wanghao2022 DistilBertForQuestionAnswering from Wanghao2022 +author: John Snow Labs +name: burmese_awesome_qa_model_wanghao2022 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_wanghao2022` is a English model originally trained by Wanghao2022. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_wanghao2022_en_5.2.0_3.0_1701068732869.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_wanghao2022_en_5.2.0_3.0_1701068732869.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_wanghao2022","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_wanghao2022", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_wanghao2022| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Wanghao2022/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_warrior1127_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_warrior1127_en.md new file mode 100644 index 000000000000..e1ee65f94c49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_warrior1127_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_warrior1127 DistilBertForQuestionAnswering from warrior1127 +author: John Snow Labs +name: burmese_awesome_qa_model_warrior1127 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_warrior1127` is a English model originally trained by warrior1127. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_warrior1127_en_5.2.0_3.0_1701065799472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_warrior1127_en_5.2.0_3.0_1701065799472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_warrior1127","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_warrior1127", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_warrior1127| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/warrior1127/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_whyadd9_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_whyadd9_en.md new file mode 100644 index 000000000000..563a6af09a17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_whyadd9_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_whyadd9 DistilBertForQuestionAnswering from whyadd9 +author: John Snow Labs +name: burmese_awesome_qa_model_whyadd9 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_whyadd9` is a English model originally trained by whyadd9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_whyadd9_en_5.2.0_3.0_1701057855254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_whyadd9_en_5.2.0_3.0_1701057855254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_whyadd9","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_whyadd9", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_whyadd9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/whyadd9/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_willaay3d_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_willaay3d_en.md new file mode 100644 index 000000000000..6f6f7a6f4413 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_willaay3d_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_willaay3d DistilBertForQuestionAnswering from willaay3d +author: John Snow Labs +name: burmese_awesome_qa_model_willaay3d +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_willaay3d` is a English model originally trained by willaay3d. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_willaay3d_en_5.2.0_3.0_1701086613308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_willaay3d_en_5.2.0_3.0_1701086613308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_willaay3d","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_willaay3d", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_willaay3d| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/willaay3d/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_wool_peach_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_wool_peach_en.md new file mode 100644 index 000000000000..94cea9a28cf5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_wool_peach_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_wool_peach DistilBertForQuestionAnswering from Wool-Peach +author: John Snow Labs +name: burmese_awesome_qa_model_wool_peach +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_wool_peach` is a English model originally trained by Wool-Peach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_wool_peach_en_5.2.0_3.0_1701082620318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_wool_peach_en_5.2.0_3.0_1701082620318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_wool_peach","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_wool_peach", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_wool_peach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Wool-Peach/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_xiaolongbao888_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_xiaolongbao888_en.md new file mode 100644 index 000000000000..cd498334e5ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_xiaolongbao888_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_xiaolongbao888 DistilBertForQuestionAnswering from xiaolongbao888 +author: John Snow Labs +name: burmese_awesome_qa_model_xiaolongbao888 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_xiaolongbao888` is a English model originally trained by xiaolongbao888. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xiaolongbao888_en_5.2.0_3.0_1701092898600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_xiaolongbao888_en_5.2.0_3.0_1701092898600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_xiaolongbao888","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_xiaolongbao888", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_xiaolongbao888| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/xiaolongbao888/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ziangzhang10_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ziangzhang10_en.md new file mode 100644 index 000000000000..a29f896ecade --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_ziangzhang10_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_ziangzhang10 DistilBertForQuestionAnswering from ziangzhang10 +author: John Snow Labs +name: burmese_awesome_qa_model_ziangzhang10 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_ziangzhang10` is a English model originally trained by ziangzhang10. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ziangzhang10_en_5.2.0_3.0_1701082798322.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_ziangzhang10_en_5.2.0_3.0_1701082798322.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_ziangzhang10","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_ziangzhang10", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_ziangzhang10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ziangzhang10/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_zycckz_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_zycckz_en.md new file mode 100644 index 000000000000..123c50e14a7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_awesome_qa_model_zycckz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_awesome_qa_model_zycckz DistilBertForQuestionAnswering from ZycckZ +author: John Snow Labs +name: burmese_awesome_qa_model_zycckz +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_qa_model_zycckz` is a English model originally trained by ZycckZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_zycckz_en_5.2.0_3.0_1701072970860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_qa_model_zycckz_en_5.2.0_3.0_1701072970860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_awesome_qa_model_zycckz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_awesome_qa_model_zycckz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_qa_model_zycckz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ZycckZ/my_awesome_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_h_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_h_qa_model_en.md new file mode 100644 index 000000000000..47c42f7b41cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_h_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_h_qa_model DistilBertForQuestionAnswering from himanimaheshwari3 +author: John Snow Labs +name: burmese_h_qa_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_h_qa_model` is a English model originally trained by himanimaheshwari3. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_h_qa_model_en_5.2.0_3.0_1701066763241.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_h_qa_model_en_5.2.0_3.0_1701066763241.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_h_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_h_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_h_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/himanimaheshwari3/my_h_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_qa_model_fardinbh_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_qa_model_fardinbh_en.md new file mode 100644 index 000000000000..8b2764109888 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_qa_model_fardinbh_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_fardinbh DistilBertForQuestionAnswering from fardinbh +author: John Snow Labs +name: burmese_qa_model_fardinbh +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_fardinbh` is a English model originally trained by fardinbh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_fardinbh_en_5.2.0_3.0_1701075992850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_fardinbh_en_5.2.0_3.0_1701075992850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_fardinbh","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_fardinbh", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_fardinbh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fardinbh/my_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_qa_model_pytorch_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_qa_model_pytorch_en.md new file mode 100644 index 000000000000..45a8feafb265 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_qa_model_pytorch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_qa_model_pytorch DistilBertForQuestionAnswering from parasgopani94 +author: John Snow Labs +name: burmese_qa_model_pytorch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_qa_model_pytorch` is a English model originally trained by parasgopani94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_qa_model_pytorch_en_5.2.0_3.0_1701076947417.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_qa_model_pytorch_en_5.2.0_3.0_1701076947417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_qa_model_pytorch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_qa_model_pytorch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_qa_model_pytorch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/parasgopani94/my_qa_model_pytorch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-burmese_test_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-burmese_test_qa_model_en.md new file mode 100644 index 000000000000..15c47d68adbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-burmese_test_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English burmese_test_qa_model DistilBertForQuestionAnswering from oagn +author: John Snow Labs +name: burmese_test_qa_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_test_qa_model` is a English model originally trained by oagn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_test_qa_model_en_5.2.0_3.0_1701063490802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_test_qa_model_en_5.2.0_3.0_1701063490802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("burmese_test_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("burmese_test_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_test_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/oagn/my_test_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-colab_dolly_ft_gpt2_en.md b/docs/_posts/ahmedlone127/2023-11-27-colab_dolly_ft_gpt2_en.md new file mode 100644 index 000000000000..686be735b36c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-colab_dolly_ft_gpt2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English colab_dolly_ft_gpt2 DistilBertForQuestionAnswering from 01GangaPutraBheeshma +author: John Snow Labs +name: colab_dolly_ft_gpt2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`colab_dolly_ft_gpt2` is a English model originally trained by 01GangaPutraBheeshma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/colab_dolly_ft_gpt2_en_5.2.0_3.0_1701091819526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/colab_dolly_ft_gpt2_en_5.2.0_3.0_1701091819526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("colab_dolly_ft_gpt2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("colab_dolly_ft_gpt2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|colab_dolly_ft_gpt2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/01GangaPutraBheeshma/colab_dolly_FT_gpt2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-content_norhanswar_en.md b/docs/_posts/ahmedlone127/2023-11-27-content_norhanswar_en.md new file mode 100644 index 000000000000..99fdd7a01269 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-content_norhanswar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English content_norhanswar DistilBertForQuestionAnswering from norhanswar +author: John Snow Labs +name: content_norhanswar +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`content_norhanswar` is a English model originally trained by norhanswar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/content_norhanswar_en_5.2.0_3.0_1701047222965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/content_norhanswar_en_5.2.0_3.0_1701047222965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("content_norhanswar","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("content_norhanswar", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|content_norhanswar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/norhanswar/content \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-dbt8_en.md b/docs/_posts/ahmedlone127/2023-11-27-dbt8_en.md new file mode 100644 index 000000000000..7bf232fef089 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-dbt8_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dbt8 DistilBertForQuestionAnswering from SUTS102779289 +author: John Snow Labs +name: dbt8 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbt8` is a English model originally trained by SUTS102779289. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbt8_en_5.2.0_3.0_1701049111429.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbt8_en_5.2.0_3.0_1701049111429.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("dbt8","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("dbt8", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbt8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SUTS102779289/dbt8 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-dbt_en.md b/docs/_posts/ahmedlone127/2023-11-27-dbt_en.md new file mode 100644 index 000000000000..5a48c6c4a08e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-dbt_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English dbt DistilBertForQuestionAnswering from SUTS102779289 +author: John Snow Labs +name: dbt +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dbt` is a English model originally trained by SUTS102779289. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dbt_en_5.2.0_3.0_1701068241898.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dbt_en_5.2.0_3.0_1701068241898.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("dbt","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("dbt", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dbt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SUTS102779289/dbt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_1epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_1epoch_en.md new file mode 100644 index 000000000000..eea8fa71bcc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_1epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English devang_qna_model_1epoch DistilBertForQuestionAnswering from herMaster +author: John Snow Labs +name: devang_qna_model_1epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`devang_qna_model_1epoch` is a English model originally trained by herMaster. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/devang_qna_model_1epoch_en_5.2.0_3.0_1701085756981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/devang_qna_model_1epoch_en_5.2.0_3.0_1701085756981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("devang_qna_model_1epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("devang_qna_model_1epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|devang_qna_model_1epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herMaster/devang-qna-model-1epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_2epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_2epoch_en.md new file mode 100644 index 000000000000..7e7b940bbd43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_2epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English devang_qna_model_2epoch DistilBertForQuestionAnswering from herMaster +author: John Snow Labs +name: devang_qna_model_2epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`devang_qna_model_2epoch` is a English model originally trained by herMaster. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/devang_qna_model_2epoch_en_5.2.0_3.0_1701050209508.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/devang_qna_model_2epoch_en_5.2.0_3.0_1701050209508.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("devang_qna_model_2epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("devang_qna_model_2epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|devang_qna_model_2epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herMaster/devang-qna-model-2epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_3epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_3epoch_en.md new file mode 100644 index 000000000000..ac8fab405c9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_3epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English devang_qna_model_3epoch DistilBertForQuestionAnswering from herMaster +author: John Snow Labs +name: devang_qna_model_3epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`devang_qna_model_3epoch` is a English model originally trained by herMaster. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/devang_qna_model_3epoch_en_5.2.0_3.0_1701080895439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/devang_qna_model_3epoch_en_5.2.0_3.0_1701080895439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("devang_qna_model_3epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("devang_qna_model_3epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|devang_qna_model_3epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herMaster/devang-qna-model-3epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_4epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_4epoch_en.md new file mode 100644 index 000000000000..9a175c51ce4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_4epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English devang_qna_model_4epoch DistilBertForQuestionAnswering from herMaster +author: John Snow Labs +name: devang_qna_model_4epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`devang_qna_model_4epoch` is a English model originally trained by herMaster. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/devang_qna_model_4epoch_en_5.2.0_3.0_1701086613688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/devang_qna_model_4epoch_en_5.2.0_3.0_1701086613688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("devang_qna_model_4epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("devang_qna_model_4epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|devang_qna_model_4epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herMaster/devang-qna-model-4epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_5epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_5epoch_en.md new file mode 100644 index 000000000000..dafab373007c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_5epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English devang_qna_model_5epoch DistilBertForQuestionAnswering from herMaster +author: John Snow Labs +name: devang_qna_model_5epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`devang_qna_model_5epoch` is a English model originally trained by herMaster. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/devang_qna_model_5epoch_en_5.2.0_3.0_1701062824892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/devang_qna_model_5epoch_en_5.2.0_3.0_1701062824892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("devang_qna_model_5epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("devang_qna_model_5epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|devang_qna_model_5epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herMaster/devang-qna-model-5epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_8epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_8epoch_en.md new file mode 100644 index 000000000000..a99fe6380836 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-devang_qna_model_8epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English devang_qna_model_8epoch DistilBertForQuestionAnswering from herMaster +author: John Snow Labs +name: devang_qna_model_8epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`devang_qna_model_8epoch` is a English model originally trained by herMaster. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/devang_qna_model_8epoch_en_5.2.0_3.0_1701079727732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/devang_qna_model_8epoch_en_5.2.0_3.0_1701079727732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("devang_qna_model_8epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("devang_qna_model_8epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|devang_qna_model_8epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/herMaster/devang-qna-model-8epoch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distil_bert_fine_tune_2_en.md b/docs/_posts/ahmedlone127/2023-11-27-distil_bert_fine_tune_2_en.md new file mode 100644 index 000000000000..6e206c78413a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distil_bert_fine_tune_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distil_bert_fine_tune_2 DistilBertForQuestionAnswering from satyamverma +author: John Snow Labs +name: distil_bert_fine_tune_2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_fine_tune_2` is a English model originally trained by satyamverma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_fine_tune_2_en_5.2.0_3.0_1701067764477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_fine_tune_2_en_5.2.0_3.0_1701067764477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distil_bert_fine_tune_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distil_bert_fine_tune_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_fine_tune_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/satyamverma/Distil_BERT_Fine_Tune_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distil_bert_fine_tuned_2_en.md b/docs/_posts/ahmedlone127/2023-11-27-distil_bert_fine_tuned_2_en.md new file mode 100644 index 000000000000..4cc94d6a1b2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distil_bert_fine_tuned_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distil_bert_fine_tuned_2 DistilBertForQuestionAnswering from satyamverma +author: John Snow Labs +name: distil_bert_fine_tuned_2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distil_bert_fine_tuned_2` is a English model originally trained by satyamverma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distil_bert_fine_tuned_2_en_5.2.0_3.0_1701066737907.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distil_bert_fine_tuned_2_en_5.2.0_3.0_1701066737907.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distil_bert_fine_tuned_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distil_bert_fine_tuned_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distil_bert_fine_tuned_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/satyamverma/Distil_BERT_Fine_tuned_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented1_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented1_en.md new file mode 100644 index 000000000000..9b92928486ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_augmented1 DistilBertForQuestionAnswering from christti +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_augmented1 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_augmented1` is a English model originally trained by christti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented1_en_5.2.0_3.0_1701086928139.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented1_en_5.2.0_3.0_1701086928139.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_augmented1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_augmented1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_augmented1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/christti/distilbert-base-cased-distilled-squad-augmented1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented2_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented2_en.md new file mode 100644 index 000000000000..b020dc575611 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_augmented2 DistilBertForQuestionAnswering from christti +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_augmented2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_augmented2` is a English model originally trained by christti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented2_en_5.2.0_3.0_1701085690785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented2_en_5.2.0_3.0_1701085690785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_augmented2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_augmented2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_augmented2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/christti/distilbert-base-cased-distilled-squad-augmented2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented3_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented3_en.md new file mode 100644 index 000000000000..a3d187669be6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_augmented3 DistilBertForQuestionAnswering from christti +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_augmented3 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_augmented3` is a English model originally trained by christti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented3_en_5.2.0_3.0_1701081964507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented3_en_5.2.0_3.0_1701081964507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_augmented3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_augmented3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_augmented3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/christti/distilbert-base-cased-distilled-squad-augmented3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented4_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented4_en.md new file mode 100644 index 000000000000..f3cffa4cd20b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_augmented4 DistilBertForQuestionAnswering from christti +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_augmented4 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_augmented4` is a English model originally trained by christti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented4_en_5.2.0_3.0_1701085756847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented4_en_5.2.0_3.0_1701085756847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_augmented4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_augmented4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_augmented4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/christti/distilbert-base-cased-distilled-squad-augmented4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented_grammarly1_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented_grammarly1_en.md new file mode 100644 index 000000000000..701fd1991c45 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_cased_distilled_squad_augmented_grammarly1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_cased_distilled_squad_augmented_grammarly1 DistilBertForQuestionAnswering from christti +author: John Snow Labs +name: distilbert_base_cased_distilled_squad_augmented_grammarly1 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_cased_distilled_squad_augmented_grammarly1` is a English model originally trained by christti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented_grammarly1_en_5.2.0_3.0_1701088084444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_cased_distilled_squad_augmented_grammarly1_en_5.2.0_3.0_1701088084444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_cased_distilled_squad_augmented_grammarly1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_cased_distilled_squad_augmented_grammarly1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_cased_distilled_squad_augmented_grammarly1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/christti/distilbert-base-cased-distilled-squad-augmented-grammarly1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_spanish_uncased_finetuned_qa_tar_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_spanish_uncased_finetuned_qa_tar_en.md new file mode 100644 index 000000000000..61cffd001f03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_spanish_uncased_finetuned_qa_tar_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_spanish_uncased_finetuned_qa_tar DistilBertForQuestionAnswering from dccuchile +author: John Snow Labs +name: distilbert_base_spanish_uncased_finetuned_qa_tar +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_spanish_uncased_finetuned_qa_tar` is a English model originally trained by dccuchile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_qa_tar_en_5.2.0_3.0_1701045784313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_spanish_uncased_finetuned_qa_tar_en_5.2.0_3.0_1701045784313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_spanish_uncased_finetuned_qa_tar","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_spanish_uncased_finetuned_qa_tar", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_spanish_uncased_finetuned_qa_tar| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|250.2 MB| + +## References + +https://huggingface.co/dccuchile/distilbert-base-spanish-uncased-finetuned-qa-tar \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_coqa_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_coqa_en.md new file mode 100644 index 000000000000..7e5f7d474172 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_coqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_coqa DistilBertForQuestionAnswering from peggyhuang +author: John Snow Labs +name: distilbert_base_uncased_coqa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_coqa` is a English model originally trained by peggyhuang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_coqa_en_5.2.0_3.0_1701043478221.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_coqa_en_5.2.0_3.0_1701043478221.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_coqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_coqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_coqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/peggyhuang/distilbert-base-uncased-coqa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_diabetes_v1_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_diabetes_v1_en.md new file mode 100644 index 000000000000..9031fe74f94e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_diabetes_v1_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_diabetes_v1 DistilBertForQuestionAnswering from LeWince +author: John Snow Labs +name: distilbert_base_uncased_finetuned_diabetes_v1 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_diabetes_v1` is a English model originally trained by LeWince. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_v1_en_5.2.0_3.0_1701044179708.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_diabetes_v1_en_5.2.0_3.0_1701044179708.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_diabetes_v1","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_diabetes_v1", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_diabetes_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/LeWince/distilbert-base-uncased-finetuned-diabetes-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_17_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_17_en.md new file mode 100644 index 000000000000..ee84c4cc2a38 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_17_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_17 DistilBertForQuestionAnswering from badokorach +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_17 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_17` is a English model originally trained by badokorach. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_17_en_5.2.0_3.0_1701043475126.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_17_en_5.2.0_3.0_1701043475126.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_17","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_17", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_17| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/badokorach/distilbert-base-uncased-finetuned-squad-17 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_adam_wein_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_adam_wein_en.md new file mode 100644 index 000000000000..4c6aa8237775 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_adam_wein_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_adam_wein DistilBertForQuestionAnswering from adam-wein +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_adam_wein +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_adam_wein` is a English model originally trained by adam-wein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_adam_wein_en_5.2.0_3.0_1701043218832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_adam_wein_en_5.2.0_3.0_1701043218832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_adam_wein","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_adam_wein", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_adam_wein| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adam-wein/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_amool_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_amool_en.md new file mode 100644 index 000000000000..393fbceaab86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_amool_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_amool DistilBertForQuestionAnswering from Amool +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_amool +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_amool` is a English model originally trained by Amool. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_amool_en_5.2.0_3.0_1701048361368.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_amool_en_5.2.0_3.0_1701048361368.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_amool","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_amool", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_amool| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Amool/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_anhvth5_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_anhvth5_en.md new file mode 100644 index 000000000000..57172422b4a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_anhvth5_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_anhvth5 DistilBertForQuestionAnswering from anhvth5 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_anhvth5 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_anhvth5` is a English model originally trained by anhvth5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anhvth5_en_5.2.0_3.0_1701065796083.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_anhvth5_en_5.2.0_3.0_1701065796083.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_anhvth5","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_anhvth5", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_anhvth5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anhvth5/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_ashtrevi_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_ashtrevi_en.md new file mode 100644 index 000000000000..5756985bf6ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_ashtrevi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ashtrevi DistilBertForQuestionAnswering from ashtrevi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ashtrevi +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ashtrevi` is a English model originally trained by ashtrevi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ashtrevi_en_5.2.0_3.0_1701056020117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ashtrevi_en_5.2.0_3.0_1701056020117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ashtrevi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ashtrevi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ashtrevi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ashtrevi/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_bornadr_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_bornadr_en.md new file mode 100644 index 000000000000..9c31294b504b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_bornadr_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_bornadr DistilBertForQuestionAnswering from bornadr +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_bornadr +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_bornadr` is a English model originally trained by bornadr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bornadr_en_5.2.0_3.0_1701091087248.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_bornadr_en_5.2.0_3.0_1701091087248.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_bornadr","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_bornadr", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_bornadr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/bornadr/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_chrislunger_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_chrislunger_en.md new file mode 100644 index 000000000000..7925a5650868 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_chrislunger_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_chrislunger DistilBertForQuestionAnswering from ChrisLunger +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_chrislunger +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_chrislunger` is a English model originally trained by ChrisLunger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_chrislunger_en_5.2.0_3.0_1701066718778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_chrislunger_en_5.2.0_3.0_1701066718778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_chrislunger","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_chrislunger", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_chrislunger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ChrisLunger/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_david_xu_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_david_xu_en.md new file mode 100644 index 000000000000..52509d0d7610 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_david_xu_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_david_xu DistilBertForQuestionAnswering from David-Xu +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_david_xu +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_david_xu` is a English model originally trained by David-Xu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_david_xu_en_5.2.0_3.0_1701053292201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_david_xu_en_5.2.0_3.0_1701053292201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_david_xu","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_david_xu", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_david_xu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/David-Xu/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_dhiruhf_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_dhiruhf_en.md new file mode 100644 index 000000000000..505a78d5eebf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_dhiruhf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_dhiruhf DistilBertForQuestionAnswering from dhiruHF +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_dhiruhf +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_dhiruhf` is a English model originally trained by dhiruHF. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dhiruhf_en_5.2.0_3.0_1701057857060.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_dhiruhf_en_5.2.0_3.0_1701057857060.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_dhiruhf","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_dhiruhf", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_dhiruhf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/dhiruHF/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_doobopp_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_doobopp_en.md new file mode 100644 index 000000000000..d8aa658d4ce3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_doobopp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_doobopp DistilBertForQuestionAnswering from doobopp +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_doobopp +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_doobopp` is a English model originally trained by doobopp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_doobopp_en_5.2.0_3.0_1701050570989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_doobopp_en_5.2.0_3.0_1701050570989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_doobopp","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_doobopp", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_doobopp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/doobopp/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_emilie_amandine_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_emilie_amandine_en.md new file mode 100644 index 000000000000..44610b4fb353 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_emilie_amandine_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_emilie_amandine DistilBertForQuestionAnswering from Emilie-Amandine +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_emilie_amandine +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_emilie_amandine` is a English model originally trained by Emilie-Amandine. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_emilie_amandine_en_5.2.0_3.0_1701086671631.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_emilie_amandine_en_5.2.0_3.0_1701086671631.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_emilie_amandine","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_emilie_amandine", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_emilie_amandine| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Emilie-Amandine/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_ephemer1s_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_ephemer1s_en.md new file mode 100644 index 000000000000..36323a7510ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_ephemer1s_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_ephemer1s DistilBertForQuestionAnswering from ephemer1s +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_ephemer1s +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_ephemer1s` is a English model originally trained by ephemer1s. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ephemer1s_en_5.2.0_3.0_1701043370534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_ephemer1s_en_5.2.0_3.0_1701043370534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_ephemer1s","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_ephemer1s", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_ephemer1s| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ephemer1s/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_finetuned_team2_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_finetuned_team2_en.md new file mode 100644 index 000000000000..b50bdf2c964e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_finetuned_team2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_finetuned_team2 DistilBertForQuestionAnswering from choz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_finetuned_team2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_finetuned_team2` is a English model originally trained by choz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_team2_en_5.2.0_3.0_1701066686133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_finetuned_team2_en_5.2.0_3.0_1701066686133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_finetuned_team2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_finetuned_team2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_finetuned_team2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/choz/distilbert-base-uncased-finetuned-squad-finetuned-team2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_fmartinmonier_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_fmartinmonier_en.md new file mode 100644 index 000000000000..77e73252e893 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_fmartinmonier_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_fmartinmonier DistilBertForQuestionAnswering from fmartinmonier +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_fmartinmonier +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_fmartinmonier` is a English model originally trained by fmartinmonier. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fmartinmonier_en_5.2.0_3.0_1701050209478.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fmartinmonier_en_5.2.0_3.0_1701050209478.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_fmartinmonier","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_fmartinmonier", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_fmartinmonier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/fmartinmonier/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_fuutoru_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_fuutoru_en.md new file mode 100644 index 000000000000..0291dac37a60 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_fuutoru_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_fuutoru DistilBertForQuestionAnswering from FuuToru +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_fuutoru +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_fuutoru` is a English model originally trained by FuuToru. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fuutoru_en_5.2.0_3.0_1701065082923.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_fuutoru_en_5.2.0_3.0_1701065082923.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_fuutoru","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_fuutoru", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_fuutoru| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/FuuToru/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_hadalee_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_hadalee_en.md new file mode 100644 index 000000000000..37860c720c6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_hadalee_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_hadalee DistilBertForQuestionAnswering from hadalee +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_hadalee +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_hadalee` is a English model originally trained by hadalee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hadalee_en_5.2.0_3.0_1701043915435.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_hadalee_en_5.2.0_3.0_1701043915435.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_hadalee","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_hadalee", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_hadalee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/hadalee/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_harishbolem_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_harishbolem_en.md new file mode 100644 index 000000000000..be5fd13bcabc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_harishbolem_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_harishbolem DistilBertForQuestionAnswering from harishbolem +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_harishbolem +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_harishbolem` is a English model originally trained by harishbolem. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_harishbolem_en_5.2.0_3.0_1701060904594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_harishbolem_en_5.2.0_3.0_1701060904594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_harishbolem","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_harishbolem", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_harishbolem| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/harishbolem/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_jeffrey1963_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_jeffrey1963_en.md new file mode 100644 index 000000000000..b4ce376f6709 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_jeffrey1963_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_jeffrey1963 DistilBertForQuestionAnswering from jeffrey1963 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_jeffrey1963 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_jeffrey1963` is a English model originally trained by jeffrey1963. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jeffrey1963_en_5.2.0_3.0_1701045384971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jeffrey1963_en_5.2.0_3.0_1701045384971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_jeffrey1963","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_jeffrey1963", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_jeffrey1963| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jeffrey1963/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_jeukhwang_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_jeukhwang_en.md new file mode 100644 index 000000000000..a95d34ead279 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_jeukhwang_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_jeukhwang DistilBertForQuestionAnswering from JeukHwang +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_jeukhwang +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_jeukhwang` is a English model originally trained by JeukHwang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jeukhwang_en_5.2.0_3.0_1701091954946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_jeukhwang_en_5.2.0_3.0_1701091954946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_jeukhwang","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_jeukhwang", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_jeukhwang| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/JeukHwang/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_martyyz_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_martyyz_en.md new file mode 100644 index 000000000000..f1a47147f641 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_martyyz_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_martyyz DistilBertForQuestionAnswering from martyyz +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_martyyz +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_martyyz` is a English model originally trained by martyyz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_martyyz_en_5.2.0_3.0_1701044789054.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_martyyz_en_5.2.0_3.0_1701044789054.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_martyyz","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_martyyz", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_martyyz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/martyyz/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_mikesharp01_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_mikesharp01_en.md new file mode 100644 index 000000000000..a19927e67485 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_mikesharp01_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mikesharp01 DistilBertForQuestionAnswering from MikeSharp01 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mikesharp01 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mikesharp01` is a English model originally trained by MikeSharp01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mikesharp01_en_5.2.0_3.0_1701045264348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mikesharp01_en_5.2.0_3.0_1701045264348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mikesharp01","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mikesharp01", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mikesharp01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/MikeSharp01/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_muditash_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_muditash_en.md new file mode 100644 index 000000000000..1fe595f2fb52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_muditash_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_muditash DistilBertForQuestionAnswering from muditash +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_muditash +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_muditash` is a English model originally trained by muditash. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_muditash_en_5.2.0_3.0_1701043913050.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_muditash_en_5.2.0_3.0_1701043913050.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_muditash","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_muditash", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_muditash| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/muditash/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_mufaawan_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_mufaawan_en.md new file mode 100644 index 000000000000..639b9d747846 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_mufaawan_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_mufaawan DistilBertForQuestionAnswering from mufaawan +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_mufaawan +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_mufaawan` is a English model originally trained by mufaawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mufaawan_en_5.2.0_3.0_1701063707910.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_mufaawan_en_5.2.0_3.0_1701063707910.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_mufaawan","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_mufaawan", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_mufaawan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mufaawan/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_nathanbourq_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_nathanbourq_en.md new file mode 100644 index 000000000000..f6cd1d5bb65a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_nathanbourq_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nathanbourq DistilBertForQuestionAnswering from nathanbourq +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nathanbourq +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nathanbourq` is a English model originally trained by nathanbourq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nathanbourq_en_5.2.0_3.0_1701051615833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nathanbourq_en_5.2.0_3.0_1701051615833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nathanbourq","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nathanbourq", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nathanbourq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nathanbourq/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_nszknao_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_nszknao_en.md new file mode 100644 index 000000000000..94bcbac31915 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_nszknao_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_nszknao DistilBertForQuestionAnswering from nszknao +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_nszknao +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_nszknao` is a English model originally trained by nszknao. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nszknao_en_5.2.0_3.0_1701065860203.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_nszknao_en_5.2.0_3.0_1701065860203.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_nszknao","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_nszknao", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_nszknao| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nszknao/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_rambodghandiparsi_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_rambodghandiparsi_en.md new file mode 100644 index 000000000000..08068cb5e9f9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_rambodghandiparsi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_rambodghandiparsi DistilBertForQuestionAnswering from RambodGhandiparsi +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_rambodghandiparsi +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_rambodghandiparsi` is a English model originally trained by RambodGhandiparsi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rambodghandiparsi_en_5.2.0_3.0_1701071996354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_rambodghandiparsi_en_5.2.0_3.0_1701071996354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_rambodghandiparsi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_rambodghandiparsi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_rambodghandiparsi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/RambodGhandiparsi/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_sornpichai_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_sornpichai_en.md new file mode 100644 index 000000000000..f8fc30b3da03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_sornpichai_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sornpichai DistilBertForQuestionAnswering from Sornpichai +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sornpichai +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sornpichai` is a English model originally trained by Sornpichai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sornpichai_en_5.2.0_3.0_1701044033986.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sornpichai_en_5.2.0_3.0_1701044033986.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sornpichai","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sornpichai", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sornpichai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sornpichai/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_stkf_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_stkf_en.md new file mode 100644 index 000000000000..5b7c369fb492 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_stkf_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_stkf DistilBertForQuestionAnswering from stkf +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_stkf +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_stkf` is a English model originally trained by stkf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_stkf_en_5.2.0_3.0_1701044592970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_stkf_en_5.2.0_3.0_1701044592970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_stkf","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_stkf", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_stkf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/stkf/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_sunhface_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_sunhface_en.md new file mode 100644 index 000000000000..d76db99aec67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_sunhface_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_sunhface DistilBertForQuestionAnswering from sunhface +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_sunhface +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_sunhface` is a English model originally trained by sunhface. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sunhface_en_5.2.0_3.0_1701043204509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_sunhface_en_5.2.0_3.0_1701043204509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_sunhface","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_sunhface", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_sunhface| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/sunhface/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_tmobaggins_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_tmobaggins_en.md new file mode 100644 index 000000000000..0aa9b69f8dbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_tmobaggins_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_tmobaggins DistilBertForQuestionAnswering from tmobaggins +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_tmobaggins +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_tmobaggins` is a English model originally trained by tmobaggins. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tmobaggins_en_5.2.0_3.0_1701043359051.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tmobaggins_en_5.2.0_3.0_1701043359051.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_tmobaggins","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_tmobaggins", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_tmobaggins| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|684.9 KB| + +## References + +https://huggingface.co/tmobaggins/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_tonnysasse_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_tonnysasse_en.md new file mode 100644 index 000000000000..5ab055635f04 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_tonnysasse_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_tonnysasse DistilBertForQuestionAnswering from tonnysasse +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_tonnysasse +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_tonnysasse` is a English model originally trained by tonnysasse. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tonnysasse_en_5.2.0_3.0_1701043885321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_tonnysasse_en_5.2.0_3.0_1701043885321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_tonnysasse","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_tonnysasse", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_tonnysasse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/tonnysasse/distilbert-base-uncased-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_v_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_v_en.md new file mode 100644 index 000000000000..f0fc6f3a9546 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_finetuned_squad_v_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_finetuned_squad_v DistilBertForQuestionAnswering from lauraparra28 +author: John Snow Labs +name: distilbert_base_uncased_finetuned_squad_v +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_finetuned_squad_v` is a English model originally trained by lauraparra28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v_en_5.2.0_3.0_1701044037349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_finetuned_squad_v_en_5.2.0_3.0_1701044037349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_finetuned_squad_v","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_finetuned_squad_v", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_finetuned_squad_v| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/lauraparra28/distilbert-base-uncased-finetuned-squad-v \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_squad_qa_gladiator_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_squad_qa_gladiator_en.md new file mode 100644 index 000000000000..0adc500efc2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_squad_qa_gladiator_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_squad_qa_gladiator DistilBertForQuestionAnswering from Gladiator +author: John Snow Labs +name: distilbert_base_uncased_squad_qa_gladiator +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_squad_qa_gladiator` is a English model originally trained by Gladiator. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_qa_gladiator_en_5.2.0_3.0_1701044652577.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_qa_gladiator_en_5.2.0_3.0_1701044652577.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_squad_qa_gladiator","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_squad_qa_gladiator", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_squad_qa_gladiator| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Gladiator/distilbert-base-uncased_squad_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_squad_qa_prithvirajg_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_squad_qa_prithvirajg_en.md new file mode 100644 index 000000000000..e77c7820b7f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_base_uncased_squad_qa_prithvirajg_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_base_uncased_squad_qa_prithvirajg DistilBertForQuestionAnswering from PrithvirajG +author: John Snow Labs +name: distilbert_base_uncased_squad_qa_prithvirajg +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_base_uncased_squad_qa_prithvirajg` is a English model originally trained by PrithvirajG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_qa_prithvirajg_en_5.2.0_3.0_1701043482587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_uncased_squad_qa_prithvirajg_en_5.2.0_3.0_1701043482587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_base_uncased_squad_qa_prithvirajg","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_base_uncased_squad_qa_prithvirajg", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_base_uncased_squad_qa_prithvirajg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/PrithvirajG/distilbert-base-uncased_squad_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_finetuned0_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_finetuned0_en.md new file mode 100644 index 000000000000..ceea753a797c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_finetuned0_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_finetuned0 DistilBertForQuestionAnswering from theArif +author: John Snow Labs +name: distilbert_finetuned0 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_finetuned0` is a English model originally trained by theArif. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_finetuned0_en_5.2.0_3.0_1701050571864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_finetuned0_en_5.2.0_3.0_1701050571864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_finetuned0","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_finetuned0", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_finetuned0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/theArif/distilBert_finetuned0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_multilingual_xx.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_multilingual_xx.md new file mode 100644 index 000000000000..8ea913d9b0e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_multilingual_xx.md @@ -0,0 +1,93 @@ +--- +layout: model +title: Multilingual distilbert_multilingual DistilBertForQuestionAnswering from Timostrijbis +author: John Snow Labs +name: distilbert_multilingual +date: 2023-11-27 +tags: [distilbert, xx, open_source, question_answering, onnx] +task: Question Answering +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_multilingual` is a Multilingual model originally trained by Timostrijbis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_xx_5.2.0_3.0_1701055300473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_multilingual_xx_5.2.0_3.0_1701055300473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_multilingual","xx") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_multilingual", "xx") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_multilingual| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|xx| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Timostrijbis/distilBERT-Multilingual \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_pst_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_pst_en.md new file mode 100644 index 000000000000..01cab8267ce4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_pst_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_pst DistilBertForQuestionAnswering from zaidbhatti +author: John Snow Labs +name: distilbert_pst +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_pst` is a English model originally trained by zaidbhatti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_pst_en_5.2.0_3.0_1701088419323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_pst_en_5.2.0_3.0_1701088419323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_pst","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_pst", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_pst| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zaidbhatti/distilbert-pst \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-distilbert_slp_en.md b/docs/_posts/ahmedlone127/2023-11-27-distilbert_slp_en.md new file mode 100644 index 000000000000..c70a54fb22ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-distilbert_slp_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English distilbert_slp DistilBertForQuestionAnswering from rowan1224 +author: John Snow Labs +name: distilbert_slp +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbert_slp` is a English model originally trained by rowan1224. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_slp_en_5.2.0_3.0_1701047388956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_slp_en_5.2.0_3.0_1701047388956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("distilbert_slp","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("distilbert_slp", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbert_slp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/rowan1224/distilbert-slp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-edtech_v2_en.md b/docs/_posts/ahmedlone127/2023-11-27-edtech_v2_en.md new file mode 100644 index 000000000000..1bfd28ca583e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-edtech_v2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English edtech_v2 DistilBertForQuestionAnswering from phanimvsk +author: John Snow Labs +name: edtech_v2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`edtech_v2` is a English model originally trained by phanimvsk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/edtech_v2_en_5.2.0_3.0_1701068639446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/edtech_v2_en_5.2.0_3.0_1701068639446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("edtech_v2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("edtech_v2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|edtech_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/phanimvsk/Edtech_v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-experiment_qa_en.md b/docs/_posts/ahmedlone127/2023-11-27-experiment_qa_en.md new file mode 100644 index 000000000000..39237cfdffec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-experiment_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English experiment_qa DistilBertForQuestionAnswering from Amal17 +author: John Snow Labs +name: experiment_qa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`experiment_qa` is a English model originally trained by Amal17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/experiment_qa_en_5.2.0_3.0_1701062697436.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/experiment_qa_en_5.2.0_3.0_1701062697436.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("experiment_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("experiment_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|experiment_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Amal17/experiment-qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-extraction_en.md b/docs/_posts/ahmedlone127/2023-11-27-extraction_en.md new file mode 100644 index 000000000000..a4cf0295f791 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-extraction_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English extraction DistilBertForQuestionAnswering from Deopusi +author: John Snow Labs +name: extraction +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`extraction` is a English model originally trained by Deopusi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/extraction_en_5.2.0_3.0_1701058321711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/extraction_en_5.2.0_3.0_1701058321711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("extraction","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("extraction", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Deopusi/extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-fine_tuned_bert_pariwisata_bali_en.md b/docs/_posts/ahmedlone127/2023-11-27-fine_tuned_bert_pariwisata_bali_en.md new file mode 100644 index 000000000000..0a182f42123d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-fine_tuned_bert_pariwisata_bali_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English fine_tuned_bert_pariwisata_bali DistilBertForQuestionAnswering from SwastyMaharani +author: John Snow Labs +name: fine_tuned_bert_pariwisata_bali +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tuned_bert_pariwisata_bali` is a English model originally trained by SwastyMaharani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_pariwisata_bali_en_5.2.0_3.0_1701090594389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tuned_bert_pariwisata_bali_en_5.2.0_3.0_1701090594389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("fine_tuned_bert_pariwisata_bali","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("fine_tuned_bert_pariwisata_bali", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tuned_bert_pariwisata_bali| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SwastyMaharani/fine-tuned-bert-pariwisata-bali \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-finetuned_qna_en.md b/docs/_posts/ahmedlone127/2023-11-27-finetuned_qna_en.md new file mode 100644 index 000000000000..fc5baacfd506 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-finetuned_qna_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English finetuned_qna DistilBertForQuestionAnswering from Sarthak7777 +author: John Snow Labs +name: finetuned_qna +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_qna` is a English model originally trained by Sarthak7777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_qna_en_5.2.0_3.0_1701056223720.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_qna_en_5.2.0_3.0_1701056223720.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("finetuned_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("finetuned_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sarthak7777/finetuned-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-germanic_languages_qna_en.md b/docs/_posts/ahmedlone127/2023-11-27-germanic_languages_qna_en.md new file mode 100644 index 000000000000..f1e56bf6d148 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-germanic_languages_qna_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English germanic_languages_qna DistilBertForQuestionAnswering from zuu +author: John Snow Labs +name: germanic_languages_qna +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`germanic_languages_qna` is a English model originally trained by zuu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/germanic_languages_qna_en_5.2.0_3.0_1701060162046.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/germanic_languages_qna_en_5.2.0_3.0_1701060162046.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("germanic_languages_qna","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("germanic_languages_qna", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|germanic_languages_qna| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zuu/gem-qna \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-health_question_en.md b/docs/_posts/ahmedlone127/2023-11-27-health_question_en.md new file mode 100644 index 000000000000..b92012376b1f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-health_question_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English health_question DistilBertForQuestionAnswering from Sarthak7777 +author: John Snow Labs +name: health_question +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`health_question` is a English model originally trained by Sarthak7777. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/health_question_en_5.2.0_3.0_1701071003926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/health_question_en_5.2.0_3.0_1701071003926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("health_question","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("health_question", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|health_question| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Sarthak7777/health_question \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-hugging_face_qa_en.md b/docs/_posts/ahmedlone127/2023-11-27-hugging_face_qa_en.md new file mode 100644 index 000000000000..25b62a0a364f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-hugging_face_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English hugging_face_qa DistilBertForQuestionAnswering from zyacub +author: John Snow Labs +name: hugging_face_qa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hugging_face_qa` is a English model originally trained by zyacub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hugging_face_qa_en_5.2.0_3.0_1701060530306.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hugging_face_qa_en_5.2.0_3.0_1701060530306.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("hugging_face_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("hugging_face_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hugging_face_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/zyacub/hugging_face_QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2_en.md new file mode 100644 index 000000000000..8ec4eff3945b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2 DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2_en_5.2.0_3.0_1701044039802.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2_en_5.2.0_3.0_1701044039802.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.7_DistilBert_DIFFERENT_UNK_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4_en.md new file mode 100644 index 000000000000..16b22a9ca1d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4 DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4_en_5.2.0_3.0_1701043741397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4_en_5.2.0_3.0_1701043741397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_7_distilbert_different_unk_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.7_DistilBert_DIFFERENT_UNK_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test_en.md new file mode 100644 index 000000000000..55c190534412 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test_en_5.2.0_3.0_1701052087002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test_en_5.2.0_3.0_1701052087002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_companyname_and_location_and_series_extraction_qa_model_1_8_distilbert_paraphr_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_CompanyName_AND_Location_AND_Series_Extraction_QA_Model_1.8_DistilBert_PARAPHR_TEST \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_distilbert_model_qa_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_distilbert_model_qa_en.md new file mode 100644 index 000000000000..686e20415d44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_distilbert_model_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_distilbert_model_qa DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_distilbert_model_qa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_distilbert_model_qa` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_distilbert_model_qa_en_5.2.0_3.0_1701065799412.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_distilbert_model_qa_en_5.2.0_3.0_1701065799412.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_distilbert_model_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_distilbert_model_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_distilbert_model_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_DistilBert_Model_QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison_en.md new file mode 100644 index 000000000000..0e701ba52355 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison_en_5.2.0_3.0_1701073055473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison_en_5.2.0_3.0_1701073055473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_qa_model_2_01_distilbert_no_unk_dataset_for_comparison| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_QA_Model_2.01_DistilBert_NO_UNK_DATASET_FOR_COMPARISON \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries_en.md new file mode 100644 index 000000000000..f8c911e5332e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries_en_5.2.0_3.0_1701078596522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries_en_5.2.0_3.0_1701078596522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_qa_model_2_0_distilbert_unk_dataset_50_entries| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_QA_Model_2.0_DistilBert_UNK_DATASET_50_ENTRIES \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch_en.md new file mode 100644 index 000000000000..ceff4f46237f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch_en_5.2.0_3.0_1701069495645.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch_en_5.2.0_3.0_1701069495645.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_qa_model_2_6_distilbert_original_3e_5_512_length_5_epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_QA_Model_2.6_DISTILBERT_ORIGINAL_3e-5_512_LENGTH_5_EPOCH \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch_en.md new file mode 100644 index 000000000000..ed5e57077d30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch_en_5.2.0_3.0_1701071078991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch_en_5.2.0_3.0_1701071078991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_qa_model_2_6_distilbert_original_4e_5_384_length_5_epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_QA_Model_2.6_DISTILBERT_ORIGINAL_4e-5_384_LENGTH_5_EPOCH \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch_en.md new file mode 100644 index 000000000000..a3431648bbbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch_en_5.2.0_3.0_1701069202850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch_en_5.2.0_3.0_1701069202850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_qa_model_2_6_distilbert_original_5e_5_512_length_5_epoch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_QA_Model_2.6_DISTILBERT_ORIGINAL_5e-5_512_LENGTH_5_EPOCH \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_3_0_bert_original_3e_5_384_length_en.md b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_3_0_bert_original_3e_5_384_length_en.md new file mode 100644 index 000000000000..b26d3a0b770f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-iotnation_qa_model_3_0_bert_original_3e_5_384_length_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English iotnation_qa_model_3_0_bert_original_3e_5_384_length DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: iotnation_qa_model_3_0_bert_original_3e_5_384_length +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iotnation_qa_model_3_0_bert_original_3e_5_384_length` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_3_0_bert_original_3e_5_384_length_en_5.2.0_3.0_1701044758850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iotnation_qa_model_3_0_bert_original_3e_5_384_length_en_5.2.0_3.0_1701044758850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("iotnation_qa_model_3_0_bert_original_3e_5_384_length","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("iotnation_qa_model_3_0_bert_original_3e_5_384_length", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iotnation_qa_model_3_0_bert_original_3e_5_384_length| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/IOTNation_QA_Model_3.0_BERT_ORIGINAL_3e-5_384_LENGTH \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-kannada_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-kannada_qa_model_en.md new file mode 100644 index 000000000000..3c079713ebd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-kannada_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English kannada_qa_model DistilBertForQuestionAnswering from someonegg +author: John Snow Labs +name: kannada_qa_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kannada_qa_model` is a English model originally trained by someonegg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kannada_qa_model_en_5.2.0_3.0_1701071003921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kannada_qa_model_en_5.2.0_3.0_1701071003921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("kannada_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("kannada_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kannada_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/someonegg/kn_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-ma_saemi_en.md b/docs/_posts/ahmedlone127/2023-11-27-ma_saemi_en.md new file mode 100644 index 000000000000..73430c564664 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-ma_saemi_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English ma_saemi DistilBertForQuestionAnswering from FranzderPapst +author: John Snow Labs +name: ma_saemi +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ma_saemi` is a English model originally trained by FranzderPapst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ma_saemi_en_5.2.0_3.0_1701054349255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ma_saemi_en_5.2.0_3.0_1701054349255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("ma_saemi","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("ma_saemi", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ma_saemi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/FranzderPapst/MA-saemi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-mlg_en.md b/docs/_posts/ahmedlone127/2023-11-27-mlg_en.md new file mode 100644 index 000000000000..d6d0e95bc1f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-mlg_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English mlg DistilBertForQuestionAnswering from JorgeUDG +author: John Snow Labs +name: mlg +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mlg` is a English model originally trained by JorgeUDG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mlg_en_5.2.0_3.0_1701076947424.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mlg_en_5.2.0_3.0_1701076947424.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("mlg","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("mlg", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mlg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/JorgeUDG/MLG \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-model1_en.md b/docs/_posts/ahmedlone127/2023-11-27-model1_en.md new file mode 100644 index 000000000000..20e50df75699 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-model1_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English model1 BertEmbeddings from flymushroom +author: John Snow Labs +name: model1 +date: 2023-11-27 +tags: [bert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model1` is a English model originally trained by flymushroom. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model1_en_5.2.0_3.0_1701043602895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model1_en_5.2.0_3.0_1701043602895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =BertEmbeddings.pretrained("model1","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = BertEmbeddings + .pretrained("model1", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +References + +https://huggingface.co/flymushroom/model1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-model3_en.md b/docs/_posts/ahmedlone127/2023-11-27-model3_en.md new file mode 100644 index 000000000000..82742ad6502f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-model3_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model3 DistilBertForQuestionAnswering from Vasu07 +author: John Snow Labs +name: model3 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model3` is a English model originally trained by Vasu07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model3_en_5.2.0_3.0_1701044181600.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model3_en_5.2.0_3.0_1701044181600.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("model3","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("model3", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Vasu07/model3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-model4_en.md b/docs/_posts/ahmedlone127/2023-11-27-model4_en.md new file mode 100644 index 000000000000..c009f626ce1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-model4_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model4 DistilBertForQuestionAnswering from Vasu07 +author: John Snow Labs +name: model4 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model4` is a English model originally trained by Vasu07. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model4_en_5.2.0_3.0_1701072882074.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model4_en_5.2.0_3.0_1701072882074.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("model4","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("model4", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Vasu07/model4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-model_mrcl0ud_en.md b/docs/_posts/ahmedlone127/2023-11-27-model_mrcl0ud_en.md new file mode 100644 index 000000000000..e5f006c9ddc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-model_mrcl0ud_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model_mrcl0ud DistilBertForQuestionAnswering from MrCl0ud +author: John Snow Labs +name: model_mrcl0ud +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_mrcl0ud` is a English model originally trained by MrCl0ud. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_mrcl0ud_en_5.2.0_3.0_1701069495618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_mrcl0ud_en_5.2.0_3.0_1701069495618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("model_mrcl0ud","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("model_mrcl0ud", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_mrcl0ud| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/MrCl0ud/model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-model_ywoo_en.md b/docs/_posts/ahmedlone127/2023-11-27-model_ywoo_en.md new file mode 100644 index 000000000000..62bd6c5ce93f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-model_ywoo_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English model_ywoo DistilBertForQuestionAnswering from ywoo +author: John Snow Labs +name: model_ywoo +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_ywoo` is a English model originally trained by ywoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_ywoo_en_5.2.0_3.0_1701085756672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_ywoo_en_5.2.0_3.0_1701085756672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("model_ywoo","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("model_ywoo", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_ywoo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/ywoo/model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-olonok_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-olonok_qa_model_en.md new file mode 100644 index 000000000000..bb16feb5c659 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-olonok_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English olonok_qa_model DistilBertForQuestionAnswering from olonok +author: John Snow Labs +name: olonok_qa_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`olonok_qa_model` is a English model originally trained by olonok. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/olonok_qa_model_en_5.2.0_3.0_1701060704467.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/olonok_qa_model_en_5.2.0_3.0_1701060704467.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("olonok_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("olonok_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|olonok_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/olonok/olonok_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-oneapi_qa_model_kaggle_en.md b/docs/_posts/ahmedlone127/2023-11-27-oneapi_qa_model_kaggle_en.md new file mode 100644 index 000000000000..fff0b8e71481 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-oneapi_qa_model_kaggle_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English oneapi_qa_model_kaggle DistilBertForQuestionAnswering from badalsahani +author: John Snow Labs +name: oneapi_qa_model_kaggle +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`oneapi_qa_model_kaggle` is a English model originally trained by badalsahani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/oneapi_qa_model_kaggle_en_5.2.0_3.0_1701087515362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/oneapi_qa_model_kaggle_en_5.2.0_3.0_1701087515362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("oneapi_qa_model_kaggle","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("oneapi_qa_model_kaggle", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|oneapi_qa_model_kaggle| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/badalsahani/oneAPI_QA_Model_kaggle \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-pruebamodelotfm_distilbert_in_en.md b/docs/_posts/ahmedlone127/2023-11-27-pruebamodelotfm_distilbert_in_en.md new file mode 100644 index 000000000000..0c2fdfc3b1f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-pruebamodelotfm_distilbert_in_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pruebamodelotfm_distilbert_in DistilBertForQuestionAnswering from pamelapaolacb +author: John Snow Labs +name: pruebamodelotfm_distilbert_in +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pruebamodelotfm_distilbert_in` is a English model originally trained by pamelapaolacb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pruebamodelotfm_distilbert_in_en_5.2.0_3.0_1701075902214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pruebamodelotfm_distilbert_in_en_5.2.0_3.0_1701075902214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("pruebamodelotfm_distilbert_in","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("pruebamodelotfm_distilbert_in", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pruebamodelotfm_distilbert_in| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/pamelapaolacb/pruebaModeloTFM_DistilBert_in \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-pruned_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-pruned_model_en.md new file mode 100644 index 000000000000..294a932937bf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-pruned_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pruned_model DistilBertForQuestionAnswering from vxbrandon +author: John Snow Labs +name: pruned_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pruned_model` is a English model originally trained by vxbrandon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pruned_model_en_5.2.0_3.0_1701075955229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pruned_model_en_5.2.0_3.0_1701075955229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("pruned_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("pruned_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pruned_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.0 MB| + +## References + +https://huggingface.co/vxbrandon/pruned_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-pruned_model_iterative_en.md b/docs/_posts/ahmedlone127/2023-11-27-pruned_model_iterative_en.md new file mode 100644 index 000000000000..7fa9c2bdb655 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-pruned_model_iterative_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English pruned_model_iterative DistilBertForQuestionAnswering from vxbrandon +author: John Snow Labs +name: pruned_model_iterative +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pruned_model_iterative` is a English model originally trained by vxbrandon. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pruned_model_iterative_en_5.2.0_3.0_1701053341271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pruned_model_iterative_en_5.2.0_3.0_1701053341271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("pruned_model_iterative","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("pruned_model_iterative", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pruned_model_iterative| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/vxbrandon/pruned_model_iterative \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-psy_q_a_test_with_starts_en.md b/docs/_posts/ahmedlone127/2023-11-27-psy_q_a_test_with_starts_en.md new file mode 100644 index 000000000000..c861cf67d2a6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-psy_q_a_test_with_starts_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English psy_q_a_test_with_starts DistilBertForQuestionAnswering from plgrm720 +author: John Snow Labs +name: psy_q_a_test_with_starts +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`psy_q_a_test_with_starts` is a English model originally trained by plgrm720. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/psy_q_a_test_with_starts_en_5.2.0_3.0_1701051177912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/psy_q_a_test_with_starts_en_5.2.0_3.0_1701051177912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("psy_q_a_test_with_starts","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("psy_q_a_test_with_starts", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|psy_q_a_test_with_starts| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/plgrm720/psy_q_a_test_with_starts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_azbukau_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_azbukau_en.md new file mode 100644 index 000000000000..3c6fd3e59a2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_azbukau_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_azbukau DistilBertForQuestionAnswering from Azbukau +author: John Snow Labs +name: qa_azbukau +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_azbukau` is a English model originally trained by Azbukau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_azbukau_en_5.2.0_3.0_1701092765295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_azbukau_en_5.2.0_3.0_1701092765295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_azbukau","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_azbukau", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_azbukau| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Azbukau/QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_model_angelogonzales_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_model_angelogonzales_en.md new file mode 100644 index 000000000000..b80a6e0f1e7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_model_angelogonzales_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_angelogonzales DistilBertForQuestionAnswering from angelogonzales +author: John Snow Labs +name: qa_model_angelogonzales +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_angelogonzales` is a English model originally trained by angelogonzales. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_angelogonzales_en_5.2.0_3.0_1701073906977.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_angelogonzales_en_5.2.0_3.0_1701073906977.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_angelogonzales","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_angelogonzales", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_angelogonzales| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/angelogonzales/qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_model_inner_exper_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_model_inner_exper_en.md new file mode 100644 index 000000000000..56dabca9cbc3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_model_inner_exper_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_inner_exper DistilBertForQuestionAnswering from AtomGradient +author: John Snow Labs +name: qa_model_inner_exper +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_inner_exper` is a English model originally trained by AtomGradient. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_inner_exper_en_5.2.0_3.0_1701048663921.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_inner_exper_en_5.2.0_3.0_1701048663921.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_inner_exper","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_inner_exper", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_inner_exper| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/AtomGradient/qa_model_inner_exper \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_model_netgvarun2005_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_model_netgvarun2005_en.md new file mode 100644 index 000000000000..26dcce283c77 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_model_netgvarun2005_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_netgvarun2005 DistilBertForQuestionAnswering from netgvarun2005 +author: John Snow Labs +name: qa_model_netgvarun2005 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_netgvarun2005` is a English model originally trained by netgvarun2005. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_netgvarun2005_en_5.2.0_3.0_1701092695726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_netgvarun2005_en_5.2.0_3.0_1701092695726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_netgvarun2005","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_netgvarun2005", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_netgvarun2005| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/netgvarun2005/qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_model_test_jmslord_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_model_test_jmslord_en.md new file mode 100644 index 000000000000..02b3946fb0c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_model_test_jmslord_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_model_test_jmslord DistilBertForQuestionAnswering from jmslord +author: John Snow Labs +name: qa_model_test_jmslord +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_model_test_jmslord` is a English model originally trained by jmslord. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_model_test_jmslord_en_5.2.0_3.0_1701088845263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_model_test_jmslord_en_5.2.0_3.0_1701088845263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_model_test_jmslord","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_model_test_jmslord", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_model_test_jmslord| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jmslord/qa_model_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_muhammadabrar9999_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_muhammadabrar9999_en.md new file mode 100644 index 000000000000..b70a3b0fbea2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_muhammadabrar9999_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_muhammadabrar9999 DistilBertForQuestionAnswering from muhammadabrar9999 +author: John Snow Labs +name: qa_muhammadabrar9999 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_muhammadabrar9999` is a English model originally trained by muhammadabrar9999. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_muhammadabrar9999_en_5.2.0_3.0_1701053249138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_muhammadabrar9999_en_5.2.0_3.0_1701053249138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_muhammadabrar9999","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_muhammadabrar9999", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_muhammadabrar9999| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/muhammadabrar9999/QA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_numberthree_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_numberthree_en.md new file mode 100644 index 000000000000..cbf4e691eae9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_numberthree_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_numberthree DistilBertForQuestionAnswering from SharKRippeR +author: John Snow Labs +name: qa_numberthree +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_numberthree` is a English model originally trained by SharKRippeR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_numberthree_en_5.2.0_3.0_1701069600450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_numberthree_en_5.2.0_3.0_1701069600450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_numberthree","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_numberthree", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_numberthree| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/SharKRippeR/QA_NumberThree \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28_en.md new file mode 100644 index 000000000000..c40f9307f147 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28 DistilBertForQuestionAnswering from anuragsingh28 +author: John Snow Labs +name: qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28` is a English model originally trained by anuragsingh28. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28_en_5.2.0_3.0_1701043600319.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28_en_5.2.0_3.0_1701043600319.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_synthetic_data_only_18_aug_distilbert_base_anuragsingh28| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/anuragsingh28/QA_SYNTHETIC_DATA_ONLY_18_AUG_distilbert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_test_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_test_en.md new file mode 100644 index 000000000000..9f3e91d33e15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_test_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_test DistilBertForQuestionAnswering from philipp-zettl +author: John Snow Labs +name: qa_test +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_test` is a English model originally trained by philipp-zettl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_test_en_5.2.0_3.0_1701080698398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_test_en_5.2.0_3.0_1701080698398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_test","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_test", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/philipp-zettl/qa-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_tutorial_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_tutorial_model_en.md new file mode 100644 index 000000000000..8203a65bf4ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_tutorial_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_tutorial_model DistilBertForQuestionAnswering from Jevgenija +author: John Snow Labs +name: qa_tutorial_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_tutorial_model` is a English model originally trained by Jevgenija. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_tutorial_model_en_5.2.0_3.0_1701089362864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_tutorial_model_en_5.2.0_3.0_1701089362864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_tutorial_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_tutorial_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_tutorial_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Jevgenija/qa_tutorial_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-qa_v3_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-27-qa_v3_finetuned_squad_en.md new file mode 100644 index 000000000000..9f3445eb1098 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-qa_v3_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English qa_v3_finetuned_squad DistilBertForQuestionAnswering from jmparejaz +author: John Snow Labs +name: qa_v3_finetuned_squad +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_v3_finetuned_squad` is a English model originally trained by jmparejaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_v3_finetuned_squad_en_5.2.0_3.0_1701043870441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_v3_finetuned_squad_en_5.2.0_3.0_1701043870441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("qa_v3_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("qa_v3_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_v3_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.7 MB| + +## References + +https://huggingface.co/jmparejaz/QA-v3-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-question_answering_nikhilwani_en.md b/docs/_posts/ahmedlone127/2023-11-27-question_answering_nikhilwani_en.md new file mode 100644 index 000000000000..95b3e678c73c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-question_answering_nikhilwani_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English question_answering_nikhilwani DistilBertForQuestionAnswering from nikhilwani +author: John Snow Labs +name: question_answering_nikhilwani +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`question_answering_nikhilwani` is a English model originally trained by nikhilwani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/question_answering_nikhilwani_en_5.2.0_3.0_1701061676360.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/question_answering_nikhilwani_en_5.2.0_3.0_1701061676360.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("question_answering_nikhilwani","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("question_answering_nikhilwani", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|question_answering_nikhilwani| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/nikhilwani/question_answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-results_en.md b/docs/_posts/ahmedlone127/2023-11-27-results_en.md new file mode 100644 index 000000000000..362fa7c763bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-results_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English results DistilBertForQuestionAnswering from Souvik123 +author: John Snow Labs +name: results +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results` is a English model originally trained by Souvik123. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_en_5.2.0_3.0_1701049761301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_en_5.2.0_3.0_1701049761301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("results","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("results", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.7 MB| + +## References + +https://huggingface.co/Souvik123/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-roberta_question_answering_en.md b/docs/_posts/ahmedlone127/2023-11-27-roberta_question_answering_en.md new file mode 100644 index 000000000000..c210dce72e9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-roberta_question_answering_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English roberta_question_answering DistilBertForQuestionAnswering from safinpeal +author: John Snow Labs +name: roberta_question_answering +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_question_answering` is a English model originally trained by safinpeal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_question_answering_en_5.2.0_3.0_1701080042594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_question_answering_en_5.2.0_3.0_1701080042594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("roberta_question_answering","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("roberta_question_answering", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_question_answering| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/safinpeal/roberta-question-answering \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-rudistilbert_base_sberquad_en.md b/docs/_posts/ahmedlone127/2023-11-27-rudistilbert_base_sberquad_en.md new file mode 100644 index 000000000000..b64f35112542 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-rudistilbert_base_sberquad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English rudistilbert_base_sberquad DistilBertForQuestionAnswering from Mathnub +author: John Snow Labs +name: rudistilbert_base_sberquad +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rudistilbert_base_sberquad` is a English model originally trained by Mathnub. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rudistilbert_base_sberquad_en_5.2.0_3.0_1701068237695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rudistilbert_base_sberquad_en_5.2.0_3.0_1701068237695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("rudistilbert_base_sberquad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("rudistilbert_base_sberquad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rudistilbert_base_sberquad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|201.1 MB| + +## References + +https://huggingface.co/Mathnub/ruDistilBert-base-sberquad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-sample_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-27-sample_finetuned_en.md new file mode 100644 index 000000000000..c52aa1ce6039 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-sample_finetuned_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English sample_finetuned DistilBertForQuestionAnswering from Logeswaransr +author: John Snow Labs +name: sample_finetuned +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sample_finetuned` is a English model originally trained by Logeswaransr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sample_finetuned_en_5.2.0_3.0_1701064986588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sample_finetuned_en_5.2.0_3.0_1701064986588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("sample_finetuned","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("sample_finetuned", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sample_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/Logeswaransr/sample_finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_en.md b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_en.md new file mode 100644 index 000000000000..908536c3c87a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English saved_distilbert_squad DistilBertForQuestionAnswering from umarzein +author: John Snow Labs +name: saved_distilbert_squad +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`saved_distilbert_squad` is a English model originally trained by umarzein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_en_5.2.0_3.0_1701044799872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_en_5.2.0_3.0_1701044799872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("saved_distilbert_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("saved_distilbert_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|saved_distilbert_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/umarzein/saved-distilbert-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_k5fold_3epoch_newer_en.md b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_k5fold_3epoch_newer_en.md new file mode 100644 index 000000000000..539a0d4d86f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_k5fold_3epoch_newer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English saved_distilbert_squad_k5fold_3epoch_newer DistilBertForQuestionAnswering from umarzein +author: John Snow Labs +name: saved_distilbert_squad_k5fold_3epoch_newer +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`saved_distilbert_squad_k5fold_3epoch_newer` is a English model originally trained by umarzein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_k5fold_3epoch_newer_en_5.2.0_3.0_1701046283222.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_k5fold_3epoch_newer_en_5.2.0_3.0_1701046283222.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("saved_distilbert_squad_k5fold_3epoch_newer","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("saved_distilbert_squad_k5fold_3epoch_newer", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|saved_distilbert_squad_k5fold_3epoch_newer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/umarzein/saved-distilbert-squad-k5fold-3epoch-newer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_k5fold_3epoch_newer_full_finetune_en.md b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_k5fold_3epoch_newer_full_finetune_en.md new file mode 100644 index 000000000000..6b7139cd8ff7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_k5fold_3epoch_newer_full_finetune_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English saved_distilbert_squad_k5fold_3epoch_newer_full_finetune DistilBertForQuestionAnswering from umarzein +author: John Snow Labs +name: saved_distilbert_squad_k5fold_3epoch_newer_full_finetune +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`saved_distilbert_squad_k5fold_3epoch_newer_full_finetune` is a English model originally trained by umarzein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_k5fold_3epoch_newer_full_finetune_en_5.2.0_3.0_1701063872182.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_k5fold_3epoch_newer_full_finetune_en_5.2.0_3.0_1701063872182.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("saved_distilbert_squad_k5fold_3epoch_newer_full_finetune","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("saved_distilbert_squad_k5fold_3epoch_newer_full_finetune", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|saved_distilbert_squad_k5fold_3epoch_newer_full_finetune| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/umarzein/saved-distilbert-squad-k5fold-3epoch-newer-full-finetune \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_nepal_bhasa_en.md b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_nepal_bhasa_en.md new file mode 100644 index 000000000000..436465db18c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_nepal_bhasa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English saved_distilbert_squad_nepal_bhasa DistilBertForQuestionAnswering from umarzein +author: John Snow Labs +name: saved_distilbert_squad_nepal_bhasa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`saved_distilbert_squad_nepal_bhasa` is a English model originally trained by umarzein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_nepal_bhasa_en_5.2.0_3.0_1701054213920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_nepal_bhasa_en_5.2.0_3.0_1701054213920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("saved_distilbert_squad_nepal_bhasa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("saved_distilbert_squad_nepal_bhasa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|saved_distilbert_squad_nepal_bhasa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/umarzein/saved-distilbert-squad-new \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_newer_en.md b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_newer_en.md new file mode 100644 index 000000000000..6c07057b2b23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-saved_distilbert_squad_newer_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English saved_distilbert_squad_newer DistilBertForQuestionAnswering from umarzein +author: John Snow Labs +name: saved_distilbert_squad_newer +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`saved_distilbert_squad_newer` is a English model originally trained by umarzein. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_newer_en_5.2.0_3.0_1701062001611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/saved_distilbert_squad_newer_en_5.2.0_3.0_1701062001611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("saved_distilbert_squad_newer","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("saved_distilbert_squad_newer", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|saved_distilbert_squad_newer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/umarzein/saved-distilbert-squad-newer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-spec_seq_lab_indonesian_en.md b/docs/_posts/ahmedlone127/2023-11-27-spec_seq_lab_indonesian_en.md new file mode 100644 index 000000000000..936d0a10a95a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-spec_seq_lab_indonesian_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English spec_seq_lab_indonesian DistilBertForQuestionAnswering from mathildeparlo +author: John Snow Labs +name: spec_seq_lab_indonesian +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`spec_seq_lab_indonesian` is a English model originally trained by mathildeparlo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/spec_seq_lab_indonesian_en_5.2.0_3.0_1701074776201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/spec_seq_lab_indonesian_en_5.2.0_3.0_1701074776201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("spec_seq_lab_indonesian","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("spec_seq_lab_indonesian", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|spec_seq_lab_indonesian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/mathildeparlo/spec_seq_lab_indonesian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-squad_q_a_en.md b/docs/_posts/ahmedlone127/2023-11-27-squad_q_a_en.md new file mode 100644 index 000000000000..6ed5640f8399 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-squad_q_a_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English squad_q_a DistilBertForQuestionAnswering from Stratcher +author: John Snow Labs +name: squad_q_a +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`squad_q_a` is a English model originally trained by Stratcher. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/squad_q_a_en_5.2.0_3.0_1701070122761.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/squad_q_a_en_5.2.0_3.0_1701070122761.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("squad_q_a","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("squad_q_a", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|squad_q_a| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/Stratcher/squad_q_a \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-squad_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-squad_qa_model_en.md new file mode 100644 index 000000000000..a378fe43668c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-squad_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English squad_qa_model DistilBertForQuestionAnswering from abhiShek1061 +author: John Snow Labs +name: squad_qa_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`squad_qa_model` is a English model originally trained by abhiShek1061. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/squad_qa_model_en_5.2.0_3.0_1701085422107.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/squad_qa_model_en_5.2.0_3.0_1701085422107.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("squad_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("squad_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|squad_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/abhiShek1061/squad_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-squad_v2_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-27-squad_v2_finetuned_squad_en.md new file mode 100644 index 000000000000..e33c20fdb444 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-squad_v2_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English squad_v2_finetuned_squad DistilBertForQuestionAnswering from jonathanagustin +author: John Snow Labs +name: squad_v2_finetuned_squad +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`squad_v2_finetuned_squad` is a English model originally trained by jonathanagustin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/squad_v2_finetuned_squad_en_5.2.0_3.0_1701072057906.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/squad_v2_finetuned_squad_en_5.2.0_3.0_1701072057906.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("squad_v2_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("squad_v2_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|squad_v2_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/jonathanagustin/squad_v2-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-starter_qa_model_en.md b/docs/_posts/ahmedlone127/2023-11-27-starter_qa_model_en.md new file mode 100644 index 000000000000..17db4bcbc753 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-starter_qa_model_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English starter_qa_model DistilBertForQuestionAnswering from aronhawkins +author: John Snow Labs +name: starter_qa_model +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`starter_qa_model` is a English model originally trained by aronhawkins. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/starter_qa_model_en_5.2.0_3.0_1701067798833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/starter_qa_model_en_5.2.0_3.0_1701067798833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("starter_qa_model","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("starter_qa_model", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|starter_qa_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/aronhawkins/starter_qa_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-test_headline_qa_en.md b/docs/_posts/ahmedlone127/2023-11-27-test_headline_qa_en.md new file mode 100644 index 000000000000..b0e8fe94879d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-test_headline_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_headline_qa DistilBertForQuestionAnswering from chriskim2273 +author: John Snow Labs +name: test_headline_qa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_headline_qa` is a English model originally trained by chriskim2273. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_headline_qa_en_5.2.0_3.0_1701044346627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_headline_qa_en_5.2.0_3.0_1701044346627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("test_headline_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("test_headline_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_headline_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/chriskim2273/test_headline_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-test_qa_en.md b/docs/_posts/ahmedlone127/2023-11-27-test_qa_en.md new file mode 100644 index 000000000000..ea3fa08a8935 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-test_qa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_qa DistilBertForQuestionAnswering from hoang14 +author: John Snow Labs +name: test_qa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_qa` is a English model originally trained by hoang14. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_qa_en_5.2.0_3.0_1701047222930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_qa_en_5.2.0_3.0_1701047222930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("test_qa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("test_qa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_qa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/hoang14/test_qa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-test_qabert_small_en.md b/docs/_posts/ahmedlone127/2023-11-27-test_qabert_small_en.md new file mode 100644 index 000000000000..cbb0d183a370 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-test_qabert_small_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_qabert_small DistilBertForQuestionAnswering from kkkzzzkkk +author: John Snow Labs +name: test_qabert_small +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_qabert_small` is a English model originally trained by kkkzzzkkk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_qabert_small_en_5.2.0_3.0_1701083168342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_qabert_small_en_5.2.0_3.0_1701083168342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("test_qabert_small","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("test_qabert_small", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_qabert_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/kkkzzzkkk/test_QABERT-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-test_squad_trained_finetuned_squad_en.md b/docs/_posts/ahmedlone127/2023-11-27-test_squad_trained_finetuned_squad_en.md new file mode 100644 index 000000000000..e8a5a43e4273 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-test_squad_trained_finetuned_squad_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English test_squad_trained_finetuned_squad DistilBertForQuestionAnswering from EricPeter +author: John Snow Labs +name: test_squad_trained_finetuned_squad +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_squad_trained_finetuned_squad` is a English model originally trained by EricPeter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_squad_trained_finetuned_squad_en_5.2.0_3.0_1701076849459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_squad_trained_finetuned_squad_en_5.2.0_3.0_1701076849459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("test_squad_trained_finetuned_squad","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("test_squad_trained_finetuned_squad", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_squad_trained_finetuned_squad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/EricPeter/test-squad-trained-finetuned-squad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-training_df_fullctxt_filtered_0_15_bertqa_en.md b/docs/_posts/ahmedlone127/2023-11-27-training_df_fullctxt_filtered_0_15_bertqa_en.md new file mode 100644 index 000000000000..4e8e46fc2344 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-training_df_fullctxt_filtered_0_15_bertqa_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English training_df_fullctxt_filtered_0_15_bertqa DistilBertForQuestionAnswering from LeWince +author: John Snow Labs +name: training_df_fullctxt_filtered_0_15_bertqa +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`training_df_fullctxt_filtered_0_15_bertqa` is a English model originally trained by LeWince. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/training_df_fullctxt_filtered_0_15_bertqa_en_5.2.0_3.0_1701082067293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/training_df_fullctxt_filtered_0_15_bertqa_en_5.2.0_3.0_1701082067293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("training_df_fullctxt_filtered_0_15_bertqa","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("training_df_fullctxt_filtered_0_15_bertqa", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|training_df_fullctxt_filtered_0_15_bertqa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|243.8 MB| + +## References + +https://huggingface.co/LeWince/training_df_fullctxt_filtered_0_15_BertQA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-27-tu_nlpweb_w22_g18_e6_en.md b/docs/_posts/ahmedlone127/2023-11-27-tu_nlpweb_w22_g18_e6_en.md new file mode 100644 index 000000000000..03b949ef1ffb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-27-tu_nlpweb_w22_g18_e6_en.md @@ -0,0 +1,93 @@ +--- +layout: model +title: English tu_nlpweb_w22_g18_e6 DistilBertForQuestionAnswering from adiharush +author: John Snow Labs +name: tu_nlpweb_w22_g18_e6 +date: 2023-11-27 +tags: [distilbert, en, open_source, question_answering, onnx] +task: Question Answering +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: DistilBertForQuestionAnswering +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tu_nlpweb_w22_g18_e6` is a English model originally trained by adiharush. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tu_nlpweb_w22_g18_e6_en_5.2.0_3.0_1701043204512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tu_nlpweb_w22_g18_e6_en_5.2.0_3.0_1701043204512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + + +document_assembler = MultiDocumentAssembler() \ + .setInputCol(["question", "context"]) \ + .setOutputCol(["document_question", "document_context"]) + + +spanClassifier = DistilBertForQuestionAnswering.pretrained("tu_nlpweb_w22_g18_e6","en") \ + .setInputCols(["document_question","document_context"]) \ + .setOutputCol("answer") + +pipeline = Pipeline().setStages([document_assembler, spanClassifier]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) + +``` +```scala + + +val document_assembler = new MultiDocumentAssembler() + .setInputCol(Array("question", "context")) + .setOutputCol(Array("document_question", "document_context")) + +val spanClassifier = DistilBertForQuestionAnswering + .pretrained("tu_nlpweb_w22_g18_e6", "en") + .setInputCols(Array("document_question","document_context")) + .setOutputCol("answer") + +val pipeline = new Pipeline().setStages(Array(document_assembler, spanClassifier)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tu_nlpweb_w22_g18_e6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document_question, document_context]| +|Output Labels:|[answer]| +|Language:|en| +|Size:|247.2 MB| + +## References + +https://huggingface.co/adiharush/tu-nlpweb-w22-g18-e6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-adding_on_en.md b/docs/_posts/ahmedlone127/2023-11-29-adding_on_en.md new file mode 100644 index 000000000000..7ffeded04ab9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-adding_on_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English adding_on RoBertaForSequenceClassification from aekupor +author: John Snow Labs +name: adding_on +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adding_on` is a English model originally trained by aekupor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adding_on_en_5.2.0_3.0_1701237798155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adding_on_en_5.2.0_3.0_1701237798155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("adding_on","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("adding_on","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adding_on| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.4 MB| + +## References + +https://huggingface.co/aekupor/adding_on \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-ag_nli_dets_sentence_similarity_v1_xx.md b/docs/_posts/ahmedlone127/2023-11-29-ag_nli_dets_sentence_similarity_v1_xx.md new file mode 100644 index 000000000000..9f7dc66ddd90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-ag_nli_dets_sentence_similarity_v1_xx.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Multilingual ag_nli_dets_sentence_similarity_v1 RoBertaForSequenceClassification from abbasgolestani +author: John Snow Labs +name: ag_nli_dets_sentence_similarity_v1 +date: 2023-11-29 +tags: [roberta, xx, open_source, sequence_classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ag_nli_dets_sentence_similarity_v1` is a Multilingual model originally trained by abbasgolestani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ag_nli_dets_sentence_similarity_v1_xx_5.2.0_3.0_1701257516688.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ag_nli_dets_sentence_similarity_v1_xx_5.2.0_3.0_1701257516688.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ag_nli_dets_sentence_similarity_v1","xx")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ag_nli_dets_sentence_similarity_v1","xx") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ag_nli_dets_sentence_similarity_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|1.3 GB| + +## References + +https://huggingface.co/abbasgolestani/ag-nli-DeTS-sentence-similarity-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-ai_human_detector_en.md b/docs/_posts/ahmedlone127/2023-11-29-ai_human_detector_en.md new file mode 100644 index 000000000000..5a9b60d3f423 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-ai_human_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ai_human_detector RoBertaForSequenceClassification from idajikuu +author: John Snow Labs +name: ai_human_detector +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai_human_detector` is a English model originally trained by idajikuu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai_human_detector_en_5.2.0_3.0_1701282196935.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai_human_detector_en_5.2.0_3.0_1701282196935.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ai_human_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ai_human_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai_human_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/idajikuu/AI-HUMAN-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-argumentmining_en_ari_aif_roberta_l_en.md b/docs/_posts/ahmedlone127/2023-11-29-argumentmining_en_ari_aif_roberta_l_en.md new file mode 100644 index 000000000000..22d1ad9e7405 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-argumentmining_en_ari_aif_roberta_l_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English argumentmining_en_ari_aif_roberta_l RoBertaForSequenceClassification from raruidol +author: John Snow Labs +name: argumentmining_en_ari_aif_roberta_l +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`argumentmining_en_ari_aif_roberta_l` is a English model originally trained by raruidol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/argumentmining_en_ari_aif_roberta_l_en_5.2.0_3.0_1701273839513.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/argumentmining_en_ari_aif_roberta_l_en_5.2.0_3.0_1701273839513.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("argumentmining_en_ari_aif_roberta_l","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("argumentmining_en_ari_aif_roberta_l","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|argumentmining_en_ari_aif_roberta_l| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/raruidol/ArgumentMining-EN-ARI-AIF-RoBERTa_L \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-babe_v3_roberta_fully_trained_en.md b/docs/_posts/ahmedlone127/2023-11-29-babe_v3_roberta_fully_trained_en.md new file mode 100644 index 000000000000..1e4e46430336 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-babe_v3_roberta_fully_trained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English babe_v3_roberta_fully_trained RoBertaForSequenceClassification from mediabiasgroup +author: John Snow Labs +name: babe_v3_roberta_fully_trained +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`babe_v3_roberta_fully_trained` is a English model originally trained by mediabiasgroup. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/babe_v3_roberta_fully_trained_en_5.2.0_3.0_1701264726344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/babe_v3_roberta_fully_trained_en_5.2.0_3.0_1701264726344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("babe_v3_roberta_fully_trained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("babe_v3_roberta_fully_trained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|babe_v3_roberta_fully_trained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.0 MB| + +## References + +https://huggingface.co/mediabiasgroup/babe-v3-roberta-fully-trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-centralbankroberta_sentiment_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-centralbankroberta_sentiment_classifier_en.md new file mode 100644 index 000000000000..311c3d1c73ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-centralbankroberta_sentiment_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English centralbankroberta_sentiment_classifier RoBertaForSequenceClassification from Moritz-Pfeifer +author: John Snow Labs +name: centralbankroberta_sentiment_classifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`centralbankroberta_sentiment_classifier` is a English model originally trained by Moritz-Pfeifer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/centralbankroberta_sentiment_classifier_en_5.2.0_3.0_1701293997244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/centralbankroberta_sentiment_classifier_en_5.2.0_3.0_1701293997244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("centralbankroberta_sentiment_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("centralbankroberta_sentiment_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|centralbankroberta_sentiment_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.1 MB| + +## References + +https://huggingface.co/Moritz-Pfeifer/CentralBankRoBERTa-sentiment-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-chatgpt_detector_roberta_hello_simpleai_en.md b/docs/_posts/ahmedlone127/2023-11-29-chatgpt_detector_roberta_hello_simpleai_en.md new file mode 100644 index 000000000000..33532c425801 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-chatgpt_detector_roberta_hello_simpleai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chatgpt_detector_roberta_hello_simpleai RoBertaForSequenceClassification from Hello-SimpleAI +author: John Snow Labs +name: chatgpt_detector_roberta_hello_simpleai +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt_detector_roberta_hello_simpleai` is a English model originally trained by Hello-SimpleAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_detector_roberta_hello_simpleai_en_5.2.0_3.0_1701237137172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_detector_roberta_hello_simpleai_en_5.2.0_3.0_1701237137172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("chatgpt_detector_roberta_hello_simpleai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("chatgpt_detector_roberta_hello_simpleai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt_detector_roberta_hello_simpleai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.4 MB| + +## References + +https://huggingface.co/Hello-SimpleAI/chatgpt-detector-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-chatgpt_qa_detector_roberta_hello_simpleai_en.md b/docs/_posts/ahmedlone127/2023-11-29-chatgpt_qa_detector_roberta_hello_simpleai_en.md new file mode 100644 index 000000000000..5ed35b52e003 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-chatgpt_qa_detector_roberta_hello_simpleai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chatgpt_qa_detector_roberta_hello_simpleai RoBertaForSequenceClassification from Hello-SimpleAI +author: John Snow Labs +name: chatgpt_qa_detector_roberta_hello_simpleai +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt_qa_detector_roberta_hello_simpleai` is a English model originally trained by Hello-SimpleAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_qa_detector_roberta_hello_simpleai_en_5.2.0_3.0_1701259188594.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_qa_detector_roberta_hello_simpleai_en_5.2.0_3.0_1701259188594.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("chatgpt_qa_detector_roberta_hello_simpleai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("chatgpt_qa_detector_roberta_hello_simpleai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt_qa_detector_roberta_hello_simpleai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.8 MB| + +## References + +https://huggingface.co/Hello-SimpleAI/chatgpt-qa-detector-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-clickbait_binary_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-clickbait_binary_detection_en.md new file mode 100644 index 000000000000..ac3a132c294c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-clickbait_binary_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English clickbait_binary_detection RoBertaForSequenceClassification from christinacdl +author: John Snow Labs +name: clickbait_binary_detection +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`clickbait_binary_detection` is a English model originally trained by christinacdl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/clickbait_binary_detection_en_5.2.0_3.0_1701291757213.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/clickbait_binary_detection_en_5.2.0_3.0_1701291757213.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("clickbait_binary_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("clickbait_binary_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|clickbait_binary_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/christinacdl/clickbait_binary_detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-cn_roberta_dig_kghate_en.md b/docs/_posts/ahmedlone127/2023-11-29-cn_roberta_dig_kghate_en.md new file mode 100644 index 000000000000..233412aac2ad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-cn_roberta_dig_kghate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cn_roberta_dig_kghate RoBertaForSequenceClassification from Kghate +author: John Snow Labs +name: cn_roberta_dig_kghate +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cn_roberta_dig_kghate` is a English model originally trained by Kghate. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cn_roberta_dig_kghate_en_5.2.0_3.0_1701275532048.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cn_roberta_dig_kghate_en_5.2.0_3.0_1701275532048.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cn_roberta_dig_kghate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cn_roberta_dig_kghate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cn_roberta_dig_kghate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|415.4 MB| + +## References + +https://huggingface.co/Kghate/CN_RoBERTa_Dig \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-codebert_base_malicious_urls_en.md b/docs/_posts/ahmedlone127/2023-11-29-codebert_base_malicious_urls_en.md new file mode 100644 index 000000000000..3a2a54558705 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-codebert_base_malicious_urls_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English codebert_base_malicious_urls RoBertaForSequenceClassification from DunnBC22 +author: John Snow Labs +name: codebert_base_malicious_urls +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`codebert_base_malicious_urls` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/codebert_base_malicious_urls_en_5.2.0_3.0_1701278893768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/codebert_base_malicious_urls_en_5.2.0_3.0_1701278893768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_base_malicious_urls","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_base_malicious_urls","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|codebert_base_malicious_urls| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/DunnBC22/codebert-base-Malicious_URLs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-codebert_base_password_strength_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-codebert_base_password_strength_classifier_en.md new file mode 100644 index 000000000000..47f067ede124 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-codebert_base_password_strength_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English codebert_base_password_strength_classifier RoBertaForSequenceClassification from DunnBC22 +author: John Snow Labs +name: codebert_base_password_strength_classifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`codebert_base_password_strength_classifier` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/codebert_base_password_strength_classifier_en_5.2.0_3.0_1701287859165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/codebert_base_password_strength_classifier_en_5.2.0_3.0_1701287859165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_base_password_strength_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_base_password_strength_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|codebert_base_password_strength_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/DunnBC22/codebert-base-Password_Strength_Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-connecting_en.md b/docs/_posts/ahmedlone127/2023-11-29-connecting_en.md new file mode 100644 index 000000000000..4ccd9f8899c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-connecting_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English connecting RoBertaForSequenceClassification from aekupor +author: John Snow Labs +name: connecting +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`connecting` is a English model originally trained by aekupor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/connecting_en_5.2.0_3.0_1701267511201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/connecting_en_5.2.0_3.0_1701267511201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("connecting","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("connecting","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|connecting| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.4 MB| + +## References + +https://huggingface.co/aekupor/connecting \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-covid_19_sentiment_analysis_roberta_model_en.md b/docs/_posts/ahmedlone127/2023-11-29-covid_19_sentiment_analysis_roberta_model_en.md new file mode 100644 index 000000000000..7f5103868f93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-covid_19_sentiment_analysis_roberta_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_19_sentiment_analysis_roberta_model RoBertaForSequenceClassification from rasmodev +author: John Snow Labs +name: covid_19_sentiment_analysis_roberta_model +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_19_sentiment_analysis_roberta_model` is a English model originally trained by rasmodev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_19_sentiment_analysis_roberta_model_en_5.2.0_3.0_1701282224116.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_19_sentiment_analysis_roberta_model_en_5.2.0_3.0_1701282224116.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_19_sentiment_analysis_roberta_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_19_sentiment_analysis_roberta_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_19_sentiment_analysis_roberta_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/rasmodev/Covid-19_Sentiment_Analysis_RoBERTa_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-covid_vaccine_tweet_sentiment_analysis_roberta_bambadij_en.md b/docs/_posts/ahmedlone127/2023-11-29-covid_vaccine_tweet_sentiment_analysis_roberta_bambadij_en.md new file mode 100644 index 000000000000..ae2da1b08ac8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-covid_vaccine_tweet_sentiment_analysis_roberta_bambadij_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_vaccine_tweet_sentiment_analysis_roberta_bambadij RoBertaForSequenceClassification from bambadij +author: John Snow Labs +name: covid_vaccine_tweet_sentiment_analysis_roberta_bambadij +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_vaccine_tweet_sentiment_analysis_roberta_bambadij` is a English model originally trained by bambadij. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_vaccine_tweet_sentiment_analysis_roberta_bambadij_en_5.2.0_3.0_1701286504427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_vaccine_tweet_sentiment_analysis_roberta_bambadij_en_5.2.0_3.0_1701286504427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_vaccine_tweet_sentiment_analysis_roberta_bambadij","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_vaccine_tweet_sentiment_analysis_roberta_bambadij","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_vaccine_tweet_sentiment_analysis_roberta_bambadij| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/bambadij/COVID_Vaccine_Tweet_sentiment_analysis_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-cybert_en.md b/docs/_posts/ahmedlone127/2023-11-29-cybert_en.md new file mode 100644 index 000000000000..d0093c4e7a37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-cybert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cybert RoBertaForSequenceClassification from SynamicTechnologies +author: John Snow Labs +name: cybert +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cybert` is a English model originally trained by SynamicTechnologies. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cybert_en_5.2.0_3.0_1701290721372.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cybert_en_5.2.0_3.0_1701290721372.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cybert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cybert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cybert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|313.4 MB| + +## References + +https://huggingface.co/SynamicTechnologies/CYBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-detexd_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-detexd_roberta_base_en.md new file mode 100644 index 000000000000..67a4b4c7afd7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-detexd_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English detexd_roberta_base RoBertaForSequenceClassification from grammarly +author: John Snow Labs +name: detexd_roberta_base +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`detexd_roberta_base` is a English model originally trained by grammarly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/detexd_roberta_base_en_5.2.0_3.0_1701244212587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/detexd_roberta_base_en_5.2.0_3.0_1701244212587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("detexd_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("detexd_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|detexd_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/grammarly/detexd-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_commitment_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_commitment_en.md new file mode 100644 index 000000000000..5569ceba8360 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_commitment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_climate_commitment RoBertaForSequenceClassification from climatebert +author: John Snow Labs +name: distilroberta_base_climate_commitment +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_climate_commitment` is a English model originally trained by climatebert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_commitment_en_5.2.0_3.0_1701248872140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_commitment_en_5.2.0_3.0_1701248872140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_commitment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_commitment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_climate_commitment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/climatebert/distilroberta-base-climate-commitment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_detector_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_detector_en.md new file mode 100644 index 000000000000..c7d6eaf3cd05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_climate_detector RoBertaForSequenceClassification from climatebert +author: John Snow Labs +name: distilroberta_base_climate_detector +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_climate_detector` is a English model originally trained by climatebert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_detector_en_5.2.0_3.0_1701259548838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_detector_en_5.2.0_3.0_1701259548838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_climate_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/climatebert/distilroberta-base-climate-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_sentiment_en.md new file mode 100644 index 000000000000..437ab853eec5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_climate_sentiment RoBertaForSequenceClassification from climatebert +author: John Snow Labs +name: distilroberta_base_climate_sentiment +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_climate_sentiment` is a English model originally trained by climatebert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_sentiment_en_5.2.0_3.0_1701261669458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_sentiment_en_5.2.0_3.0_1701261669458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_climate_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/climatebert/distilroberta-base-climate-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_specificity_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_specificity_en.md new file mode 100644 index 000000000000..a5442cbb61a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_specificity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_climate_specificity RoBertaForSequenceClassification from climatebert +author: John Snow Labs +name: distilroberta_base_climate_specificity +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_climate_specificity` is a English model originally trained by climatebert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_specificity_en_5.2.0_3.0_1701247818614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_specificity_en_5.2.0_3.0_1701247818614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_specificity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_specificity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_climate_specificity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/climatebert/distilroberta-base-climate-specificity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_tcfd_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_tcfd_en.md new file mode 100644 index 000000000000..76195a3a7069 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_climate_tcfd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_climate_tcfd RoBertaForSequenceClassification from climatebert +author: John Snow Labs +name: distilroberta_base_climate_tcfd +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_climate_tcfd` is a English model originally trained by climatebert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_tcfd_en_5.2.0_3.0_1701269916243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_climate_tcfd_en_5.2.0_3.0_1701269916243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_tcfd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_climate_tcfd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_climate_tcfd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/climatebert/distilroberta-base-climate-tcfd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_emolit_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_emolit_en.md new file mode 100644 index 000000000000..e78db03f33db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_emolit_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_emolit RoBertaForSequenceClassification from lrei +author: John Snow Labs +name: distilroberta_base_emolit +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_emolit` is a English model originally trained by lrei. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_emolit_en_5.2.0_3.0_1701297775726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_emolit_en_5.2.0_3.0_1701297775726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_emolit","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_emolit","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_emolit| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|458.3 MB| + +## References + +https://huggingface.co/lrei/distilroberta-base-emolit \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_etc_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_etc_en.md new file mode 100644 index 000000000000..d4697e099747 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_etc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_etc RoBertaForSequenceClassification from agi-css +author: John Snow Labs +name: distilroberta_base_etc +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_etc` is a English model originally trained by agi-css. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_etc_en_5.2.0_3.0_1701287343535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_etc_en_5.2.0_3.0_1701287343535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_etc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_etc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_etc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/agi-css/distilroberta-base-etc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_finetuned_fake_news_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_finetuned_fake_news_detection_en.md new file mode 100644 index 000000000000..a07561bcaa89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_finetuned_fake_news_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_finetuned_fake_news_detection RoBertaForSequenceClassification from vikram71198 +author: John Snow Labs +name: distilroberta_base_finetuned_fake_news_detection +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_finetuned_fake_news_detection` is a English model originally trained by vikram71198. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_finetuned_fake_news_detection_en_5.2.0_3.0_1701285901356.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_finetuned_fake_news_detection_en_5.2.0_3.0_1701285901356.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_finetuned_fake_news_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_finetuned_fake_news_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_finetuned_fake_news_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/vikram71198/distilroberta-base-finetuned-fake-news-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_movie_genre_prediction_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_movie_genre_prediction_en.md new file mode 100644 index 000000000000..9cfe25ef382c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_movie_genre_prediction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_movie_genre_prediction RoBertaForSequenceClassification from nickmuchi +author: John Snow Labs +name: distilroberta_base_movie_genre_prediction +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_movie_genre_prediction` is a English model originally trained by nickmuchi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_movie_genre_prediction_en_5.2.0_3.0_1701290720458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_movie_genre_prediction_en_5.2.0_3.0_1701290720458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_movie_genre_prediction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_movie_genre_prediction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_movie_genre_prediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/nickmuchi/distilroberta-base-movie-genre-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_offensive_hateful_speech_text_multiclassification_en.md b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_offensive_hateful_speech_text_multiclassification_en.md new file mode 100644 index 000000000000..97cf6cd8531c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-distilroberta_base_offensive_hateful_speech_text_multiclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_offensive_hateful_speech_text_multiclassification RoBertaForSequenceClassification from badmatr11x +author: John Snow Labs +name: distilroberta_base_offensive_hateful_speech_text_multiclassification +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_offensive_hateful_speech_text_multiclassification` is a English model originally trained by badmatr11x. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_offensive_hateful_speech_text_multiclassification_en_5.2.0_3.0_1701245428388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_offensive_hateful_speech_text_multiclassification_en_5.2.0_3.0_1701245428388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_offensive_hateful_speech_text_multiclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_offensive_hateful_speech_text_multiclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_offensive_hateful_speech_text_multiclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| + +## References + +https://huggingface.co/badmatr11x/distilroberta-base-offensive-hateful-speech-text-multiclassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-eliciting_en.md b/docs/_posts/ahmedlone127/2023-11-29-eliciting_en.md new file mode 100644 index 000000000000..7472d8f2cb56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-eliciting_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English eliciting RoBertaForSequenceClassification from aekupor +author: John Snow Labs +name: eliciting +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`eliciting` is a English model originally trained by aekupor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/eliciting_en_5.2.0_3.0_1701244381585.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/eliciting_en_5.2.0_3.0_1701244381585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("eliciting","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("eliciting","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|eliciting| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.4 MB| + +## References + +https://huggingface.co/aekupor/eliciting \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-emoberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-emoberta_base_en.md new file mode 100644 index 000000000000..ab7ac88e2bb8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-emoberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emoberta_base RoBertaForSequenceClassification from tae898 +author: John Snow Labs +name: emoberta_base +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emoberta_base` is a English model originally trained by tae898. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emoberta_base_en_5.2.0_3.0_1701293375609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emoberta_base_en_5.2.0_3.0_1701293375609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emoberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emoberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emoberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.1 MB| + +## References + +https://huggingface.co/tae898/emoberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-emoberta_large_en.md b/docs/_posts/ahmedlone127/2023-11-29-emoberta_large_en.md new file mode 100644 index 000000000000..8af5cb8b1ebc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-emoberta_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emoberta_large RoBertaForSequenceClassification from tae898 +author: John Snow Labs +name: emoberta_large +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emoberta_large` is a English model originally trained by tae898. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emoberta_large_en_5.2.0_3.0_1701248332117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emoberta_large_en_5.2.0_3.0_1701248332117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emoberta_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emoberta_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emoberta_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tae898/emoberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-emotion_text_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-emotion_text_classifier_en.md new file mode 100644 index 000000000000..fdd137035b34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-emotion_text_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_text_classifier RoBertaForSequenceClassification from michellejieli +author: John Snow Labs +name: emotion_text_classifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_text_classifier` is a English model originally trained by michellejieli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_text_classifier_en_5.2.0_3.0_1701236883042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_text_classifier_en_5.2.0_3.0_1701236883042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_text_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_text_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_text_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/michellejieli/emotion_text_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-environmentalbert_environmental_en.md b/docs/_posts/ahmedlone127/2023-11-29-environmentalbert_environmental_en.md new file mode 100644 index 000000000000..5164204e5e01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-environmentalbert_environmental_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English environmentalbert_environmental RoBertaForSequenceClassification from ESGBERT +author: John Snow Labs +name: environmentalbert_environmental +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`environmentalbert_environmental` is a English model originally trained by ESGBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/environmentalbert_environmental_en_5.2.0_3.0_1701249329598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/environmentalbert_environmental_en_5.2.0_3.0_1701249329598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("environmentalbert_environmental","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("environmentalbert_environmental","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|environmentalbert_environmental| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| + +## References + +https://huggingface.co/ESGBERT/EnvironmentalBERT-environmental \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-envroberta_environmental_en.md b/docs/_posts/ahmedlone127/2023-11-29-envroberta_environmental_en.md new file mode 100644 index 000000000000..f69aa6fef922 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-envroberta_environmental_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English envroberta_environmental RoBertaForSequenceClassification from ESGBERT +author: John Snow Labs +name: envroberta_environmental +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`envroberta_environmental` is a English model originally trained by ESGBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/envroberta_environmental_en_5.2.0_3.0_1701272166717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/envroberta_environmental_en_5.2.0_3.0_1701272166717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("envroberta_environmental","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("envroberta_environmental","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|envroberta_environmental| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/ESGBERT/EnvRoBERTa-environmental \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-eventclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-eventclassifier_en.md new file mode 100644 index 000000000000..1e955ca6cecf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-eventclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English eventclassifier RoBertaForSequenceClassification from hadifar +author: John Snow Labs +name: eventclassifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`eventclassifier` is a English model originally trained by hadifar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/eventclassifier_en_5.2.0_3.0_1701258135126.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/eventclassifier_en_5.2.0_3.0_1701258135126.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("eventclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("eventclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|eventclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.4 MB| + +## References + +https://huggingface.co/hadifar/eventclassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-fake_news_bert_detect_en.md b/docs/_posts/ahmedlone127/2023-11-29-fake_news_bert_detect_en.md new file mode 100644 index 000000000000..7d643d4a78b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-fake_news_bert_detect_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fake_news_bert_detect RoBertaForSequenceClassification from jy46604790 +author: John Snow Labs +name: fake_news_bert_detect +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fake_news_bert_detect` is a English model originally trained by jy46604790. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fake_news_bert_detect_en_5.2.0_3.0_1701264506858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fake_news_bert_detect_en_5.2.0_3.0_1701264506858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fake_news_bert_detect","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fake_news_bert_detect","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fake_news_bert_detect| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.7 MB| + +## References + +https://huggingface.co/jy46604790/Fake-News-Bert-Detect \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-finetuned_fake_news_roberta_ikoghoemmanuell_en.md b/docs/_posts/ahmedlone127/2023-11-29-finetuned_fake_news_roberta_ikoghoemmanuell_en.md new file mode 100644 index 000000000000..cc2481d7f5d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-finetuned_fake_news_roberta_ikoghoemmanuell_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_fake_news_roberta_ikoghoemmanuell RoBertaForSequenceClassification from ikoghoemmanuell +author: John Snow Labs +name: finetuned_fake_news_roberta_ikoghoemmanuell +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_fake_news_roberta_ikoghoemmanuell` is a English model originally trained by ikoghoemmanuell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_fake_news_roberta_ikoghoemmanuell_en_5.2.0_3.0_1701272049376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_fake_news_roberta_ikoghoemmanuell_en_5.2.0_3.0_1701272049376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_fake_news_roberta_ikoghoemmanuell","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_fake_news_roberta_ikoghoemmanuell","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_fake_news_roberta_ikoghoemmanuell| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.0 MB| + +## References + +https://huggingface.co/ikoghoemmanuell/finetuned_fake_news_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-finetuned_roberta_text_emotion_recognition_en.md b/docs/_posts/ahmedlone127/2023-11-29-finetuned_roberta_text_emotion_recognition_en.md new file mode 100644 index 000000000000..cfcc11abd4c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-finetuned_roberta_text_emotion_recognition_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_roberta_text_emotion_recognition RoBertaForSequenceClassification from eric0708 +author: John Snow Labs +name: finetuned_roberta_text_emotion_recognition +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_roberta_text_emotion_recognition` is a English model originally trained by eric0708. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_roberta_text_emotion_recognition_en_5.2.0_3.0_1701287184893.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_roberta_text_emotion_recognition_en_5.2.0_3.0_1701287184893.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_text_emotion_recognition","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_text_emotion_recognition","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_roberta_text_emotion_recognition| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|421.5 MB| + +## References + +https://huggingface.co/eric0708/finetuned_roberta_text_emotion_recognition \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-finetuning_roberta_base_model_en.md b/docs/_posts/ahmedlone127/2023-11-29-finetuning_roberta_base_model_en.md new file mode 100644 index 000000000000..852737af4134 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-finetuning_roberta_base_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_base_model RoBertaForSequenceClassification from KABANDA18 +author: John Snow Labs +name: finetuning_roberta_base_model +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_base_model` is a English model originally trained by KABANDA18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_model_en_5.2.0_3.0_1701298548094.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_model_en_5.2.0_3.0_1701298548094.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_base_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.9 MB| + +## References + +https://huggingface.co/KABANDA18/FineTuning-Roberta-base_Model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-finetuning_roberta_base_on_sst2_7000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-29-finetuning_roberta_base_on_sst2_7000_samples_en.md new file mode 100644 index 000000000000..0f7d076b059d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-finetuning_roberta_base_on_sst2_7000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_base_on_sst2_7000_samples RoBertaForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_roberta_base_on_sst2_7000_samples +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_base_on_sst2_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_sst2_7000_samples_en_5.2.0_3.0_1701239359451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_sst2_7000_samples_en_5.2.0_3.0_1701239359451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_sst2_7000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_sst2_7000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_base_on_sst2_7000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.2 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-roberta-base-on-sst2_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-gezelle_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-29-gezelle_sentiment_en.md new file mode 100644 index 000000000000..25287bd1d109 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-gezelle_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gezelle_sentiment RoBertaForSequenceClassification from lunadebruyne +author: John Snow Labs +name: gezelle_sentiment +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gezelle_sentiment` is a English model originally trained by lunadebruyne. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gezelle_sentiment_en_5.2.0_3.0_1701250162255.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gezelle_sentiment_en_5.2.0_3.0_1701250162255.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("gezelle_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("gezelle_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gezelle_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|433.2 MB| + +## References + +https://huggingface.co/lunadebruyne/Gezelle-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-indonesian_roberta_base_indolem_sentiment_classifier_fold_0_id.md b/docs/_posts/ahmedlone127/2023-11-29-indonesian_roberta_base_indolem_sentiment_classifier_fold_0_id.md new file mode 100644 index 000000000000..a6e747bb2056 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-indonesian_roberta_base_indolem_sentiment_classifier_fold_0_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian indonesian_roberta_base_indolem_sentiment_classifier_fold_0 RoBertaForSequenceClassification from w11wo +author: John Snow Labs +name: indonesian_roberta_base_indolem_sentiment_classifier_fold_0 +date: 2023-11-29 +tags: [roberta, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indonesian_roberta_base_indolem_sentiment_classifier_fold_0` is a Indonesian model originally trained by w11wo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indonesian_roberta_base_indolem_sentiment_classifier_fold_0_id_5.2.0_3.0_1701278893799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indonesian_roberta_base_indolem_sentiment_classifier_fold_0_id_5.2.0_3.0_1701278893799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("indonesian_roberta_base_indolem_sentiment_classifier_fold_0","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("indonesian_roberta_base_indolem_sentiment_classifier_fold_0","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indonesian_roberta_base_indolem_sentiment_classifier_fold_0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| + +## References + +https://huggingface.co/w11wo/indonesian-roberta-base-indolem-sentiment-classifier-fold-0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-javanese_roberta_small_imdb_classifier_jv.md b/docs/_posts/ahmedlone127/2023-11-29-javanese_roberta_small_imdb_classifier_jv.md new file mode 100644 index 000000000000..21a7e7a415c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-javanese_roberta_small_imdb_classifier_jv.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Javanese javanese_roberta_small_imdb_classifier RoBertaForSequenceClassification from w11wo +author: John Snow Labs +name: javanese_roberta_small_imdb_classifier +date: 2023-11-29 +tags: [roberta, jv, open_source, sequence_classification, onnx] +task: Text Classification +language: jv +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`javanese_roberta_small_imdb_classifier` is a Javanese model originally trained by w11wo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/javanese_roberta_small_imdb_classifier_jv_5.2.0_3.0_1701275502198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/javanese_roberta_small_imdb_classifier_jv_5.2.0_3.0_1701275502198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("javanese_roberta_small_imdb_classifier","jv")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("javanese_roberta_small_imdb_classifier","jv") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|javanese_roberta_small_imdb_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|jv| +|Size:|468.4 MB| + +## References + +https://huggingface.co/w11wo/javanese-roberta-small-imdb-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-misogynistic_statements_classification_model_es.md b/docs/_posts/ahmedlone127/2023-11-29-misogynistic_statements_classification_model_es.md new file mode 100644 index 000000000000..3f1497123dbb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-misogynistic_statements_classification_model_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish misogynistic_statements_classification_model RoBertaForSequenceClassification from glombardo +author: John Snow Labs +name: misogynistic_statements_classification_model +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`misogynistic_statements_classification_model` is a Castilian, Spanish model originally trained by glombardo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/misogynistic_statements_classification_model_es_5.2.0_3.0_1701287999344.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/misogynistic_statements_classification_model_es_5.2.0_3.0_1701287999344.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("misogynistic_statements_classification_model","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("misogynistic_statements_classification_model","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|misogynistic_statements_classification_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|308.6 MB| + +## References + +https://huggingface.co/glombardo/misogynistic-statements-classification-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-model_utterance_en.md b/docs/_posts/ahmedlone127/2023-11-29-model_utterance_en.md new file mode 100644 index 000000000000..ae157d2be398 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-model_utterance_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English model_utterance RoBertaForSequenceClassification from aekupor +author: John Snow Labs +name: model_utterance +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`model_utterance` is a English model originally trained by aekupor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/model_utterance_en_5.2.0_3.0_1701268463101.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/model_utterance_en_5.2.0_3.0_1701268463101.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("model_utterance","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("model_utterance","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|model_utterance| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.0 MB| + +## References + +https://huggingface.co/aekupor/model_utterance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-movie_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-29-movie_sentiment_en.md new file mode 100644 index 000000000000..8061d9781826 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-movie_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English movie_sentiment RoBertaForSequenceClassification from gr8testgad-1 +author: John Snow Labs +name: movie_sentiment +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_sentiment` is a English model originally trained by gr8testgad-1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_sentiment_en_5.2.0_3.0_1701264506844.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_sentiment_en_5.2.0_3.0_1701264506844.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("movie_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("movie_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|448.7 MB| + +## References + +https://huggingface.co/gr8testgad-1/movie_sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-multi_hate_en.md b/docs/_posts/ahmedlone127/2023-11-29-multi_hate_en.md new file mode 100644 index 000000000000..99abebb20fa7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-multi_hate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English multi_hate RoBertaForSequenceClassification from SotirisLegkas +author: John Snow Labs +name: multi_hate +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`multi_hate` is a English model originally trained by SotirisLegkas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/multi_hate_en_5.2.0_3.0_1701281113150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/multi_hate_en_5.2.0_3.0_1701281113150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("multi_hate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("multi_hate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|multi_hate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|461.0 MB| + +## References + +https://huggingface.co/SotirisLegkas/multi_hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-muppet_roberta_base_joke_detector_en.md b/docs/_posts/ahmedlone127/2023-11-29-muppet_roberta_base_joke_detector_en.md new file mode 100644 index 000000000000..1f75731a67d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-muppet_roberta_base_joke_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English muppet_roberta_base_joke_detector RoBertaForSequenceClassification from Reggie +author: John Snow Labs +name: muppet_roberta_base_joke_detector +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`muppet_roberta_base_joke_detector` is a English model originally trained by Reggie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/muppet_roberta_base_joke_detector_en_5.2.0_3.0_1701280228201.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/muppet_roberta_base_joke_detector_en_5.2.0_3.0_1701280228201.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("muppet_roberta_base_joke_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("muppet_roberta_base_joke_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|muppet_roberta_base_joke_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.0 MB| + +## References + +https://huggingface.co/Reggie/muppet-roberta-base-joke_detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-mytest_trainer_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-mytest_trainer_roberta_base_en.md new file mode 100644 index 000000000000..9cae2abfa136 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-mytest_trainer_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mytest_trainer_roberta_base RoBertaForSequenceClassification from DeeeTeeee01 +author: John Snow Labs +name: mytest_trainer_roberta_base +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mytest_trainer_roberta_base` is a English model originally trained by DeeeTeeee01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mytest_trainer_roberta_base_en_5.2.0_3.0_1701264184863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mytest_trainer_roberta_base_en_5.2.0_3.0_1701264184863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mytest_trainer_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mytest_trainer_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mytest_trainer_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|431.5 MB| + +## References + +https://huggingface.co/DeeeTeeee01/mytest_trainer_roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-netzero_reduction_en.md b/docs/_posts/ahmedlone127/2023-11-29-netzero_reduction_en.md new file mode 100644 index 000000000000..47ddd93ff5fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-netzero_reduction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English netzero_reduction RoBertaForSequenceClassification from climatebert +author: John Snow Labs +name: netzero_reduction +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`netzero_reduction` is a English model originally trained by climatebert. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/netzero_reduction_en_5.2.0_3.0_1701278990946.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/netzero_reduction_en_5.2.0_3.0_1701278990946.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("netzero_reduction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("netzero_reduction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|netzero_reduction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/climatebert/netzero-reduction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-nlp_capstone_en.md b/docs/_posts/ahmedlone127/2023-11-29-nlp_capstone_en.md new file mode 100644 index 000000000000..7d84c428582e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-nlp_capstone_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_capstone RoBertaForSequenceClassification from petermutwiri +author: John Snow Labs +name: nlp_capstone +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_capstone` is a English model originally trained by petermutwiri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_capstone_en_5.2.0_3.0_1701261048099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_capstone_en_5.2.0_3.0_1701261048099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nlp_capstone","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nlp_capstone","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_capstone| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.5 MB| + +## References + +https://huggingface.co/petermutwiri/NLP_Capstone \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-paraphrase_classification_onestop_and_adversarial_en.md b/docs/_posts/ahmedlone127/2023-11-29-paraphrase_classification_onestop_and_adversarial_en.md new file mode 100644 index 000000000000..afb5beb7a272 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-paraphrase_classification_onestop_and_adversarial_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English paraphrase_classification_onestop_and_adversarial RoBertaForSequenceClassification from Andrianos +author: John Snow Labs +name: paraphrase_classification_onestop_and_adversarial +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`paraphrase_classification_onestop_and_adversarial` is a English model originally trained by Andrianos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/paraphrase_classification_onestop_and_adversarial_en_5.2.0_3.0_1701292366693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/paraphrase_classification_onestop_and_adversarial_en_5.2.0_3.0_1701292366693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("paraphrase_classification_onestop_and_adversarial","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("paraphrase_classification_onestop_and_adversarial","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|paraphrase_classification_onestop_and_adversarial| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.6 MB| + +## References + +https://huggingface.co/Andrianos/paraphrase_classification_onestop_and_adversarial \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-partypress_monolingual_netherlands_nl.md b/docs/_posts/ahmedlone127/2023-11-29-partypress_monolingual_netherlands_nl.md new file mode 100644 index 000000000000..3d1abe3cffb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-partypress_monolingual_netherlands_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish partypress_monolingual_netherlands RoBertaForSequenceClassification from partypress +author: John Snow Labs +name: partypress_monolingual_netherlands +date: 2023-11-29 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`partypress_monolingual_netherlands` is a Dutch, Flemish model originally trained by partypress. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/partypress_monolingual_netherlands_nl_5.2.0_3.0_1701278006855.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/partypress_monolingual_netherlands_nl_5.2.0_3.0_1701278006855.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("partypress_monolingual_netherlands","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("partypress_monolingual_netherlands","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|partypress_monolingual_netherlands| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|438.0 MB| + +## References + +https://huggingface.co/partypress/partypress-monolingual-netherlands \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-phosformer_en.md b/docs/_posts/ahmedlone127/2023-11-29-phosformer_en.md new file mode 100644 index 000000000000..f6fcd4b4a9a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-phosformer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English phosformer RoBertaForSequenceClassification from waylandy +author: John Snow Labs +name: phosformer +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phosformer` is a English model originally trained by waylandy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phosformer_en_5.2.0_3.0_1701237505009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phosformer_en_5.2.0_3.0_1701237505009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("phosformer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("phosformer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|phosformer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|173.4 MB| + +## References + +https://huggingface.co/waylandy/phosformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-probing_en.md b/docs/_posts/ahmedlone127/2023-11-29-probing_en.md new file mode 100644 index 000000000000..c1347fc22ee0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-probing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English probing RoBertaForSequenceClassification from aekupor +author: John Snow Labs +name: probing +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`probing` is a English model originally trained by aekupor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/probing_en_5.2.0_3.0_1701243118148.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/probing_en_5.2.0_3.0_1701243118148.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("probing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("probing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|probing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.0 MB| + +## References + +https://huggingface.co/aekupor/probing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-qnli_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-qnli_distilroberta_base_en.md new file mode 100644 index 000000000000..e3ec3a9c1803 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-qnli_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English qnli_distilroberta_base RoBertaForSequenceClassification from cross-encoder +author: John Snow Labs +name: qnli_distilroberta_base +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qnli_distilroberta_base` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qnli_distilroberta_base_en_5.2.0_3.0_1701246579035.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qnli_distilroberta_base_en_5.2.0_3.0_1701246579035.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("qnli_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("qnli_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qnli_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/cross-encoder/qnli-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-query_wellformedness_score_en.md b/docs/_posts/ahmedlone127/2023-11-29-query_wellformedness_score_en.md new file mode 100644 index 000000000000..a1b47c6b490d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-query_wellformedness_score_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English query_wellformedness_score RoBertaForSequenceClassification from Ashishkr +author: John Snow Labs +name: query_wellformedness_score +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`query_wellformedness_score` is a English model originally trained by Ashishkr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/query_wellformedness_score_en_5.2.0_3.0_1701252816065.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/query_wellformedness_score_en_5.2.0_3.0_1701252816065.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("query_wellformedness_score","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("query_wellformedness_score","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|query_wellformedness_score| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.0 MB| + +## References + +https://huggingface.co/Ashishkr/query_wellformedness_score \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-quip_512_mocha_en.md b/docs/_posts/ahmedlone127/2023-11-29-quip_512_mocha_en.md new file mode 100644 index 000000000000..4d8dbacc59fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-quip_512_mocha_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English quip_512_mocha RoBertaForSequenceClassification from alirezamsh +author: John Snow Labs +name: quip_512_mocha +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quip_512_mocha` is a English model originally trained by alirezamsh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quip_512_mocha_en_5.2.0_3.0_1701239483621.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quip_512_mocha_en_5.2.0_3.0_1701239483621.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("quip_512_mocha","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("quip_512_mocha","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quip_512_mocha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/alirezamsh/quip-512-mocha \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-quora_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-quora_distilroberta_base_en.md new file mode 100644 index 000000000000..3b91228170a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-quora_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English quora_distilroberta_base RoBertaForSequenceClassification from cross-encoder +author: John Snow Labs +name: quora_distilroberta_base +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quora_distilroberta_base` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quora_distilroberta_base_en_5.2.0_3.0_1701258665214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quora_distilroberta_base_en_5.2.0_3.0_1701258665214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quora_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/cross-encoder/quora-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-quora_roberta_base_cross_encoder_en.md b/docs/_posts/ahmedlone127/2023-11-29-quora_roberta_base_cross_encoder_en.md new file mode 100644 index 000000000000..fe4ba938cdae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-quora_roberta_base_cross_encoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English quora_roberta_base_cross_encoder RoBertaForSequenceClassification from cross-encoder +author: John Snow Labs +name: quora_roberta_base_cross_encoder +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quora_roberta_base_cross_encoder` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quora_roberta_base_cross_encoder_en_5.2.0_3.0_1701284342256.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quora_roberta_base_cross_encoder_en_5.2.0_3.0_1701284342256.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_base_cross_encoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_base_cross_encoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quora_roberta_base_cross_encoder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.4 MB| + +## References + +https://huggingface.co/cross-encoder/quora-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-quora_roberta_large_cross_encoder_en.md b/docs/_posts/ahmedlone127/2023-11-29-quora_roberta_large_cross_encoder_en.md new file mode 100644 index 000000000000..4f518e9cd9a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-quora_roberta_large_cross_encoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English quora_roberta_large_cross_encoder RoBertaForSequenceClassification from cross-encoder +author: John Snow Labs +name: quora_roberta_large_cross_encoder +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quora_roberta_large_cross_encoder` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quora_roberta_large_cross_encoder_en_5.2.0_3.0_1701237564176.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quora_roberta_large_cross_encoder_en_5.2.0_3.0_1701237564176.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_large_cross_encoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_large_cross_encoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quora_roberta_large_cross_encoder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/cross-encoder/quora-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-radar_vicuna_7b_en.md b/docs/_posts/ahmedlone127/2023-11-29-radar_vicuna_7b_en.md new file mode 100644 index 000000000000..fd73dcd85c71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-radar_vicuna_7b_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English radar_vicuna_7b RoBertaForSequenceClassification from TrustSafeAI +author: John Snow Labs +name: radar_vicuna_7b +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`radar_vicuna_7b` is a English model originally trained by TrustSafeAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/radar_vicuna_7b_en_5.2.0_3.0_1701251307719.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/radar_vicuna_7b_en_5.2.0_3.0_1701251307719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("radar_vicuna_7b","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("radar_vicuna_7b","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|radar_vicuna_7b| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/TrustSafeAI/RADAR-Vicuna-7B \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-revoicing_en.md b/docs/_posts/ahmedlone127/2023-11-29-revoicing_en.md new file mode 100644 index 000000000000..49929dd401e5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-revoicing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English revoicing RoBertaForSequenceClassification from aekupor +author: John Snow Labs +name: revoicing +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`revoicing` is a English model originally trained by aekupor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/revoicing_en_5.2.0_3.0_1701269921526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/revoicing_en_5.2.0_3.0_1701269921526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("revoicing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("revoicing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|revoicing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.0 MB| + +## References + +https://huggingface.co/aekupor/revoicing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee_en.md b/docs/_posts/ahmedlone127/2023-11-29-rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee_en.md new file mode 100644 index 000000000000..3f9c5b9ff2ab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee RoBertaForSequenceClassification from RajuEEE +author: John Snow Labs +name: rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee` is a English model originally trained by RajuEEE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee_en_5.2.0_3.0_1701290721348.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee_en_5.2.0_3.0_1701290721348.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rewardmodelsmallerquestionwithtwolabelslengthjustified_rajueee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|433.3 MB| + +## References + +https://huggingface.co/RajuEEE/RewardModelSmallerQuestionWithTwoLabelsLengthJustified \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_5emotions_cls_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_5emotions_cls_en.md new file mode 100644 index 000000000000..a1e121cbc410 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_5emotions_cls_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_5emotions_cls RoBertaForSequenceClassification from mountinyy +author: John Snow Labs +name: roberta_5emotions_cls +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_5emotions_cls` is a English model originally trained by mountinyy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_5emotions_cls_en_5.2.0_3.0_1701273565147.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_5emotions_cls_en_5.2.0_3.0_1701273565147.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_5emotions_cls","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_5emotions_cls","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_5emotions_cls| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.9 MB| + +## References + +https://huggingface.co/mountinyy/roberta-5emotions-cls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_academic_detector_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_academic_detector_en.md new file mode 100644 index 000000000000..79276af335b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_academic_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_academic_detector RoBertaForSequenceClassification from andreas122001 +author: John Snow Labs +name: roberta_academic_detector +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_academic_detector` is a English model originally trained by andreas122001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_academic_detector_en_5.2.0_3.0_1701256411349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_academic_detector_en_5.2.0_3.0_1701256411349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_academic_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_academic_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_academic_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.7 MB| + +## References + +https://huggingface.co/andreas122001/roberta-academic-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_ag_news_achimoraites_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_ag_news_achimoraites_en.md new file mode 100644 index 000000000000..28e043d9f125 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_ag_news_achimoraites_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_ag_news_achimoraites RoBertaForSequenceClassification from achimoraites +author: John Snow Labs +name: roberta_base_ag_news_achimoraites +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ag_news_achimoraites` is a English model originally trained by achimoraites. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_achimoraites_en_5.2.0_3.0_1701262683718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_achimoraites_en_5.2.0_3.0_1701262683718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_achimoraites","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_achimoraites","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ag_news_achimoraites| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.0 MB| + +## References + +https://huggingface.co/achimoraites/roberta-base_ag_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_ag_news_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_ag_news_textattack_en.md new file mode 100644 index 000000000000..609275daeaab --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_ag_news_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_ag_news_textattack RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_ag_news_textattack +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ag_news_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_textattack_en_5.2.0_3.0_1701260450354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_textattack_en_5.2.0_3.0_1701260450354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ag_news_textattack| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.3 MB| + +## References + +https://huggingface.co/textattack/roberta-base-ag-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_bne_ranker_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_bne_ranker_es.md new file mode 100644 index 000000000000..ddea2e1ff49c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_bne_ranker_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_base_bne_ranker RoBertaForSequenceClassification from IIC +author: John Snow Labs +name: roberta_base_bne_ranker +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_ranker` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_ranker_es_5.2.0_3.0_1701259336192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_ranker_es_5.2.0_3.0_1701259336192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_ranker","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_ranker","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_ranker| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|464.8 MB| + +## References + +https://huggingface.co/IIC/roberta-base-bne-ranker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_boolq_shahrukhx01_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_boolq_shahrukhx01_en.md new file mode 100644 index 000000000000..855e67dcbc02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_boolq_shahrukhx01_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_boolq_shahrukhx01 RoBertaForSequenceClassification from shahrukhx01 +author: John Snow Labs +name: roberta_base_boolq_shahrukhx01 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_boolq_shahrukhx01` is a English model originally trained by shahrukhx01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_boolq_shahrukhx01_en_5.2.0_3.0_1701268217502.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_boolq_shahrukhx01_en_5.2.0_3.0_1701268217502.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_boolq_shahrukhx01","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_boolq_shahrukhx01","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_boolq_shahrukhx01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.2 MB| + +## References + +https://huggingface.co/shahrukhx01/roberta-base-boolq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_cola_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_cola_en.md new file mode 100644 index 000000000000..821093fa2c6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_cola RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_cola +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_cola` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_cola_en_5.2.0_3.0_1701237344333.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_cola_en_5.2.0_3.0_1701237344333.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|434.5 MB| + +## References + +https://huggingface.co/textattack/roberta-base-CoLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_crest_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_crest_en.md new file mode 100644 index 000000000000..a4fbad3738a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_crest_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_crest RoBertaForSequenceClassification from gargam +author: John Snow Labs +name: roberta_base_crest +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_crest` is a English model originally trained by gargam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_crest_en_5.2.0_3.0_1701261048162.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_crest_en_5.2.0_3.0_1701261048162.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_crest","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_crest","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_crest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.2 MB| + +## References + +https://huggingface.co/gargam/roberta-base-crest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_emotion_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_emotion_classifier_en.md new file mode 100644 index 000000000000..78061f879b52 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_emotion_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_emotion_classifier RoBertaForSequenceClassification from Azma-AI +author: John Snow Labs +name: roberta_base_emotion_classifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_emotion_classifier` is a English model originally trained by Azma-AI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_emotion_classifier_en_5.2.0_3.0_1701296630343.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_emotion_classifier_en_5.2.0_3.0_1701296630343.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotion_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotion_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_emotion_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.5 MB| + +## References + +https://huggingface.co/Azma-AI/roberta-base-emotion-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_emotions_detection_from_text_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_emotions_detection_from_text_en.md new file mode 100644 index 000000000000..961b5f8af666 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_emotions_detection_from_text_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_emotions_detection_from_text RoBertaForSequenceClassification from badmatr11x +author: John Snow Labs +name: roberta_base_emotions_detection_from_text +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_emotions_detection_from_text` is a English model originally trained by badmatr11x. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_emotions_detection_from_text_en_5.2.0_3.0_1701280228206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_emotions_detection_from_text_en_5.2.0_3.0_1701280228206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotions_detection_from_text","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotions_detection_from_text","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_emotions_detection_from_text| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/badmatr11x/roberta-base-emotions-detection-from-text \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_imdb_wrmurray_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_imdb_wrmurray_en.md new file mode 100644 index 000000000000..fdc7bab563bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_imdb_wrmurray_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_imdb_wrmurray RoBertaForSequenceClassification from wrmurray +author: John Snow Labs +name: roberta_base_finetuned_imdb_wrmurray +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_imdb_wrmurray` is a English model originally trained by wrmurray. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_imdb_wrmurray_en_5.2.0_3.0_1701283579603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_imdb_wrmurray_en_5.2.0_3.0_1701283579603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_imdb_wrmurray","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_imdb_wrmurray","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_imdb_wrmurray| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.0 MB| + +## References + +https://huggingface.co/wrmurray/roberta-base-finetuned-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_sms_spam_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_sms_spam_detection_en.md new file mode 100644 index 000000000000..00a760044de2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_sms_spam_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_sms_spam_detection RoBertaForSequenceClassification from mariagrandury +author: John Snow Labs +name: roberta_base_finetuned_sms_spam_detection +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_sms_spam_detection` is a English model originally trained by mariagrandury. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sms_spam_detection_en_5.2.0_3.0_1701245801948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sms_spam_detection_en_5.2.0_3.0_1701245801948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_sms_spam_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_sms_spam_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_sms_spam_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.8 MB| + +## References + +https://huggingface.co/mariagrandury/roberta-base-finetuned-sms-spam-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_toxic_comment_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_toxic_comment_detection_en.md new file mode 100644 index 000000000000..d81a4d656025 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_toxic_comment_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_toxic_comment_detection RoBertaForSequenceClassification from tillschwoerer +author: John Snow Labs +name: roberta_base_finetuned_toxic_comment_detection +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_toxic_comment_detection` is a English model originally trained by tillschwoerer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_toxic_comment_detection_en_5.2.0_3.0_1701281723380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_toxic_comment_detection_en_5.2.0_3.0_1701281723380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_toxic_comment_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_toxic_comment_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_toxic_comment_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.8 MB| + +## References + +https://huggingface.co/tillschwoerer/roberta-base-finetuned-toxic-comment-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_yelp_polarity_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_yelp_polarity_en.md new file mode 100644 index 000000000000..61014c92d06a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_finetuned_yelp_polarity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_yelp_polarity RoBertaForSequenceClassification from VictorSanh +author: John Snow Labs +name: roberta_base_finetuned_yelp_polarity +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_yelp_polarity` is a English model originally trained by VictorSanh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_yelp_polarity_en_5.2.0_3.0_1701237936867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_yelp_polarity_en_5.2.0_3.0_1701237936867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_yelp_polarity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_yelp_polarity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_yelp_polarity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.6 MB| + +## References + +https://huggingface.co/VictorSanh/roberta-base-finetuned-yelp-polarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_formality_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_formality_en.md new file mode 100644 index 000000000000..7797a583643c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_formality_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_formality RoBertaForSequenceClassification from cointegrated +author: John Snow Labs +name: roberta_base_formality +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_formality` is a English model originally trained by cointegrated. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_formality_en_5.2.0_3.0_1701271656433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_formality_en_5.2.0_3.0_1701271656433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_formality","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_formality","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_formality| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|453.4 MB| + +## References + +https://huggingface.co/cointegrated/roberta-base-formality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_formality_ranker_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_formality_ranker_en.md new file mode 100644 index 000000000000..4cd05bb40745 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_formality_ranker_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_formality_ranker RoBertaForSequenceClassification from s-nlp +author: John Snow Labs +name: roberta_base_formality_ranker +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_formality_ranker` is a English model originally trained by s-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_formality_ranker_en_5.2.0_3.0_1701265636121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_formality_ranker_en_5.2.0_3.0_1701265636121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_formality_ranker","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_formality_ranker","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_formality_ranker| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|454.3 MB| + +## References + +https://huggingface.co/s-nlp/roberta-base-formality-ranker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_go_emotions_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_go_emotions_en.md new file mode 100644 index 000000000000..ef71a8106766 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_go_emotions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_go_emotions RoBertaForSequenceClassification from SamLowe +author: John Snow Labs +name: roberta_base_go_emotions +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_go_emotions` is a English model originally trained by SamLowe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_go_emotions_en_5.2.0_3.0_1701251252226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_go_emotions_en_5.2.0_3.0_1701251252226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_go_emotions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_go_emotions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_go_emotions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.5 MB| + +## References + +https://huggingface.co/SamLowe/roberta-base-go_emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_imdb_textattack_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_imdb_textattack_en.md new file mode 100644 index 000000000000..4300101b381b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_imdb_textattack_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_imdb_textattack RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_imdb_textattack +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_imdb_textattack` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_imdb_textattack_en_5.2.0_3.0_1701275343918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_imdb_textattack_en_5.2.0_3.0_1701275343918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_imdb_textattack","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_imdb_textattack","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_imdb_textattack| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.7 MB| + +## References + +https://huggingface.co/textattack/roberta-base-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_mnli_en.md new file mode 100644 index 000000000000..82dae801d7c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_mnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_mnli RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_mnli +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_mnli` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_mnli_en_5.2.0_3.0_1701263180810.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_mnli_en_5.2.0_3.0_1701263180810.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_mnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|461.7 MB| + +## References + +https://huggingface.co/textattack/roberta-base-MNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_mrpc_en.md new file mode 100644 index 000000000000..77113e025f1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_mrpc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_mrpc RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_mrpc +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_mrpc` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_mrpc_en_5.2.0_3.0_1701256795418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_mrpc_en_5.2.0_3.0_1701256795418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mrpc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mrpc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_mrpc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.9 MB| + +## References + +https://huggingface.co/textattack/roberta-base-MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_openai_detector_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_openai_detector_en.md new file mode 100644 index 000000000000..2ef43e844df9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_openai_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_openai_detector RoBertaForSequenceClassification from huggingface +author: John Snow Labs +name: roberta_base_openai_detector +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_openai_detector` is a English model originally trained by huggingface. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_openai_detector_en_5.2.0_3.0_1701256795407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_openai_detector_en_5.2.0_3.0_1701256795407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_openai_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_openai_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_openai_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/roberta-base-openai-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_pakornor_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_pakornor_en.md new file mode 100644 index 000000000000..2e363ff98889 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_pakornor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_pakornor RoBertaForSequenceClassification from pakornor +author: John Snow Labs +name: roberta_base_pakornor +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_pakornor` is a English model originally trained by pakornor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_pakornor_en_5.2.0_3.0_1701279154067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_pakornor_en_5.2.0_3.0_1701279154067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_pakornor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_pakornor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_pakornor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|431.3 MB| + +## References + +https://huggingface.co/pakornor/roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rotten_tomatoes_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rotten_tomatoes_en.md new file mode 100644 index 000000000000..11f83f308521 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rotten_tomatoes_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_rotten_tomatoes RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_rotten_tomatoes +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_rotten_tomatoes` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_rotten_tomatoes_en_5.2.0_3.0_1701289804351.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_rotten_tomatoes_en_5.2.0_3.0_1701289804351.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_rotten_tomatoes","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_rotten_tomatoes","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_rotten_tomatoes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|442.7 MB| + +## References + +https://huggingface.co/textattack/roberta-base-rotten-tomatoes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rte_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rte_en.md new file mode 100644 index 000000000000..f13903d3c102 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rte_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_rte RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_rte +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_rte` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_rte_en_5.2.0_3.0_1701273565017.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_rte_en_5.2.0_3.0_1701273565017.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_rte","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_rte","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_rte| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.7 MB| + +## References + +https://huggingface.co/textattack/roberta-base-RTE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rte_willheld_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rte_willheld_en.md new file mode 100644 index 000000000000..91f6b415d09e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_rte_willheld_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_rte_willheld RoBertaForSequenceClassification from WillHeld +author: John Snow Labs +name: roberta_base_rte_willheld +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_rte_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_rte_willheld_en_5.2.0_3.0_1701237998953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_rte_willheld_en_5.2.0_3.0_1701237998953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_rte_willheld","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_rte_willheld","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_rte_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.1 MB| + +## References + +https://huggingface.co/WillHeld/roberta-base-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_snli_custeau_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_snli_custeau_en.md new file mode 100644 index 000000000000..557b17871eac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_snli_custeau_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_snli_custeau RoBertaForSequenceClassification from custeau +author: John Snow Labs +name: roberta_base_snli_custeau +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_snli_custeau` is a English model originally trained by custeau. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_snli_custeau_en_5.2.0_3.0_1701284830540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_snli_custeau_en_5.2.0_3.0_1701284830540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_snli_custeau","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_snli_custeau","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_snli_custeau| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.2 MB| + +## References + +https://huggingface.co/custeau/roberta-base-snli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_snli_pepa_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_snli_pepa_en.md new file mode 100644 index 000000000000..9b4c570318a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_snli_pepa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_snli_pepa RoBertaForSequenceClassification from pepa +author: John Snow Labs +name: roberta_base_snli_pepa +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_snli_pepa` is a English model originally trained by pepa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_snli_pepa_en_5.2.0_3.0_1701255448598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_snli_pepa_en_5.2.0_3.0_1701255448598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_snli_pepa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_snli_pepa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_snli_pepa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.3 MB| + +## References + +https://huggingface.co/pepa/roberta-base-snli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_sst_2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_sst_2_en.md new file mode 100644 index 000000000000..2362767ec877 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_sst_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_sst_2 RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_sst_2 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_sst_2` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_sst_2_en_5.2.0_3.0_1701263180806.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_sst_2_en_5.2.0_3.0_1701263180806.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_sst_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|433.2 MB| + +## References + +https://huggingface.co/textattack/roberta-base-SST-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_stocktwits_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_stocktwits_finetuned_en.md new file mode 100644 index 000000000000..77e27e57201e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_stocktwits_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_stocktwits_finetuned RoBertaForSequenceClassification from zhayunduo +author: John Snow Labs +name: roberta_base_stocktwits_finetuned +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_stocktwits_finetuned` is a English model originally trained by zhayunduo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_stocktwits_finetuned_en_5.2.0_3.0_1701252096276.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_stocktwits_finetuned_en_5.2.0_3.0_1701252096276.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_stocktwits_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_stocktwits_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_stocktwits_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.0 MB| + +## References + +https://huggingface.co/zhayunduo/roberta-base-stocktwits-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_stsb_willheld_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_stsb_willheld_en.md new file mode 100644 index 000000000000..598188994c21 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_stsb_willheld_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_stsb_willheld RoBertaForSequenceClassification from WillHeld +author: John Snow Labs +name: roberta_base_stsb_willheld +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_stsb_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_stsb_willheld_en_5.2.0_3.0_1701287298687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_stsb_willheld_en_5.2.0_3.0_1701287298687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_stsb_willheld","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_stsb_willheld","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_stsb_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.5 MB| + +## References + +https://huggingface.co/WillHeld/roberta-base-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_base_tweet_sentiment_english_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_tweet_sentiment_english_en.md new file mode 100644 index 000000000000..c8b750f0f7d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_base_tweet_sentiment_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_tweet_sentiment_english RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_base_tweet_sentiment_english +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_tweet_sentiment_english` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_sentiment_english_en_5.2.0_3.0_1701273565133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_sentiment_english_en_5.2.0_3.0_1701273565133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_sentiment_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_sentiment_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_tweet_sentiment_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.3 MB| + +## References + +https://huggingface.co/cardiffnlp/roberta-base-tweet-sentiment-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_acts_feedback1_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_acts_feedback1_en.md new file mode 100644 index 000000000000..f520c89019ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_acts_feedback1_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from mp6kv) +author: John Snow Labs +name: roberta_classifier_acts_feedback1 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ACTS_feedback1` is a English model originally trained by `mp6kv`. + +## Predicted Entities + +`negative`, `neutral`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_acts_feedback1_en_5.2.0_3.0_1701217822695.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_acts_feedback1_en_5.2.0_3.0_1701217822695.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_acts_feedback1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_acts_feedback1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.acts_feedback.roberta.by_mp6kv").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_acts_feedback1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mp6kv/ACTS_feedback1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_argument_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_argument_en.md new file mode 100644 index 000000000000..999e4ae156cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_argument_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from chkla) +author: John Snow Labs +name: roberta_classifier_argument +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-argument` is a English model originally trained by `chkla`. + +## Predicted Entities + +`ARGUMENT`, `NON-ARGUMENT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_argument_en_5.2.0_3.0_1701217845110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_argument_en_5.2.0_3.0_1701217845110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_argument","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_argument","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_chkla").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_argument| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/chkla/roberta-argument +- https://www.aclweb.org/anthology/D18-1402/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_aristotletan_base_finetuned_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_aristotletan_base_finetuned_sst2_en.md new file mode 100644 index 000000000000..3b8f877deb48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_aristotletan_base_finetuned_sst2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from aristotletan) +author: John Snow Labs +name: roberta_classifier_aristotletan_base_finetuned_sst2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-finetuned-sst2` is a English model originally trained by `aristotletan`. + +## Predicted Entities + +`analogous event`, `disposal`, `appointment of receiver`, `event or events`, `repudiation`, `non payment`, `assets`, `others`, `breach of obligations`, `cross default`, `winding up`, `nationalisation`, `judgement`, `composition and arrangement`, `jeopardy`, `insolvency`, `revocation of license`, `legal proceedings`, `cessation of business`, `invalidity`, `misrepresentation`, `creditor control` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_aristotletan_base_finetuned_sst2_en_5.2.0_3.0_1701218114785.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_aristotletan_base_finetuned_sst2_en_5.2.0_3.0_1701218114785.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_aristotletan_base_finetuned_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_aristotletan_base_finetuned_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.base_finetuned.by_aristotletan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_aristotletan_base_finetuned_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|427.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aristotletan/roberta-base-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_attribute_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_attribute_classification_en.md new file mode 100644 index 000000000000..7e407a3c4807 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_attribute_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from AaronCU) +author: John Snow Labs +name: roberta_classifier_attribute_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `attribute-classification` is a English model originally trained by `AaronCU`. + +## Predicted Entities + +`description`, `measurement` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_attribute_classification_en_5.2.0_3.0_1701217787712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_attribute_classification_en_5.2.0_3.0_1701217787712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_attribute_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_attribute_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_aaroncu").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_attribute_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/AaronCU/attribute-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_123_478412765_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_123_478412765_en.md new file mode 100644 index 000000000000..7c5993c486eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_123_478412765_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from XYHY) +author: John Snow Labs +name: roberta_classifier_autonlp_123_478412765 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-123-478412765` is a English model originally trained by `XYHY`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_123_478412765_en_5.2.0_3.0_1701218044275.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_123_478412765_en_5.2.0_3.0_1701218044275.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_123_478412765","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_123_478412765","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_xyhy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_123_478412765| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/XYHY/autonlp-123-478412765 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_bbc_37249301_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_bbc_37249301_en.md new file mode 100644 index 000000000000..69093cc4aacf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_bbc_37249301_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from abhishek) +author: John Snow Labs +name: roberta_classifier_autonlp_bbc_37249301 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-bbc-roberta-37249301` is a English model originally trained by `abhishek`. + +## Predicted Entities + +`tech`, `business`, `politics`, `sport`, `entertainment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_bbc_37249301_en_5.2.0_3.0_1701218135506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_bbc_37249301_en_5.2.0_3.0_1701218135506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_bbc_37249301","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_bbc_37249301","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.bbc.roberta.by_abhishek").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_bbc_37249301| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/abhishek/autonlp-bbc-roberta-37249301 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_bert_covid_407910467_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_bert_covid_407910467_en.md new file mode 100644 index 000000000000..9e4de740ff94 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_bert_covid_407910467_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from nurkayevaa) +author: John Snow Labs +name: roberta_classifier_autonlp_bert_covid_407910467 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-bert-covid-407910467` is a English model originally trained by `nurkayevaa`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_bert_covid_407910467_en_5.2.0_3.0_1701218009071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_bert_covid_407910467_en_5.2.0_3.0_1701218009071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_bert_covid_407910467","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_bert_covid_407910467","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.covid.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_bert_covid_407910467| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nurkayevaa/autonlp-bert-covid-407910467 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817_en.md new file mode 100644 index 000000000000..0df667c790ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817 RoBertaForSequenceClassification from pierric +author: John Snow Labs +name: roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817` is a English model originally trained by pierric. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817_en_5.2.0_3.0_1701220564414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817_en_5.2.0_3.0_1701220564414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_burmese_own_imdb_sentiment_analysis_2131817| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.0 MB| + +## References + +https://huggingface.co/pierric/autonlp-my-own-imdb-sentiment-analysis-2131817 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_covid_432211280_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_covid_432211280_en.md new file mode 100644 index 000000000000..ae805e7adfff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_covid_432211280_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from anelnurkayeva) +author: John Snow Labs +name: roberta_classifier_autonlp_covid_432211280 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-covid-432211280` is a English model originally trained by `anelnurkayeva`. + +## Predicted Entities + +`news`, `misleading` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_covid_432211280_en_5.2.0_3.0_1701217784582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_covid_432211280_en_5.2.0_3.0_1701217784582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_covid_432211280","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_covid_432211280","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.covid.by_anelnurkayeva").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_covid_432211280| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/anelnurkayeva/autonlp-covid-432211280 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_doctor_german_24595548_de.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_doctor_german_24595548_de.md new file mode 100644 index 000000000000..920c4e887b8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_doctor_german_24595548_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German roberta_classifier_autonlp_doctor_german_24595548 RoBertaForSequenceClassification from muhtasham +author: John Snow Labs +name: roberta_classifier_autonlp_doctor_german_24595548 +date: 2023-11-29 +tags: [roberta, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_autonlp_doctor_german_24595548` is a German model originally trained by muhtasham. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_doctor_german_24595548_de_5.2.0_3.0_1701218235169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_doctor_german_24595548_de_5.2.0_3.0_1701218235169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_doctor_german_24595548","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_doctor_german_24595548","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_doctor_german_24595548| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|472.8 MB| + +## References + +https://huggingface.co/muhtasham/autonlp-Doctor_DE-24595548 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_formality_scoring_2_32597818_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_formality_scoring_2_32597818_en.md new file mode 100644 index 000000000000..8f939487bbda --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_formality_scoring_2_32597818_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from Harshveer) +author: John Snow Labs +name: roberta_classifier_autonlp_formality_scoring_2_32597818 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-formality_scoring_2-32597818` is a English model originally trained by `Harshveer`. + +## Predicted Entities + +`target` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_formality_scoring_2_32597818_en_5.2.0_3.0_1701218390572.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_formality_scoring_2_32597818_en_5.2.0_3.0_1701218390572.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_formality_scoring_2_32597818","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_formality_scoring_2_32597818","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.32d").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_formality_scoring_2_32597818| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Harshveer/autonlp-formality_scoring_2-32597818 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_fred2_2682064_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_fred2_2682064_en.md new file mode 100644 index 000000000000..d1a511697cf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_fred2_2682064_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from abhishek) +author: John Snow Labs +name: roberta_classifier_autonlp_fred2_2682064 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-fred2-2682064` is a English model originally trained by `abhishek`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_fred2_2682064_en_5.2.0_3.0_1701218613551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_fred2_2682064_en_5.2.0_3.0_1701218613551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_fred2_2682064","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_fred2_2682064","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.fred2.roberta.by_abhishek").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_fred2_2682064| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/abhishek/autonlp-fred2-2682064 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_group_classification_441411446_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_group_classification_441411446_en.md new file mode 100644 index 000000000000..d18f583fa7cd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_group_classification_441411446_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from alecmullen) +author: John Snow Labs +name: roberta_classifier_autonlp_group_classification_441411446 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-group-classification-441411446` is a English model originally trained by `alecmullen`. + +## Predicted Entities + +`Travel`, `Local`, `Faith`, `Fitness`, `TV/Movies`, `Memes`, `Food`, `Social`, `Beauty`, `Marketplace`, `Sports`, `Business/Finance`, `None`, `Gaming`, `Music` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_group_classification_441411446_en_5.2.0_3.0_1701218420130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_group_classification_441411446_en_5.2.0_3.0_1701218420130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_group_classification_441411446","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_group_classification_441411446","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_alecmullen").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_group_classification_441411446| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/alecmullen/autonlp-group-classification-441411446 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_base_3662644_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_base_3662644_en.md new file mode 100644 index 000000000000..7e7be779f3f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_base_3662644_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from abhishek) +author: John Snow Labs +name: roberta_classifier_autonlp_imdb_base_3662644 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb-roberta-base-3662644` is a English model originally trained by `abhishek`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_base_3662644_en_5.2.0_3.0_1701218342472.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_base_3662644_en_5.2.0_3.0_1701218342472.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_base_3662644","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_base_3662644","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.base.by_abhishek").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_imdb_base_3662644| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/abhishek/autonlp-imdb-roberta-base-3662644 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_reviews_sentiment_329982_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_reviews_sentiment_329982_en.md new file mode 100644 index 000000000000..708ecfdc29db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_reviews_sentiment_329982_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from victor) +author: John Snow Labs +name: roberta_classifier_autonlp_imdb_reviews_sentiment_329982 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb-reviews-sentiment-329982` is a English model originally trained by `victor`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_reviews_sentiment_329982_en_5.2.0_3.0_1701218958635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_reviews_sentiment_329982_en_5.2.0_3.0_1701218958635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_reviews_sentiment_329982","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_reviews_sentiment_329982","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb_sentiment.32d").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_imdb_reviews_sentiment_329982| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|454.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/victor/autonlp-imdb-reviews-sentiment-329982 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_sentiment_classification_31154_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_sentiment_classification_31154_en.md new file mode 100644 index 000000000000..cadaf71c3de3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_sentiment_classification_31154_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from abhishek) +author: John Snow Labs +name: roberta_classifier_autonlp_imdb_sentiment_classification_31154 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb_sentiment_classification-31154` is a English model originally trained by `abhishek`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_sentiment_classification_31154_en_5.2.0_3.0_1701219242968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_sentiment_classification_31154_en_5.2.0_3.0_1701219242968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_sentiment_classification_31154","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_sentiment_classification_31154","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb_sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_imdb_sentiment_classification_31154| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/abhishek/autonlp-imdb_sentiment_classification-31154 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_test_21134453_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_test_21134453_en.md new file mode 100644 index 000000000000..0d83cd891809 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_imdb_test_21134453_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from mmcquade11) +author: John Snow Labs +name: roberta_classifier_autonlp_imdb_test_21134453 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-imdb-test-21134453` is a English model originally trained by `mmcquade11`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_test_21134453_en_5.2.0_3.0_1701218008350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_imdb_test_21134453_en_5.2.0_3.0_1701218008350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_test_21134453","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_imdb_test_21134453","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.by_mmcquade11").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_imdb_test_21134453| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mmcquade11/autonlp-imdb-test-21134453 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_large_finetuned_467612250_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_large_finetuned_467612250_en.md new file mode 100644 index 000000000000..2edcbcec2800 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_large_finetuned_467612250_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from rexxar96) +author: John Snow Labs +name: roberta_classifier_autonlp_large_finetuned_467612250 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-roberta-large-finetuned-467612250` is a English model originally trained by `rexxar96`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_large_finetuned_467612250_en_5.2.0_3.0_1701219741836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_large_finetuned_467612250_en_5.2.0_3.0_1701219741836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_large_finetuned_467612250","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_large_finetuned_467612250","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_large_finetuned_467612250| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/rexxar96/autonlp-roberta-large-finetuned-467612250 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_lessons_tagging_606217261_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_lessons_tagging_606217261_en.md new file mode 100644 index 000000000000..9abeddcc9466 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_lessons_tagging_606217261_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Kamuuung) +author: John Snow Labs +name: roberta_classifier_autonlp_lessons_tagging_606217261 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-lessons_tagging-606217261` is a English model originally trained by `Kamuuung`. + +## Predicted Entities + +`project result`, `disbursement`, `social`, `financial management`, `procurement`, `environmental`, `policy`, `technical`, `institutional` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_lessons_tagging_606217261_en_5.2.0_3.0_1701220344120.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_lessons_tagging_606217261_en_5.2.0_3.0_1701220344120.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_lessons_tagging_606217261","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_lessons_tagging_606217261","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_kamuuung").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_lessons_tagging_606217261| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Kamuuung/autonlp-lessons_tagging-606217261 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_reading_prediction_172506_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_reading_prediction_172506_en.md new file mode 100644 index 000000000000..cf5a62d632e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_reading_prediction_172506_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from idjotherwise) +author: John Snow Labs +name: roberta_classifier_autonlp_reading_prediction_172506 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-reading_prediction-172506` is a English model originally trained by `idjotherwise`. + +## Predicted Entities + +`target` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_reading_prediction_172506_en_5.2.0_3.0_1701218948920.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_reading_prediction_172506_en_5.2.0_3.0_1701218948920.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_reading_prediction_172506","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_reading_prediction_172506","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_idjotherwise").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_reading_prediction_172506| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/idjotherwise/autonlp-reading_prediction-172506 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_savesome_631818261_nl.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_savesome_631818261_nl.md new file mode 100644 index 000000000000..7499b2d440d3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_savesome_631818261_nl.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Dutch RoBertaForSequenceClassification Cased model (from test1345) +author: John Snow Labs +name: roberta_classifier_autonlp_savesome_631818261 +date: 2023-11-29 +tags: [nl, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-savesome-631818261` is a Dutch model originally trained by `test1345`. + +## Predicted Entities + +`Koeken of Chocolade of Snoep`, `Niet-voeding`, `Dranken`, `Bereidingen of Charcuterie of Vis of Veggie`, `Wijn`, `Onderhoud of Huishouden`, `Zuivel`, `Dieetvoeding of Voedingssupplementen`, `Baby`, `Diepvries`, `Groenten en fruit`, `Lichaamsverzorging of Parfumerie`, `Conserven`, `Huisdieren`, `Kruidenierswaren of Droge voeding`, `Colruyt-beenhouwerij`, `Chips of Borrelhapjes`, `Brood of Ontbijt` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_savesome_631818261_nl_5.2.0_3.0_1701220852473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_savesome_631818261_nl_5.2.0_3.0_1701220852473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_savesome_631818261","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_savesome_631818261","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.roberta").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_savesome_631818261| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|438.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/test1345/autonlp-savesome-631818261 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_traffic_nlp_binary_537215209_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_traffic_nlp_binary_537215209_en.md new file mode 100644 index 000000000000..5aa181c2d908 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autonlp_traffic_nlp_binary_537215209_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from zwang199) +author: John Snow Labs +name: roberta_classifier_autonlp_traffic_nlp_binary_537215209 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autonlp-traffic_nlp_binary-537215209` is a English model originally trained by `zwang199`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_traffic_nlp_binary_537215209_en_5.2.0_3.0_1701218334793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autonlp_traffic_nlp_binary_537215209_en_5.2.0_3.0_1701218334793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_traffic_nlp_binary_537215209","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autonlp_traffic_nlp_binary_537215209","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_zwang199").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autonlp_traffic_nlp_binary_537215209| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/zwang199/autonlp-traffic_nlp_binary-537215209 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_app_review_train_1314150168_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_app_review_train_1314150168_en.md new file mode 100644 index 000000000000..2a11e6660594 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_app_review_train_1314150168_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from noob123) +author: John Snow Labs +name: roberta_classifier_autotrain_app_review_train_1314150168 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-app_review_train_roberta-1314150168` is a English model originally trained by `noob123`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_app_review_train_1314150168_en_5.2.0_3.0_1701218434516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_app_review_train_1314150168_en_5.2.0_3.0_1701218434516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_app_review_train_1314150168","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_app_review_train_1314150168","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_noob123").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_app_review_train_1314150168| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/noob123/autotrain-app_review_train_roberta-1314150168 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_article_pred_1142742075_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_article_pred_1142742075_en.md new file mode 100644 index 000000000000..843cdcbd07df --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_article_pred_1142742075_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from Johny201) +author: John Snow Labs +name: roberta_classifier_autotrain_article_pred_1142742075 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-article_pred-1142742075` is a English model originally trained by `Johny201`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_article_pred_1142742075_en_5.2.0_3.0_1701219084498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_article_pred_1142742075_en_5.2.0_3.0_1701219084498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_article_pred_1142742075","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_article_pred_1142742075","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_johny201").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_article_pred_1142742075| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Johny201/autotrain-article_pred-1142742075 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_atc2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_atc2_en.md new file mode 100644 index 000000000000..2cc89bbccc7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_atc2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from cjbarrie) +author: John Snow Labs +name: roberta_classifier_autotrain_atc2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-atc2` is a English model originally trained by `cjbarrie`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_atc2_en_5.2.0_3.0_1701218588700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_atc2_en_5.2.0_3.0_1701218588700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_atc2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_atc2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_cjbarrie").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_atc2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cjbarrie/autotrain-atc2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248775_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248775_en.md new file mode 100644 index 000000000000..ff88eb236ca2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248775_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_base_imdb_1275248775 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-roberta-base-imdb-1275248775` is a English model originally trained by `sasha`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248775_en_5.2.0_3.0_1701218921526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248775_en_5.2.0_3.0_1701218921526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248775","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248775","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.base_v1.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_base_imdb_1275248775| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-roberta-base-imdb-1275248775 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248776_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248776_en.md new file mode 100644 index 000000000000..a905ba75415f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248776_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_base_imdb_1275248776 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-roberta-base-imdb-1275248776` is a English model originally trained by `sasha`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248776_en_5.2.0_3.0_1701219397044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248776_en_5.2.0_3.0_1701219397044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248776","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248776","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.base_v2.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_base_imdb_1275248776| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|458.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-roberta-base-imdb-1275248776 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248777_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248777_en.md new file mode 100644 index 000000000000..02e0c32f93d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248777_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_base_imdb_1275248777 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-roberta-base-imdb-1275248777` is a English model originally trained by `sasha`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248777_en_5.2.0_3.0_1701221147140.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248777_en_5.2.0_3.0_1701221147140.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248777","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248777","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.base_v3.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_base_imdb_1275248777| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-roberta-base-imdb-1275248777 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248778_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248778_en.md new file mode 100644 index 000000000000..1362f2c6fa7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248778_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_base_imdb_1275248778 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-roberta-base-imdb-1275248778` is a English model originally trained by `sasha`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248778_en_5.2.0_3.0_1701221407342.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248778_en_5.2.0_3.0_1701221407342.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248778","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248778","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.base_v4.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_base_imdb_1275248778| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-roberta-base-imdb-1275248778 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248779_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248779_en.md new file mode 100644 index 000000000000..51c091d72937 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_base_imdb_1275248779_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_base_imdb_1275248779 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-roberta-base-imdb-1275248779` is a English model originally trained by `sasha`. + +## Predicted Entities + +`pos`, `neg` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248779_en_5.2.0_3.0_1701219270941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_base_imdb_1275248779_en_5.2.0_3.0_1701219270941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248779","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_base_imdb_1275248779","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.base_v5.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_base_imdb_1275248779| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|461.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-roberta-base-imdb-1275248779 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048986_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048986_en.md new file mode 100644 index 000000000000..c5a77868ebca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048986_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_basetweeteval_1281048986 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-RobertaBaseTweetEval-1281048986` is a English model originally trained by `sasha`. + +## Predicted Entities + +`neutral`, `negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048986_en_5.2.0_3.0_1701221711496.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048986_en_5.2.0_3.0_1701221711496.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048986","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048986","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet.base_128d").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_basetweeteval_1281048986| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-RobertaBaseTweetEval-1281048986 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048987_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048987_en.md new file mode 100644 index 000000000000..a2796eed38f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048987_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_basetweeteval_1281048987 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-RobertaBaseTweetEval-1281048987` is a English model originally trained by `sasha`. + +## Predicted Entities + +`neutral`, `negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048987_en_5.2.0_3.0_1701222018874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048987_en_5.2.0_3.0_1701222018874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048987","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048987","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet.base_128d_1281048987.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_basetweeteval_1281048987| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-RobertaBaseTweetEval-1281048987 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048988_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048988_en.md new file mode 100644 index 000000000000..cee7597fc982 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048988_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_basetweeteval_1281048988 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-RobertaBaseTweetEval-1281048988` is a English model originally trained by `sasha`. + +## Predicted Entities + +`neutral`, `negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048988_en_5.2.0_3.0_1701218723955.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048988_en_5.2.0_3.0_1701218723955.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048988","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048988","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet.base_128d_1281048988.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_basetweeteval_1281048988| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-RobertaBaseTweetEval-1281048988 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048989_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048989_en.md new file mode 100644 index 000000000000..40d3106d4b78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048989_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_basetweeteval_1281048989 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-RobertaBaseTweetEval-1281048989` is a English model originally trained by `sasha`. + +## Predicted Entities + +`neutral`, `negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048989_en_5.2.0_3.0_1701219276979.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048989_en_5.2.0_3.0_1701219276979.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048989","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048989","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet.base_128d_1281048989.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_basetweeteval_1281048989| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-RobertaBaseTweetEval-1281048989 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048990_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048990_en.md new file mode 100644 index 000000000000..ec759ed0631b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_basetweeteval_1281048990_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from sasha) +author: John Snow Labs +name: roberta_classifier_autotrain_basetweeteval_1281048990 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-RobertaBaseTweetEval-1281048990` is a English model originally trained by `sasha`. + +## Predicted Entities + +`neutral`, `negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048990_en_5.2.0_3.0_1701219013235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_basetweeteval_1281048990_en_5.2.0_3.0_1701219013235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048990","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_basetweeteval_1281048990","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet.base_128d_1281048990.by_sasha").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_basetweeteval_1281048990| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sasha/autotrain-RobertaBaseTweetEval-1281048990 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_car_review_project_966432120_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_car_review_project_966432120_en.md new file mode 100644 index 000000000000..cbc70dbcabdf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_car_review_project_966432120_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from qualitydatalab) +author: John Snow Labs +name: roberta_classifier_autotrain_car_review_project_966432120 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-car-review-project-966432120` is a English model originally trained by `qualitydatalab`. + +## Predicted Entities + +`ok`, `great`, `poor` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_car_review_project_966432120_en_5.2.0_3.0_1701222295791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_car_review_project_966432120_en_5.2.0_3.0_1701222295791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_car_review_project_966432120","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_car_review_project_966432120","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.car_review.v1.by_qualitydatalab").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_car_review_project_966432120| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/qualitydatalab/autotrain-car-review-project-966432120 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_car_review_project_966432121_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_car_review_project_966432121_en.md new file mode 100644 index 000000000000..a37f2a73c1d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_car_review_project_966432121_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from qualitydatalab) +author: John Snow Labs +name: roberta_classifier_autotrain_car_review_project_966432121 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-car-review-project-966432121` is a English model originally trained by `qualitydatalab`. + +## Predicted Entities + +`ok`, `great`, `poor` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_car_review_project_966432121_en_5.2.0_3.0_1701219612589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_car_review_project_966432121_en_5.2.0_3.0_1701219612589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_car_review_project_966432121","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_car_review_project_966432121","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.car_review.v2.by_qualitydatalab").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_car_review_project_966432121| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/qualitydatalab/autotrain-car-review-project-966432121 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_commonsense_1_696121179_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_commonsense_1_696121179_en.md new file mode 100644 index 000000000000..b5a7ca7e28e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_commonsense_1_696121179_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from unjustify) +author: John Snow Labs +name: roberta_classifier_autotrain_commonsense_1_696121179 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-commonsense_1-696121179` is a English model originally trained by `unjustify`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_commonsense_1_696121179_en_5.2.0_3.0_1701219626449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_commonsense_1_696121179_en_5.2.0_3.0_1701219626449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_commonsense_1_696121179","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_commonsense_1_696121179","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_unjustify").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_commonsense_1_696121179| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|425.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/unjustify/autotrain-commonsense_1-696121179 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_intentclassificationfilipino_715021714_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_intentclassificationfilipino_715021714_en.md new file mode 100644 index 000000000000..ef3743fa69fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_intentclassificationfilipino_715021714_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from shubh024) +author: John Snow Labs +name: roberta_classifier_autotrain_intentclassificationfilipino_715021714 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-intentclassificationfilipino-715021714` is a English model originally trained by `shubh024`. + +## Predicted Entities + +`Greeting`, `domain_based`, `GoodBye`, `general_questions`, `comparison`, `GreetingResponse`, `CourtesyGoodBye`, `CourtesyGreeting`, `Thanks`, `no_suggestion` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_intentclassificationfilipino_715021714_en_5.2.0_3.0_1701219950154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_intentclassificationfilipino_715021714_en_5.2.0_3.0_1701219950154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_intentclassificationfilipino_715021714","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_intentclassificationfilipino_715021714","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_shubh024").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_intentclassificationfilipino_715021714| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shubh024/autotrain-intentclassificationfilipino-715021714 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_mental_health_analysis_752423172_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_mental_health_analysis_752423172_en.md new file mode 100644 index 000000000000..50301355e7e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_mental_health_analysis_752423172_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from rabiaqayyum) +author: John Snow Labs +name: roberta_classifier_autotrain_mental_health_analysis_752423172 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-mental-health-analysis-752423172` is a English model originally trained by `rabiaqayyum`. + +## Predicted Entities + +`autism`, `BPD`, `bipolar`, `mentalhealth`, `schizophrenia`, `depression`, `Anxiety` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_mental_health_analysis_752423172_en_5.2.0_3.0_1701219617629.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_mental_health_analysis_752423172_en_5.2.0_3.0_1701219617629.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_mental_health_analysis_752423172","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_mental_health_analysis_752423172","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_rabiaqayyum").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_mental_health_analysis_752423172| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/rabiaqayyum/autotrain-mental-health-analysis-752423172 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_mlsec_1013333734_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_mlsec_1013333734_en.md new file mode 100644 index 000000000000..6c633dc585ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_mlsec_1013333734_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from deepesh0x) +author: John Snow Labs +name: roberta_classifier_autotrain_mlsec_1013333734 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-mlsec-1013333734` is a English model originally trained by `deepesh0x`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_mlsec_1013333734_en_5.2.0_3.0_1701222829283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_mlsec_1013333734_en_5.2.0_3.0_1701222829283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_mlsec_1013333734","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_mlsec_1013333734","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_deepesh0x").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_mlsec_1013333734| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/deepesh0x/autotrain-mlsec-1013333734 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133_en.md new file mode 100644 index 000000000000..781094adb359 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from zainalq7) +author: John Snow Labs +name: roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-NLU_crypto_sentiment_analysis-754123133` is a English model originally trained by `zainalq7`. + +## Predicted Entities + +`neutral`, `negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133_en_5.2.0_3.0_1701220293330.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133_en_5.2.0_3.0_1701220293330.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.crypto_sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_nlu_crypto_sentiment_analysis_754123133| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/zainalq7/autotrain-NLU_crypto_sentiment_analysis-754123133 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_not_interested_1_1213145894_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_not_interested_1_1213145894_en.md new file mode 100644 index 000000000000..5d61c7ed2c70 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_not_interested_1_1213145894_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from aujer) +author: John Snow Labs +name: roberta_classifier_autotrain_not_interested_1_1213145894 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-not_interested_1-1213145894` is a English model originally trained by `aujer`. + +## Predicted Entities + +`COMPANY_FIT`, `REMOTE_POLICY`, `TIMING`, `COMPENSATION`, `ROLE_FIT`, `SENIORITY`, `OTHER`, `VISA` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_not_interested_1_1213145894_en_5.2.0_3.0_1701220607745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_not_interested_1_1213145894_en_5.2.0_3.0_1701220607745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_not_interested_1_1213145894","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_not_interested_1_1213145894","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.not_interested.by_aujer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_not_interested_1_1213145894| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aujer/autotrain-not_interested_1-1213145894 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_pan_977432399_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_pan_977432399_en.md new file mode 100644 index 000000000000..8a0af4589f2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_pan_977432399_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from tayyaba) +author: John Snow Labs +name: roberta_classifier_autotrain_pan_977432399 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-pan-977432399` is a English model originally trained by `tayyaba`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_pan_977432399_en_5.2.0_3.0_1701219956848.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_pan_977432399_en_5.2.0_3.0_1701219956848.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_pan_977432399","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_pan_977432399","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_tayyaba").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_pan_977432399| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/tayyaba/autotrain-pan-977432399 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_qn_classification_1015534072_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_qn_classification_1015534072_en.md new file mode 100644 index 000000000000..d27bbd5814e2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_qn_classification_1015534072_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from lucianpopa) +author: John Snow Labs +name: roberta_classifier_autotrain_qn_classification_1015534072 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-qn-classification-1015534072` is a English model originally trained by `lucianpopa`. + +## Predicted Entities + +`atis_airline`, `atis_city`, `atis_ground_fare`, `atis_flight_time`, `atis_flight`, `atis_capacity`, `atis_quantity`, `atis_flight_no`, `atis_airport`, `atis_distance`, `atis_flight#atis_airfare`, `atis_aircraft`, `atis_airfare`, `atis_abbreviation`, `atis_ground_service` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_qn_classification_1015534072_en_5.2.0_3.0_1701220223445.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_qn_classification_1015534072_en_5.2.0_3.0_1701220223445.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_qn_classification_1015534072","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_qn_classification_1015534072","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.qn.roberta.by_lucianpopa").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_qn_classification_1015534072| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lucianpopa/autotrain-qn-classification-1015534072 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_security_texts_classification_688020754_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_security_texts_classification_688020754_en.md new file mode 100644 index 000000000000..3cf236a49671 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_security_texts_classification_688020754_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from vlsb) +author: John Snow Labs +name: roberta_classifier_autotrain_security_texts_classification_688020754 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-security-texts-classification-roberta-688020754` is a English model originally trained by `vlsb`. + +## Predicted Entities + +`irrelevant`, `relevant` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_security_texts_classification_688020754_en_5.2.0_3.0_1701220929000.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_security_texts_classification_688020754_en_5.2.0_3.0_1701220929000.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_security_texts_classification_688020754","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_security_texts_classification_688020754","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.security_text.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_security_texts_classification_688020754| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/vlsb/autotrain-security-texts-classification-roberta-688020754 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_security_texts_classification_distil_688220764_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_security_texts_classification_distil_688220764_en.md new file mode 100644 index 000000000000..2061aa0d97cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_security_texts_classification_distil_688220764_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from vlsb) +author: John Snow Labs +name: roberta_classifier_autotrain_security_texts_classification_distil_688220764 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-security-texts-classification-distilroberta-688220764` is a English model originally trained by `vlsb`. + +## Predicted Entities + +`irrelevant`, `relevant` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_security_texts_classification_distil_688220764_en_5.2.0_3.0_1701220222735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_security_texts_classification_distil_688220764_en_5.2.0_3.0_1701220222735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_security_texts_classification_distil_688220764","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_security_texts_classification_distil_688220764","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.security_text.distilled").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_security_texts_classification_distil_688220764| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/vlsb/autotrain-security-texts-classification-distilroberta-688220764 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_smm4h_large_clean_874027878_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_smm4h_large_clean_874027878_en.md new file mode 100644 index 000000000000..555232f2155e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_smm4h_large_clean_874027878_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from Amalq) +author: John Snow Labs +name: roberta_classifier_autotrain_smm4h_large_clean_874027878 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-smm4h_large_roberta_clean-874027878` is a English model originally trained by `Amalq`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_smm4h_large_clean_874027878_en_5.2.0_3.0_1701220882179.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_smm4h_large_clean_874027878_en_5.2.0_3.0_1701220882179.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_smm4h_large_clean_874027878","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_smm4h_large_clean_874027878","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large_4h").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_smm4h_large_clean_874027878| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Amalq/autotrain-smm4h_large_roberta_clean-874027878 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_test_project_879428192_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_test_project_879428192_en.md new file mode 100644 index 000000000000..c61b0ff3b05d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_test_project_879428192_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from nthanhha26) +author: John Snow Labs +name: roberta_classifier_autotrain_test_project_879428192 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `autotrain-test-project-879428192` is a English model originally trained by `nthanhha26`. + +## Predicted Entities + +`hate-speech`, `no-hate-speech` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_test_project_879428192_en_5.2.0_3.0_1701223121947.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_test_project_879428192_en_5.2.0_3.0_1701223121947.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_test_project_879428192","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_test_project_879428192","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_nthanhha26").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_test_project_879428192| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nthanhha26/autotrain-test-project-879428192 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469_en.md new file mode 100644 index 000000000000..81d198774636 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469 RoBertaForSequenceClassification from Siddish +author: John Snow Labs +name: roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469` is a English model originally trained by Siddish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469_en_5.2.0_3.0_1701221321407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469_en_5.2.0_3.0_1701221321407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_autotrain_yes_oriya_norwegian_on_circa_1009033469| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Siddish/autotrain-yes-or-no-classifier-on-circa-1009033469 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_banking77_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_banking77_en.md new file mode 100644 index 000000000000..71a557df6299 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_banking77_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_banking77 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `RoBERTa-Banking77` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`card_payment_not_recognised`, `activate_my_card`, `exchange_charge`, `getting_virtual_card`, `wrong_amount_of_cash_received`, `card_delivery_estimate`, `unable_to_verify_identity`, `cash_withdrawal_charge`, `get_physical_card`, `wrong_exchange_rate_for_cash_withdrawal`, `declined_cash_withdrawal`, `top_up_by_card_charge`, `card_not_working`, `card_swallowed`, `card_payment_wrong_exchange_rate`, `atm_support`, `getting_spare_card`, `card_acceptance`, `card_linking`, `request_refund`, `reverted_card_payment?`, `top_up_failed`, `verify_my_identity`, `exchange_rate`, `virtual_card_not_working`, `country_support`, `disposable_card_limits`, `card_arrival`, `supported_cards_and_currencies`, `top_up_reverted`, `apple_pay_or_google_pay`, `transaction_charged_twice`, `Refund_not_showing_up`, `balance_not_updated_after_cheque_or_cash_deposit`, `lost_or_stolen_phone`, `order_physical_card`, `declined_card_payment`, `cash_withdrawal_not_recognised`, `edit_personal_details`, `contactless_not_working`, `change_pin`, `cancel_transfer`, `extra_charge_on_statement`, `balance_not_updated_after_bank_transfer`, `lost_or_stolen_card`, `failed_transfer`, `verify_source_of_funds`, `verify_top_up`, `pending_card_payment`, `transfer_timing`, `why_verify_identity`, `card_about_to_expire`, `compromised_card`, `direct_debit_payment_not_recognised`, `transfer_into_account`, `pending_top_up`, `top_up_limits`, `top_up_by_cash_or_cheque`, `pin_blocked`, `visa_or_mastercard`, `declined_transfer`, `get_disposable_virtual_card`, `automatic_top_up`, `top_up_by_bank_transfer_charge`, `terminate_account`, `passcode_forgotten`, `beneficiary_not_allowed`, `receiving_money`, `fiat_currency_support`, `topping_up_by_card`, `pending_transfer`, `exchange_via_app`, `transfer_fee_charged`, `pending_cash_withdrawal`, `transfer_not_received_by_recipient`, `age_limit`, `card_payment_fee_charged` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_banking77_en_5.2.0_3.0_1701223341053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_banking77_en_5.2.0_3.0_1701223341053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_banking77","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_banking77","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.banking.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_banking77| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/RoBERTa-Banking77 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=BANKING77 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_adr_smm4h2022_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_adr_smm4h2022_en.md new file mode 100644 index 000000000000..d340cb0e34bd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_adr_smm4h2022_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from orestxherija) +author: John Snow Labs +name: roberta_classifier_base_adr_smm4h2022 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-adr-smm4h2022` is a English model originally trained by `orestxherija`. + +## Predicted Entities + +`noADE`, `ADE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_adr_smm4h2022_en_5.2.0_3.0_1701220551970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_adr_smm4h2022_en_5.2.0_3.0_1701220551970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_adr_smm4h2022","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_adr_smm4h2022","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.base_4h").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_adr_smm4h2022| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/orestxherija/roberta-base-adr-smm4h2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi_es.md new file mode 100644 index 000000000000..f7379954029c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish RoBertaForSequenceClassification Base Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-bne-finetuned-amazon_reviews_multi-finetuned-amazon_reviews_multi` is a Spanish model originally trained by `lewtun`. + +## Predicted Entities + +`NEGATIVO`, `POSITIVO` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi_es_5.2.0_3.0_1701220877532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi_es_5.2.0_3.0_1701220877532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.amazon.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_bne_finetuned_amazon_reviews_multi_finetuned_amazon_reviews_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|446.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-base-bne-finetuned-amazon_reviews_multi-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish_es.md new file mode 100644 index 000000000000..3437cff3a39f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish RoBertaForSequenceClassification Base Cased model (from JonatanGk) +author: John Snow Labs +name: roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-bne-finetuned-hate-speech-offensive-spanish` is a Spanish model originally trained by `JonatanGk`. + +## Predicted Entities + +`OFFENSIVE`, `HATE_SPEECH`, `NEITHER` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish_es_5.2.0_3.0_1701223925003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish_es_5.2.0_3.0_1701223925003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.hate.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_bne_finetuned_hate_speech_offensive_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|444.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JonatanGk/roberta-base-bne-finetuned-hate-speech-offensive-spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_catalonia_independence_detector_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_catalonia_independence_detector_ca.md new file mode 100644 index 000000000000..8911874a52c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_catalonia_independence_detector_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_finetuned_catalonia_independence_detector RoBertaForSequenceClassification from JonatanGk +author: John Snow Labs +name: roberta_classifier_base_catalan_finetuned_catalonia_independence_detector +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_finetuned_catalonia_independence_detector` is a Catalan, Valencian model originally trained by JonatanGk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_finetuned_catalonia_independence_detector_ca_5.2.0_3.0_1701221111211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_finetuned_catalonia_independence_detector_ca_5.2.0_3.0_1701221111211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_finetuned_catalonia_independence_detector","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_finetuned_catalonia_independence_detector","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_finetuned_catalonia_independence_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|448.9 MB| + +## References + +https://huggingface.co/JonatanGk/roberta-base-ca-finetuned-catalonia-independence-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_cyberbullying_catalan_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_cyberbullying_catalan_ca.md new file mode 100644 index 000000000000..198c359eca14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_cyberbullying_catalan_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_finetuned_cyberbullying_catalan RoBertaForSequenceClassification from JonatanGk +author: John Snow Labs +name: roberta_classifier_base_catalan_finetuned_cyberbullying_catalan +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_finetuned_cyberbullying_catalan` is a Catalan, Valencian model originally trained by JonatanGk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_finetuned_cyberbullying_catalan_ca_5.2.0_3.0_1701219686890.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_finetuned_cyberbullying_catalan_ca_5.2.0_3.0_1701219686890.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_finetuned_cyberbullying_catalan","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_finetuned_cyberbullying_catalan","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_finetuned_cyberbullying_catalan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|469.0 MB| + +## References + +https://huggingface.co/JonatanGk/roberta-base-ca-finetuned-cyberbullying-catalan \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_tecla_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_tecla_ca.md new file mode 100644 index 000000000000..bc6f82cb88b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_finetuned_tecla_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_finetuned_tecla RoBertaForSequenceClassification from JonatanGk +author: John Snow Labs +name: roberta_classifier_base_catalan_finetuned_tecla +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_finetuned_tecla` is a Catalan, Valencian model originally trained by JonatanGk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_finetuned_tecla_ca_5.2.0_3.0_1701221542964.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_finetuned_tecla_ca_5.2.0_3.0_1701221542964.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_finetuned_tecla","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_finetuned_tecla","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_finetuned_tecla| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|470.7 MB| + +## References + +https://huggingface.co/JonatanGk/roberta-base-ca-finetuned-tecla \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_sts_cased_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_sts_cased_ca.md new file mode 100644 index 000000000000..7c987438ac62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_sts_cased_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_sts_cased RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_classifier_base_catalan_sts_cased +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_sts_cased` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_sts_cased_ca_5.2.0_3.0_1701219923828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_sts_cased_ca_5.2.0_3.0_1701219923828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_sts_cased","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_sts_cased","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_sts_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|433.4 MB| + +## References + +https://huggingface.co/projecte-aina/roberta-base-ca-cased-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_tc_cased_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_tc_cased_ca.md new file mode 100644 index 000000000000..46d2831964c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_tc_cased_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_tc_cased RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_classifier_base_catalan_tc_cased +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_tc_cased` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_tc_cased_ca_5.2.0_3.0_1701220134008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_tc_cased_ca_5.2.0_3.0_1701220134008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_tc_cased","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_tc_cased","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_tc_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|469.8 MB| + +## References + +https://huggingface.co/projecte-aina/roberta-base-ca-cased-tc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_telugu_cased_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_telugu_cased_ca.md new file mode 100644 index 000000000000..0da03a3edc62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_telugu_cased_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_telugu_cased RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_classifier_base_catalan_telugu_cased +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_telugu_cased` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_telugu_cased_ca_5.2.0_3.0_1701220328516.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_telugu_cased_ca_5.2.0_3.0_1701220328516.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_telugu_cased","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_telugu_cased","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_telugu_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|449.2 MB| + +## References + +https://huggingface.co/projecte-aina/roberta-base-ca-cased-te \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_sts_cased_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_sts_cased_ca.md new file mode 100644 index 000000000000..c01421dbd6ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_sts_cased_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_v2_sts_cased RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_classifier_base_catalan_v2_sts_cased +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_v2_sts_cased` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_v2_sts_cased_ca_5.2.0_3.0_1701220553254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_v2_sts_cased_ca_5.2.0_3.0_1701220553254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_v2_sts_cased","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_v2_sts_cased","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_v2_sts_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|430.4 MB| + +## References + +https://huggingface.co/projecte-aina/roberta-base-ca-v2-cased-sts \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_tc_cased_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_tc_cased_ca.md new file mode 100644 index 000000000000..0f6f8470388b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_tc_cased_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_v2_tc_cased RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_classifier_base_catalan_v2_tc_cased +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_v2_tc_cased` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_v2_tc_cased_ca_5.2.0_3.0_1701220815607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_v2_tc_cased_ca_5.2.0_3.0_1701220815607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_v2_tc_cased","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_v2_tc_cased","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_v2_tc_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|464.9 MB| + +## References + +https://huggingface.co/projecte-aina/roberta-base-ca-v2-cased-tc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_telugu_cased_ca.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_telugu_cased_ca.md new file mode 100644 index 000000000000..28582130ad41 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_catalan_v2_telugu_cased_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_classifier_base_catalan_v2_telugu_cased RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_classifier_base_catalan_v2_telugu_cased +date: 2023-11-29 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_base_catalan_v2_telugu_cased` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_v2_telugu_cased_ca_5.2.0_3.0_1701221318835.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_catalan_v2_telugu_cased_ca_5.2.0_3.0_1701221318835.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_v2_telugu_cased","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_catalan_v2_telugu_cased","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_catalan_v2_telugu_cased| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|445.9 MB| + +## References + +https://huggingface.co/projecte-aina/roberta-base-ca-v2-cased-te \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_clickbait_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_clickbait_en.md new file mode 100644 index 000000000000..a9a69a09bf5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_clickbait_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from Stremie) +author: John Snow Labs +name: roberta_classifier_base_clickbait +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-clickbait` is a English model originally trained by `Stremie`. + +## Predicted Entities + +`Not Clickbait`, `Clickbait` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_clickbait_en_5.2.0_3.0_1701221824408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_clickbait_en_5.2.0_3.0_1701221824408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_clickbait","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_clickbait","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.base.by_stremie").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_clickbait| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Stremie/roberta-base-clickbait +- https://webis.de/data/webis-clickbait-17.html \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_cola_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_cola_en.md new file mode 100644 index 000000000000..b295902916aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_cola_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from JeremiahZ) +author: John Snow Labs +name: roberta_classifier_base_cola +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-cola` is a English model originally trained by `JeremiahZ`. + +## Predicted Entities + +`acceptable`, `unacceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_cola_en_5.2.0_3.0_1701224215832.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_cola_en_5.2.0_3.0_1701224215832.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_cola","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_cola","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue_cola1.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JeremiahZ/roberta-base-cola +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+COLA \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_emotion_en.md new file mode 100644 index 000000000000..44225bea9a19 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_emotion_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from bhadresh-savani) +author: John Snow Labs +name: roberta_classifier_base_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-emotion` is a English model originally trained by `bhadresh-savani`. + +## Predicted Entities + +`surprise`, `joy`, `anger`, `fear`, `love`, `sadness` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_emotion_en_5.2.0_3.0_1701221107370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_emotion_en_5.2.0_3.0_1701221107370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.emotion.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bhadresh-savani/roberta-base-emotion +- https://arxiv.org/abs/1907.11692 +- https://github.com/bhadreshpsavani/ExploringSentimentalAnalysis/blob/main/SentimentalAnalysisWithDistilbert.ipynb +- https://learning.oreilly.com/library/view/natural-language-processing/9781098103231/ +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_formality_ranker_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_formality_ranker_en.md new file mode 100644 index 000000000000..84500b3ddda8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_formality_ranker_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from SkolkovoInstitute) +author: John Snow Labs +name: roberta_classifier_base_formality_ranker +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-formality-ranker` is a English model originally trained by `SkolkovoInstitute`. + +## Predicted Entities + +`formal`, `informal` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_formality_ranker_en_5.2.0_3.0_1701221635361.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_formality_ranker_en_5.2.0_3.0_1701221635361.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_formality_ranker","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_formality_ranker","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_formality_ranker| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|454.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/SkolkovoInstitute/roberta-base-formality-ranker +- https://github.com/raosudha89/GYAFC-corpus +- https://aclanthology.org/N18-1012 +- http://www.seas.upenn.edu/~nlp/resources/formality-corpus.tgz +- https://aclanthology.org/Q16-1005 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_frenk_hate_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_frenk_hate_en.md new file mode 100644 index 000000000000..a1abd8b33f18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_frenk_hate_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from classla) +author: John Snow Labs +name: roberta_classifier_base_frenk_hate +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-frenk-hate` is a English model originally trained by `classla`. + +## Predicted Entities + +`Offensive`, `Acceptable` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_frenk_hate_en_5.2.0_3.0_1701224487948.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_frenk_hate_en_5.2.0_3.0_1701224487948.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_frenk_hate","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_frenk_hate","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.hate.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_frenk_hate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/classla/roberta-base-frenk-hate +- https://www.clarin.si/repository/xmlui/handle/11356/1433 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_imdb_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_imdb_en.md new file mode 100644 index 000000000000..11d8907b15bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_imdb_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from aychang) +author: John Snow Labs +name: roberta_classifier_base_imdb +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-imdb` is a English model originally trained by `aychang`. + +## Predicted Entities + +`neg`, `pos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_imdb_en_5.2.0_3.0_1701221399321.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_imdb_en_5.2.0_3.0_1701221399321.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_imdb","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_imdb","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.imdb.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_imdb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aychang/roberta-base-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_en.md new file mode 100644 index 000000000000..11eea556f553 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from ayameRushia) +author: John Snow Labs +name: roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-indonesian-1.5G-sentiment-analysis-smsa` is a English model originally trained by `ayameRushia`. + +## Predicted Entities + +`POSITIVE`, `NEUTRAL`, `NEGATIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_en_5.2.0_3.0_1701221698932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa_en_5.2.0_3.0_1701221698932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.sentiment.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_indonesian_1.5g_sentiment_analysis_smsa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|473.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ayameRushia/roberta-base-indonesian-1.5G-sentiment-analysis-smsa +- https://paperswithcode.com/sota?task=Text+Classification&dataset=indonlu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_indonesian_sentiment_analysis_smsa_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_indonesian_sentiment_analysis_smsa_id.md new file mode 100644 index 000000000000..30ef1b073218 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_indonesian_sentiment_analysis_smsa_id.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Indonesian RoBertaForSequenceClassification Base Cased model (from ayameRushia) +author: John Snow Labs +name: roberta_classifier_base_indonesian_sentiment_analysis_smsa +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-indonesian-sentiment-analysis-smsa` is a Indonesian model originally trained by `ayameRushia`. + +## Predicted Entities + +`POSITIVE`, `NEUTRAL`, `NEGATIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_indonesian_sentiment_analysis_smsa_id_5.2.0_3.0_1701222097269.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_indonesian_sentiment_analysis_smsa_id_5.2.0_3.0_1701222097269.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_indonesian_sentiment_analysis_smsa","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_indonesian_sentiment_analysis_smsa","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta.sentiment.base.by_ayameRushia").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_indonesian_sentiment_analysis_smsa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ayameRushia/roberta-base-indonesian-sentiment-analysis-smsa +- https://paperswithcode.com/sota?task=Text+Classification&dataset=indonlu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_stars_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_stars_en.md new file mode 100644 index 000000000000..eda388d38a7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_stars_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from onewithnickelcoins) +author: John Snow Labs +name: roberta_classifier_base_stars +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-stars` is a English model originally trained by `onewithnickelcoins`. + +## Predicted Entities + +`4 stars`, `3 stars`, `1 star`, `2 stars` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_stars_en_5.2.0_3.0_1701224765003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_stars_en_5.2.0_3.0_1701224765003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_stars","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_stars","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.base.by_onewithnickelcoins").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_stars| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/onewithnickelcoins/roberta-base-stars \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_toxicity_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_toxicity_en.md new file mode 100644 index 000000000000..bfaaa0635a8a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_base_toxicity_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from mohsenfayyaz) +author: John Snow Labs +name: roberta_classifier_base_toxicity +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-toxicity` is a English model originally trained by `mohsenfayyaz`. + +## Predicted Entities + +`Toxic`, `Non-Toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_toxicity_en_5.2.0_3.0_1701222396963.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_base_toxicity_en_5.2.0_3.0_1701222396963.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_toxicity","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_base_toxicity","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.base.by_mohsenfayyaz").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_base_toxicity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mohsenfayyaz/roberta-base-toxicity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_sentiment_analysis_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_sentiment_analysis_es.md new file mode 100644 index 000000000000..f2f0893e99e0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_sentiment_analysis_es.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Base Cased model (from edumunozsala) +author: John Snow Labs +name: roberta_classifier_bertin_base_sentiment_analysis +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertin_base_sentiment_analysis_es` is a Spanish model originally trained by `edumunozsala`. + +## Predicted Entities + +`Negativo`, `Positivo` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_base_sentiment_analysis_es_5.2.0_3.0_1701222001626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_base_sentiment_analysis_es_5.2.0_3.0_1701222001626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_base_sentiment_analysis","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_base_sentiment_analysis","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.sentiment.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_bertin_base_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|455.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/edumunozsala/bertin_base_sentiment_analysis_es +- http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6403 +- https://github.com/edumunozsala +- https://paperswithcode.com/sota?task=Sentiment+Analysis&dataset=IMDb+Reviews+in+Spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1_es.md new file mode 100644 index 000000000000..4458aa02070b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1 RoBertaForSequenceClassification from maxpe +author: John Snow Labs +name: roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1 +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1` is a Castilian, Spanish model originally trained by maxpe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1_es_5.2.0_3.0_1701222594964.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1_es_5.2.0_3.0_1701222594964.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_bertin_base_spanish_semitic_languages_eval_2018_task_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|464.5 MB| + +## References + +https://huggingface.co/maxpe/bertin-roberta-base-spanish_sem_eval_2018_task_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_xnli_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_xnli_es.md new file mode 100644 index 000000000000..30c02d06f807 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_base_xnli_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Base Cased model (from bertin-project) +author: John Snow Labs +name: roberta_classifier_bertin_base_xnli +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertin-base-xnli-es` is a Spanish model originally trained by `bertin-project`. + +## Predicted Entities + +`neutral`, `contradiction`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_base_xnli_es_5.2.0_3.0_1701221939355.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_base_xnli_es_5.2.0_3.0_1701221939355.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_base_xnli","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_base_xnli","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.xnli.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_bertin_base_xnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|452.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bertin-project/bertin-base-xnli-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_exist22_task1_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_exist22_task1_en.md new file mode 100644 index 000000000000..d664ec465ae8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bertin_exist22_task1_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from avacaondata) +author: John Snow Labs +name: roberta_classifier_bertin_exist22_task1 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `bertin-exist22-task1` is a English model originally trained by `avacaondata`. + +## Predicted Entities + +`non-sexist`, `sexist` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_exist22_task1_en_5.2.0_3.0_1701222246919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_bertin_exist22_task1_en_5.2.0_3.0_1701222246919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_exist22_task1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bertin_exist22_task1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_avacaondata").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_bertin_exist22_task1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/avacaondata/bertin-exist22-task1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bne_sentiment_analysis_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bne_sentiment_analysis_es.md new file mode 100644 index 000000000000..5aac1fd7a09c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_bne_sentiment_analysis_es.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Cased model (from edumunozsala) +author: John Snow Labs +name: roberta_classifier_bne_sentiment_analysis +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta_bne_sentiment_analysis_es` is a Spanish model originally trained by `edumunozsala`. + +## Predicted Entities + +`Positivo`, `Negativo` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_bne_sentiment_analysis_es_5.2.0_3.0_1701222892744.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_bne_sentiment_analysis_es_5.2.0_3.0_1701222892744.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bne_sentiment_analysis","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_bne_sentiment_analysis","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_bne_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|459.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/edumunozsala/roberta_bne_sentiment_analysis_es +- http://www.bne.es/en/Inicio/index.html +- https://arxiv.org/abs/2107.07253 +- https://github.com/edumunozsala +- https://paperswithcode.com/sota?task=Sentiment+Analysis&dataset=IMDb+Reviews+in+Spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_burmese_awesome_model_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_burmese_awesome_model_en.md new file mode 100644 index 000000000000..daf220848907 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_burmese_awesome_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_classifier_burmese_awesome_model RoBertaForSequenceClassification from Anthos23 +author: John Snow Labs +name: roberta_classifier_burmese_awesome_model +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_burmese_awesome_model` is a English model originally trained by Anthos23. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_burmese_awesome_model_en_5.2.0_3.0_1701236346778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_burmese_awesome_model_en_5.2.0_3.0_1701236346778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_burmese_awesome_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_burmese_awesome_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_burmese_awesome_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/Anthos23/my-awesome-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_carer_2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_carer_2_en.md new file mode 100644 index 000000000000..496468bac7a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_carer_2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from crcb) +author: John Snow Labs +name: roberta_classifier_carer_2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `carer_2` is a English model originally trained by `crcb`. + +## Predicted Entities + +`sadness`, `fear`, `surprise`, `anger` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_carer_2_en_5.2.0_3.0_1701225064031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_carer_2_en_5.2.0_3.0_1701225064031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_carer_2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_carer_2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.carer.roberta.v2.by_crcb").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_carer_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/crcb/carer_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_carer_nepal_bhasa_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_carer_nepal_bhasa_en.md new file mode 100644 index 000000000000..efaea3b88215 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_carer_nepal_bhasa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_classifier_carer_nepal_bhasa RoBertaForSequenceClassification from crcb +author: John Snow Labs +name: roberta_classifier_carer_nepal_bhasa +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_carer_nepal_bhasa` is a English model originally trained by crcb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_carer_nepal_bhasa_en_5.2.0_3.0_1701225328374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_carer_nepal_bhasa_en_5.2.0_3.0_1701225328374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_carer_nepal_bhasa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_carer_nepal_bhasa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_carer_nepal_bhasa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.2 MB| + +## References + +https://huggingface.co/crcb/carer_new \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_clasificacion_sentimientos_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_clasificacion_sentimientos_es.md new file mode 100644 index 000000000000..62cfeca9d350 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_clasificacion_sentimientos_es.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from alexhf90) +author: John Snow Labs +name: roberta_classifier_clasificacion_sentimientos +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Clasificacion_sentimientos` is a English model originally trained by `alexhf90`. + +## Predicted Entities + +`Comentario_Positivo`, `Comentario_Negativo` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_clasificacion_sentimientos_es_5.2.0_3.0_1701221450603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_clasificacion_sentimientos_es_5.2.0_3.0_1701221450603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_clasificacion_sentimientos","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_clasificacion_sentimientos","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.by_alexhf90").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_clasificacion_sentimientos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|454.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/alexhf90/Clasificacion_sentimientos +- https://www.filmaffinity.com/es/main.html$ +- https://www.kaggle.com/ricardomoya/criticas-peliculas-filmaffinity-en-espaniol/code \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_concreteness_english_distil_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_concreteness_english_distil_base_en.md new file mode 100644 index 000000000000..21b53c92f82a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_concreteness_english_distil_base_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from j-hartmann) +author: John Snow Labs +name: roberta_classifier_concreteness_english_distil_base +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `concreteness-english-distilroberta-base` is a English model originally trained by `j-hartmann`. + +## Predicted Entities + +`concrete`, `abstract` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_concreteness_english_distil_base_en_5.2.0_3.0_1701222527163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_concreteness_english_distil_base_en_5.2.0_3.0_1701222527163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_concreteness_english_distil_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_concreteness_english_distil_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_base.by_j_hartmann").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_concreteness_english_distil_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/j-hartmann/concreteness-english-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_covid_policy_21_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_covid_policy_21_en.md new file mode 100644 index 000000000000..e74f1887b39c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_covid_policy_21_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from MoritzLaurer) +author: John Snow Labs +name: roberta_classifier_covid_policy_21 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `covid-policy-roberta-21` is a English model originally trained by `MoritzLaurer`. + +## Predicted Entities + +`Quarantine`, `Health Monitoring`, `Lockdown`, `Restrictions of Mass Gatherings`, `Health Testing`, `Public Awareness Measures`, `Closure and Regulation of Schools`, `Restriction and Regulation of Businesses`, `COVID-19 Vaccines`, `Other Policy Not Listed Above`, `Internal Border Restrictions`, `Restriction and Regulation of Government Services`, `Curfew`, `Social Distancing`, `Health Resources`, `External Border Restrictions`, `Anti-Disinformation Measures`, `Hygiene`, `Other`, `Declaration of Emergency` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_covid_policy_21_en_5.2.0_3.0_1701222254481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_covid_policy_21_en_5.2.0_3.0_1701222254481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_covid_policy_21","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_covid_policy_21","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.covid.by_moritzlaurer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_covid_policy_21| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/MoritzLaurer/covid-policy-roberta-21 +- https://www.ceps.eu/ceps-staff/moritz-laurer/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_cryptobert_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_cryptobert_en.md new file mode 100644 index 000000000000..6f8faf69f527 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_cryptobert_en.md @@ -0,0 +1,113 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from ElKulako) +author: John Snow Labs +name: roberta_classifier_cryptobert +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `cryptobert` is a English model originally trained by `ElKulako`. + +## Predicted Entities + +`Bearish`, `Neutral`, `Bullish` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_cryptobert_en_5.2.0_3.0_1701221748872.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_cryptobert_en_5.2.0_3.0_1701221748872.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_cryptobert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_cryptobert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.crypto.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_cryptobert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ElKulako/cryptobert +- https://t.me/binanceexchange +- https://t.me/BittrexGlobalEnglish +- https://t.me/huobiglobalofficial +- https://t.me/Kucoin_Exchange +- https://t.me/OKExOfficial_English +- https://www.kaggle.com/datasets/aagghh/crypto-telegram-groups +- https://www.kaggle.com/datasets/paul92s/bitcoin-tweets-14m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_ctrl44_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_ctrl44_en.md new file mode 100644 index 000000000000..01321120a0ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_ctrl44_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from liamcripwell) +author: John Snow Labs +name: roberta_classifier_ctrl44 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ctrl44-clf` is a English model originally trained by `liamcripwell`. + +## Predicted Entities + +`syntax-split`, `discourse-split`, `rephrase`, `ignore` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_ctrl44_en_5.2.0_3.0_1701222013011.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_ctrl44_en_5.2.0_3.0_1701222013011.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_ctrl44","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_ctrl44","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.ctrl44.roberta.by_liamcripwell").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_ctrl44| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/liamcripwell/ctrl44-clf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_cuad_contract_type_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_cuad_contract_type_en.md new file mode 100644 index 000000000000..f87f4b6a6f9b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_cuad_contract_type_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from agnihotri) +author: John Snow Labs +name: roberta_classifier_cuad_contract_type +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `cuad_contract_type` is a English model originally trained by `agnihotri`. + +## Predicted Entities + +`Endorsement Agreement`, `Affiliate Agreement`, `Joint Venture`, `Supply`, `Development`, `Non_Compete_Non_Solicit`, `Reseller`, `Distributor`, `Promotion`, `Franchise`, `Co_Branding`, `Collaboration`, `Hosting`, `Sponsorship`, `Service`, `License_Agreements`, `Outsourcing`, `Transportation`, `Marketing`, `Manufacturing`, `Maintenance`, `IP`, `Consulting Agreements`, `Agency Agreements`, `Strategic Alliance` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_cuad_contract_type_en_5.2.0_3.0_1701223463215.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_cuad_contract_type_en_5.2.0_3.0_1701223463215.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_cuad_contract_type","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_cuad_contract_type","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.cuad.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_cuad_contract_type| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/agnihotri/cuad_contract_type \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_dbounds_large_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_dbounds_large_finetuned_clinc_en.md new file mode 100644 index 000000000000..b056292ce350 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_dbounds_large_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from dbounds) +author: John Snow Labs +name: roberta_classifier_dbounds_large_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc` is a English model originally trained by `dbounds`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_dbounds_large_finetuned_clinc_en_5.2.0_3.0_1701222618408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_dbounds_large_finetuned_clinc_en_5.2.0_3.0_1701222618408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_dbounds_large_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_dbounds_large_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large_finetuned.by_dbounds").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_dbounds_large_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/dbounds/roberta-large-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_depression_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_depression_detection_en.md new file mode 100644 index 000000000000..ad886b1ecc86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_depression_detection_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from paulagarciaserrano) +author: John Snow Labs +name: roberta_classifier_depression_detection +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-depression-detection` is a English model originally trained by `paulagarciaserrano`. + +## Predicted Entities + +`moderate`, `severe`, `not depression` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_depression_detection_en_5.2.0_3.0_1701222902734.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_depression_detection_en_5.2.0_3.0_1701222902734.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_depression_detection","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_depression_detection","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_paulagarciaserrano").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_depression_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/paulagarciaserrano/roberta-depression-detection +- https://competitions.codalab.org/competitions/36410#learn_the_details \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_detect_acoso_twitter_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_detect_acoso_twitter_es.md new file mode 100644 index 000000000000..7e461470cad9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_detect_acoso_twitter_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Cased model (from hackathon-pln-es) +author: John Snow Labs +name: roberta_classifier_detect_acoso_twitter +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Detect-Acoso-Twitter-Es` is a Spanish model originally trained by `hackathon-pln-es`. + +## Predicted Entities + +`acoso`, `No acoso` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_detect_acoso_twitter_es_5.2.0_3.0_1701223129053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_detect_acoso_twitter_es_5.2.0_3.0_1701223129053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +roberta_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_detect_acoso_twitter","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, roberta_classifier]) + +data = spark.createDataFrame([["I love you!"], ["I feel lucky to be here."]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols("text") + .setOutputCols("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val roberta_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_detect_acoso_twitter","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, roberta_classifier)) + +val data = Seq("I love you!").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.twitter.").predict("""I feel lucky to be here.""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_detect_acoso_twitter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|308.9 MB| +|Case sensitive:|true| +|Max sentence length:|128| + +## References + +References + +- https://huggingface.co/hackathon-pln-es/Detect-Acoso-Twitter-Es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_discord_nft_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_discord_nft_sentiment_en.md new file mode 100644 index 000000000000..c59880fbafa4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_discord_nft_sentiment_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from BVK97) +author: John Snow Labs +name: roberta_classifier_discord_nft_sentiment +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Discord-NFT-Sentiment` is a English model originally trained by `BVK97`. + +## Predicted Entities + +`Neutral`, `Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_discord_nft_sentiment_en_5.2.0_3.0_1701222539113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_discord_nft_sentiment_en_5.2.0_3.0_1701222539113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_discord_nft_sentiment","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_discord_nft_sentiment","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.cord19_sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_discord_nft_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BVK97/Discord-NFT-Sentiment +- https://github.com/BVK23/Discord-NLP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_base_finetuned_fake_news_english_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_base_finetuned_fake_news_english_en.md new file mode 100644 index 000000000000..cebc6e3330d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_base_finetuned_fake_news_english_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from jaygala24) +author: John Snow Labs +name: roberta_classifier_distil_base_finetuned_fake_news_english +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-base-finetuned-fake-news-english` is a English model originally trained by `jaygala24`. + +## Predicted Entities + +`real`, `fake` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_base_finetuned_fake_news_english_en_5.2.0_3.0_1701225651657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_base_finetuned_fake_news_english_en_5.2.0_3.0_1701225651657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_base_finetuned_fake_news_english","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_base_finetuned_fake_news_english","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.news.distilled_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_base_finetuned_fake_news_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jaygala24/distilroberta-base-finetuned-fake-news-english +- https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset +- https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_base_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_base_sst2_distilled_en.md new file mode 100644 index 000000000000..00e507f7f7a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_base_sst2_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from azizbarank) +author: John Snow Labs +name: roberta_classifier_distil_base_sst2_distilled +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-base-sst2-distilled` is a English model originally trained by `azizbarank`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_base_sst2_distilled_en_5.2.0_3.0_1701222810133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_base_sst2_distilled_en_5.2.0_3.0_1701222810133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_base_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_base_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_base.by_azizbarank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_base_sst2_distilled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/azizbarank/distilroberta-base-sst2-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_bias_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_bias_en.md new file mode 100644 index 000000000000..6352b9e2ddb7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_bias_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_bias +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-bias` is a English model originally trained by `valurank`. + +## Predicted Entities + +`BIASED`, `NEUTRAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_bias_en_5.2.0_3.0_1701223370864.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_bias_en_5.2.0_3.0_1701223370864.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_bias","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_bias","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_bias| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-bias +- https://github.com/rpryzant/neutralizing-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_clickbait_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_clickbait_en.md new file mode 100644 index 000000000000..5ec2ba4d3dd2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_clickbait_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_clickbait +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-clickbait` is a English model originally trained by `valurank`. + +## Predicted Entities + +`CLICKBAIT`, `NOT_CLICKBAIT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_clickbait_en_5.2.0_3.0_1701225910301.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_clickbait_en_5.2.0_3.0_1701225910301.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_clickbait","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_clickbait","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clickbait.distilled.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_clickbait| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-clickbait +- https://www.kaggle.com/amananandrai/clickbait-dataset +- https://github.com/MotiBaadror/Clickbait-Detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_current_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_current_en.md new file mode 100644 index 000000000000..30a374b57afc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_current_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_current +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-current` is a English model originally trained by `valurank`. + +## Predicted Entities + +`Current`, `Not_current` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_current_en_5.2.0_3.0_1701222809532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_current_en_5.2.0_3.0_1701222809532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_current","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_current","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.current.distilled.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_current| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-current \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_banking77_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_banking77_en.md new file mode 100644 index 000000000000..32c73c6c46d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_banking77_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from mrm8488) +author: John Snow Labs +name: roberta_classifier_distil_finetuned_banking77 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-finetuned-banking77` is a English model originally trained by `mrm8488`. + +## Predicted Entities + +`verify_top_up`, `visa_or_mastercard`, `cash_withdrawal_not_recognised`, `card_swallowed`, `exchange_rate`, `fiat_currency_support`, `automatic_top_up`, `unable_to_verify_identity`, `disposable_card_limits`, `declined_transfer`, `activate_my_card`, `pending_top_up`, `balance_not_updated_after_bank_transfer`, `top_up_limits`, `age_limit`, `get_disposable_virtual_card`, `lost_or_stolen_phone`, `card_payment_fee_charged`, `request_refund`, `passcode_forgotten`, `atm_support`, `cancel_transfer`, `transaction_charged_twice`, `card_about_to_expire`, `transfer_into_account`, `change_pin`, `card_payment_not_recognised`, `exchange_via_app`, `get_physical_card`, `terminate_account`, `transfer_timing`, `order_physical_card`, `verify_my_identity`, `card_linking`, `apple_pay_or_google_pay`, `verify_source_of_funds`, `wrong_exchange_rate_for_cash_withdrawal`, `wrong_amount_of_cash_received`, `virtual_card_not_working`, `pin_blocked`, `card_acceptance`, `card_arrival`, `pending_transfer`, `country_support`, `why_verify_identity`, `edit_personal_details`, `card_payment_wrong_exchange_rate`, `pending_cash_withdrawal`, `failed_transfer`, `getting_spare_card`, `balance_not_updated_after_cheque_or_cash_deposit`, `top_up_by_bank_transfer_charge`, `topping_up_by_card`, `reverted_card_payment?`, `exchange_charge`, `transfer_not_received_by_recipient`, `top_up_reverted`, `pending_card_payment`, `top_up_by_card_charge`, `supported_cards_and_currencies`, `getting_virtual_card`, `Refund_not_showing_up`, `top_up_by_cash_or_cheque`, `transfer_fee_charged`, `beneficiary_not_allowed`, `card_not_working`, `lost_or_stolen_card`, `declined_cash_withdrawal`, `card_delivery_estimate`, `contactless_not_working`, `direct_debit_payment_not_recognised`, `cash_withdrawal_charge`, `declined_card_payment`, `extra_charge_on_statement`, `receiving_money`, `compromised_card`, `top_up_failed` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_finetuned_banking77_en_5.2.0_3.0_1701223602150.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_finetuned_banking77_en_5.2.0_3.0_1701223602150.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_finetuned_banking77","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_finetuned_banking77","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.banking.distilled_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_finetuned_banking77| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/distilroberta-finetuned-banking77 +- https://twitter.com/mrm8488 +- https://www.linkedin.com/in/manuel-romero-cs/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_financial_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_financial_text_classification_en.md new file mode 100644 index 000000000000..e9486cd12679 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_financial_text_classification_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from nickmuchi) +author: John Snow Labs +name: roberta_classifier_distil_finetuned_financial_text_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-finetuned-financial-text-classification` is a English model originally trained by `nickmuchi`. + +## Predicted Entities + +`bearish`, `neutral`, `bullish` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_finetuned_financial_text_classification_en_5.2.0_3.0_1701223081699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_finetuned_financial_text_classification_en_5.2.0_3.0_1701223081699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_finetuned_financial_text_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_finetuned_financial_text_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_finetuned_financial_text_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nickmuchi/distilroberta-finetuned-financial-text-classification +- https://www.kaggle.com/percyzheng/sentiment-classification-selflabel-dataset +- https://paperswithcode.com/sota?task=Text+Classification&dataset=financial_phrasebank \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_stereotype_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_stereotype_detection_en.md new file mode 100644 index 000000000000..e7854cb8138f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_finetuned_stereotype_detection_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Narrativa) +author: John Snow Labs +name: roberta_classifier_distil_finetuned_stereotype_detection +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-finetuned-stereotype-detection` is a English model originally trained by `Narrativa`. + +## Predicted Entities + +`anti-stereotype`, `neutral`, `stereotype` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_finetuned_stereotype_detection_en_5.2.0_3.0_1701226172962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_finetuned_stereotype_detection_en_5.2.0_3.0_1701226172962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_finetuned_stereotype_detection","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_finetuned_stereotype_detection","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_finetuned.by_narrativa").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_finetuned_stereotype_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Narrativa/distilroberta-finetuned-stereotype-detection +- https://www.narrativa.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_hatespeech_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_hatespeech_en.md new file mode 100644 index 000000000000..8091fa05d0a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_hatespeech_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_hatespeech +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-hatespeech` is a English model originally trained by `valurank`. + +## Predicted Entities + +`HATE`, `NOT_HATE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_hatespeech_en_5.2.0_3.0_1701223084350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_hatespeech_en_5.2.0_3.0_1701223084350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_hatespeech","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_hatespeech","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.hate.distilled").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_hatespeech| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-hatespeech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_mbfc_bias_4class_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_mbfc_bias_4class_en.md new file mode 100644 index 000000000000..620b1ba37301 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_mbfc_bias_4class_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_mbfc_bias_4class +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-mbfc-bias-4class` is a English model originally trained by `valurank`. + +## Predicted Entities + +`left`, `right`, `extremeright`, `leastbiased` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_mbfc_bias_4class_en_5.2.0_3.0_1701223336173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_mbfc_bias_4class_en_5.2.0_3.0_1701223336173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_mbfc_bias_4class","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_mbfc_bias_4class","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.mbfc_bias_4class.distilled.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_mbfc_bias_4class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-mbfc-bias-4class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_mbfc_bias_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_mbfc_bias_en.md new file mode 100644 index 000000000000..09b4c3108fcc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_mbfc_bias_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_mbfc_bias +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-mbfc-bias` is a English model originally trained by `valurank`. + +## Predicted Entities + +`left`, `right`, `leastbiased`, `rightcenter`, `extremeright`, `leftcenter`, `unknown` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_mbfc_bias_en_5.2.0_3.0_1701223332181.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_mbfc_bias_en_5.2.0_3.0_1701223332181.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_mbfc_bias","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_mbfc_bias","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.mbfc_bias.distilled.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_mbfc_bias| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-mbfc-bias +- https://zenodo.org/record/3271522 +- https://propaganda.qcri.org/papers/elsarticle-template.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_news_small_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_news_small_en.md new file mode 100644 index 000000000000..c7049dfda776 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_news_small_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Small Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_news_small +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-news-small` is a English model originally trained by `valurank`. + +## Predicted Entities + +`bad`, `medium`, `good` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_news_small_en_5.2.0_3.0_1701223604758.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_news_small_en_5.2.0_3.0_1701223604758.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_news_small","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_news_small","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.news.distilled_small").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_news_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-news-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_offensive_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_offensive_en.md new file mode 100644 index 000000000000..235a101d3469 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_offensive_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_offensive +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-offensive` is a English model originally trained by `valurank`. + +## Predicted Entities + +`OFFENSIVE`, `NOT_OFFENSIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_offensive_en_5.2.0_3.0_1701223608022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_offensive_en_5.2.0_3.0_1701223608022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_offensive","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_offensive","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.offensive.distilled.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_offensive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_propaganda_2class_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_propaganda_2class_en.md new file mode 100644 index 000000000000..533d5375dc12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_propaganda_2class_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_propaganda_2class +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-propaganda-2class` is a English model originally trained by `valurank`. + +## Predicted Entities + +`No_Prop`, `Prop` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_propaganda_2class_en_5.2.0_3.0_1701226449125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_propaganda_2class_en_5.2.0_3.0_1701226449125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_propaganda_2class","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_propaganda_2class","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.propaganda_2class.distilled.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_propaganda_2class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-propaganda-2class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_proppy_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_proppy_en.md new file mode 100644 index 000000000000..0170c09a4d11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_distil_proppy_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from valurank) +author: John Snow Labs +name: roberta_classifier_distil_proppy +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-proppy` is a English model originally trained by `valurank`. + +## Predicted Entities + +`no_prop`, `prop` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_proppy_en_5.2.0_3.0_1701223854103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_distil_proppy_en_5.2.0_3.0_1701223854103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_proppy","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_distil_proppy","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.proppy.distilled.by_valurank").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_distil_proppy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/valurank/distilroberta-proppy +- https://zenodo.org/record/3271522 +- https://propaganda.qcri.org/papers/elsarticle-template.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_earning_call_transcript_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_earning_call_transcript_classification_en.md new file mode 100644 index 000000000000..77c16bd6d572 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_earning_call_transcript_classification_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from NLPScholars) +author: John Snow Labs +name: roberta_classifier_earning_call_transcript_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Roberta-Earning-Call-Transcript-Classification` is a English model originally trained by `NLPScholars`. + +## Predicted Entities + +`Negative`, `Positive`, `Uncertainty`, `Litigious`, `Constraining` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_earning_call_transcript_classification_en_5.2.0_3.0_1701224167539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_earning_call_transcript_classification_en_5.2.0_3.0_1701224167539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_earning_call_transcript_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_earning_call_transcript_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_nlpscholars").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_earning_call_transcript_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/NLPScholars/Roberta-Earning-Call-Transcript-Classification +- https://www.fool.com/earnings/call-transcripts/2022/04/29/apple-aapl-q2-2022-earnings-call-transcript \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_election_relevancy_best_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_election_relevancy_best_en.md new file mode 100644 index 000000000000..a0032dca8ae8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_election_relevancy_best_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from sefaozalpadl) +author: John Snow Labs +name: roberta_classifier_election_relevancy_best +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `election_relevancy_best` is a English model originally trained by `sefaozalpadl`. + +## Predicted Entities + +`Yes`, `No` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_election_relevancy_best_en_5.2.0_3.0_1701226784173.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_election_relevancy_best_en_5.2.0_3.0_1701226784173.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_election_relevancy_best","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_election_relevancy_best","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.election_relevancy_best.by_sefaozalpadl").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_election_relevancy_best| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sefaozalpadl/election_relevancy_best \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emo_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emo_en.md new file mode 100644 index 000000000000..472b686b14fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emo_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Sindhu) +author: John Snow Labs +name: roberta_classifier_emo +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `emo_roberta` is a English model originally trained by `Sindhu`. + +## Predicted Entities + +`surprise`, `love`, `curiosity`, `remorse`, `approval`, `fear`, `anger`, `realization`, `nervousness`, `pride`, `grief`, `joy`, `amusement`, `neutral`, `caring`, `disappointment`, `disgust`, `desire`, `gratitude`, `excitement`, `admiration`, `annoyance`, `sadness`, `embarrassment`, `optimism`, `confusion`, `relief`, `disapproval` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_emo_en_5.2.0_3.0_1701223935367.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_emo_en_5.2.0_3.0_1701223935367.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_emo","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_emo","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_sindhu").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_emo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|448.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Sindhu/emo_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emotion_en.md new file mode 100644 index 000000000000..caa095f4ae91 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emotion_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Osiris) +author: John Snow Labs +name: roberta_classifier_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `emotion_classifier` is a English model originally trained by `Osiris`. + +## Predicted Entities + +`Positive`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_emotion_en_5.2.0_3.0_1701224456233.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_emotion_en_5.2.0_3.0_1701224456233.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.emotion.by_osiris").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Osiris/emotion_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emotion_english_large_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emotion_english_large_en.md new file mode 100644 index 000000000000..18c9f7fea3d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_emotion_english_large_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Large Cased model (from j-hartmann) +author: John Snow Labs +name: roberta_classifier_emotion_english_large +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `emotion-english-roberta-large` is a English model originally trained by `j-hartmann`. + +## Predicted Entities + +`disgust`, `joy`, `anger`, `fear`, `surprise`, `sadness`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_emotion_english_large_en_5.2.0_3.0_1701224164149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_emotion_english_large_en_5.2.0_3.0_1701224164149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_emotion_english_large","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_emotion_english_large","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_emotion_english_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/j-hartmann/emotion-english-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_environmental_claims_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_environmental_claims_en.md new file mode 100644 index 000000000000..b44983963afd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_environmental_claims_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from climatebert) +author: John Snow Labs +name: roberta_classifier_environmental_claims +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `environmental-claims` is a English model originally trained by `climatebert`. + +## Predicted Entities + +`no`, `yes` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_environmental_claims_en_5.2.0_3.0_1701224247618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_environmental_claims_en_5.2.0_3.0_1701224247618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_environmental_claims","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_environmental_claims","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.environment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_environmental_claims| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/climatebert/environmental-claims +- https://arxiv.org/abs/2209.00507 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_classification_en.md new file mode 100644 index 000000000000..5a031749afc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_classification_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from hamzab) +author: John Snow Labs +name: roberta_classifier_fake_news_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-fake-news-classification` is a English model originally trained by `hamzab`. + +## Predicted Entities + +`TRUE`, `FAKE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_fake_news_classification_en_5.2.0_3.0_1701224492492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_fake_news_classification_en_5.2.0_3.0_1701224492492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fake_news_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fake_news_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.news.by_hamzab").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_fake_news_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|458.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/hamzab/roberta-fake-news-classification +- https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_debunker_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_debunker_en.md new file mode 100644 index 000000000000..48556fa48247 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_debunker_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from Nithiwat) +author: John Snow Labs +name: roberta_classifier_fake_news_debunker +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `fake-news-debunker` is a English model originally trained by `Nithiwat`. + +## Predicted Entities + +`1`, `0` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_fake_news_debunker_en_5.2.0_3.0_1701224426071.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_fake_news_debunker_en_5.2.0_3.0_1701224426071.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fake_news_debunker","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fake_news_debunker","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.news.by_nithiwat").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_fake_news_debunker| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Nithiwat/fake-news-debunker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_detection_spanish_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_detection_spanish_es.md new file mode 100644 index 000000000000..b76908835927 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fake_news_detection_spanish_es.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Cased model (from Narrativaai) +author: John Snow Labs +name: roberta_classifier_fake_news_detection_spanish +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `fake-news-detection-spanish` is a Spanish model originally trained by `Narrativaai`. + +## Predicted Entities + +`FAKE`, `REAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_fake_news_detection_spanish_es_5.2.0_3.0_1701225097817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_fake_news_detection_spanish_es_5.2.0_3.0_1701225097817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fake_news_detection_spanish","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fake_news_detection_spanish","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.news.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_fake_news_detection_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Narrativaai/fake-news-detection-spanish +- https://sites.google.com/view/iberlef2020/#h.p_w0c31bn0r-SW +- https://sites.google.com/view/fakedes/results?authuser=0 +- https://sites.google.com/view/fakedes/home +- https://www.narrativa.com/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fakeddit_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fakeddit_en.md new file mode 100644 index 000000000000..9619eaea1b5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fakeddit_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from yaoyinnan) +author: John Snow Labs +name: roberta_classifier_fakeddit +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-fakeddit` is a English model originally trained by `yaoyinnan`. + +## Predicted Entities + +`Real`, `Fake` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_fakeddit_en_5.2.0_3.0_1701224770103.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_fakeddit_en_5.2.0_3.0_1701224770103.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fakeddit","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fakeddit","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.fakeddit.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_fakeddit| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/yaoyinnan/roberta-fakeddit \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_feedback_intent_test_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_feedback_intent_test_en.md new file mode 100644 index 000000000000..9bd3d29fd90c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_feedback_intent_test_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from mp6kv) +author: John Snow Labs +name: roberta_classifier_feedback_intent_test +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `feedback_intent_test` is a English model originally trained by `mp6kv`. + +## Predicted Entities + +`neutral_feedback`, `positive_feedback`, `negative_feedback` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_feedback_intent_test_en_5.2.0_3.0_1701225110598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_feedback_intent_test_en_5.2.0_3.0_1701225110598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_feedback_intent_test","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_feedback_intent_test","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.feedback_intent_test.roberta.by_mp6kv").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_feedback_intent_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mp6kv/feedback_intent_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetune_emotion_distilroberta_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetune_emotion_distilroberta_en.md new file mode 100644 index 000000000000..f785283b8589 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetune_emotion_distilroberta_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from aarnphm) +author: John Snow Labs +name: roberta_classifier_finetune_emotion_distilroberta +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetune_emotion_distilroberta` is a English model originally trained by `aarnphm`. + +## Predicted Entities + +`surprise`, `love`, `joy`, `fear`, `sadness`, `anger` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_finetune_emotion_distilroberta_en_5.2.0_3.0_1701225373338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_finetune_emotion_distilroberta_en_5.2.0_3.0_1701225373338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finetune_emotion_distilroberta","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finetune_emotion_distilroberta","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_roberta.distilled").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_finetune_emotion_distilroberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aarnphm/finetune_emotion_distilroberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51_en.md new file mode 100644 index 000000000000..e634404d7130 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from ali2066) +author: John Snow Labs +name: roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetuned_sentence_itr0_2e-05_all_01_03_2022-02_53_51` is a English model originally trained by `ali2066`. + +## Predicted Entities + +`POSITIVE`, `NEGATIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51_en_5.2.0_3.0_1701227663012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51_en_5.2.0_3.0_1701227663012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.finetuned.by_ali2066").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_finetuned_sentence_itr0_2e_05_all_01_03_2022_02_53_51| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ali2066/finetuned_sentence_itr0_2e-05_all_01_03_2022-02_53_51 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetuning_cardiffnlp_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetuning_cardiffnlp_sentiment_model_en.md new file mode 100644 index 000000000000..09c26470aea3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finetuning_cardiffnlp_sentiment_model_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from anvay) +author: John Snow Labs +name: roberta_classifier_finetuning_cardiffnlp_sentiment_model +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finetuning-cardiffnlp-sentiment-model` is a English model originally trained by `anvay`. + +## Predicted Entities + +`Positive`, `Neutral`, `Negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_finetuning_cardiffnlp_sentiment_model_en_5.2.0_3.0_1701225658434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_finetuning_cardiffnlp_sentiment_model_en_5.2.0_3.0_1701225658434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finetuning_cardiffnlp_sentiment_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finetuning_cardiffnlp_sentiment_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.sentiment.finetuning_").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_finetuning_cardiffnlp_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/anvay/finetuning-cardiffnlp-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finsent_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finsent_en.md new file mode 100644 index 000000000000..5de9c80e2313 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_finsent_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from blinjrm) +author: John Snow Labs +name: roberta_classifier_finsent +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `finsent` is a English model originally trained by `blinjrm`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_finsent_en_5.2.0_3.0_1701225949079.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_finsent_en_5.2.0_3.0_1701225949079.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finsent","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_finsent","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_blinjrm").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_finsent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/blinjrm/finsent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_for_multilabel_sentence_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_for_multilabel_sentence_classification_en.md new file mode 100644 index 000000000000..dd9aa445b769 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_for_multilabel_sentence_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Zamachi) +author: John Snow Labs +name: roberta_classifier_for_multilabel_sentence_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `RoBERTa-for-multilabel-sentence-classification` is a English model originally trained by `Zamachi`. + +## Predicted Entities + +`joy`, `optimism`, `sadness`, `anger` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_for_multilabel_sentence_classification_en_5.2.0_3.0_1701226238484.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_for_multilabel_sentence_classification_en_5.2.0_3.0_1701226238484.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_for_multilabel_sentence_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_for_multilabel_sentence_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_zamachi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_for_multilabel_sentence_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Zamachi/RoBERTa-for-multilabel-sentence-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fs_distil_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fs_distil_fine_tuned_en.md new file mode 100644 index 000000000000..220058c6e56b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_fs_distil_fine_tuned_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Anthos23) +author: John Snow Labs +name: roberta_classifier_fs_distil_fine_tuned +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `FS-distilroberta-fine-tuned` is a English model originally trained by `Anthos23`. + +## Predicted Entities + +`negative`, `neutral`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_fs_distil_fine_tuned_en_5.2.0_3.0_1701223665158.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_fs_distil_fine_tuned_en_5.2.0_3.0_1701223665158.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fs_distil_fine_tuned","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_fs_distil_fine_tuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled.by_anthos23").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_fs_distil_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Anthos23/FS-distilroberta-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r1_target_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r1_target_en.md new file mode 100644 index 000000000000..4c69d5eae143 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r1_target_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from facebook) +author: John Snow Labs +name: roberta_classifier_hate_speech_dynabench_r1_target +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-hate-speech-dynabench-r1-target` is a English model originally trained by `facebook`. + +## Predicted Entities + +`nothate`, `hate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r1_target_en_5.2.0_3.0_1701225386520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r1_target_en_5.2.0_3.0_1701225386520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r1_target","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r1_target","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.hate.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_hate_speech_dynabench_r1_target| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|456.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/facebook/roberta-hate-speech-dynabench-r1-target +- https://arxiv.org/abs/2012.15761 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r2_target_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r2_target_en.md new file mode 100644 index 000000000000..8ba0df5f2bc9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r2_target_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from facebook) +author: John Snow Labs +name: roberta_classifier_hate_speech_dynabench_r2_target +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-hate-speech-dynabench-r2-target` is a English model originally trained by `facebook`. + +## Predicted Entities + +`nothate`, `hate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r2_target_en_5.2.0_3.0_1701226642077.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r2_target_en_5.2.0_3.0_1701226642077.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r2_target","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r2_target","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.hate.dynabench_r2_target.by_facebook").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_hate_speech_dynabench_r2_target| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/facebook/roberta-hate-speech-dynabench-r2-target +- https://arxiv.org/abs/2012.15761 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r3_target_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r3_target_en.md new file mode 100644 index 000000000000..62882ae8d42a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r3_target_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from facebook) +author: John Snow Labs +name: roberta_classifier_hate_speech_dynabench_r3_target +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-hate-speech-dynabench-r3-target` is a English model originally trained by `facebook`. + +## Predicted Entities + +`nothate`, `hate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r3_target_en_5.2.0_3.0_1701226984161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r3_target_en_5.2.0_3.0_1701226984161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r3_target","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r3_target","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.hate.dynabench_r3_target.by_facebook").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_hate_speech_dynabench_r3_target| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|458.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/facebook/roberta-hate-speech-dynabench-r3-target +- https://arxiv.org/abs/2012.15761 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r4_target_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r4_target_en.md new file mode 100644 index 000000000000..bb006bd281c6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_hate_speech_dynabench_r4_target_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from facebook) +author: John Snow Labs +name: roberta_classifier_hate_speech_dynabench_r4_target +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-hate-speech-dynabench-r4-target` is a English model originally trained by `facebook`. + +## Predicted Entities + +`nothate`, `hate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r4_target_en_5.2.0_3.0_1701224113608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_hate_speech_dynabench_r4_target_en_5.2.0_3.0_1701224113608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r4_target","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_hate_speech_dynabench_r4_target","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.hate.dynabench_r4_target.by_facebook").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_hate_speech_dynabench_r4_target| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/facebook/roberta-hate-speech-dynabench-r4-target +- https://arxiv.org/abs/2012.15761 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indo_indonli_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indo_indonli_id.md new file mode 100644 index 000000000000..9529c644638f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indo_indonli_id.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Indonesian RobertaForSequenceClassification Cased model (from StevenLimcorn) +author: John Snow Labs +name: roberta_classifier_indo_indonli +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indo-roberta-indonli` is a Indonesian model originally trained by `StevenLimcorn`. + +## Predicted Entities + +`c`, `n`, `e` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indo_indonli_id_5.2.0_3.0_1701224415831.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indo_indonli_id_5.2.0_3.0_1701224415831.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indo_indonli","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indo_indonli","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indo_indonli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/StevenLimcorn/indo-roberta-indonli +- https://github.com/ir-nlp-csui/indonli/tree/main/data/indonli +- https://github.com/stevenlimcorn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_emotion_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_emotion_id.md new file mode 100644 index 000000000000..a04c24db7368 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_emotion_id.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Indonesian RobertaForSequenceClassification Cased model (from akahana) +author: John Snow Labs +name: roberta_classifier_indonesia_emotion +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indonesia-emotion-roberta` is a Indonesian model originally trained by `akahana`. + +## Predicted Entities + +`TAKUT`, `SEDIH`, `MARAH`, `BAHAGIA`, `CINTA` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesia_emotion_id_5.2.0_3.0_1701225390970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesia_emotion_id_5.2.0_3.0_1701225390970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesia_emotion","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesia_emotion","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta.by_akahana").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indonesia_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/akahana/indonesia-emotion-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_emotion_small_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_emotion_small_en.md new file mode 100644 index 000000000000..989484a68b83 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_emotion_small_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Small Cased model (from akahana) +author: John Snow Labs +name: roberta_classifier_indonesia_emotion_small +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indonesia-emotion-roberta-small` is a English model originally trained by `akahana`. + +## Predicted Entities + +`SEDIH`, `BAHAGIA`, `TAKUT`, `MARAH`, `CINTA` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesia_emotion_small_en_5.2.0_3.0_1701224564250.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesia_emotion_small_en_5.2.0_3.0_1701224564250.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesia_emotion_small","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesia_emotion_small","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.small").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indonesia_emotion_small| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|85.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/akahana/indonesia-emotion-roberta-small \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_sentiment_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_sentiment_id.md new file mode 100644 index 000000000000..95ba4c55627f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesia_sentiment_id.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Indonesian RobertaForSequenceClassification Cased model (from akahana) +author: John Snow Labs +name: roberta_classifier_indonesia_sentiment +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indonesia-sentiment-roberta` is a Indonesian model originally trained by `akahana`. + +## Predicted Entities + +`NETRAL`, `NEGATIF`, `POSITIF` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesia_sentiment_id_5.2.0_3.0_1701227329370.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesia_sentiment_id_5.2.0_3.0_1701227329370.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesia_sentiment","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesia_sentiment","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indonesia_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/akahana/indonesia-sentiment-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_emotion_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_emotion_id.md new file mode 100644 index 000000000000..d281c877c484 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_emotion_id.md @@ -0,0 +1,108 @@ +--- +layout: model +title: Indonesian RobertaForSequenceClassification Base Cased model (from StevenLimcorn) +author: John Snow Labs +name: roberta_classifier_indonesian_base_emotion +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indonesian-roberta-base-emotion-classifier` is a Indonesian model originally trained by `StevenLimcorn`. + +## Predicted Entities + +`sadness`, `fear`, `happy`, `anger`, `love` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesian_base_emotion_id_5.2.0_3.0_1701227662605.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesian_base_emotion_id_5.2.0_3.0_1701227662605.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesian_base_emotion","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesian_base_emotion","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indonesian_base_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/StevenLimcorn/indonesian-roberta-base-emotion-classifier +- https://www.indobenchmark.com/ +- https://github.com/stevenlimcorn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_indonli_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_indonli_id.md new file mode 100644 index 000000000000..28ab13fc111d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_indonli_id.md @@ -0,0 +1,111 @@ +--- +layout: model +title: Indonesian RobertaForSequenceClassification Base Cased model (from w11wo) +author: John Snow Labs +name: roberta_classifier_indonesian_base_indonli +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indonesian-roberta-base-indonli` is a Indonesian model originally trained by `w11wo`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesian_base_indonli_id_5.2.0_3.0_1701228002403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesian_base_indonli_id_5.2.0_3.0_1701228002403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesian_base_indonli","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesian_base_indonli","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta.base.by_w11wo").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indonesian_base_indonli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/w11wo/indonesian-roberta-base-indonli +- https://arxiv.org/abs/1907.11692 +- https://hf.co/flax-community/indonesian-roberta-base +- https://github.com/ir-nlp-csui/indonli +- https://arxiv.org/abs/2110.14566 +- https://w11wo.github.io/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_sentiment_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_sentiment_id.md new file mode 100644 index 000000000000..883282e0033d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesian_base_sentiment_id.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Indonesian RobertaForSequenceClassification Base Cased model (from w11wo) +author: John Snow Labs +name: roberta_classifier_indonesian_base_sentiment +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indonesian-roberta-base-sentiment-classifier` is a Indonesian model originally trained by `w11wo`. + +## Predicted Entities + +`positive`, `neutral`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesian_base_sentiment_id_5.2.0_3.0_1701225674964.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesian_base_sentiment_id_5.2.0_3.0_1701225674964.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesian_base_sentiment","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesian_base_sentiment","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta.sentiment.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indonesian_base_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/w11wo/indonesian-roberta-base-sentiment-classifier +- https://arxiv.org/abs/1907.11692 +- https://hf.co/flax-community/indonesian-roberta-base +- https://hf.co/datasets/indonlu +- https://w11wo.github.io/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesiasentiment_id.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesiasentiment_id.md new file mode 100644 index 000000000000..3807eb33f7a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_indonesiasentiment_id.md @@ -0,0 +1,109 @@ +--- +layout: model +title: Indonesian RobertaForSequenceClassification Cased model (from sahri) +author: John Snow Labs +name: roberta_classifier_indonesiasentiment +date: 2023-11-29 +tags: [id, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `indonesiasentiment` is a Indonesian model originally trained by `sahri`. + +## Predicted Entities + +`positive`, `neutral`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesiasentiment_id_5.2.0_3.0_1701226009113.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_indonesiasentiment_id_5.2.0_3.0_1701226009113.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesiasentiment","id") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_indonesiasentiment","id") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("id.classify.roberta.sentiment.by_sahri").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_indonesiasentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sahri/indonesiasentiment +- https://arxiv.org/abs/1907.11692 +- https://hf.co/flax-community/indonesian-roberta-base +- https://hf.co/datasets/indonlu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_intel_base_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_intel_base_mrpc_en.md new file mode 100644 index 000000000000..d07d98122900 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_intel_base_mrpc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from Intel) +author: John Snow Labs +name: roberta_classifier_intel_base_mrpc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-mrpc` is a English model originally trained by `Intel`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_intel_base_mrpc_en_5.2.0_3.0_1701228016379.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_intel_base_mrpc_en_5.2.0_3.0_1701228016379.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_intel_base_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_intel_base_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_intel_base_mrpc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Intel/roberta-base-mrpc +- https://paperswithcode.com/sota?task=Natural+Language+Inference&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_iqa_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_iqa_classification_en.md new file mode 100644 index 000000000000..5af472e88a01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_iqa_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from mp6kv) +author: John Snow Labs +name: roberta_classifier_iqa_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `IQA_classification` is a English model originally trained by `mp6kv`. + +## Predicted Entities + +`procedural`, `probing_exploring`, `expository`, `other_math`, `non_math` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_iqa_classification_en_5.2.0_3.0_1701228397380.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_iqa_classification_en_5.2.0_3.0_1701228397380.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_iqa_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_iqa_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.iqa_classification.roberta.by_mp6kv").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_iqa_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|425.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mp6kv/IQA_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_isear_bert_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_isear_bert_en.md new file mode 100644 index 000000000000..5d5cc46342e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_isear_bert_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from crcb) +author: John Snow Labs +name: roberta_classifier_isear_bert +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `isear_bert` is a English model originally trained by `crcb`. + +## Predicted Entities + +`joy`, `shame`, `sadness`, `anger`, `guilt`, `fear`, `disgust` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_isear_bert_en_5.2.0_3.0_1701224898523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_isear_bert_en_5.2.0_3.0_1701224898523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_isear_bert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_isear_bert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.isear_bert.roberta.by_crcb").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_isear_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/crcb/isear_bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_iterater_intention_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_iterater_intention_en.md new file mode 100644 index 000000000000..c94e783b26ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_iterater_intention_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from wanyu) +author: John Snow Labs +name: roberta_classifier_iterater_intention +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `IteraTeR-ROBERTA-Intention-Classifier` is a English model originally trained by `wanyu`. + +## Predicted Entities + +`clarity`, `coherence`, `meaning-changed`, `style`, `fluency` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_iterater_intention_en_5.2.0_3.0_1701226695532.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_iterater_intention_en_5.2.0_3.0_1701226695532.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_iterater_intention","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_iterater_intention","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_wanyu").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_iterater_intention| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/wanyu/IteraTeR-ROBERTA-Intention-Classifier +- https://arxiv.org/abs/2203.03802 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_jeremiahz_base_mrpc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_jeremiahz_base_mrpc_en.md new file mode 100644 index 000000000000..7315948baa80 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_jeremiahz_base_mrpc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from JeremiahZ) +author: John Snow Labs +name: roberta_classifier_jeremiahz_base_mrpc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-mrpc` is a English model originally trained by `JeremiahZ`. + +## Predicted Entities + +`not_equivalent`, `equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_jeremiahz_base_mrpc_en_5.2.0_3.0_1701228410522.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_jeremiahz_base_mrpc_en_5.2.0_3.0_1701228410522.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_jeremiahz_base_mrpc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_jeremiahz_base_mrpc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.mrpc.glue.base.by_JeremiahZ").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_jeremiahz_base_mrpc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JeremiahZ/roberta-base-mrpc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+MRPC \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_en.md new file mode 100644 index 000000000000..82987a2bd49e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from cestwc) +author: John Snow Labs +name: roberta_classifier_large +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large` is a English model originally trained by `cestwc`. + +## Predicted Entities + +`entailment`, `neutral`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_en_5.2.0_3.0_1701227351005.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_en_5.2.0_3.0_1701227351005.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large.by_cestwc").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.2 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cestwc/roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_faithcritic_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_faithcritic_en.md new file mode 100644 index 000000000000..21c3281e8ad7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_faithcritic_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from McGill-NLP) +author: John Snow Labs +name: roberta_classifier_large_faithcritic +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-faithcritic` is a English model originally trained by `McGill-NLP`. + +## Predicted Entities + +`Hallucination`, `Entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_faithcritic_en_5.2.0_3.0_1701227984018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_faithcritic_en_5.2.0_3.0_1701227984018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_faithcritic","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_faithcritic","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large.by_mcgill_nlp").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_faithcritic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/McGill-NLP/roberta-large-faithcritic +- https://github.com/McGill-NLP/FaithDial +- https://paperswithcode.com/sota?task=Faithfulness+Critic&dataset=FaithCritic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_1234567_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_1234567_en.md new file mode 100644 index 000000000000..a0f57cc11d76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_1234567_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_1234567 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-1234567` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_1234567_en_5.2.0_3.0_1701231145427.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_1234567_en_5.2.0_3.0_1701231145427.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_1234567","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_1234567","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v2.large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_1234567| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-1234567 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_123456_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_123456_en.md new file mode 100644 index 000000000000..284de8d5cf86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_123456_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_123456 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-123456` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_123456_en_5.2.0_3.0_1701225959860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_123456_en_5.2.0_3.0_1701225959860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_123456","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_123456","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v7large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_123456| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-123456 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_12345_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_12345_en.md new file mode 100644 index 000000000000..270f4492396a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_12345_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_12345 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-12345` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_12345_en_5.2.0_3.0_1701230490328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_12345_en_5.2.0_3.0_1701230490328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_12345","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_12345","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v6large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_12345| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-12345 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_1234_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_1234_en.md new file mode 100644 index 000000000000..630e6d9094fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_1234_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_1234 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-1234` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_1234_en_5.2.0_3.0_1701228583431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_1234_en_5.2.0_3.0_1701228583431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_1234","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_1234","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v5large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_1234| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-1234 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_123_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_123_en.md new file mode 100644 index 000000000000..3b84700c15b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_123_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_123 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-123` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_123_en_5.2.0_3.0_1701229818912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_123_en_5.2.0_3.0_1701229818912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_123","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_123","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v4large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_123| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-123 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_12_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_12_en.md new file mode 100644 index 000000000000..5a6600af5df1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_12_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_12 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-12` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_12_en_5.2.0_3.0_1701229163184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_12_en_5.2.0_3.0_1701229163184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_12","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_12","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v3large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_12| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-12 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_3141_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_3141_en.md new file mode 100644 index 000000000000..379f1b09a2bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_3141_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_3141 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-3141` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_3141_en_5.2.0_3.0_1701226710066.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_3141_en_5.2.0_3.0_1701226710066.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_3141","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_3141","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v9large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_3141| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-3141 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_314_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_314_en.md new file mode 100644 index 000000000000..be8b3c599998 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_finetuned_clinc_314_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_large_finetuned_clinc_314 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc-314` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_314_en_5.2.0_3.0_1701231991390.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_finetuned_clinc_314_en_5.2.0_3.0_1701231991390.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_314","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_finetuned_clinc_314","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.v1.large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_finetuned_clinc_314| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc-314 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_mnli_en.md new file mode 100644 index 000000000000..ed092c2c47f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_mnli_en.md @@ -0,0 +1,132 @@ +--- +layout: model +title: English RobertaForSequenceClassification Large Cased model +author: John Snow Labs +name: roberta_classifier_large_mnli +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-mnli` is a English model originally trained by HuggingFace. + +## Predicted Entities + +`ENTAILMENT`, `NEUTRAL`, `CONTRADICTION` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_mnli_en_5.2.0_3.0_1701229376338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_mnli_en_5.2.0_3.0_1701229376338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_mnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_mnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large.by_uploaded by huggingface").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_mnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|845.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/roberta-large-mnli +- https://github.com/facebookresearch/fairseq/tree/main/examples/roberta +- https://arxiv.org/abs/1907.11692 +- https://github.com/facebookresearch/fairseq/tree/main/examples/roberta +- https://github.com/facebookresearch/fairseq/tree/main/examples/roberta +- https://aclanthology.org/2021.acl-long.330.pdf +- https://dl.acm.org/doi/pdf/10.1145/3442188.3445922 +- https://cims.nyu.edu/~sbowman/multinli/ +- https://yknzhu.wixsite.com/mbweb +- https://en.wikipedia.org/wiki/English_Wikipedia +- https://commoncrawl.org/2016/10/news-dataset-available/ +- https://github.com/jcpeterson/openwebtext +- https://arxiv.org/abs/1806.02847 +- https://github.com/facebookresearch/fairseq/tree/main/examples/roberta +- https://arxiv.org/pdf/1804.07461.pdf +- https://cims.nyu.edu/~sbowman/multinli/ +- https://arxiv.org/pdf/1804.07461.pdf +- https://arxiv.org/pdf/1804.07461.pdf +- https://arxiv.org/abs/1704.05426 +- https://arxiv.org/abs/1508.05326 +- https://arxiv.org/pdf/1809.05053.pdf +- https://cims.nyu.edu/~sbowman/multinli/ +- https://arxiv.org/pdf/1809.05053.pdf +- https://mlco2.github.io/impact#compute +- https://arxiv.org/abs/1910.09700 +- https://arxiv.org/pdf/1907.11692.pdf +- https://arxiv.org/pdf/1907.11692.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_mnli_finetuned_header_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_mnli_finetuned_header_en.md new file mode 100644 index 000000000000..7a75b7de6a44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_mnli_finetuned_header_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from alk) +author: John Snow Labs +name: roberta_classifier_large_mnli_finetuned_header +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-mnli-finetuned-header-classifier` is a English model originally trained by `alk`. + +## Predicted Entities + +`ENTAILMENT`, `CONTRADICTION`, `NEUTRAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_mnli_finetuned_header_en_5.2.0_3.0_1701232650607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_mnli_finetuned_header_en_5.2.0_3.0_1701232650607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_mnli_finetuned_header","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_mnli_finetuned_header","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large_finetuned_adverse_drug_event").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_mnli_finetuned_header| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/alk/roberta-large-mnli-finetuned-header-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_pyrxsum_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_pyrxsum_en.md new file mode 100644 index 000000000000..8fc24607273b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_pyrxsum_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_pyrxsum +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-pyrxsum` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_pyrxsum_en_5.2.0_3.0_1701225476441.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_pyrxsum_en_5.2.0_3.0_1701225476441.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_pyrxsum","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_pyrxsum","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.pyrxsum.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_pyrxsum| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-pyrxsum \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold1_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold1_en.md new file mode 100644 index 000000000000..ef9b979ecd82 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold1_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_examples_fold1 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-examples-fold1` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold1_en_5.2.0_3.0_1701226082423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold1_en_5.2.0_3.0_1701226082423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_examples_fold1.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_examples_fold1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-examples-fold1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold3_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold3_en.md new file mode 100644 index 000000000000..48f709d28a6d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold3_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_examples_fold3 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-examples-fold3` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold3_en_5.2.0_3.0_1701230071654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold3_en_5.2.0_3.0_1701230071654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold3","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold3","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_examples_fold3.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_examples_fold3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-examples-fold3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold4_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold4_en.md new file mode 100644 index 000000000000..0017d2e106b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold4_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_examples_fold4 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-examples-fold4` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold4_en_5.2.0_3.0_1701229639177.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold4_en_5.2.0_3.0_1701229639177.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold4","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_examples_fold4.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_examples_fold4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-examples-fold4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold5_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold5_en.md new file mode 100644 index 000000000000..652dbe947ddd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_examples_fold5_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_examples_fold5 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-examples-fold5` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold5_en_5.2.0_3.0_1701233394692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_examples_fold5_en_5.2.0_3.0_1701233394692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold5","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_examples_fold5","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_examples_fold5.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_examples_fold5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-examples-fold5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold1_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold1_en.md new file mode 100644 index 000000000000..0d346dc835f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold1_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_systems_fold1 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-systems-fold1` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold1_en_5.2.0_3.0_1701227204618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold1_en_5.2.0_3.0_1701227204618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold1","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold1","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_systems_fold1.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_systems_fold1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-systems-fold1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold2_en.md new file mode 100644 index 000000000000..94d049df9215 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold2_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_systems_fold2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-systems-fold2` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold2_en_5.2.0_3.0_1701233949061.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold2_en_5.2.0_3.0_1701233949061.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_systems_fold2.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_systems_fold2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-systems-fold2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold3_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold3_en.md new file mode 100644 index 000000000000..2d892cea9d2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold3_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_systems_fold3 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-systems-fold3` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold3_en_5.2.0_3.0_1701230854374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold3_en_5.2.0_3.0_1701230854374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold3","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold3","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_systems_fold3.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_systems_fold3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-systems-fold3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold4_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold4_en.md new file mode 100644 index 000000000000..697237da6cad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold4_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_systems_fold4 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-systems-fold4` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold4_en_5.2.0_3.0_1701227902558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold4_en_5.2.0_3.0_1701227902558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold4","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold4","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_systems_fold4.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_systems_fold4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-systems-fold4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold5_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold5_en.md new file mode 100644 index 000000000000..51c25a78db48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_by_systems_fold5_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm_by_systems_fold5 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm-by-systems-fold5` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold5_en_5.2.0_3.0_1701230384038.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_by_systems_fold5_en_5.2.0_3.0_1701230384038.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold5","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm_by_systems_fold5","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm_by_systems_fold5.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm_by_systems_fold5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm-by-systems-fold5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_en.md new file mode 100644 index 000000000000..b4d1bfbcdb17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_realsumm_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_realsumm +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-realsumm` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_en_5.2.0_3.0_1701227458984.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_realsumm_en_5.2.0_3.0_1701227458984.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_realsumm","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.realsumm.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_realsumm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-realsumm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli_en.md new file mode 100644 index 000000000000..914d9e688c50 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli_en.md @@ -0,0 +1,111 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from ynie) +author: John Snow Labs +name: roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli` is a English model originally trained by `ynie`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli_en_5.2.0_3.0_1701228653323.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli_en_5.2.0_3.0_1701228653323.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.fever.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_snli_mnli_fever_anli_r1_r2_r3_nli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|845.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli +- https://nlp.stanford.edu/projects/snli/ +- https://cims.nyu.edu/~sbowman/multinli/ +- https://github.com/easonnie/combine-FEVER-NSMN/blob/master/other_resources/nli_fever.md +- https://github.com/facebookresearch/anli +- https://easonnie.github.io \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac08_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac08_en.md new file mode 100644 index 000000000000..9cfaf883b112 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac08_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_tac08 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-tac08` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_tac08_en_5.2.0_3.0_1701231101512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_tac08_en_5.2.0_3.0_1701231101512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_tac08","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_tac08","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tac08.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_tac08| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-tac08 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac08_tac09_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac08_tac09_en.md new file mode 100644 index 000000000000..9801047f0463 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac08_tac09_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_tac08_tac09 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-tac08-tac09` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_tac08_tac09_en_5.2.0_3.0_1701231541554.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_tac08_tac09_en_5.2.0_3.0_1701231541554.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_tac08_tac09","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_tac08_tac09","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tac08_tac09.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_tac08_tac09| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-tac08-tac09 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac09_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac09_en.md new file mode 100644 index 000000000000..41b7b7856832 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_tac09_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from shiyue) +author: John Snow Labs +name: roberta_classifier_large_tac09 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-tac09` is a English model originally trained by `shiyue`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_tac09_en_5.2.0_3.0_1701228177755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_tac09_en_5.2.0_3.0_1701228177755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_tac09","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_tac09","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tac09.large.by_shiyue").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_tac09| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shiyue/roberta-large-tac09 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_winogrande_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_winogrande_en.md new file mode 100644 index 000000000000..86b690af2862 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_large_winogrande_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Large Cased model (from DeepPavlov) +author: John Snow Labs +name: roberta_classifier_large_winogrande +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-winogrande` is a English model originally trained by `DeepPavlov`. + +## Predicted Entities + +`False`, `True` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_winogrande_en_5.2.0_3.0_1701234479875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_large_winogrande_en_5.2.0_3.0_1701234479875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_winogrande","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_large_winogrande","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large.by_deeppavlov").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_large_winogrande| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/DeepPavlov/roberta-large-winogrande \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_latam_question_quality_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_latam_question_quality_es.md new file mode 100644 index 000000000000..3f1f707b41fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_latam_question_quality_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Cased model (from gagandeepkundi) +author: John Snow Labs +name: roberta_classifier_latam_question_quality +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `latam-question-quality` is a Spanish model originally trained by `gagandeepkundi`. + +## Predicted Entities + +`Low Quality`, `High Quality` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_latam_question_quality_es_5.2.0_3.0_1701234813133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_latam_question_quality_es_5.2.0_3.0_1701234813133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_latam_question_quality","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_latam_question_quality","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.by_gagandeepkundi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_latam_question_quality| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|472.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/gagandeepkundi/latam-question-quality \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lewtun_large_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lewtun_large_finetuned_clinc_en.md new file mode 100644 index 000000000000..f006ee947589 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lewtun_large_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_lewtun_large_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_lewtun_large_finetuned_clinc_en_5.2.0_3.0_1701235422798.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_lewtun_large_finetuned_clinc_en_5.2.0_3.0_1701235422798.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_lewtun_large_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_lewtun_large_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.clinc.large_finetuned.by_lewtun").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_lewtun_large_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/roberta-large-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc_en.md new file mode 100644 index 000000000000..c08e4a797936 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from lewtun) +author: John Snow Labs +name: roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L12-H384-distilled-finetuned-clinc` is a English model originally trained by `lewtun`. + +## Predicted Entities + +`international_visa`, `distance`, `gas`, `what_are_your_hobbies`, `whisper_mode`, `travel_notification`, `pay_bill`, `alarm`, `ingredient_substitution`, `order`, `greeting`, `directions`, `tire_pressure`, `nutrition_info`, `bill_balance`, `change_ai_name`, `weather`, `update_playlist`, `payday`, `restaurant_reservation`, `transactions`, `translate`, `carry_on`, `find_phone`, `oos`, `fun_fact`, `rewards_balance`, `measurement_conversion`, `what_song`, `flip_coin`, `cancel_reservation`, `what_is_your_name`, `todo_list`, `who_made_you`, `transfer`, `w2`, `sync_device`, `yes`, `where_are_you_from`, `reminder_update`, `calculator`, `credit_score`, `who_do_you_work_for`, `travel_suggestion`, `international_fees`, `repeat`, `calories`, `credit_limit_change`, `are_you_a_bot`, `redeem_rewards`, `book_hotel`, `how_old_are_you`, `interest_rate`, `reminder`, `timezone`, `user_name`, `card_declined`, `routing`, `make_call`, `income`, `book_flight`, `what_can_i_ask_you`, `change_speed`, `pto_request`, `application_status`, `change_accent`, `freeze_account`, `change_language`, `todo_list_update`, `calendar_update`, `timer`, `pto_balance`, `oil_change_when`, `gas_type`, `accept_reservations`, `pto_request_status`, `damaged_card`, `schedule_meeting`, `report_lost_card`, `car_rental`, `improve_credit_score`, `do_you_have_pets`, `expiration_date`, `food_last`, `insurance_change`, `shopping_list_update`, `pin_change`, `order_status`, `schedule_maintenance`, `account_blocked`, `min_payment`, `apr`, `plug_type`, `tire_change`, `spending_history`, `direct_deposit`, `balance`, `reset_settings`, `insurance`, `spelling`, `report_fraud`, `last_maintenance`, `no`, `vaccines`, `cook_time`, `next_song`, `bill_due`, `restaurant_suggestion`, `text`, `smart_home`, `ingredients_list`, `recipe`, `replacement_card_duration`, `date`, `play_music`, `flight_status`, `roll_dice`, `current_location`, `restaurant_reviews`, `shopping_list`, `change_volume`, `new_card`, `travel_alert`, `cancel`, `tell_joke`, `order_checks`, `uber`, `next_holiday`, `meaning_of_life`, `calendar`, `rollover_401k`, `oil_change_how`, `confirm_reservation`, `how_busy`, `credit_limit`, `maybe`, `meal_suggestion`, `thank_you`, `exchange_rate`, `goodbye`, `definition`, `pto_used`, `mpg`, `time`, `lost_luggage`, `change_user_name`, `taxes`, `traffic`, `share_location`, `jump_start`, `meeting_schedule` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc_en_5.2.0_3.0_1701231998619.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc_en_5.2.0_3.0_1701231998619.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_v2_mini_lm_mini_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_lewtun_minilmv2_l12_h384_distilled_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|132.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lewtun/MiniLMv2-L12-H384-distilled-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_live_demo_question_intimacy_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_live_demo_question_intimacy_en.md new file mode 100644 index 000000000000..f62c4f22b683 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_live_demo_question_intimacy_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from pedropei) +author: John Snow Labs +name: roberta_classifier_live_demo_question_intimacy +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `live-demo-question-intimacy` is a English model originally trained by `pedropei`. + +## Predicted Entities + +`intimacy` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_live_demo_question_intimacy_en_5.2.0_3.0_1701231955075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_live_demo_question_intimacy_en_5.2.0_3.0_1701231955075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_live_demo_question_intimacy","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_live_demo_question_intimacy","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.live_demo_question_intimacy.by_pedropei").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_live_demo_question_intimacy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|469.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/pedropei/live-demo-question-intimacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lro_v1.0.2a_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lro_v1.0.2a_en.md new file mode 100644 index 000000000000..79209fd944b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_lro_v1.0.2a_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from PhucLe) +author: John Snow Labs +name: roberta_classifier_lro_v1.0.2a +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `LRO_v1.0.2a` is a English model originally trained by `PhucLe`. + +## Predicted Entities + +`lead`, `resident`, `other` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_lro_v1.0.2a_en_5.2.0_3.0_1701232292841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_lro_v1.0.2a_en_5.2.0_3.0_1701232292841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_lro_v1.0.2a","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_lro_v1.0.2a","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_phucle").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_lro_v1.0.2a| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|419.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/PhucLe/LRO_v1.0.2a \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_main_intent_test_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_main_intent_test_en.md new file mode 100644 index 000000000000..ed760b182244 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_main_intent_test_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from mp6kv) +author: John Snow Labs +name: roberta_classifier_main_intent_test +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `main_intent_test` is a English model originally trained by `mp6kv`. + +## Predicted Entities + +`connect`, `feedback`, `inform`, `none`, `pump` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_main_intent_test_en_5.2.0_3.0_1701229081377.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_main_intent_test_en_5.2.0_3.0_1701229081377.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_main_intent_test","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_main_intent_test","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.main_intent_test.roberta.by_mp6kv").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_main_intent_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|425.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mp6kv/main_intent_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_malayalam_news_ml.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_malayalam_news_ml.md new file mode 100644 index 000000000000..50b628edd6b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_malayalam_news_ml.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Malayalam RoBertaForSequenceClassification Cased model (from bipin) +author: John Snow Labs +name: roberta_classifier_malayalam_news +date: 2023-11-29 +tags: [ml, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: ml +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `malayalam-news-classifier` is a Malayalam model originally trained by `bipin`. + +## Predicted Entities + +`business`, `sports`, `entertainment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_malayalam_news_ml_5.2.0_3.0_1701232550133.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_malayalam_news_ml_5.2.0_3.0_1701232550133.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_malayalam_news","ml") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_malayalam_news","ml") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ml.classify.roberta.news.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_malayalam_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ml| +|Size:|314.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/bipin/malayalam-news-classifier +- https://www.kaggle.com/disisbig/malyalam-news-dataset \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_manibert_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_manibert_en.md new file mode 100644 index 000000000000..0971e9ddcadc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_manibert_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from niksmer) +author: John Snow Labs +name: roberta_classifier_manibert +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `ManiBERT` is a English model originally trained by `niksmer`. + +## Predicted Entities + +`Multiculturalism: Negative`, `Labour Groups: Positive`, `Nationalisation`, `European Community/Union or Latin America Integration: Positive`, `Economic Growth: Positive`, `Anti-Growth Economy and Sustainability`, `Education Limitation`, `Agriculture and Farmers`, `Technology and Infrastructure: Positive`, `Incentives: Positive`, `Governmental and Administrative Efficiency`, `Anti-Imperialism`, `Free Market Economy`, `Traditional Morality: Positive`, `Labour Groups: Negative`, `Constitutionalism: Negative`, `Peace`, `Welfare State Limitation`, `Traditional Morality: Negative`, `Centralisation: Positive`, `Environmental Protection`, `Economic Goals`, `Internationalism: Negative`, `Protectionism: Negative`, `Foreign Special Relationships: Negative`, `Welfare State Expansion`, `Controlled Economy`, `Market Regulation`, `Education Expansion`, `Culture: Positive`, `Law and Order`, `Protectionism: Positive`, `Corporatism/ Mixed Economy`, `Non-economic Demographic Groups`, `Constitutionalism: Positive`, `National Way of Life: Negative`, `Military: Positive`, `Freedom and Human Rights`, `European Community/Union or Latin America Integration: Negative`, `Decentralisation: Positive`, `Multiculturalism: Positive`, `Democracy`, `Economic Planning`, `Equality: Positive`, `Underprivileged Minority Groups`, `Foreign Special Relationships: Positive`, `Political Authority`, `Economic Orthodoxy`, `Military: Negative`, `Political Corruption`, `Keynesian Demand Management`, `Marxist Analysis: Positive`, `Civic Mindedness: Positive`, `Internationalism: Positive`, `Middle Class and Professional Groups`, `National Way of Life: Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_manibert_en_5.2.0_3.0_1701228479635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_manibert_en_5.2.0_3.0_1701228479635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_manibert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_manibert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.manibert.by_niksmer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_manibert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/niksmer/ManiBERT +- https://manifesto-project.wzb.eu/ +- https://manifesto-project.wzb.eu/datasets +- https://manifesto-project.wzb.eu/down/tutorials/main-dataset.html#measuring-parties-left-right-positions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l12_clinc_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l12_clinc_distilled_en.md new file mode 100644 index 000000000000..00cbb43d3c8b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l12_clinc_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from moshew) +author: John Snow Labs +name: roberta_classifier_minilm_l12_clinc_distilled +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L12-clinc-distilled` is a English model originally trained by `moshew`. + +## Predicted Entities + +`international_visa`, `distance`, `gas`, `what_are_your_hobbies`, `whisper_mode`, `travel_notification`, `pay_bill`, `alarm`, `ingredient_substitution`, `order`, `greeting`, `directions`, `tire_pressure`, `nutrition_info`, `bill_balance`, `change_ai_name`, `weather`, `update_playlist`, `payday`, `restaurant_reservation`, `transactions`, `translate`, `carry_on`, `find_phone`, `oos`, `fun_fact`, `rewards_balance`, `measurement_conversion`, `what_song`, `flip_coin`, `cancel_reservation`, `what_is_your_name`, `todo_list`, `who_made_you`, `transfer`, `w2`, `sync_device`, `yes`, `where_are_you_from`, `reminder_update`, `calculator`, `credit_score`, `who_do_you_work_for`, `travel_suggestion`, `international_fees`, `repeat`, `calories`, `credit_limit_change`, `are_you_a_bot`, `redeem_rewards`, `book_hotel`, `how_old_are_you`, `interest_rate`, `reminder`, `timezone`, `user_name`, `card_declined`, `routing`, `make_call`, `income`, `book_flight`, `what_can_i_ask_you`, `change_speed`, `pto_request`, `application_status`, `change_accent`, `freeze_account`, `change_language`, `todo_list_update`, `calendar_update`, `timer`, `pto_balance`, `oil_change_when`, `gas_type`, `accept_reservations`, `pto_request_status`, `damaged_card`, `schedule_meeting`, `report_lost_card`, `car_rental`, `improve_credit_score`, `do_you_have_pets`, `expiration_date`, `food_last`, `insurance_change`, `shopping_list_update`, `pin_change`, `order_status`, `schedule_maintenance`, `account_blocked`, `min_payment`, `apr`, `plug_type`, `tire_change`, `spending_history`, `direct_deposit`, `balance`, `reset_settings`, `insurance`, `spelling`, `report_fraud`, `last_maintenance`, `no`, `vaccines`, `cook_time`, `next_song`, `bill_due`, `restaurant_suggestion`, `text`, `smart_home`, `ingredients_list`, `recipe`, `replacement_card_duration`, `date`, `play_music`, `flight_status`, `roll_dice`, `current_location`, `restaurant_reviews`, `shopping_list`, `change_volume`, `new_card`, `travel_alert`, `cancel`, `tell_joke`, `order_checks`, `uber`, `next_holiday`, `meaning_of_life`, `calendar`, `rollover_401k`, `oil_change_how`, `confirm_reservation`, `how_busy`, `credit_limit`, `maybe`, `meal_suggestion`, `thank_you`, `exchange_rate`, `goodbye`, `definition`, `pto_used`, `mpg`, `time`, `lost_luggage`, `change_user_name`, `taxes`, `traffic`, `share_location`, `jump_start`, `meeting_schedule` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilm_l12_clinc_distilled_en_5.2.0_3.0_1701235670331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilm_l12_clinc_distilled_en_5.2.0_3.0_1701235670331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilm_l12_clinc_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilm_l12_clinc_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_mini_lm_mini").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilm_l12_clinc_distilled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|137.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/moshew/MiniLM-L12-clinc-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l12_h384_sst2_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l12_h384_sst2_distilled_en.md new file mode 100644 index 000000000000..a31dc661be97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l12_h384_sst2_distilled_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_minilm_l12_h384_sst2_distilled +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `minilm-l12-h384-sst2-distilled` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilm_l12_h384_sst2_distilled_en_5.2.0_3.0_1701235884365.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilm_l12_h384_sst2_distilled_en_5.2.0_3.0_1701235884365.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilm_l12_h384_sst2_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilm_l12_h384_sst2_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue.distilled_mini_lm_mini").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilm_l12_h384_sst2_distilled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|138.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/minilm-l12-h384-sst2-distilled +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l6_clinc_distilled_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l6_clinc_distilled_en.md new file mode 100644 index 000000000000..f7ea6cdf158c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilm_l6_clinc_distilled_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from moshew) +author: John Snow Labs +name: roberta_classifier_minilm_l6_clinc_distilled +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLM-L6-clinc-distilled` is a English model originally trained by `moshew`. + +## Predicted Entities + +`international_visa`, `distance`, `gas`, `what_are_your_hobbies`, `whisper_mode`, `travel_notification`, `pay_bill`, `alarm`, `ingredient_substitution`, `order`, `greeting`, `directions`, `tire_pressure`, `nutrition_info`, `bill_balance`, `change_ai_name`, `weather`, `update_playlist`, `payday`, `restaurant_reservation`, `transactions`, `translate`, `carry_on`, `find_phone`, `oos`, `fun_fact`, `rewards_balance`, `measurement_conversion`, `what_song`, `flip_coin`, `cancel_reservation`, `what_is_your_name`, `todo_list`, `who_made_you`, `transfer`, `w2`, `sync_device`, `yes`, `where_are_you_from`, `reminder_update`, `calculator`, `credit_score`, `who_do_you_work_for`, `travel_suggestion`, `international_fees`, `repeat`, `calories`, `credit_limit_change`, `are_you_a_bot`, `redeem_rewards`, `book_hotel`, `how_old_are_you`, `interest_rate`, `reminder`, `timezone`, `user_name`, `card_declined`, `routing`, `make_call`, `income`, `book_flight`, `what_can_i_ask_you`, `change_speed`, `pto_request`, `application_status`, `change_accent`, `freeze_account`, `change_language`, `todo_list_update`, `calendar_update`, `timer`, `pto_balance`, `oil_change_when`, `gas_type`, `accept_reservations`, `pto_request_status`, `damaged_card`, `schedule_meeting`, `report_lost_card`, `car_rental`, `improve_credit_score`, `do_you_have_pets`, `expiration_date`, `food_last`, `insurance_change`, `shopping_list_update`, `pin_change`, `order_status`, `schedule_maintenance`, `account_blocked`, `min_payment`, `apr`, `plug_type`, `tire_change`, `spending_history`, `direct_deposit`, `balance`, `reset_settings`, `insurance`, `spelling`, `report_fraud`, `last_maintenance`, `no`, `vaccines`, `cook_time`, `next_song`, `bill_due`, `restaurant_suggestion`, `text`, `smart_home`, `ingredients_list`, `recipe`, `replacement_card_duration`, `date`, `play_music`, `flight_status`, `roll_dice`, `current_location`, `restaurant_reviews`, `shopping_list`, `change_volume`, `new_card`, `travel_alert`, `cancel`, `tell_joke`, `order_checks`, `uber`, `next_holiday`, `meaning_of_life`, `calendar`, `rollover_401k`, `oil_change_how`, `confirm_reservation`, `how_busy`, `credit_limit`, `maybe`, `meal_suggestion`, `thank_you`, `exchange_rate`, `goodbye`, `definition`, `pto_used`, `mpg`, `time`, `lost_luggage`, `change_user_name`, `taxes`, `traffic`, `share_location`, `jump_start`, `meeting_schedule` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilm_l6_clinc_distilled_en_5.2.0_3.0_1701232177132.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilm_l6_clinc_distilled_en_5.2.0_3.0_1701232177132.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilm_l6_clinc_distilled","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilm_l6_clinc_distilled","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_mini_lm_mini.by_moshew").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilm_l6_clinc_distilled| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|97.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/moshew/MiniLM-L6-clinc-distilled \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection_en.md new file mode 100644 index 000000000000..790558c0063e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from Rhuax) +author: John Snow Labs +name: roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L12-H384-distilled-finetuned-spam-detection` is a English model originally trained by `Rhuax`. + +## Predicted Entities + +`ham`, `spam` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection_en_5.2.0_3.0_1701229287124.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection_en_5.2.0_3.0_1701229287124.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.email_spam.distilled_v2_mini_lm_mini_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilmv2_l12_h384_distilled_finetuned_spam_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|134.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Rhuax/MiniLMv2-L12-H384-distilled-finetuned-spam-detection +- https://paperswithcode.com/sota?task=Text+Classification&dataset=sms_spam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_emotion_en.md new file mode 100644 index 000000000000..dc904dbdf0e1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_minilmv2_l12_h384_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L12-H384-emotion` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`fear`, `anger`, `love`, `surprise`, `joy`, `sadness` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_emotion_en_5.2.0_3.0_1701232355216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_emotion_en_5.2.0_3.0_1701232355216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.emotion.v2_mini_lm_mini").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilmv2_l12_h384_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|137.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/MiniLMv2-L12-H384-emotion +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_finetuned_clinc_en.md new file mode 100644 index 000000000000..9dc770c6e8bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from optimum) +author: John Snow Labs +name: roberta_classifier_minilmv2_l12_h384_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L12-H384-finetuned-clinc` is a English model originally trained by `optimum`. + +## Predicted Entities + +`international_visa`, `distance`, `gas`, `what_are_your_hobbies`, `whisper_mode`, `travel_notification`, `pay_bill`, `alarm`, `ingredient_substitution`, `order`, `greeting`, `directions`, `tire_pressure`, `nutrition_info`, `bill_balance`, `change_ai_name`, `weather`, `update_playlist`, `payday`, `restaurant_reservation`, `transactions`, `translate`, `carry_on`, `find_phone`, `oos`, `fun_fact`, `rewards_balance`, `measurement_conversion`, `what_song`, `flip_coin`, `cancel_reservation`, `what_is_your_name`, `todo_list`, `who_made_you`, `transfer`, `w2`, `sync_device`, `yes`, `where_are_you_from`, `reminder_update`, `calculator`, `credit_score`, `who_do_you_work_for`, `travel_suggestion`, `international_fees`, `repeat`, `calories`, `credit_limit_change`, `are_you_a_bot`, `redeem_rewards`, `book_hotel`, `how_old_are_you`, `interest_rate`, `reminder`, `timezone`, `user_name`, `card_declined`, `routing`, `make_call`, `income`, `book_flight`, `what_can_i_ask_you`, `change_speed`, `pto_request`, `application_status`, `change_accent`, `freeze_account`, `change_language`, `todo_list_update`, `calendar_update`, `timer`, `pto_balance`, `oil_change_when`, `gas_type`, `accept_reservations`, `pto_request_status`, `damaged_card`, `schedule_meeting`, `report_lost_card`, `car_rental`, `improve_credit_score`, `do_you_have_pets`, `expiration_date`, `food_last`, `insurance_change`, `shopping_list_update`, `pin_change`, `order_status`, `schedule_maintenance`, `account_blocked`, `min_payment`, `apr`, `plug_type`, `tire_change`, `spending_history`, `direct_deposit`, `balance`, `reset_settings`, `insurance`, `spelling`, `report_fraud`, `last_maintenance`, `no`, `vaccines`, `cook_time`, `next_song`, `bill_due`, `restaurant_suggestion`, `text`, `smart_home`, `ingredients_list`, `recipe`, `replacement_card_duration`, `date`, `play_music`, `flight_status`, `roll_dice`, `current_location`, `restaurant_reviews`, `shopping_list`, `change_volume`, `new_card`, `travel_alert`, `cancel`, `tell_joke`, `order_checks`, `uber`, `next_holiday`, `meaning_of_life`, `calendar`, `rollover_401k`, `oil_change_how`, `confirm_reservation`, `how_busy`, `credit_limit`, `maybe`, `meal_suggestion`, `thank_you`, `exchange_rate`, `goodbye`, `definition`, `pto_used`, `mpg`, `time`, `lost_luggage`, `change_user_name`, `taxes`, `traffic`, `share_location`, `jump_start`, `meeting_schedule` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_finetuned_clinc_en_5.2.0_3.0_1701228665655.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_finetuned_clinc_en_5.2.0_3.0_1701228665655.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.v2_mini_lm_mini_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilmv2_l12_h384_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|138.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/optimum/MiniLMv2-L12-H384-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_sst2_en.md new file mode 100644 index 000000000000..7b7d40c72f20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l12_h384_sst2_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_minilmv2_l12_h384_sst2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L12-H384-sst2` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_sst2_en_5.2.0_3.0_1701232732925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l12_h384_sst2_en_5.2.0_3.0_1701232732925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l12_h384_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue.v2_mini_lm_mini").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilmv2_l12_h384_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|138.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/MiniLMv2-L12-H384-sst2 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h384_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h384_emotion_en.md new file mode 100644 index 000000000000..ab23b5e423b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h384_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_minilmv2_l6_h384_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L6-H384-emotion` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`fear`, `anger`, `love`, `surprise`, `joy`, `sadness` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l6_h384_emotion_en_5.2.0_3.0_1701232892837.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l6_h384_emotion_en_5.2.0_3.0_1701232892837.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l6_h384_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l6_h384_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.emotion.v2_mini_lm_mini.by_philschmid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilmv2_l6_h384_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|97.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/MiniLMv2-L6-H384-emotion +- https://paperswithcode.com/sota?task=Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h384_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h384_sst2_en.md new file mode 100644 index 000000000000..b64d41df972a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h384_sst2_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_minilmv2_l6_h384_sst2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L6-H384-sst2` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l6_h384_sst2_en_5.2.0_3.0_1701233063408.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l6_h384_sst2_en_5.2.0_3.0_1701233063408.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l6_h384_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l6_h384_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue.v2_mini_lm_mini_l6_h384_sst2.by_philschmid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilmv2_l6_h384_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|98.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/MiniLMv2-L6-H384-sst2 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h768_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h768_sst2_en.md new file mode 100644 index 000000000000..e75591a62dd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_minilmv2_l6_h768_sst2_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_minilmv2_l6_h768_sst2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L6-H768-sst2` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l6_h768_sst2_en_5.2.0_3.0_1701229025971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_minilmv2_l6_h768_sst2_en_5.2.0_3.0_1701229025971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l6_h768_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_minilmv2_l6_h768_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue.v2_mini_lm_mini_l6_h768_sst2.by_philschmid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_minilmv2_l6_h768_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|276.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/MiniLMv2-L6-H768-sst2 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_mnli_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_mnli_base_en.md new file mode 100644 index 000000000000..b671ddf55ec7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_mnli_base_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from boychaboy) +author: John Snow Labs +name: roberta_classifier_mnli_base +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MNLI_roberta-base` is a English model originally trained by `boychaboy`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_mnli_base_en_5.2.0_3.0_1701229297616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_mnli_base_en_5.2.0_3.0_1701229297616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_mnli_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_mnli_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.mnli.roberta.base.by_boychaboy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_mnli_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/boychaboy/MNLI_roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_mnli_distil_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_mnli_distil_base_en.md new file mode 100644 index 000000000000..8d10131ead2f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_mnli_distil_base_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from boychaboy) +author: John Snow Labs +name: roberta_classifier_mnli_distil_base +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MNLI_distilroberta-base` is a English model originally trained by `boychaboy`. + +## Predicted Entities + +`neutral`, `entailment`, `contradiction` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_mnli_distil_base_en_5.2.0_3.0_1701236182713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_mnli_distil_base_en_5.2.0_3.0_1701236182713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_mnli_distil_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_mnli_distil_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.mnli.distilled_base.by_boychaboy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_mnli_distil_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/boychaboy/MNLI_distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_multiclass_textclassification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_multiclass_textclassification_en.md new file mode 100644 index 000000000000..f1aa506beaa5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_multiclass_textclassification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from palakagl) +author: John Snow Labs +name: roberta_classifier_multiclass_textclassification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `Roberta_Multiclass_TextClassification` is a English model originally trained by `palakagl`. + +## Predicted Entities + +`transport_ticket`, `general_commandstop`, `iot_cleaning`, `general_praise`, `music_settings`, `general_quirky`, `recommendation_locations`, `iot_hue_lightoff`, `audio_volume_mute`, `calendar_set`, `iot_coffee`, `datetime_convert`, `general_explain`, `cooking_recipe`, `qa_definition`, `news_query`, `music_likeness`, `recommendation_movies`, `general_dontcare`, `general_affirm`, `recommendation_events`, `alarm_set`, `qa_maths`, `qa_factoid`, `play_podcasts`, `takeaway_query`, `email_sendemail`, `email_addcontact`, `transport_traffic`, `iot_wemo_off`, `general_negate`, `iot_hue_lightdim`, `audio_volume_up`, `general_repeat`, `iot_wemo_on`, `alarm_query`, `lists_createoradd`, `music_query`, `weather_query`, `transport_query`, `alarm_remove`, `takeaway_order`, `social_post`, `general_confirm`, `calendar_query`, `iot_hue_lightup`, `general_joke`, `calendar_remove`, `email_querycontact`, `iot_hue_lightchange`, `iot_hue_lighton`, `play_radio`, `social_query`, `lists_query`, `transport_taxi`, `lists_remove`, `email_query`, `datetime_query`, `play_music`, `qa_stock`, `audio_volume_down`, `qa_currency`, `play_game`, `play_audiobook` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_multiclass_textclassification_en_5.2.0_3.0_1701229643454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_multiclass_textclassification_en_5.2.0_3.0_1701229643454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_multiclass_textclassification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_multiclass_textclassification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_palakagl").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_multiclass_textclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|421.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/palakagl/Roberta_Multiclass_TextClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_neutral_non_neutral_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_neutral_non_neutral_en.md new file mode 100644 index 000000000000..1d932f5a313a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_neutral_non_neutral_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Osiris) +author: John Snow Labs +name: roberta_classifier_neutral_non_neutral +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `neutral_non_neutral_classifier` is a English model originally trained by `Osiris`. + +## Predicted Entities + +`Non-Neutral`, `Neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_neutral_non_neutral_en_5.2.0_3.0_1701232639039.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_neutral_non_neutral_en_5.2.0_3.0_1701232639039.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_neutral_non_neutral","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_neutral_non_neutral","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.neutral_non_neutral.by_osiris").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_neutral_non_neutral| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|448.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Osiris/neutral_non_neutral_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_news_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_news_sentiment_analysis_en.md new file mode 100644 index 000000000000..77ef171e03d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_news_sentiment_analysis_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from shashanksrinath) +author: John Snow Labs +name: roberta_classifier_news_sentiment_analysis +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `News_Sentiment_Analysis` is a English model originally trained by `shashanksrinath`. + +## Predicted Entities + +`Negative`, `Neutral`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_news_sentiment_analysis_en_5.2.0_3.0_1701236628084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_news_sentiment_analysis_en_5.2.0_3.0_1701236628084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_news_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_news_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.news_sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_news_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shashanksrinath/News_Sentiment_Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_not_interested_v0_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_not_interested_v0_en.md new file mode 100644 index 000000000000..addcbac543ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_not_interested_v0_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from aujer) +author: John Snow Labs +name: roberta_classifier_not_interested_v0 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `not_interested_v0` is a English model originally trained by `aujer`. + +## Predicted Entities + +`OTHER`, `TIMING`, `COMPANY_FIT`, `SENIORITY`, `ROLE_FIT`, `COMPENSATION`, `VISA`, `REMOTE_POLICY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_not_interested_v0_en_5.2.0_3.0_1701230071677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_not_interested_v0_en_5.2.0_3.0_1701230071677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_not_interested_v0","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_not_interested_v0","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.not_interested.v2.by_aujer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_not_interested_v0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aujer/not_interested_v0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_optimum_large_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_optimum_large_finetuned_clinc_en.md new file mode 100644 index 000000000000..f55af4f61281 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_optimum_large_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from optimum) +author: John Snow Labs +name: roberta_classifier_optimum_large_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc` is a English model originally trained by `optimum`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_optimum_large_finetuned_clinc_en_5.2.0_3.0_1701230767388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_optimum_large_finetuned_clinc_en_5.2.0_3.0_1701230767388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_optimum_large_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_optimum_large_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large_finetuned.by_optimum").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_optimum_large_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/optimum/roberta-large-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc_en.md new file mode 100644 index 000000000000..3123278c8c6c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from optimum) +author: John Snow Labs +name: roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L12-H384-distilled-finetuned-clinc` is a English model originally trained by `optimum`. + +## Predicted Entities + +`international_visa`, `distance`, `gas`, `what_are_your_hobbies`, `whisper_mode`, `travel_notification`, `pay_bill`, `alarm`, `ingredient_substitution`, `order`, `greeting`, `directions`, `tire_pressure`, `nutrition_info`, `bill_balance`, `change_ai_name`, `weather`, `update_playlist`, `payday`, `restaurant_reservation`, `transactions`, `translate`, `carry_on`, `find_phone`, `oos`, `fun_fact`, `rewards_balance`, `measurement_conversion`, `what_song`, `flip_coin`, `cancel_reservation`, `what_is_your_name`, `todo_list`, `who_made_you`, `transfer`, `w2`, `sync_device`, `yes`, `where_are_you_from`, `reminder_update`, `calculator`, `credit_score`, `who_do_you_work_for`, `travel_suggestion`, `international_fees`, `repeat`, `calories`, `credit_limit_change`, `are_you_a_bot`, `redeem_rewards`, `book_hotel`, `how_old_are_you`, `interest_rate`, `reminder`, `timezone`, `user_name`, `card_declined`, `routing`, `make_call`, `income`, `book_flight`, `what_can_i_ask_you`, `change_speed`, `pto_request`, `application_status`, `change_accent`, `freeze_account`, `change_language`, `todo_list_update`, `calendar_update`, `timer`, `pto_balance`, `oil_change_when`, `gas_type`, `accept_reservations`, `pto_request_status`, `damaged_card`, `schedule_meeting`, `report_lost_card`, `car_rental`, `improve_credit_score`, `do_you_have_pets`, `expiration_date`, `food_last`, `insurance_change`, `shopping_list_update`, `pin_change`, `order_status`, `schedule_maintenance`, `account_blocked`, `min_payment`, `apr`, `plug_type`, `tire_change`, `spending_history`, `direct_deposit`, `balance`, `reset_settings`, `insurance`, `spelling`, `report_fraud`, `last_maintenance`, `no`, `vaccines`, `cook_time`, `next_song`, `bill_due`, `restaurant_suggestion`, `text`, `smart_home`, `ingredients_list`, `recipe`, `replacement_card_duration`, `date`, `play_music`, `flight_status`, `roll_dice`, `current_location`, `restaurant_reviews`, `shopping_list`, `change_volume`, `new_card`, `travel_alert`, `cancel`, `tell_joke`, `order_checks`, `uber`, `next_holiday`, `meaning_of_life`, `calendar`, `rollover_401k`, `oil_change_how`, `confirm_reservation`, `how_busy`, `credit_limit`, `maybe`, `meal_suggestion`, `thank_you`, `exchange_rate`, `goodbye`, `definition`, `pto_used`, `mpg`, `time`, `lost_luggage`, `change_user_name`, `taxes`, `traffic`, `share_location`, `jump_start`, `meeting_schedule` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc_en_5.2.0_3.0_1701233377875.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc_en_5.2.0_3.0_1701233377875.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_v2_mini_lm_mini_finetuned.by_optimum").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_optimum_minilmv2_l12_h384_distilled_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|138.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/optimum/MiniLMv2-L12-H384-distilled-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_paper_feedback_intent_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_paper_feedback_intent_en.md new file mode 100644 index 000000000000..61e30b6ed0ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_paper_feedback_intent_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from mp6kv) +author: John Snow Labs +name: roberta_classifier_paper_feedback_intent +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `paper_feedback_intent` is a English model originally trained by `mp6kv`. + +## Predicted Entities + +`neutral`, `positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_paper_feedback_intent_en_5.2.0_3.0_1701232890080.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_paper_feedback_intent_en_5.2.0_3.0_1701232890080.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_paper_feedback_intent","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_paper_feedback_intent","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.paper_feedback_intent.roberta.by_mp6kv").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_paper_feedback_intent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mp6kv/paper_feedback_intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_paraphrase_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_paraphrase_es.md new file mode 100644 index 000000000000..48de955ceca8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_paraphrase_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Cased model (from Prompsit) +author: John Snow Labs +name: roberta_classifier_paraphrase +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `paraphrase-roberta-es` is a Spanish model originally trained by `Prompsit`. + +## Predicted Entities + +`Paraphrase`, `Not Paraphrase` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_paraphrase_es_5.2.0_3.0_1701231125686.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_paraphrase_es_5.2.0_3.0_1701231125686.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_paraphrase","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_paraphrase","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.by_prompsit").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_paraphrase| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|457.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Prompsit/paraphrase-roberta-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_parrot_adequacy_model_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_parrot_adequacy_model_en.md new file mode 100644 index 000000000000..d18032bb46f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_parrot_adequacy_model_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from prithivida) +author: John Snow Labs +name: roberta_classifier_parrot_adequacy_model +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `parrot_adequacy_model` is a English model originally trained by `prithivida`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_parrot_adequacy_model_en_5.2.0_3.0_1701237220108.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_parrot_adequacy_model_en_5.2.0_3.0_1701237220108.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_parrot_adequacy_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_parrot_adequacy_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.adverse_drug_event").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_parrot_adequacy_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/prithivida/parrot_adequacy_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_large_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_large_finetuned_clinc_en.md new file mode 100644 index 000000000000..3d922706a381 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_large_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_philschmid_large_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-finetuned-clinc` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`todo_list`, `card_declined`, `cook_time`, `pto_request_status`, `calendar`, `spending_history`, `next_holiday`, `tell_joke`, `ingredients_list`, `change_language`, `restaurant_suggestion`, `min_payment`, `pin_change`, `whisper_mode`, `date`, `international_visa`, `plug_type`, `w2`, `translate`, `pto_used`, `thank_you`, `alarm`, `shopping_list_update`, `flight_status`, `change_volume`, `bill_due`, `find_phone`, `carry_on`, `reminder_update`, `apr`, `user_name`, `uber`, `calories`, `report_lost_card`, `change_accent`, `payday`, `timezone`, `reminder`, `roll_dice`, `text`, `current_location`, `cancel`, `change_ai_name`, `weather`, `directions`, `jump_start`, `recipe`, `timer`, `what_song`, `income`, `change_user_name`, `tire_change`, `sync_device`, `application_status`, `lost_luggage`, `meeting_schedule`, `what_is_your_name`, `credit_score`, `gas_type`, `maybe`, `order_checks`, `do_you_have_pets`, `oil_change_when`, `schedule_meeting`, `interest_rate`, `rollover_401k`, `how_old_are_you`, `last_maintenance`, `smart_home`, `book_hotel`, `freeze_account`, `nutrition_info`, `bill_balance`, `improve_credit_score`, `pto_balance`, `replacement_card_duration`, `travel_suggestion`, `calendar_update`, `transfer`, `vaccines`, `update_playlist`, `mpg`, `schedule_maintenance`, `confirm_reservation`, `repeat`, `restaurant_reservation`, `meaning_of_life`, `gas`, `cancel_reservation`, `international_fees`, `routing`, `meal_suggestion`, `time`, `change_speed`, `new_card`, `redeem_rewards`, `insurance_change`, `insurance`, `play_music`, `credit_limit`, `balance`, `goodbye`, `are_you_a_bot`, `restaurant_reviews`, `todo_list_update`, `rewards_balance`, `no`, `spelling`, `what_can_i_ask_you`, `order`, `reset_settings`, `shopping_list`, `order_status`, `ingredient_substitution`, `food_last`, `transactions`, `make_call`, `travel_notification`, `who_made_you`, `share_location`, `damaged_card`, `next_song`, `oil_change_how`, `taxes`, `direct_deposit`, `who_do_you_work_for`, `yes`, `exchange_rate`, `definition`, `what_are_your_hobbies`, `expiration_date`, `car_rental`, `tire_pressure`, `accept_reservations`, `calculator`, `account_blocked`, `how_busy`, `distance`, `book_flight`, `credit_limit_change`, `report_fraud`, `pay_bill`, `measurement_conversion`, `where_are_you_from`, `pto_request`, `travel_alert`, `flip_coin`, `fun_fact`, `traffic`, `greeting`, `oos` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_philschmid_large_finetuned_clinc_en_5.2.0_3.0_1701231990765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_philschmid_large_finetuned_clinc_en_5.2.0_3.0_1701231990765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_philschmid_large_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_philschmid_large_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large_finetuned.by_philschmid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_philschmid_large_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/roberta-large-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_large_sst2_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_large_sst2_en.md new file mode 100644 index 000000000000..c8cdc8a2f728 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_large_sst2_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_philschmid_large_sst2 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-large-sst2` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_philschmid_large_sst2_en_5.2.0_3.0_1701232668677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_philschmid_large_sst2_en_5.2.0_3.0_1701232668677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_philschmid_large_sst2","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_philschmid_large_sst2","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue.large").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_philschmid_large_sst2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/roberta-large-sst2 +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+SST2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc_en.md new file mode 100644 index 000000000000..0fba9dc8d55b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Mini Cased model (from philschmid) +author: John Snow Labs +name: roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `MiniLMv2-L12-H384-distilled-finetuned-clinc` is a English model originally trained by `philschmid`. + +## Predicted Entities + +`international_visa`, `distance`, `gas`, `what_are_your_hobbies`, `whisper_mode`, `travel_notification`, `pay_bill`, `alarm`, `ingredient_substitution`, `order`, `greeting`, `directions`, `tire_pressure`, `nutrition_info`, `bill_balance`, `change_ai_name`, `weather`, `update_playlist`, `payday`, `restaurant_reservation`, `transactions`, `translate`, `carry_on`, `find_phone`, `oos`, `fun_fact`, `rewards_balance`, `measurement_conversion`, `what_song`, `flip_coin`, `cancel_reservation`, `what_is_your_name`, `todo_list`, `who_made_you`, `transfer`, `w2`, `sync_device`, `yes`, `where_are_you_from`, `reminder_update`, `calculator`, `credit_score`, `who_do_you_work_for`, `travel_suggestion`, `international_fees`, `repeat`, `calories`, `credit_limit_change`, `are_you_a_bot`, `redeem_rewards`, `book_hotel`, `how_old_are_you`, `interest_rate`, `reminder`, `timezone`, `user_name`, `card_declined`, `routing`, `make_call`, `income`, `book_flight`, `what_can_i_ask_you`, `change_speed`, `pto_request`, `application_status`, `change_accent`, `freeze_account`, `change_language`, `todo_list_update`, `calendar_update`, `timer`, `pto_balance`, `oil_change_when`, `gas_type`, `accept_reservations`, `pto_request_status`, `damaged_card`, `schedule_meeting`, `report_lost_card`, `car_rental`, `improve_credit_score`, `do_you_have_pets`, `expiration_date`, `food_last`, `insurance_change`, `shopping_list_update`, `pin_change`, `order_status`, `schedule_maintenance`, `account_blocked`, `min_payment`, `apr`, `plug_type`, `tire_change`, `spending_history`, `direct_deposit`, `balance`, `reset_settings`, `insurance`, `spelling`, `report_fraud`, `last_maintenance`, `no`, `vaccines`, `cook_time`, `next_song`, `bill_due`, `restaurant_suggestion`, `text`, `smart_home`, `ingredients_list`, `recipe`, `replacement_card_duration`, `date`, `play_music`, `flight_status`, `roll_dice`, `current_location`, `restaurant_reviews`, `shopping_list`, `change_volume`, `new_card`, `travel_alert`, `cancel`, `tell_joke`, `order_checks`, `uber`, `next_holiday`, `meaning_of_life`, `calendar`, `rollover_401k`, `oil_change_how`, `confirm_reservation`, `how_busy`, `credit_limit`, `maybe`, `meal_suggestion`, `thank_you`, `exchange_rate`, `goodbye`, `definition`, `pto_used`, `mpg`, `time`, `lost_luggage`, `change_user_name`, `taxes`, `traffic`, `share_location`, `jump_start`, `meeting_schedule` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc_en_5.2.0_3.0_1701232871651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc_en_5.2.0_3.0_1701232871651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.distilled_v2_mini_lm_mini_finetuned.by_philschmid").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_philschmid_minilmv2_l12_h384_distilled_finetuned_clinc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|132.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/philschmid/MiniLMv2-L12-H384-distilled-finetuned-clinc +- https://paperswithcode.com/sota?task=Text+Classification&dataset=clinc_oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel_en.md new file mode 100644 index 000000000000..0ef66d7b7a7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from platzi) +author: John Snow Labs +name: roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `platzi-distilroberta-base-mrpc-glue-omar-espejel` is a English model originally trained by `platzi`. + +## Predicted Entities + +`equivalent`, `not_equivalent` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel_en_5.2.0_3.0_1701230089500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel_en_5.2.0_3.0_1701230089500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.glue.distilled_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_platzi_distil_base_mrpc_glue_omar_espejel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/platzi/platzi-distilroberta-base-mrpc-glue-omar-espejel +- https://paperswithcode.com/sota?task=Text+Classification&dataset=glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_policyberta_7d_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_policyberta_7d_en.md new file mode 100644 index 000000000000..5cc91151761f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_policyberta_7d_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from niksmer) +author: John Snow Labs +name: roberta_classifier_policyberta_7d +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `PolicyBERTa-7d` is a English model originally trained by `niksmer`. + +## Predicted Entities + +`welfare and quality of life`, `fabric of society`, `external relations`, `freedom and democracy`, `economy`, `political system`, `social groups` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_policyberta_7d_en_5.2.0_3.0_1701233188536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_policyberta_7d_en_5.2.0_3.0_1701233188536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_policyberta_7d","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_policyberta_7d","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.policyberta.by_niksmer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_policyberta_7d| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/niksmer/PolicyBERTa-7d +- https://manifesto-project.wzb.eu/ +- https://manifesto-project.wzb.eu/datasets +- https://manifesto-project.wzb.eu/down/papers/handbook_2021_version_5.pdf +- https://manifesto-project.wzb.eu/down/tutorials/main-dataset.html#measuring-parties-left-right-positions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_programming_lang_identifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_programming_lang_identifier_en.md new file mode 100644 index 000000000000..272ffb218d4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_programming_lang_identifier_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from addy88) +author: John Snow Labs +name: roberta_classifier_programming_lang_identifier +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `programming-lang-identifier` is a English model originally trained by `addy88`. + +## Predicted Entities + +`ruby`, `javascript`, `python`, `java`, `go`, `php` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_programming_lang_identifier_en_5.2.0_3.0_1701233186432.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_programming_lang_identifier_en_5.2.0_3.0_1701233186432.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_programming_lang_identifier","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_programming_lang_identifier","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.lang.by_addy88").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_programming_lang_identifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|314.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/addy88/programming-lang-identifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_pump_intent_test_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_pump_intent_test_en.md new file mode 100644 index 000000000000..8e7338e9568b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_pump_intent_test_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from mp6kv) +author: John Snow Labs +name: roberta_classifier_pump_intent_test +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `pump_intent_test` is a English model originally trained by `mp6kv`. + +## Predicted Entities + +`value`, `clarification`, `testing` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_pump_intent_test_en_5.2.0_3.0_1701233470267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_pump_intent_test_en_5.2.0_3.0_1701233470267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_pump_intent_test","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_pump_intent_test","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.pump_intent_test.roberta.by_mp6kv").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_pump_intent_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mp6kv/pump_intent_test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_purchase_intention_english_large_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_purchase_intention_english_large_en.md new file mode 100644 index 000000000000..eb116800f6b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_purchase_intention_english_large_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Large Cased model (from j-hartmann) +author: John Snow Labs +name: roberta_classifier_purchase_intention_english_large +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `purchase-intention-english-roberta-large` is a English model originally trained by `j-hartmann`. + +## Predicted Entities + +`yes`, `no` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_purchase_intention_english_large_en_5.2.0_3.0_1701233742486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_purchase_intention_english_large_en_5.2.0_3.0_1701233742486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_purchase_intention_english_large","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_purchase_intention_english_large","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.large.by_j_hartmann").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_purchase_intention_english_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/j-hartmann/purchase-intention-english-roberta-large +- https://journals.sagepub.com/doi/full/10.1177/00222437211037258 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qandaclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qandaclassifier_en.md new file mode 100644 index 000000000000..1328124f6d32 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qandaclassifier_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from p-christ) +author: John Snow Labs +name: roberta_classifier_qandaclassifier +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `QandAClassifier` is a English model originally trained by `p-christ`. + +## Predicted Entities + +`ACCEPTED`, `REJECTED` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_qandaclassifier_en_5.2.0_3.0_1701233737590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_qandaclassifier_en_5.2.0_3.0_1701233737590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_qandaclassifier","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_qandaclassifier","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_p_christ").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_qandaclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/p-christ/QandAClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qnli_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qnli_base_en.md new file mode 100644 index 000000000000..2155d00ccbef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qnli_base_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from JeremiahZ) +author: John Snow Labs +name: roberta_classifier_qnli_base +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-qnli` is a English model originally trained by `JeremiahZ`. + +## Predicted Entities + +`not_entailment`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_qnli_base_en_5.2.0_3.0_1701234012970.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_qnli_base_en_5.2.0_3.0_1701234012970.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_qnli_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_qnli_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.qnli.glue.base.by_JeremiahZ").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_qnli_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JeremiahZ/roberta-base-qnli +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+QNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qqp_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qqp_base_en.md new file mode 100644 index 000000000000..f8269a748152 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_qqp_base_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from JeremiahZ) +author: John Snow Labs +name: roberta_classifier_qqp_base +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-qqp` is a English model originally trained by `JeremiahZ`. + +## Predicted Entities + +`not_duplicate`, `duplicate` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_qqp_base_en_5.2.0_3.0_1701233702712.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_qqp_base_en_5.2.0_3.0_1701233702712.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_qqp_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_qqp_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.qqp.glue.base.by_JeremiahZ").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_qqp_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JeremiahZ/roberta-base-qqp +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+QQP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_question_intimacy_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_question_intimacy_en.md new file mode 100644 index 000000000000..7258fa30377b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_question_intimacy_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from pedropei) +author: John Snow Labs +name: roberta_classifier_question_intimacy +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `question-intimacy` is a English model originally trained by `pedropei`. + +## Predicted Entities + +`intimacy` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_question_intimacy_en_5.2.0_3.0_1701234013401.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_question_intimacy_en_5.2.0_3.0_1701234013401.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_question_intimacy","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_question_intimacy","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.question_intimacy.by_pedropei").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_question_intimacy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|469.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/pedropei/question-intimacy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_reactiongif_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_reactiongif_en.md new file mode 100644 index 000000000000..77f71171e761 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_reactiongif_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from julien-c) +author: John Snow Labs +name: roberta_classifier_reactiongif +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `reactiongif-roberta` is a English model originally trained by `julien-c`. + +## Predicted Entities + +`awww`, `agree`, `scared`, `oh_snap`, `hug`, `happy_dance`, `sorry`, `oops`, `yes`, `slow_clap`, `want`, `please`, `good_luck`, `sigh`, `thank_you`, `dance`, `high_five`, `you_got_this`, `yawn`, `thumbs_up`, `fist_bump`, `idk`, `shrug`, `ok`, `yolo`, `smh`, `shocked`, `facepalm`, `kiss`, `no`, `deal_with_it`, `hearts`, `applause`, `popcorn`, `thumbs_down`, `do_not_want`, `seriously`, `omg`, `eye_roll`, `wink`, `win`, `mic_drop`, `eww` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_reactiongif_en_5.2.0_3.0_1701230396448.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_reactiongif_en_5.2.0_3.0_1701230396448.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_reactiongif","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_reactiongif","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_julien_c").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_reactiongif| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/julien-c/reactiongif-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_3class_paragraphs_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_3class_paragraphs_es.md new file mode 100644 index 000000000000..34c05ff487fd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_3class_paragraphs_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_classifier_readability_spanish_3class_paragraphs RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: roberta_classifier_readability_spanish_3class_paragraphs +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_readability_spanish_3class_paragraphs` is a Castilian, Spanish model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_3class_paragraphs_es_5.2.0_3.0_1701237459588.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_3class_paragraphs_es_5.2.0_3.0_1701237459588.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_3class_paragraphs","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_3class_paragraphs","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_readability_spanish_3class_paragraphs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|441.9 MB| + +## References + +https://huggingface.co/hackathon-pln-es/readability-es-3class-paragraphs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_3class_sentences_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_3class_sentences_es.md new file mode 100644 index 000000000000..57af57e4c8e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_3class_sentences_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_classifier_readability_spanish_3class_sentences RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: roberta_classifier_readability_spanish_3class_sentences +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_readability_spanish_3class_sentences` is a Castilian, Spanish model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_3class_sentences_es_5.2.0_3.0_1701230699901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_3class_sentences_es_5.2.0_3.0_1701230699901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_3class_sentences","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_3class_sentences","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_readability_spanish_3class_sentences| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|442.1 MB| + +## References + +https://huggingface.co/hackathon-pln-es/readability-es-3class-sentences \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_paragraphs_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_paragraphs_es.md new file mode 100644 index 000000000000..ca52911a9d43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_paragraphs_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_classifier_readability_spanish_paragraphs RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: roberta_classifier_readability_spanish_paragraphs +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_readability_spanish_paragraphs` is a Castilian, Spanish model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_paragraphs_es_5.2.0_3.0_1701234229649.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_paragraphs_es_5.2.0_3.0_1701234229649.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_paragraphs","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_paragraphs","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_readability_spanish_paragraphs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|444.4 MB| + +## References + +https://huggingface.co/hackathon-pln-es/readability-es-paragraphs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_sentences_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_sentences_es.md new file mode 100644 index 000000000000..9213af6250c9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_readability_spanish_sentences_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_classifier_readability_spanish_sentences RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: roberta_classifier_readability_spanish_sentences +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_readability_spanish_sentences` is a Castilian, Spanish model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_sentences_es_5.2.0_3.0_1701233946951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_readability_spanish_sentences_es_5.2.0_3.0_1701233946951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_sentences","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_readability_spanish_sentences","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_readability_spanish_sentences| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|444.7 MB| + +## References + +https://huggingface.co/hackathon-pln-es/readability-es-sentences \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_reranking_model_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_reranking_model_en.md new file mode 100644 index 000000000000..a41916f72d5a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_reranking_model_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from Cathy) +author: John Snow Labs +name: roberta_classifier_reranking_model +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `reranking_model` is a English model originally trained by `Cathy`. + +## Predicted Entities + +`entailment`, `contradiction`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_reranking_model_en_5.2.0_3.0_1701238318014.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_reranking_model_en_5.2.0_3.0_1701238318014.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_reranking_model","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_reranking_model","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_cathy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_reranking_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Cathy/reranking_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rile_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rile_en.md new file mode 100644 index 000000000000..0b1aac40ca7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rile_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from niksmer) +author: John Snow Labs +name: roberta_classifier_rile +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `RoBERTa-RILE` is a English model originally trained by `niksmer`. + +## Predicted Entities + +`Neutral`, `Left`, `Right` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_rile_en_5.2.0_3.0_1701231044788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_rile_en_5.2.0_3.0_1701231044788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_rile","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_rile","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.rile.roberta.by_niksmer").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_rile| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.5 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/niksmer/RoBERTa-RILE +- https://manifesto-project.wzb.eu/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_dutch_base_toxic_comments_nl.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_dutch_base_toxic_comments_nl.md new file mode 100644 index 000000000000..e4be1c80e6b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_dutch_base_toxic_comments_nl.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Dutch RobertaForSequenceClassification Base Cased model (from ml6team) +author: John Snow Labs +name: roberta_classifier_robbert_dutch_base_toxic_comments +date: 2023-11-29 +tags: [nl, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `robbert-dutch-base-toxic-comments` is a Dutch model originally trained by `ml6team`. + +## Predicted Entities + +`non-toxic`, `toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_dutch_base_toxic_comments_nl_5.2.0_3.0_1701231389425.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_dutch_base_toxic_comments_nl_5.2.0_3.0_1701231389425.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_dutch_base_toxic_comments","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_dutch_base_toxic_comments","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.roberta.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_robbert_dutch_base_toxic_comments| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|437.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/ml6team/robbert-dutch-base-toxic-comments +- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_twitter_sentiment_custom_nl.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_twitter_sentiment_custom_nl.md new file mode 100644 index 000000000000..eab7f3aef135 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_twitter_sentiment_custom_nl.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Dutch RoBertaForSequenceClassification Cased model (from btjiong) +author: John Snow Labs +name: roberta_classifier_robbert_twitter_sentiment_custom +date: 2023-11-29 +tags: [nl, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `robbert-twitter-sentiment-custom` is a Dutch model originally trained by `btjiong`. + +## Predicted Entities + +`POSITIEF`, `NEGATIEF`, `NEUTRAAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_twitter_sentiment_custom_nl_5.2.0_3.0_1701231839627.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_twitter_sentiment_custom_nl_5.2.0_3.0_1701231839627.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_twitter_sentiment_custom","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_twitter_sentiment_custom","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.roberta.sentiment_twitter.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_robbert_twitter_sentiment_custom| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|437.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/btjiong/robbert-twitter-sentiment-custom +- https://paperswithcode.com/sota?task=Text+Classification&dataset=dutch_social \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_twitter_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_twitter_sentiment_en.md new file mode 100644 index 000000000000..0f4b4c268b2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_twitter_sentiment_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from btjiong) +author: John Snow Labs +name: roberta_classifier_robbert_twitter_sentiment +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `robbert-twitter-sentiment` is a English model originally trained by `btjiong`. + +## Predicted Entities + +`POSITIEF`, `NEGATIEF`, `NEUTRAAL` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_twitter_sentiment_en_5.2.0_3.0_1701234316426.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_twitter_sentiment_en_5.2.0_3.0_1701234316426.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_twitter_sentiment","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_twitter_sentiment","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.sentiment_twitter.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_robbert_twitter_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/btjiong/robbert-twitter-sentiment +- https://paperswithcode.com/sota?task=Text+Classification&dataset=dutch_social \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_v2_dutch_base_hebban_reviews_nl.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_v2_dutch_base_hebban_reviews_nl.md new file mode 100644 index 000000000000..60ef1262186d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_v2_dutch_base_hebban_reviews_nl.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Dutch RobertaForSequenceClassification Base Cased model (from BramVanroy) +author: John Snow Labs +name: roberta_classifier_robbert_v2_dutch_base_hebban_reviews +date: 2023-11-29 +tags: [nl, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `robbert-v2-dutch-base-hebban-reviews` is a Dutch model originally trained by `BramVanroy`. + +## Predicted Entities + +`neutral`, `positive`, `negative` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_v2_dutch_base_hebban_reviews_nl_5.2.0_3.0_1701234573220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_v2_dutch_base_hebban_reviews_nl_5.2.0_3.0_1701234573220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_v2_dutch_base_hebban_reviews","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_v2_dutch_base_hebban_reviews","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.roberta.v2_base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_robbert_v2_dutch_base_hebban_reviews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|437.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/BramVanroy/robbert-v2-dutch-base-hebban-reviews +- https://paperswithcode.com/sota?task=sentiment+analysis&dataset=BramVanroy%2Fhebban-reviews+-+filtered_sentiment+-+2.0.0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_v2_dutch_sentiment_nl.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_v2_dutch_sentiment_nl.md new file mode 100644 index 000000000000..c26e233ea7f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbert_v2_dutch_sentiment_nl.md @@ -0,0 +1,112 @@ +--- +layout: model +title: Dutch RobertaForSequenceClassification Cased model (from DTAI-KULeuven) +author: John Snow Labs +name: roberta_classifier_robbert_v2_dutch_sentiment +date: 2023-11-29 +tags: [nl, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `robbert-v2-dutch-sentiment` is a Dutch model originally trained by `DTAI-KULeuven`. + +## Predicted Entities + +`Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_v2_dutch_sentiment_nl_5.2.0_3.0_1701234251873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbert_v2_dutch_sentiment_nl_5.2.0_3.0_1701234251873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_v2_dutch_sentiment","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbert_v2_dutch_sentiment","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.roberta.sentiment.v2").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_robbert_v2_dutch_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|437.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/DTAI-KULeuven/robbert-v2-dutch-sentiment +- https://hebban.nl +- https://www.aclweb.org/anthology/2020.findings-emnlp.292 +- https://people.cs.kuleuven.be/~pieter.delobelle +- https://thomaswinters.be +- https://people.cs.kuleuven.be/~bettina.berendt/ +- https://paperswithcode.com/sota?task=Text+Classification&dataset=dbrd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbertje_merged_dutch_sentiment_nl.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbertje_merged_dutch_sentiment_nl.md new file mode 100644 index 000000000000..5d9d007c85e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_robbertje_merged_dutch_sentiment_nl.md @@ -0,0 +1,111 @@ +--- +layout: model +title: Dutch RobertaForSequenceClassification Cased model (from DTAI-KULeuven) +author: John Snow Labs +name: roberta_classifier_robbertje_merged_dutch_sentiment +date: 2023-11-29 +tags: [nl, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `robbertje-merged-dutch-sentiment` is a Dutch model originally trained by `DTAI-KULeuven`. + +## Predicted Entities + +`Negative`, `Positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbertje_merged_dutch_sentiment_nl_5.2.0_3.0_1701234494622.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_robbertje_merged_dutch_sentiment_nl_5.2.0_3.0_1701234494622.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbertje_merged_dutch_sentiment","nl") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_robbertje_merged_dutch_sentiment","nl") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("nl.classify.roberta.sentiment.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_robbertje_merged_dutch_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|278.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/DTAI-KULeuven/robbertje-merged-dutch-sentiment +- https://www.aclweb.org/anthology/2020.findings-emnlp.292 +- https://people.cs.kuleuven.be/~pieter.delobelle +- https://thomaswinters.be +- https://people.cs.kuleuven.be/~bettina.berendt/ +- https://paperswithcode.com/sota?task=Text+Classification&dataset=dbrd \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rota_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rota_en.md new file mode 100644 index 000000000000..97765049a4d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rota_en.md @@ -0,0 +1,110 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from rti-international) +author: John Snow Labs +name: roberta_classifier_rota +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `rota` is a English model originally trained by `rti-international`. + +## Predicted Entities + +`TRAFFICKING - OTHER CONTROLLED SUBSTANCES`, `INVASION OF PRIVACY`, `HABITUAL OFFENDER`, `LARCENY/THEFT - VALUE UNKNOWN`, `TAX LAW (FEDERAL ONLY)`, `MANSLAUGHTER - VEHICULAR`, `CONTROLLED SUBSTANCE - OFFENSE UNSPECIFIED`, `WEAPON OFFENSE`, `RAPE - FORCE`, `DRUG OFFENSES - VIOLATION/DRUG UNSPECIFIED`, `TRAFFIC OFFENSES - MINOR`, `FLIGHT TO AVOID PROSECUTION`, `BRIBERY AND CONFLICT OF INTEREST`, `KIDNAPPING`, `AUTO THEFT`, `RIOTING`, `PROPERTY OFFENSES - OTHER`, `EMBEZZLEMENT (FEDERAL ONLY)`, `CHILD ABUSE`, `HEROIN VIOLATION - OFFENSE UNSPECIFIED`, `BLACKMAIL/EXTORTION/INTIMIDATION`, `GRAND LARCENY - THEFT OVER $200`, `DRIVING UNDER INFLUENCE - DRUGS`, `EMBEZZLEMENT`, `FORGERY/FRAUD`, `POSSESSION/USE - MARIJUANA/HASHISH`, `STOLEN PROPERTY - TRAFFICKING`, `FORGERY (FEDERAL ONLY)`, `PROBATION VIOLATION`, `FRAUD (FEDERAL ONLY)`, `UNARMED ROBBERY`, `ARSON`, `COCAINE OR CRACK VIOLATION OFFENSE UNSPECIFIED`, `SIMPLE ASSAULT`, `DESTRUCTION OF PROPERTY`, `POSSESSION/USE - DRUG UNSPECIFIED`, `COUNTERFEITING (FEDERAL ONLY)`, `FORCIBLE SODOMY`, `RAPE - STATUTORY - NO FORCE`, `UNAUTHORIZED USE OF VEHICLE`, `POSSESSION/USE - OTHER CONTROLLED SUBSTANCES`, `TRAFFICKING - DRUG UNSPECIFIED`, `IMMIGRATION VIOLATIONS`, `VOLUNTARY/NONNEGLIGENT MANSLAUGHTER`, `DRIVING WHILE INTOXICATED`, `PETTY LARCENY - THEFT UNDER $200`, `HIT/RUN DRIVING - PROPERTY DAMAGE`, `MURDER`, `REGULATORY OFFENSES (FEDERAL ONLY)`, `FAMILY RELATED OFFENSES`, `POSSESSION/USE - HEROIN`, `PUBLIC ORDER OFFENSES - OTHER`, `DRIVING UNDER THE INFLUENCE`, `TRESPASSING`, `CONTRIBUTING TO DELINQUENCY OF A MINOR`, `ARMED ROBBERY`, `FELONY - UNSPECIFIED`, `UNSPECIFIED HOMICIDE`, `MARIJUANA/HASHISH VIOLATION - OFFENSE UNSPECIFIED`, `TRAFFICKING - COCAINE OR CRACK`, `COMMERCIALIZED VICE`, `TRAFFICKING - HEROIN`, `LIQUOR LAW VIOLATIONS`, `ASSAULTING PUBLIC OFFICER`, `JUVENILE OFFENSES`, `VIOLENT OFFENSES - OTHER`, `MISDEMEANOR UNSPECIFIED`, `HIT AND RUN DRIVING`, `CONTEMPT OF COURT`, `BURGLARY`, `MANSLAUGHTER - NON-VEHICULAR`, `PAROLE VIOLATION`, `DRUNKENNESS/VAGRANCY/DISORDERLY CONDUCT`, `STOLEN PROPERTY - RECEIVING`, `TRAFFICKING MARIJUANA/HASHISH`, `SEXUAL ASSAULT - OTHER`, `LEWD ACT WITH CHILDREN`, `POSSESSION/USE - COCAINE OR CRACK`, `OBSTRUCTION - LAW ENFORCEMENT`, `RACKETEERING/EXTORTION (FEDERAL ONLY)`, `AGGRAVATED ASSAULT`, `MORALS/DECENCY - OFFENSE`, `ESCAPE FROM CUSTODY` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_rota_en_5.2.0_3.0_1701234468665.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_rota_en_5.2.0_3.0_1701234468665.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_rota","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_rota","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_rti_international").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_rota| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/rti-international/rota +- https://github.com/RTIInternational/rota +- https://doi.org/10.5281/zenodo.4770492 +- https://www.icpsr.umich.edu/web/NACJD/studies/30799/datadocumentation# +- https://web.archive.org/web/20201021001250/https://www.icpsr.umich.edu/web/pages/NACJD/guides/ncrp.html \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rte_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rte_base_en.md new file mode 100644 index 000000000000..9a117fceb5e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_rte_base_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from JeremiahZ) +author: John Snow Labs +name: roberta_classifier_rte_base +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta-base-rte` is a English model originally trained by `JeremiahZ`. + +## Predicted Entities + +`not_entailment`, `entailment` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_rte_base_en_5.2.0_3.0_1701234812320.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_rte_base_en_5.2.0_3.0_1701234812320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_rte_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_rte_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.rte.glue.base.by_JeremiahZ").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_rte_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/JeremiahZ/roberta-base-rte +- https://paperswithcode.com/sota?task=Text+Classification&dataset=GLUE+RTE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_ruperta_base_sentiment_analysis_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_ruperta_base_sentiment_analysis_es.md new file mode 100644 index 000000000000..ef80eaf49b29 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_ruperta_base_sentiment_analysis_es.md @@ -0,0 +1,107 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Base Cased model (from edumunozsala) +author: John Snow Labs +name: roberta_classifier_ruperta_base_sentiment_analysis +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `RuPERTa_base_sentiment_analysis_es` is a Spanish model originally trained by `edumunozsala`. + +## Predicted Entities + +`Positivo`, `Negativo` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_ruperta_base_sentiment_analysis_es_5.2.0_3.0_1701232163281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_ruperta_base_sentiment_analysis_es_5.2.0_3.0_1701232163281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_ruperta_base_sentiment_analysis","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_ruperta_base_sentiment_analysis","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.sentiment.base.by_edumunozsala").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_ruperta_base_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|472.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/edumunozsala/RuPERTa_base_sentiment_analysis_es +- https://github.com/edumunozsala \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sagemaker_base_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sagemaker_base_emotion_en.md new file mode 100644 index 000000000000..ec8663e5d4f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sagemaker_base_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from Jorgeutd) +author: John Snow Labs +name: roberta_classifier_sagemaker_base_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sagemaker-roberta-base-emotion` is a English model originally trained by `Jorgeutd`. + +## Predicted Entities + +`joy`, `anger`, `love`, `fear`, `surprise`, `sadness` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_sagemaker_base_emotion_en_5.2.0_3.0_1701234828197.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_sagemaker_base_emotion_en_5.2.0_3.0_1701234828197.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_sagemaker_base_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_sagemaker_base_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.emotion.base.by_Jorgeutd").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_sagemaker_base_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|434.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Jorgeutd/sagemaker-roberta-base-emotion +- https://paperswithcode.com/sota?task=Multi+Class+Text+Classification&dataset=emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_scim_distilroberta_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_scim_distilroberta_en.md new file mode 100644 index 000000000000..afc3e80ca478 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_scim_distilroberta_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from aristotletan) +author: John Snow Labs +name: roberta_classifier_scim_distilroberta +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `scim-distilroberta` is a English model originally trained by `aristotletan`. + +## Predicted Entities + +`Utilisation of Proceeds`, `Conditions Precedent`, `Negative Covenant`, `Information Covenant`, `Rating`, `Designated Accounts`, `Events of Default`, `Positive Covenant`, `Conditions Subsequent`, `Conflict of Interest`, `Financial Covenant` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_scim_distilroberta_en_5.2.0_3.0_1701235143602.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_scim_distilroberta_en_5.2.0_3.0_1701235143602.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_scim_distilroberta","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_scim_distilroberta","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_roberta.distilled.by_aristotletan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_scim_distilroberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/aristotletan/scim-distilroberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sentiment_large_english_3_classes_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sentiment_large_english_3_classes_en.md new file mode 100644 index 000000000000..cd7efa3b1129 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sentiment_large_english_3_classes_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Large Cased model (from j-hartmann) +author: John Snow Labs +name: roberta_classifier_sentiment_large_english_3_classes +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sentiment-roberta-large-english-3-classes` is a English model originally trained by `j-hartmann`. + +## Predicted Entities + +`negative`, `positive`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_sentiment_large_english_3_classes_en_5.2.0_3.0_1701235721867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_sentiment_large_english_3_classes_en_5.2.0_3.0_1701235721867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_sentiment_large_english_3_classes","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_sentiment_large_english_3_classes","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.sentiment.large.by_j_hartmann").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_sentiment_large_english_3_classes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/j-hartmann/sentiment-roberta-large-english-3-classes +- https://journals.sagepub.com/doi/full/10.1177/00222437211037258 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_snli_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_snli_base_en.md new file mode 100644 index 000000000000..d1458d2d7a39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_snli_base_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from boychaboy) +author: John Snow Labs +name: roberta_classifier_snli_base +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SNLI_roberta-base` is a English model originally trained by `boychaboy`. + +## Predicted Entities + +`contradiction`, `entailment`, `neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_snli_base_en_5.2.0_3.0_1701232537989.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_snli_base_en_5.2.0_3.0_1701232537989.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_snli_base","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_snli_base","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.snli.roberta.base.by_boychaboy").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_snli_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/boychaboy/SNLI_roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_base_apeach_ko.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_base_apeach_ko.md new file mode 100644 index 000000000000..1bedcfac9d65 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_base_apeach_ko.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Korean RobertaForSequenceClassification Base Cased model (from jason9693) +author: John Snow Labs +name: roberta_classifier_soongsil_bert_base_apeach +date: 2023-11-29 +tags: [ko, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `soongsil-bert-base-apeach` is a Korean model originally trained by `jason9693`. + +## Predicted Entities + +`Default`, `Spoiled` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsil_bert_base_apeach_ko_5.2.0_3.0_1701235422045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsil_bert_base_apeach_ko_5.2.0_3.0_1701235422045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsil_bert_base_apeach","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsil_bert_base_apeach","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.roberta.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_soongsil_bert_base_apeach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|368.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jason9693/soongsil-bert-base-apeach \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_small_apeach_ko.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_small_apeach_ko.md new file mode 100644 index 000000000000..2275afae0672 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_small_apeach_ko.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Korean RobertaForSequenceClassification Small Cased model (from jason9693) +author: John Snow Labs +name: roberta_classifier_soongsil_bert_small_apeach +date: 2023-11-29 +tags: [ko, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `soongsil-bert-small-apeach` is a Korean model originally trained by `jason9693`. + +## Predicted Entities + +`Default`, `Spoiled` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsil_bert_small_apeach_ko_5.2.0_3.0_1701232798287.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsil_bert_small_apeach_ko_5.2.0_3.0_1701232798287.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsil_bert_small_apeach","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsil_bert_small_apeach","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.roberta.small").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_soongsil_bert_small_apeach| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|209.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jason9693/soongsil-bert-small-apeach \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_wellness_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_wellness_en.md new file mode 100644 index 000000000000..816d96dd05c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsil_bert_wellness_en.md @@ -0,0 +1,101 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from kco4776) +author: John Snow Labs +name: roberta_classifier_soongsil_bert_wellness +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `soongsil-bert-wellness` is a English model originally trained by `kco4776`. + +## Predicted Entities + +`일반대화`, `부가설명`, `상태`, `원인`, `자가치료`, `내원이유`, `모호함`, `배경`, `감정`, `증상`, `현재상태`, `치료이력` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsil_bert_wellness_en_5.2.0_3.0_1701236076612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsil_bert_wellness_en_5.2.0_3.0_1701236076612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsil_bert_wellness","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsil_bert_wellness","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_soongsil_bert_wellness| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|368.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/kco4776/soongsil-bert-wellness +- https://github.com/jason9693/Soongsil-BERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsilbert_base_beep_ko.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsilbert_base_beep_ko.md new file mode 100644 index 000000000000..b129f3aefc5b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_soongsilbert_base_beep_ko.md @@ -0,0 +1,117 @@ +--- +layout: model +title: Korean RobertaForSequenceClassification Base Cased model (from jason9693) +author: John Snow Labs +name: roberta_classifier_soongsilbert_base_beep +date: 2023-11-29 +tags: [ko, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: ko +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SoongsilBERT-base-beep` is a Korean model originally trained by `jason9693`. + +## Predicted Entities + +`hate`, `offensive`, `none` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsilbert_base_beep_ko_5.2.0_3.0_1701235671824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_soongsilbert_base_beep_ko_5.2.0_3.0_1701235671824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsilbert_base_beep","ko") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_soongsilbert_base_beep","ko") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("ko.classify.roberta.base.by_jason9693").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_soongsilbert_base_beep| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|ko| +|Size:|368.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jason9693/SoongsilBERT-base-beep +- https://github.com/e9t/nsmc +- https://github.com/naver/nlp-challenge +- https://github.com/google-research-datasets/paws +- https://github.com/kakaobrain/KorNLUDatasets +- https://github.com/songys/Question_pair +- https://korquad.github.io/category/1.0_KOR.html +- https://github.com/kocohub/korean-hate-speech +- https://github.com/monologg/KoELECTRA +- https://github.com/SKTBrain/KoBERT +- https://github.com/tbai2019/HanBert-54k-N +- https://github.com/monologg/HanBert-Transformers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_spte_large_all_mnli_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_spte_large_all_mnli_en.md new file mode 100644 index 000000000000..e4b8c6927845 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_spte_large_all_mnli_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from veronica320) +author: John Snow Labs +name: roberta_classifier_spte_large_all_mnli +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SPTE_roberta-large-mnli_all` is a English model originally trained by `veronica320`. + +## Predicted Entities + +`NEUTRAL`, `CONTRADICTION`, `ENTAILMENT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_spte_large_all_mnli_en_5.2.0_3.0_1701233405042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_spte_large_all_mnli_en_5.2.0_3.0_1701233405042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_spte_large_all_mnli","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_spte_large_all_mnli","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.all_mnli.large.by_veronica320").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_spte_large_all_mnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/veronica320/SPTE_roberta-large-mnli_all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_spte_large_mnli_200_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_spte_large_mnli_200_en.md new file mode 100644 index 000000000000..5c3f4a6c6089 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_spte_large_mnli_200_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Large Cased model (from veronica320) +author: John Snow Labs +name: roberta_classifier_spte_large_mnli_200 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `SPTE_roberta-large-mnli_200` is a English model originally trained by `veronica320`. + +## Predicted Entities + +`NEUTRAL`, `CONTRADICTION`, `ENTAILMENT` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_spte_large_mnli_200_en_5.2.0_3.0_1701236614569.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_spte_large_mnli_200_en_5.2.0_3.0_1701236614569.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_spte_large_mnli_200","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_spte_large_mnli_200","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.mnli_200.large.by_veronica320").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_spte_large_mnli_200| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/veronica320/SPTE_roberta-large-mnli_200 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_stop_the_steal_relevancy_analysis_binary_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_stop_the_steal_relevancy_analysis_binary_en.md new file mode 100644 index 000000000000..85615cc150b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_stop_the_steal_relevancy_analysis_binary_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from sefaozalpadl) +author: John Snow Labs +name: roberta_classifier_stop_the_steal_relevancy_analysis_binary +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `stop_the_steal_relevancy_analysis-binary` is a English model originally trained by `sefaozalpadl`. + +## Predicted Entities + +`No`, `Yes` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_stop_the_steal_relevancy_analysis_binary_en_5.2.0_3.0_1701240674536.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_stop_the_steal_relevancy_analysis_binary_en_5.2.0_3.0_1701240674536.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_stop_the_steal_relevancy_analysis_binary","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_stop_the_steal_relevancy_analysis_binary","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.stop_the_steal_relevancy_analysis_binary.by_sefaozalpadl").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_stop_the_steal_relevancy_analysis_binary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/sefaozalpadl/stop_the_steal_relevancy_analysis-binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_stress_twitter_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_stress_twitter_en.md new file mode 100644 index 000000000000..2916d0fe83a5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_stress_twitter_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from hsaglamlar) +author: John Snow Labs +name: roberta_classifier_stress_twitter +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `stress_twitter` is a English model originally trained by `hsaglamlar`. + +## Predicted Entities + +`0`, `1` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_stress_twitter_en_5.2.0_3.0_1701237279518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_stress_twitter_en_5.2.0_3.0_1701237279518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_stress_twitter","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_stress_twitter","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.twitter.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_stress_twitter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/hsaglamlar/stress_twitter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sundanese_base_emotion_su.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sundanese_base_emotion_su.md new file mode 100644 index 000000000000..8bb8e3bd45d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_sundanese_base_emotion_su.md @@ -0,0 +1,110 @@ +--- +layout: model +title: Sundanese RobertaForSequenceClassification Base Cased model (from w11wo) +author: John Snow Labs +name: roberta_classifier_sundanese_base_emotion +date: 2023-11-29 +tags: [su, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: su +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sundanese-roberta-base-emotion-classifier` is a Sundanese model originally trained by `w11wo`. + +## Predicted Entities + +`sadness`, `anger`, `joy`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_sundanese_base_emotion_su_5.2.0_3.0_1701235152839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_sundanese_base_emotion_su_5.2.0_3.0_1701235152839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_sundanese_base_emotion","su") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_sundanese_base_emotion","su") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("su.classify.roberta.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_sundanese_base_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|su| +|Size:|467.8 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/w11wo/sundanese-roberta-base-emotion-classifier +- https://arxiv.org/abs/1907.11692 +- https://hf.co/w11wo/sundanese-roberta-base +- https://github.com/virgantara/sundanese-twitter-dataset +- https://w11wo.github.io/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_superpal_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_superpal_en.md new file mode 100644 index 000000000000..ee1869357dd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_superpal_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from biu-nlp) +author: John Snow Labs +name: roberta_classifier_superpal +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `superpal` is a English model originally trained by `biu-nlp`. + +## Predicted Entities + +`not_aligned`, `aligned` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_superpal_en_5.2.0_3.0_1701236230104.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_superpal_en_5.2.0_3.0_1701236230104.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_superpal","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_superpal","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_biu_nlp").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_superpal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/biu-nlp/superpal +- https://arxiv.org/pdf/2009.00590 +- https://github.com/oriern/SuperPAL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tigrinya_geezswitch_ti.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tigrinya_geezswitch_ti.md new file mode 100644 index 000000000000..4f59b2f22cb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tigrinya_geezswitch_ti.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Tigrinya roberta_classifier_tigrinya_geezswitch RoBertaForSequenceClassification from fgaim +author: John Snow Labs +name: roberta_classifier_tigrinya_geezswitch +date: 2023-11-29 +tags: [roberta, ti, open_source, sequence_classification, onnx] +task: Text Classification +language: ti +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_tigrinya_geezswitch` is a Tigrinya model originally trained by fgaim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tigrinya_geezswitch_ti_5.2.0_3.0_1701237546285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tigrinya_geezswitch_ti_5.2.0_3.0_1701237546285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tigrinya_geezswitch","ti")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tigrinya_geezswitch","ti") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tigrinya_geezswitch| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ti| +|Size:|467.1 MB| + +## References + +https://huggingface.co/fgaim/tiroberta-geezswitch \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tigrinya_sentiment_ti.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tigrinya_sentiment_ti.md new file mode 100644 index 000000000000..eef77276ce89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tigrinya_sentiment_ti.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Tigrinya roberta_classifier_tigrinya_sentiment RoBertaForSequenceClassification from fgaim +author: John Snow Labs +name: roberta_classifier_tigrinya_sentiment +date: 2023-11-29 +tags: [roberta, ti, open_source, sequence_classification, onnx] +task: Text Classification +language: ti +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_tigrinya_sentiment` is a Tigrinya model originally trained by fgaim. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tigrinya_sentiment_ti_5.2.0_3.0_1701236469260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tigrinya_sentiment_ti_5.2.0_3.0_1701236469260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tigrinya_sentiment","ti")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tigrinya_sentiment","ti") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tigrinya_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ti| +|Size:|467.1 MB| + +## References + +https://huggingface.co/fgaim/tiroberta-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tonga_tonga_islands_music_genre_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tonga_tonga_islands_music_genre_en.md new file mode 100644 index 000000000000..a2f0b6f1286f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tonga_tonga_islands_music_genre_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_classifier_tonga_tonga_islands_music_genre RoBertaForSequenceClassification from luiz826 +author: John Snow Labs +name: roberta_classifier_tonga_tonga_islands_music_genre +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_tonga_tonga_islands_music_genre` is a English model originally trained by luiz826. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tonga_tonga_islands_music_genre_en_5.2.0_3.0_1701235415748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tonga_tonga_islands_music_genre_en_5.2.0_3.0_1701235415748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tonga_tonga_islands_music_genre","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tonga_tonga_islands_music_genre","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tonga_tonga_islands_music_genre| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.9 MB| + +## References + +https://huggingface.co/luiz826/roberta-to-music-genre \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_toxic_detector_distilroberta_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_toxic_detector_distilroberta_en.md new file mode 100644 index 000000000000..6f7228b3e608 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_toxic_detector_distilroberta_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from jpcorb20) +author: John Snow Labs +name: roberta_classifier_toxic_detector_distilroberta +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `toxic-detector-distilroberta` is a English model originally trained by `jpcorb20`. + +## Predicted Entities + +`identity_hate`, `threat`, `obscene`, `severe_toxic`, `insult`, `toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_toxic_detector_distilroberta_en_5.2.0_3.0_1701233678221.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_toxic_detector_distilroberta_en_5.2.0_3.0_1701233678221.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_toxic_detector_distilroberta","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_toxic_detector_distilroberta","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_roberta.distilled.by_jpcorb20").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_toxic_detector_distilroberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/jpcorb20/toxic-detector-distilroberta +- https://github.com/jpcorb20/toxic-comment-server +- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_toxicity_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_toxicity_en.md new file mode 100644 index 000000000000..2ac83534d679 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_toxicity_en.md @@ -0,0 +1,112 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from SkolkovoInstitute) +author: John Snow Labs +name: roberta_classifier_toxicity +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `roberta_toxicity_classifier` is a English model originally trained by `SkolkovoInstitute`. + +## Predicted Entities + +`neutral`, `toxic` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_toxicity_en_5.2.0_3.0_1701235697267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_toxicity_en_5.2.0_3.0_1701235697267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_toxicity","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_toxicity","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_skolkovoinstitute").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_toxicity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/SkolkovoInstitute/roberta_toxicity_classifier +- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge +- https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification +- https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification +- https://arxiv.org/abs/1907.11692 +- http://creativecommons.org/licenses/by-nc-sa/4.0/ +- http://creativecommons.org/licenses/by-nc-sa/4.0/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_offensive_eval_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_offensive_eval_en.md new file mode 100644 index 000000000000..ce833d10f3ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_offensive_eval_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from elozano) +author: John Snow Labs +name: roberta_classifier_tweet_offensive_eval +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tweet_offensive_eval` is a English model originally trained by `elozano`. + +## Predicted Entities + +`Non-Offensive`, `Offensive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_offensive_eval_en_5.2.0_3.0_1701236720446.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_offensive_eval_en_5.2.0_3.0_1701236720446.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_offensive_eval","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_offensive_eval","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet.").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tweet_offensive_eval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/elozano/tweet_offensive_eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_19_multi_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_19_multi_en.md new file mode 100644 index 000000000000..6c8c1d72e1d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_19_multi_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from cardiffnlp) +author: John Snow Labs +name: roberta_classifier_tweet_topic_19_multi +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tweet-topic-19-multi` is a English model originally trained by `cardiffnlp`. + +## Predicted Entities + +`film_tv_&_video`, `diaries_&_daily_life`, `other_hobbies`, `music`, `business_&_entrepreneurs`, `relationships`, `gaming`, `youth_&_student_life`, `food_&_dining`, `fitness_&_health`, `news_&_social_concern`, `fashion_&_style`, `family`, `travel_&_adventure`, `science_&_technology`, `learning_&_educational`, `arts_&_culture`, `sports`, `celebrity_&_pop_culture` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_19_multi_en_5.2.0_3.0_1701244214357.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_19_multi_en_5.2.0_3.0_1701244214357.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_19_multi","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_19_multi","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet_topic_19_multi.by_cardiffnlp").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tweet_topic_19_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cardiffnlp/tweet-topic-19-multi +- https://github.com/cardiffnlp/tweeteval +- https://arxiv.org/abs/2202.03829 +- https://github.com/cardiffnlp/timelms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_19_single_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_19_single_en.md new file mode 100644 index 000000000000..c51424d3a400 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_19_single_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from cardiffnlp) +author: John Snow Labs +name: roberta_classifier_tweet_topic_19_single +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tweet-topic-19-single` is a English model originally trained by `cardiffnlp`. + +## Predicted Entities + +`daily_life`, `arts_&_culture`, `business_&_entrepreneurs`, `pop_culture`, `sports_&_gaming`, `science_&_technology` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_19_single_en_5.2.0_3.0_1701237868397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_19_single_en_5.2.0_3.0_1701237868397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_19_single","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_19_single","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet_topic_19_single.by_cardiffnlp").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tweet_topic_19_single| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cardiffnlp/tweet-topic-19-single +- https://github.com/cardiffnlp/tweeteval +- https://arxiv.org/abs/2202.03829 +- https://github.com/cardiffnlp/timelms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_21_multi_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_21_multi_en.md new file mode 100644 index 000000000000..1a2e31eb1757 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_21_multi_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from cardiffnlp) +author: John Snow Labs +name: roberta_classifier_tweet_topic_21_multi +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tweet-topic-21-multi` is a English model originally trained by `cardiffnlp`. + +## Predicted Entities + +`film_tv_&_video`, `diaries_&_daily_life`, `other_hobbies`, `music`, `business_&_entrepreneurs`, `relationships`, `gaming`, `youth_&_student_life`, `food_&_dining`, `fitness_&_health`, `news_&_social_concern`, `fashion_&_style`, `family`, `travel_&_adventure`, `science_&_technology`, `learning_&_educational`, `arts_&_culture`, `sports`, `celebrity_&_pop_culture` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_21_multi_en_5.2.0_3.0_1701239359717.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_21_multi_en_5.2.0_3.0_1701239359717.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_21_multi","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_21_multi","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet_topic_21_multi.by_cardiffnlp").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tweet_topic_21_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cardiffnlp/tweet-topic-21-multi +- https://github.com/cardiffnlp/tweeteval +- https://arxiv.org/abs/2202.03829 +- https://github.com/cardiffnlp/timelms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_21_single_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_21_single_en.md new file mode 100644 index 000000000000..6d393e8fa90a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_tweet_topic_21_single_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from cardiffnlp) +author: John Snow Labs +name: roberta_classifier_tweet_topic_21_single +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `tweet-topic-21-single` is a English model originally trained by `cardiffnlp`. + +## Predicted Entities + +`daily_life`, `arts_&_culture`, `business_&_entrepreneurs`, `pop_culture`, `sports_&_gaming`, `science_&_technology` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_21_single_en_5.2.0_3.0_1701245992510.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_tweet_topic_21_single_en_5.2.0_3.0_1701245992510.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_21_single","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_tweet_topic_21_single","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.tweet_topic_21_single.by_cardiffnlp").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_tweet_topic_21_single| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cardiffnlp/tweet-topic-21-single +- https://github.com/cardiffnlp/tweeteval +- https://arxiv.org/abs/2202.03829 +- https://github.com/cardiffnlp/timelms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twisent_siebert_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twisent_siebert_en.md new file mode 100644 index 000000000000..f9acec1c6c24 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twisent_siebert_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from shatabdi) +author: John Snow Labs +name: roberta_classifier_twisent_siebert +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twisent_sieBert` is a English model originally trained by `shatabdi`. + +## Predicted Entities + +`POSITIVE`, `NEGATIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twisent_siebert_en_5.2.0_3.0_1701237364226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twisent_siebert_en_5.2.0_3.0_1701237364226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twisent_siebert","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twisent_siebert","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.twisent_siebert.by_shatabdi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twisent_siebert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shatabdi/twisent_sieBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twisent_twisent_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twisent_twisent_en.md new file mode 100644 index 000000000000..a21cd6a0fea5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twisent_twisent_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from shatabdi) +author: John Snow Labs +name: roberta_classifier_twisent_twisent +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twisent_twisent` is a English model originally trained by `shatabdi`. + +## Predicted Entities + +`POSITIVE`, `NEGATIVE` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twisent_twisent_en_5.2.0_3.0_1701237918836.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twisent_twisent_en_5.2.0_3.0_1701237918836.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twisent_twisent","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twisent_twisent","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.twisent_twisent.by_shatabdi").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twisent_twisent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/shatabdi/twisent_twisent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_dec2021_rbam_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_dec2021_rbam_fine_tuned_en.md new file mode 100644 index 000000000000..90134258725d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_dec2021_rbam_fine_tuned_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from MohammadABH) +author: John Snow Labs +name: roberta_classifier_twitter_base_dec2021_rbam_fine_tuned +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twitter-roberta-base-dec2021_rbam_fine_tuned` is a English model originally trained by `MohammadABH`. + +## Predicted Entities + +`neutral`, `attack`, `support` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_dec2021_rbam_fine_tuned_en_5.2.0_3.0_1701243444767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_dec2021_rbam_fine_tuned_en_5.2.0_3.0_1701243444767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_dec2021_rbam_fine_tuned","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_dec2021_rbam_fine_tuned","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.twitter.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twitter_base_dec2021_rbam_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/MohammadABH/twitter-roberta-base-dec2021_rbam_fine_tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_emotion_en.md new file mode 100644 index 000000000000..6c67e8c846cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_emotion_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from cardiffnlp) +author: John Snow Labs +name: roberta_classifier_twitter_base_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twitter-roberta-base-emotion` is a English model originally trained by `cardiffnlp`. + +## Predicted Entities + +`sadness`, `anger`, `joy`, `optimism` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_emotion_en_5.2.0_3.0_1701239709601.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_emotion_en_5.2.0_3.0_1701239709601.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.twitter.base.by_cardiffnlp").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twitter_base_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cardiffnlp/twitter-roberta-base-emotion +- https://arxiv.org/pdf/2010.12421.pdf +- https://github.com/cardiffnlp/tweeteval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1_en.md new file mode 100644 index 000000000000..116db0dc4703 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1 RoBertaForSequenceClassification from maxpe +author: John Snow Labs +name: roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1` is a English model originally trained by maxpe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1_en_5.2.0_3.0_1701247516647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1_en_5.2.0_3.0_1701247516647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twitter_base_jun2022_semitic_languages_eval_2018_task_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/maxpe/twitter-roberta-base-jun2022_sem_eval_2018_task_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_mar2022_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_mar2022_finetuned_emotion_en.md new file mode 100644 index 000000000000..998073d5510b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_mar2022_finetuned_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from Tomas23) +author: John Snow Labs +name: roberta_classifier_twitter_base_mar2022_finetuned_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twitter-roberta-base-mar2022-finetuned-emotion` is a English model originally trained by `Tomas23`. + +## Predicted Entities + +`anger`, `sadness`, `joy`, `optimism` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_mar2022_finetuned_emotion_en_5.2.0_3.0_1701233994018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_mar2022_finetuned_emotion_en_5.2.0_3.0_1701233994018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_mar2022_finetuned_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_mar2022_finetuned_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.twitter.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twitter_base_mar2022_finetuned_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Tomas23/twitter-roberta-base-mar2022-finetuned-emotion +- https://paperswithcode.com/sota?task=Text+Classification&dataset=tweet_eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_mar2022_finetuned_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_mar2022_finetuned_sentiment_en.md new file mode 100644 index 000000000000..581089112578 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_mar2022_finetuned_sentiment_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from Tomas23) +author: John Snow Labs +name: roberta_classifier_twitter_base_mar2022_finetuned_sentiment +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twitter-roberta-base-mar2022-finetuned-sentiment` is a English model originally trained by `Tomas23`. + +## Predicted Entities + +`neutral`, `negative`, `positive` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_mar2022_finetuned_sentiment_en_5.2.0_3.0_1701236095891.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_mar2022_finetuned_sentiment_en_5.2.0_3.0_1701236095891.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_mar2022_finetuned_sentiment","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_mar2022_finetuned_sentiment","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.sentiment_twitter.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twitter_base_mar2022_finetuned_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/Tomas23/twitter-roberta-base-mar2022-finetuned-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_sentiment_latest_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_sentiment_latest_en.md new file mode 100644 index 000000000000..c5584dbb3ff5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_sentiment_latest_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from cardiffnlp) +author: John Snow Labs +name: roberta_classifier_twitter_base_sentiment_latest +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twitter-roberta-base-sentiment-latest` is a English model originally trained by `cardiffnlp`. + +## Predicted Entities + +`Negative`, `Positive`, `Neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_sentiment_latest_en_5.2.0_3.0_1701249249765.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_sentiment_latest_en_5.2.0_3.0_1701249249765.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_sentiment_latest","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_sentiment_latest","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.sentiment_twitter.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twitter_base_sentiment_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest +- https://github.com/cardiffnlp/tweeteval +- https://arxiv.org/abs/2202.03829 +- https://github.com/cardiffnlp/timelms \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news_en.md new file mode 100644 index 000000000000..5d32d13ab8db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from lucaordronneau) +author: John Snow Labs +name: roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `twitter-roberta-base-sentiment-latest-finetuned-FG-CONCAT_SENTENCE-H-NEWS` is a English model originally trained by `lucaordronneau`. + +## Predicted Entities + +`neutral`, `greed`, `fear` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news_en_5.2.0_3.0_1701245332052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news_en_5.2.0_3.0_1701245332052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.news_sentiment_twitter.d_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_twitter_base_sentiment_latest_finetuned_fg_concat_sentence_h_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lucaordronneau/twitter-roberta-base-sentiment-latest-finetuned-FG-CONCAT_SENTENCE-H-NEWS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_uganda_labor_market_interview_text_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_uganda_labor_market_interview_text_classification_en.md new file mode 100644 index 000000000000..37fe4980151f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_uganda_labor_market_interview_text_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from wanghao2023) +author: John Snow Labs +name: roberta_classifier_uganda_labor_market_interview_text_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `uganda-labor-market-interview-text-classification` is a English model originally trained by `wanghao2023`. + +## Predicted Entities + +`is_info`, `is_motivation`, `is_referral`, `is_strategy`, `is_tip`, `is_neutral` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_uganda_labor_market_interview_text_classification_en_5.2.0_3.0_1701247017084.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_uganda_labor_market_interview_text_classification_en_5.2.0_3.0_1701247017084.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_uganda_labor_market_interview_text_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_uganda_labor_market_interview_text_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_wanghao2023").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_uganda_labor_market_interview_text_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|434.4 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/wanghao2023/uganda-labor-market-interview-text-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_unbiased_toxic_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_unbiased_toxic_en.md new file mode 100644 index 000000000000..796ef2f22b1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_unbiased_toxic_en.md @@ -0,0 +1,113 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from unitary) +author: John Snow Labs +name: roberta_classifier_unbiased_toxic +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `unbiased-toxic-roberta` is a English model originally trained by `unitary`. + +## Predicted Entities + +`christian`, `jewish`, `homosexual_gay_or_lesbian`, `black`, `threat`, `female`, `toxicity`, `white`, `muslim`, `identity_attack`, `severe_toxicity`, `psychiatric_or_mental_illness`, `sexual_explicit`, `insult`, `male`, `obscene` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_unbiased_toxic_en_5.2.0_3.0_1701234285210.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_unbiased_toxic_en_5.2.0_3.0_1701234285210.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_unbiased_toxic","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_unbiased_toxic","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_unitary").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_unbiased_toxic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|472.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/unitary/unbiased-toxic-roberta +- https://github.com/unitaryai/detoxify +- https://laurahanu.github.io/ +- https://www.unitary.ai/ +- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge +- https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf +- https://arxiv.org/pdf/1703.04009.pdf%201.pdf +- https://arxiv.org/pdf/1905.12516.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_unhappyzebra100_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_unhappyzebra100_en.md new file mode 100644 index 000000000000..fd7e9ce39c90 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_unhappyzebra100_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from FuriouslyAsleep) +author: John Snow Labs +name: roberta_classifier_unhappyzebra100 +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `unhappyZebra100` is a English model originally trained by `FuriouslyAsleep`. + +## Predicted Entities + +`False`, `True` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_unhappyzebra100_en_5.2.0_3.0_1701243564826.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_unhappyzebra100_en_5.2.0_3.0_1701243564826.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_unhappyzebra100","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_unhappyzebra100","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_furiouslyasleep").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_unhappyzebra100| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/FuriouslyAsleep/unhappyZebra100 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_vent_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_vent_emotion_en.md new file mode 100644 index 000000000000..10d647276ef5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_vent_emotion_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Cased model (from lumalik) +author: John Snow Labs +name: roberta_classifier_vent_emotion +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `vent-roberta-emotion` is a English model originally trained by `lumalik`. + +## Predicted Entities + +`Anger`, `Affection`, `Fear`, `Happiness`, `Sadness` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_vent_emotion_en_5.2.0_3.0_1701245326674.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_vent_emotion_en_5.2.0_3.0_1701245326674.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_vent_emotion","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_vent_emotion","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_lumalik").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_vent_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/lumalik/vent-roberta-emotion +- https://arxiv.org/abs/1901.04856 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_verdict_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_verdict_en.md new file mode 100644 index 000000000000..73f5b1c92d4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_verdict_en.md @@ -0,0 +1,108 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from saattrupdan) +author: John Snow Labs +name: roberta_classifier_verdict +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `verdict-classifier-en` is a English model originally trained by `saattrupdan`. + +## Predicted Entities + +`misinformation`, `other`, `factual` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_verdict_en_5.2.0_3.0_1701234577652.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_verdict_en_5.2.0_3.0_1701234577652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_verdict","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_verdict","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_saattrupdan").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_verdict| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/saattrupdan/verdict-classifier-en +- https://developers.google.com/fact-check/tools/api/reference/rest/v1alpha1/claims/search +- https://cloud.google.com/translate/docs/reference/rest/ \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_yelp_rating_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_yelp_rating_classification_en.md new file mode 100644 index 000000000000..a0d6d97d959b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_yelp_rating_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from nihaldsouza1) +author: John Snow Labs +name: roberta_classifier_yelp_rating_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `yelp-rating-classification` is a English model originally trained by `nihaldsouza1`. + +## Predicted Entities + +`2star`, `4star`, `3star`, `5star`, `1star` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_yelp_rating_classification_en_5.2.0_3.0_1701250982518.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_yelp_rating_classification_en_5.2.0_3.0_1701250982518.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_yelp_rating_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_yelp_rating_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.by_nihaldsouza1").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_yelp_rating_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/nihaldsouza1/yelp-rating-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_zabanshenas_base_mix_xx.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_zabanshenas_base_mix_xx.md new file mode 100644 index 000000000000..65c4ba4d8fa1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_classifier_zabanshenas_base_mix_xx.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RoBertaForSequenceClassification Base Cased model (from m3hrdadfi) +author: John Snow Labs +name: roberta_classifier_zabanshenas_base_mix +date: 2023-11-29 +tags: [xx, open_source, roberta, sequence_classification, classification, onnx] +task: Text Classification +language: xx +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `zabanshenas-roberta-base-mix` is a English model originally trained by `m3hrdadfi`. + +## Predicted Entities + +`mon`, `mdf`, `sun`, `bho`, `bxr`, `kaz`, `mrj`, `nld`, `dty`, `ben`, `mlt`, `arz`, `fur`, `pan`, `rup`, `ilo`, `srp`, `mwl`, `tat`, `mhr`, `som`, `vie`, `bjn`, `krc`, `mzn`, `nno`, `tur`, `bel`, `olo`, `mya`, `tam`, `pus`, `roh`, `ido`, `pdc`, `nds`, `ltg`, `lit`, `fas`, `kin`, `lao`, `lav`, `egl`, `lzh`, `afr`, `bod`, `map-bms`, `ina`, `pfl`, `wln`, `war`, `mri`, `ton`, `nap`, `hye`, `oci`, `new`, `gle`, `kbd`, `eng`, `nav`, `que`, `lug`, `cym`, `pol`, `sah`, `nds-nl`, `tuk`, `bul`, `chr`, `isl`, `ava`, `orm`, `scn`, `nan`, `azb`, `aym`, `slk`, `szl`, `wuu`, `sco`, `sgs`, `srd`, `mai`, `lad`, `amh`, `cdo`, `urd`, `nrm`, `por`, `cbk`, `san`, `sin`, `lrc`, `ukr`, `lez`, `vec`, `uig`, `ceb`, `tgl`, `glg`, `cat`, `pam`, `eus`, `chv`, `kir`, `nep`, `vol`, `est`, `dan`, `hsb`, `kor`, `nob`, `ara`, `ile`, `jam`, `srn`, `lat`, `zho`, `snd`, `epo`, `fry`, `swe`, `xmf`, `cos`, `bak`, `vls`, `ces`, `tel`, `ckb`, `zea`, `lim`, `nci`, `ron`, `lin`, `uzb`, `kat`, `aze`, `frp`, `hau`, `hbs`, `ibo`, `bpy`, `glv`, `heb`, `rus`, `kan`, `che`, `tsn`, `bcl`, `min`, `hat`, `fra`, `yid`, `kom`, `ast`, `ita`, `be-tarask`, `myv`, `tcy`, `lij`, `hak`, `sqi`, `gla`, `glk`, `sme`, `pap`, `mlg`, `ell`, `tha`, `hrv`, `tet`, `asm`, `als`, `crh`, `vep`, `pcd`, `sna`, `slv`, `diq`, `kur`, `dsb`, `jbo`, `ext`, `ind`, `yor`, `ori`, `mal`, `guj`, `grn`, `vro`, `spa`, `fin`, `cor`, `bre`, `nso`, `roa-tara`, `udm`, `tgk`, `jpn`, `hun`, `csb`, `bos`, `jav`, `bar`, `fao`, `ang`, `pag`, `hin`, `arg`, `stq`, `gag`, `hif`, `zh-yue`, `msa`, `kok`, `xho`, `koi`, `ltz`, `rue`, `wol`, `ace`, `kaa`, `lmo`, `swa`, `oss`, `kab`, `ksh`, `mkd`, `pnb`, `khm`, `deu`, `tyv`, `div`, `mar` + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_classifier_zabanshenas_base_mix_xx_5.2.0_3.0_1701248704476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_classifier_zabanshenas_base_mix_xx_5.2.0_3.0_1701248704476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_zabanshenas_base_mix","xx") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, seq_classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val seq_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_zabanshenas_base_mix","xx") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, seq_classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("xx.classify.roberta.base").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_classifier_zabanshenas_base_mix| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|xx| +|Size:|415.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/m3hrdadfi/zabanshenas-roberta-base-mix +- https://github.com/m3hrdadfi/zabanshenas \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_cls_consec_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_cls_consec_en.md new file mode 100644 index 000000000000..9e963686814b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_cls_consec_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_cls_consec RoBertaForSequenceClassification from dennlinger +author: John Snow Labs +name: roberta_cls_consec +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_cls_consec` is a English model originally trained by dennlinger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_cls_consec_en_5.2.0_3.0_1701288643742.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_cls_consec_en_5.2.0_3.0_1701288643742.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_cls_consec","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_cls_consec","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_cls_consec| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|458.7 MB| + +## References + +https://huggingface.co/dennlinger/roberta-cls-consec \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_daily_dialog_intent_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_daily_dialog_intent_classifier_en.md new file mode 100644 index 000000000000..175ab71a7503 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_daily_dialog_intent_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_daily_dialog_intent_classifier RoBertaForSequenceClassification from rajkumarrrk +author: John Snow Labs +name: roberta_daily_dialog_intent_classifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_daily_dialog_intent_classifier` is a English model originally trained by rajkumarrrk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_daily_dialog_intent_classifier_en_5.2.0_3.0_1701243118123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_daily_dialog_intent_classifier_en_5.2.0_3.0_1701243118123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_daily_dialog_intent_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_daily_dialog_intent_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_daily_dialog_intent_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/rajkumarrrk/roberta-daily-dialog-intent-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_fact_check_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_fact_check_en.md new file mode 100644 index 000000000000..ff494c6f5c4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_fact_check_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_fact_check RoBertaForSequenceClassification from Dzeniks +author: John Snow Labs +name: roberta_fact_check +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_fact_check` is a English model originally trained by Dzeniks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_fact_check_en_5.2.0_3.0_1701259188514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_fact_check_en_5.2.0_3.0_1701259188514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fact_check","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fact_check","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_fact_check| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.3 MB| + +## References + +https://huggingface.co/Dzeniks/roberta-fact-check \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_fine_tuned_sentiment_newsmtsc_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_fine_tuned_sentiment_newsmtsc_en.md new file mode 100644 index 000000000000..dbdb208d81da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_fine_tuned_sentiment_newsmtsc_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_fine_tuned_sentiment_newsmtsc RoBertaForSequenceClassification from RogerKam +author: John Snow Labs +name: roberta_fine_tuned_sentiment_newsmtsc +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_fine_tuned_sentiment_newsmtsc` is a English model originally trained by RogerKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_fine_tuned_sentiment_newsmtsc_en_5.2.0_3.0_1701278006919.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_fine_tuned_sentiment_newsmtsc_en_5.2.0_3.0_1701278006919.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fine_tuned_sentiment_newsmtsc","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fine_tuned_sentiment_newsmtsc","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_fine_tuned_sentiment_newsmtsc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.0 MB| + +## References + +https://huggingface.co/RogerKam/roberta_fine_tuned_sentiment_newsmtsc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_first_toxicity_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_first_toxicity_classifier_en.md new file mode 100644 index 000000000000..4b24d2a2bf75 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_first_toxicity_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_first_toxicity_classifier RoBertaForSequenceClassification from s-nlp +author: John Snow Labs +name: roberta_first_toxicity_classifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_first_toxicity_classifier` is a English model originally trained by s-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_first_toxicity_classifier_en_5.2.0_3.0_1701277283865.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_first_toxicity_classifier_en_5.2.0_3.0_1701277283865.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_first_toxicity_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_first_toxicity_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_first_toxicity_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/s-nlp/roberta_first_toxicity_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_imdb_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_imdb_sentiment_analysis_en.md new file mode 100644 index 000000000000..5c43e04b6e66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_imdb_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_imdb_sentiment_analysis RoBertaForSequenceClassification from ncduy +author: John Snow Labs +name: roberta_imdb_sentiment_analysis +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_imdb_sentiment_analysis` is a English model originally trained by ncduy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_imdb_sentiment_analysis_en_5.2.0_3.0_1701289274356.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_imdb_sentiment_analysis_en_5.2.0_3.0_1701289274356.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_imdb_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_imdb_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_imdb_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.5 MB| + +## References + +https://huggingface.co/ncduy/roberta-imdb-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_jurisbert_clas_artificial_languages_convencion_americana_dh_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_jurisbert_clas_artificial_languages_convencion_americana_dh_es.md new file mode 100644 index 000000000000..c5d5e618ffbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_jurisbert_clas_artificial_languages_convencion_americana_dh_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_jurisbert_clas_artificial_languages_convencion_americana_dh RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: roberta_jurisbert_clas_artificial_languages_convencion_americana_dh +date: 2023-11-29 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_jurisbert_clas_artificial_languages_convencion_americana_dh` is a Castilian, Spanish model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_jurisbert_clas_artificial_languages_convencion_americana_dh_es_5.2.0_3.0_1701234823612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_jurisbert_clas_artificial_languages_convencion_americana_dh_es_5.2.0_3.0_1701234823612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_jurisbert_clas_artificial_languages_convencion_americana_dh","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_jurisbert_clas_artificial_languages_convencion_americana_dh","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_jurisbert_clas_artificial_languages_convencion_americana_dh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|465.9 MB| + +## References + +https://huggingface.co/hackathon-pln-es/jurisbert-clas-art-convencion-americana-dh \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_large_finetuned_sst5_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_large_finetuned_sst5_en.md new file mode 100644 index 000000000000..e5b206929bad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_large_finetuned_sst5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_finetuned_sst5 RoBertaForSequenceClassification from Unso +author: John Snow Labs +name: roberta_large_finetuned_sst5 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_finetuned_sst5` is a English model originally trained by Unso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_sst5_en_5.2.0_3.0_1701297099184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_sst5_en_5.2.0_3.0_1701297099184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_sst5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_sst5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_finetuned_sst5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Unso/roberta-large-finetuned-sst5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_mixed_detector_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_mixed_detector_en.md new file mode 100644 index 000000000000..298b4dee0468 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_mixed_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_mixed_detector RoBertaForSequenceClassification from andreas122001 +author: John Snow Labs +name: roberta_mixed_detector +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_mixed_detector` is a English model originally trained by andreas122001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_mixed_detector_en_5.2.0_3.0_1701286021283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_mixed_detector_en_5.2.0_3.0_1701286021283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_mixed_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_mixed_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_mixed_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|463.1 MB| + +## References + +https://huggingface.co/andreas122001/roberta-mixed-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_mtl_media_bias_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_mtl_media_bias_en.md new file mode 100644 index 000000000000..fc59d5673b7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_mtl_media_bias_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_mtl_media_bias RoBertaForSequenceClassification from mediabiasgroup +author: John Snow Labs +name: roberta_mtl_media_bias +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_mtl_media_bias` is a English model originally trained by mediabiasgroup. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_mtl_media_bias_en_5.2.0_3.0_1701262488045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_mtl_media_bias_en_5.2.0_3.0_1701262488045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_mtl_media_bias","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_mtl_media_bias","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_mtl_media_bias| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.8 MB| + +## References + +https://huggingface.co/mediabiasgroup/roberta_mtl_media_bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_rcade_fine_tuned_sentiment_covid_news_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_rcade_fine_tuned_sentiment_covid_news_en.md new file mode 100644 index 000000000000..81a8bcc368ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_rcade_fine_tuned_sentiment_covid_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_rcade_fine_tuned_sentiment_covid_news RoBertaForSequenceClassification from RogerKam +author: John Snow Labs +name: roberta_rcade_fine_tuned_sentiment_covid_news +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_rcade_fine_tuned_sentiment_covid_news` is a English model originally trained by RogerKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_rcade_fine_tuned_sentiment_covid_news_en_5.2.0_3.0_1701270826661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_rcade_fine_tuned_sentiment_covid_news_en_5.2.0_3.0_1701270826661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_rcade_fine_tuned_sentiment_covid_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_rcade_fine_tuned_sentiment_covid_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_rcade_fine_tuned_sentiment_covid_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.3 MB| + +## References + +https://huggingface.co/RogerKam/roberta_RCADE_fine_tuned_sentiment_covid_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sentiment_classifier_kodwo11_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sentiment_classifier_kodwo11_en.md new file mode 100644 index 000000000000..149b359b30db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sentiment_classifier_kodwo11_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_sentiment_classifier_kodwo11 RoBertaForSequenceClassification from Kodwo11 +author: John Snow Labs +name: roberta_sentiment_classifier_kodwo11 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentiment_classifier_kodwo11` is a English model originally trained by Kodwo11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_kodwo11_en_5.2.0_3.0_1701268462246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_kodwo11_en_5.2.0_3.0_1701268462246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_kodwo11","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_kodwo11","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentiment_classifier_kodwo11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.2 MB| + +## References + +https://huggingface.co/Kodwo11/Roberta-Sentiment-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection_en.md new file mode 100644 index 000000000000..2507a779a1f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from mrm8488) +author: John Snow Labs +name: roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `codebert2codebert-finetuned-code-defect-detection` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection_en_5.2.0_3.0_1701236459352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection_en_5.2.0_3.0_1701236459352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sequence_classifier_codebert2codebert_finetuned_code_defect_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/codebert2codebert-finetuned-code-defect-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod_en.md new file mode 100644 index 000000000000..a120dcf35e2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod_en.md @@ -0,0 +1,109 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from mrm8488) +author: John Snow Labs +name: roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `codebert-base-finetuned-detect-insecure-code` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod_en_5.2.0_3.0_1701250776063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod_en_5.2.0_3.0_1701250776063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.roberta.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sequence_classifier_codebert_base_finetuned_detect_insecure_cod| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/codebert-base-finetuned-detect-insecure-code +- https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/Defect-detection +- https://arxiv.org/pdf/1907.11692.pdf +- https://arxiv.org/pdf/2002.08155.pdf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression_en.md new file mode 100644 index 000000000000..6835f420f9da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression_en.md @@ -0,0 +1,107 @@ +--- +layout: model +title: English RobertaForSequenceClassification Base Cased model (from mrm8488) +author: John Snow Labs +name: roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-base-finetuned-suicide-depression` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression_en_5.2.0_3.0_1701236711293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression_en_5.2.0_3.0_1701236711293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_roberta.distilled_base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sequence_classifier_distilroberta_base_finetuned_suicide_depression| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/distilroberta-base-finetuned-suicide-depression +- https://github.com/ayaanzhaque/SDCNL \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_age_news_classification_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_age_news_classification_en.md new file mode 100644 index 000000000000..ae8e3d03a1fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_age_news_classification_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from mrm8488) +author: John Snow Labs +name: roberta_sequence_classifier_distilroberta_finetuned_age_news_classification +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-finetuned-age_news-classification` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_finetuned_age_news_classification_en_5.2.0_3.0_1701235146350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_finetuned_age_news_classification_en_5.2.0_3.0_1701235146350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_finetuned_age_news_classification","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_finetuned_age_news_classification","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_roberta.news.distilled_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sequence_classifier_distilroberta_finetuned_age_news_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/distilroberta-finetuned-age_news-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis_en.md new file mode 100644 index 000000000000..3b56d9edd062 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from mrm8488) +author: John Snow Labs +name: roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-finetuned-financial-news-sentiment-analysis` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis_en_5.2.0_3.0_1701252632817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis_en_5.2.0_3.0_1701252632817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_roberta.news_sentiment.distilled_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sequence_classifier_distilroberta_finetuned_financial_news_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis_en.md new file mode 100644 index 000000000000..3c112fde175a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis_en.md @@ -0,0 +1,106 @@ +--- +layout: model +title: English RobertaForSequenceClassification Cased model (from mrm8488) +author: John Snow Labs +name: roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis +date: 2023-11-29 +tags: [en, open_source, roberta, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `distilroberta-finetuned-rotten_tomatoes-sentiment-analysis` is a English model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis_en_5.2.0_3.0_1701248755735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis_en_5.2.0_3.0_1701248755735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis","en") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis","en") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("en.classify.distil_roberta.sentiment.distilled_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sequence_classifier_distilroberta_finetuned_rotten_tomatoes_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/distilroberta-finetuned-rotten_tomatoes-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_ruperta_base_finetuned_pawsx_es.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_ruperta_base_finetuned_pawsx_es.md new file mode 100644 index 000000000000..685d9fc163ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sequence_classifier_ruperta_base_finetuned_pawsx_es.md @@ -0,0 +1,106 @@ +--- +layout: model +title: Spanish RobertaForSequenceClassification Base Cased model (from mrm8488) +author: John Snow Labs +name: roberta_sequence_classifier_ruperta_base_finetuned_pawsx +date: 2023-11-29 +tags: [es, open_source, roberta, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `RuPERTa-base-finetuned-pawsx-es` is a Spanish model originally trained by `mrm8488`. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_ruperta_base_finetuned_pawsx_es_5.2.0_3.0_1701235452586.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sequence_classifier_ruperta_base_finetuned_pawsx_es_5.2.0_3.0_1701235452586.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("document") + +tokenizer = Tokenizer() \ + .setInputCols("document") \ + .setOutputCol("token") + +classifier = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_ruperta_base_finetuned_pawsx","es") \ + .setInputCols(["document", "token"]) \ + .setOutputCol("class") + +pipeline = Pipeline(stages=[documentAssembler, tokenizer, classifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCols(Array("text")) + .setOutputCols(Array("document")) + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val classifer = RoBertaForSequenceClassification.pretrained("roberta_sequence_classifier_ruperta_base_finetuned_pawsx","es") + .setInputCols(Array("document", "token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, classifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` + +{:.nlu-block} +```python +import nlu +nlu.load("es.classify.roberta.pawsx_xtreme.base_finetuned").predict("""PUT YOUR STRING HERE""") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sequence_classifier_ruperta_base_finetuned_pawsx| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|472.2 MB| +|Case sensitive:|true| +|Max sentence length:|256| + +## References + +References + +- https://huggingface.co/mrm8488/RuPERTa-base-finetuned-pawsx-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_sfda_sharpseed_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_sfda_sharpseed_en.md new file mode 100644 index 000000000000..fef81a301499 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_sfda_sharpseed_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_sfda_sharpseed RoBertaForSequenceClassification from tmills +author: John Snow Labs +name: roberta_sfda_sharpseed +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sfda_sharpseed` is a English model originally trained by tmills. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sfda_sharpseed_en_5.2.0_3.0_1701273342289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sfda_sharpseed_en_5.2.0_3.0_1701273342289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sfda_sharpseed","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sfda_sharpseed","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sfda_sharpseed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.2 MB| + +## References + +https://huggingface.co/tmills/roberta_sfda_sharpseed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_spam_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_spam_en.md new file mode 100644 index 000000000000..37a4b2fb3c0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_spam_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_spam RoBertaForSequenceClassification from mshenoda +author: John Snow Labs +name: roberta_spam +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_spam` is a English model originally trained by mshenoda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_spam_en_5.2.0_3.0_1701268567560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_spam_en_5.2.0_3.0_1701268567560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_spam","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_spam","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_spam| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.4 MB| + +## References + +https://huggingface.co/mshenoda/roberta-spam \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_tagalog_base_philippine_elections_2016_2022_hate_speech_tl.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_tagalog_base_philippine_elections_2016_2022_hate_speech_tl.md new file mode 100644 index 000000000000..5cf0abddc88e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_tagalog_base_philippine_elections_2016_2022_hate_speech_tl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Tagalog roberta_tagalog_base_philippine_elections_2016_2022_hate_speech RoBertaForSequenceClassification from mapsoriano +author: John Snow Labs +name: roberta_tagalog_base_philippine_elections_2016_2022_hate_speech +date: 2023-11-29 +tags: [roberta, tl, open_source, sequence_classification, onnx] +task: Text Classification +language: tl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_tagalog_base_philippine_elections_2016_2022_hate_speech` is a Tagalog model originally trained by mapsoriano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_tagalog_base_philippine_elections_2016_2022_hate_speech_tl_5.2.0_3.0_1701257793397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_tagalog_base_philippine_elections_2016_2022_hate_speech_tl_5.2.0_3.0_1701257793397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_tagalog_base_philippine_elections_2016_2022_hate_speech","tl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_tagalog_base_philippine_elections_2016_2022_hate_speech","tl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_tagalog_base_philippine_elections_2016_2022_hate_speech| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tl| +|Size:|409.3 MB| + +## References + +https://huggingface.co/mapsoriano/roberta-tagalog-base-philippine-elections-2016-2022-hate-speech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_tagalog_profanity_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_tagalog_profanity_classifier_en.md new file mode 100644 index 000000000000..be039c4c7719 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_tagalog_profanity_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_tagalog_profanity_classifier RoBertaForSequenceClassification from mginoben +author: John Snow Labs +name: roberta_tagalog_profanity_classifier +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_tagalog_profanity_classifier` is a English model originally trained by mginoben. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_tagalog_profanity_classifier_en_5.2.0_3.0_1701293521289.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_tagalog_profanity_classifier_en_5.2.0_3.0_1701293521289.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_tagalog_profanity_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_tagalog_profanity_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_tagalog_profanity_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|409.3 MB| + +## References + +https://huggingface.co/mginoben/roberta-tagalog-profanity-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_targeted_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_targeted_sentiment_analysis_en.md new file mode 100644 index 000000000000..0ad623522336 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_targeted_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_targeted_sentiment_analysis RoBertaForSequenceClassification from pysentimiento +author: John Snow Labs +name: roberta_targeted_sentiment_analysis +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_targeted_sentiment_analysis` is a English model originally trained by pysentimiento. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_targeted_sentiment_analysis_en_5.2.0_3.0_1701292366662.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_targeted_sentiment_analysis_en_5.2.0_3.0_1701292366662.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_targeted_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_targeted_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_targeted_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.5 MB| + +## References + +https://huggingface.co/pysentimiento/roberta-targeted-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_toxicity_classifier_s_nlp_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_toxicity_classifier_s_nlp_en.md new file mode 100644 index 000000000000..5947ddc25940 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_toxicity_classifier_s_nlp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_toxicity_classifier_s_nlp RoBertaForSequenceClassification from s-nlp +author: John Snow Labs +name: roberta_toxicity_classifier_s_nlp +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_toxicity_classifier_s_nlp` is a English model originally trained by s-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_toxicity_classifier_s_nlp_en_5.2.0_3.0_1701261766099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_toxicity_classifier_s_nlp_en_5.2.0_3.0_1701261766099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxicity_classifier_s_nlp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxicity_classifier_s_nlp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_toxicity_classifier_s_nlp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/s-nlp/roberta_toxicity_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-roberta_toxicity_classifier_v1_en.md b/docs/_posts/ahmedlone127/2023-11-29-roberta_toxicity_classifier_v1_en.md new file mode 100644 index 000000000000..cb0a590b4ffb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-roberta_toxicity_classifier_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_toxicity_classifier_v1 RoBertaForSequenceClassification from s-nlp +author: John Snow Labs +name: roberta_toxicity_classifier_v1 +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_toxicity_classifier_v1` is a English model originally trained by s-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_toxicity_classifier_v1_en_5.2.0_3.0_1701241554099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_toxicity_classifier_v1_en_5.2.0_3.0_1701241554099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxicity_classifier_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxicity_classifier_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_toxicity_classifier_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/s-nlp/roberta_toxicity_classifier_v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-sbic_roberta_demographic_text_disagreement_predictor_en.md b/docs/_posts/ahmedlone127/2023-11-29-sbic_roberta_demographic_text_disagreement_predictor_en.md new file mode 100644 index 000000000000..cbc64e348f01 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-sbic_roberta_demographic_text_disagreement_predictor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sbic_roberta_demographic_text_disagreement_predictor RoBertaForSequenceClassification from RuyuanWan +author: John Snow Labs +name: sbic_roberta_demographic_text_disagreement_predictor +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sbic_roberta_demographic_text_disagreement_predictor` is a English model originally trained by RuyuanWan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sbic_roberta_demographic_text_disagreement_predictor_en_5.2.0_3.0_1701296266475.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sbic_roberta_demographic_text_disagreement_predictor_en_5.2.0_3.0_1701296266475.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sbic_roberta_demographic_text_disagreement_predictor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sbic_roberta_demographic_text_disagreement_predictor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sbic_roberta_demographic_text_disagreement_predictor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.7 MB| + +## References + +https://huggingface.co/RuyuanWan/SBIC_RoBERTa_Demographic-text_Disagreement_Predictor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-sentiment_analysis_knkarthick_en.md b/docs/_posts/ahmedlone127/2023-11-29-sentiment_analysis_knkarthick_en.md new file mode 100644 index 000000000000..98f721c7e384 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-sentiment_analysis_knkarthick_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_knkarthick RoBertaForSequenceClassification from knkarthick +author: John Snow Labs +name: sentiment_analysis_knkarthick +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_knkarthick` is a English model originally trained by knkarthick. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_knkarthick_en_5.2.0_3.0_1701237727567.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_knkarthick_en_5.2.0_3.0_1701237727567.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_knkarthick","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_knkarthick","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_knkarthick| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/knkarthick/Sentiment-Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-sentiment_analysis_test_trainer_en.md b/docs/_posts/ahmedlone127/2023-11-29-sentiment_analysis_test_trainer_en.md new file mode 100644 index 000000000000..d504ce744b48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-sentiment_analysis_test_trainer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_test_trainer RoBertaForSequenceClassification from KAITANY +author: John Snow Labs +name: sentiment_analysis_test_trainer +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_test_trainer` is a English model originally trained by KAITANY. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_test_trainer_en_5.2.0_3.0_1701289275189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_test_trainer_en_5.2.0_3.0_1701289275189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_test_trainer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_test_trainer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_test_trainer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/KAITANY/sentiment_analysis_test_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-slovakbert_sentiment_twitter_sk.md b/docs/_posts/ahmedlone127/2023-11-29-slovakbert_sentiment_twitter_sk.md new file mode 100644 index 000000000000..a415cc5d9497 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-slovakbert_sentiment_twitter_sk.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Slovak slovakbert_sentiment_twitter RoBertaForSequenceClassification from kinit +author: John Snow Labs +name: slovakbert_sentiment_twitter +date: 2023-11-29 +tags: [roberta, sk, open_source, sequence_classification, onnx] +task: Text Classification +language: sk +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`slovakbert_sentiment_twitter` is a Slovak model originally trained by kinit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/slovakbert_sentiment_twitter_sk_5.2.0_3.0_1701269788044.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/slovakbert_sentiment_twitter_sk_5.2.0_3.0_1701269788044.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("slovakbert_sentiment_twitter","sk")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("slovakbert_sentiment_twitter","sk") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|slovakbert_sentiment_twitter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sk| +|Size:|458.8 MB| + +## References + +https://huggingface.co/kinit/slovakbert-sentiment-twitter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-socialbert_social_en.md b/docs/_posts/ahmedlone127/2023-11-29-socialbert_social_en.md new file mode 100644 index 000000000000..f80bd0bc8a5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-socialbert_social_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English socialbert_social RoBertaForSequenceClassification from ESGBERT +author: John Snow Labs +name: socialbert_social +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`socialbert_social` is a English model originally trained by ESGBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/socialbert_social_en_5.2.0_3.0_1701289442453.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/socialbert_social_en_5.2.0_3.0_1701289442453.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("socialbert_social","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("socialbert_social","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|socialbert_social| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/ESGBERT/SocialBERT-social \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-stanceberta_en.md b/docs/_posts/ahmedlone127/2023-11-29-stanceberta_en.md new file mode 100644 index 000000000000..bd84609a5f10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-stanceberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stanceberta RoBertaForSequenceClassification from eevvgg +author: John Snow Labs +name: stanceberta +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stanceberta` is a English model originally trained by eevvgg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stanceberta_en_5.2.0_3.0_1701299437749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stanceberta_en_5.2.0_3.0_1701299437749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("stanceberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("stanceberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stanceberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/eevvgg/StanceBERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-stsb_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-stsb_distilroberta_base_en.md new file mode 100644 index 000000000000..61738141e546 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-stsb_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stsb_distilroberta_base RoBertaForSequenceClassification from cross-encoder +author: John Snow Labs +name: stsb_distilroberta_base +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_distilroberta_base` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_distilroberta_base_en_5.2.0_3.0_1701237067636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_distilroberta_base_en_5.2.0_3.0_1701237067636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("stsb_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("stsb_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/cross-encoder/stsb-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-stsb_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-29-stsb_roberta_base_en.md new file mode 100644 index 000000000000..18dd3b93e1d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-stsb_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stsb_roberta_base RoBertaForSequenceClassification from cross-encoder +author: John Snow Labs +name: stsb_roberta_base +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_roberta_base` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_roberta_base_en_5.2.0_3.0_1701236863202.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_roberta_base_en_5.2.0_3.0_1701236863202.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("stsb_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("stsb_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.0 MB| + +## References + +https://huggingface.co/cross-encoder/stsb-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-stsb_roberta_large_en.md b/docs/_posts/ahmedlone127/2023-11-29-stsb_roberta_large_en.md new file mode 100644 index 000000000000..9aa4b0e2c602 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-stsb_roberta_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stsb_roberta_large RoBertaForSequenceClassification from cross-encoder +author: John Snow Labs +name: stsb_roberta_large +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stsb_roberta_large` is a English model originally trained by cross-encoder. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stsb_roberta_large_en_5.2.0_3.0_1701255633503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stsb_roberta_large_en_5.2.0_3.0_1701255633503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("stsb_roberta_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("stsb_roberta_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stsb_roberta_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/cross-encoder/stsb-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-textdetection_en.md b/docs/_posts/ahmedlone127/2023-11-29-textdetection_en.md new file mode 100644 index 000000000000..05b7b25618c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-textdetection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English textdetection RoBertaForSequenceClassification from rimuruu1 +author: John Snow Labs +name: textdetection +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`textdetection` is a English model originally trained by rimuruu1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/textdetection_en_5.2.0_3.0_1701296266749.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/textdetection_en_5.2.0_3.0_1701296266749.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("textdetection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("textdetection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|textdetection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.9 MB| + +## References + +https://huggingface.co/rimuruu1/TextDetection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-topic_detector_en.md b/docs/_posts/ahmedlone127/2023-11-29-topic_detector_en.md new file mode 100644 index 000000000000..cc9033b2dfa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-topic_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_detector RoBertaForSequenceClassification from ishaansharma +author: John Snow Labs +name: topic_detector +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_detector` is a English model originally trained by ishaansharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_detector_en_5.2.0_3.0_1701294808347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_detector_en_5.2.0_3.0_1701294808347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/ishaansharma/topic-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-toxicitymodel_en.md b/docs/_posts/ahmedlone127/2023-11-29-toxicitymodel_en.md new file mode 100644 index 000000000000..4f8cd6da06ff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-toxicitymodel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English toxicitymodel RoBertaForSequenceClassification from nicholasKluge +author: John Snow Labs +name: toxicitymodel +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxicitymodel` is a English model originally trained by nicholasKluge. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxicitymodel_en_5.2.0_3.0_1701258134952.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxicitymodel_en_5.2.0_3.0_1701258134952.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("toxicitymodel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("toxicitymodel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxicitymodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|456.9 MB| + +## References + +https://huggingface.co/nicholasKluge/ToxicityModel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-toxigen_roberta_en.md b/docs/_posts/ahmedlone127/2023-11-29-toxigen_roberta_en.md new file mode 100644 index 000000000000..ec164fbfafa7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-toxigen_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English toxigen_roberta RoBertaForSequenceClassification from tomh +author: John Snow Labs +name: toxigen_roberta +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxigen_roberta` is a English model originally trained by tomh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxigen_roberta_en_5.2.0_3.0_1701236572169.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxigen_roberta_en_5.2.0_3.0_1701236572169.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("toxigen_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("toxigen_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxigen_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tomh/toxigen_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-tweet_topic_latest_multi_en.md b/docs/_posts/ahmedlone127/2023-11-29-tweet_topic_latest_multi_en.md new file mode 100644 index 000000000000..04725a969e8f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-tweet_topic_latest_multi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tweet_topic_latest_multi RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: tweet_topic_latest_multi +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tweet_topic_latest_multi` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tweet_topic_latest_multi_en_5.2.0_3.0_1701297777987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tweet_topic_latest_multi_en_5.2.0_3.0_1701297777987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_topic_latest_multi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_topic_latest_multi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tweet_topic_latest_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.4 MB| + +## References + +https://huggingface.co/cardiffnlp/tweet-topic-latest-multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_dec2021_offensive_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_dec2021_offensive_en.md new file mode 100644 index 000000000000..e93184fd1e44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_dec2021_offensive_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_offensive RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_offensive +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_offensive` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_offensive_en_5.2.0_3.0_1701297778032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_offensive_en_5.2.0_3.0_1701297778032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_offensive","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_offensive","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_offensive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_dec2021_tweet_topic_multi_all_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_dec2021_tweet_topic_multi_all_en.md new file mode 100644 index 000000000000..67a83d170502 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_dec2021_tweet_topic_multi_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_tweet_topic_multi_all RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_tweet_topic_multi_all +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_tweet_topic_multi_all` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_tweet_topic_multi_all_en_5.2.0_3.0_1701236227751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_tweet_topic_multi_all_en_5.2.0_3.0_1701236227751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_tweet_topic_multi_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_tweet_topic_multi_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_tweet_topic_multi_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-tweet-topic-multi-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_emotion_multilabel_latest_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_emotion_multilabel_latest_en.md new file mode 100644 index 000000000000..6db945534b9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_emotion_multilabel_latest_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_emotion_multilabel_latest RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_emotion_multilabel_latest +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_emotion_multilabel_latest` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_emotion_multilabel_latest_en_5.2.0_3.0_1701254805537.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_emotion_multilabel_latest_en_5.2.0_3.0_1701254805537.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_emotion_multilabel_latest","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_emotion_multilabel_latest","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_emotion_multilabel_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-emotion-multilabel-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_en.md new file mode 100644 index 000000000000..78cc6a761b6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_hate RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_hate +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_hate` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_en_5.2.0_3.0_1701256394099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_en_5.2.0_3.0_1701256394099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_hate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_latest_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_latest_en.md new file mode 100644 index 000000000000..0ef0839eb2c1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_latest_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_hate_latest RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_hate_latest +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_hate_latest` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_latest_en_5.2.0_3.0_1701257793507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_latest_en_5.2.0_3.0_1701257793507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate_latest","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate_latest","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_hate_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-hate-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_multiclass_latest_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_multiclass_latest_en.md new file mode 100644 index 000000000000..84d4b4aaedd0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_hate_multiclass_latest_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_hate_multiclass_latest RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_hate_multiclass_latest +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_hate_multiclass_latest` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_multiclass_latest_en_5.2.0_3.0_1701276313611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_multiclass_latest_en_5.2.0_3.0_1701276313611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate_multiclass_latest","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate_multiclass_latest","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_hate_multiclass_latest| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-hate-multiclass-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_irony_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_irony_en.md new file mode 100644 index 000000000000..47becc263596 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_irony_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_irony RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_irony +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_irony` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_irony_en_5.2.0_3.0_1701250135640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_irony_en_5.2.0_3.0_1701250135640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_irony","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_irony","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_irony| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-irony \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_offensive_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_offensive_en.md new file mode 100644 index 000000000000..0a76a2bd2cad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_offensive_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_offensive RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_offensive +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_offensive` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_offensive_en_5.2.0_3.0_1701235882118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_offensive_en_5.2.0_3.0_1701235882118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_offensive","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_offensive","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_offensive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_sentiment_cardiffnlp_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_sentiment_cardiffnlp_en.md new file mode 100644 index 000000000000..ef8e85365eac --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_sentiment_cardiffnlp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_sentiment_cardiffnlp RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_sentiment_cardiffnlp +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_sentiment_cardiffnlp` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentiment_cardiffnlp_en_5.2.0_3.0_1701235663793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentiment_cardiffnlp_en_5.2.0_3.0_1701235663793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentiment_cardiffnlp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentiment_cardiffnlp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_sentiment_cardiffnlp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_sentiment_latest_mbabazi_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_sentiment_latest_mbabazi_en.md new file mode 100644 index 000000000000..60ee54aaa08f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_sentiment_latest_mbabazi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_sentiment_latest_mbabazi RoBertaForSequenceClassification from Mbabazi +author: John Snow Labs +name: twitter_roberta_base_sentiment_latest_mbabazi +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_sentiment_latest_mbabazi` is a English model originally trained by Mbabazi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentiment_latest_mbabazi_en_5.2.0_3.0_1701284341423.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentiment_latest_mbabazi_en_5.2.0_3.0_1701284341423.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentiment_latest_mbabazi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentiment_latest_mbabazi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_sentiment_latest_mbabazi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Mbabazi/twitter-roberta-base-sentiment-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_stance_abortion_en.md b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_stance_abortion_en.md new file mode 100644 index 000000000000..d8c7f4db5989 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-twitter_roberta_base_stance_abortion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_stance_abortion RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_stance_abortion +date: 2023-11-29 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_stance_abortion` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_abortion_en_5.2.0_3.0_1701295200620.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_abortion_en_5.2.0_3.0_1701295200620.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_abortion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_abortion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_stance_abortion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-stance-abortion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-29-urduclassification_ur.md b/docs/_posts/ahmedlone127/2023-11-29-urduclassification_ur.md new file mode 100644 index 000000000000..826b558bfaeb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-29-urduclassification_ur.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Urdu urduclassification RoBertaForSequenceClassification from mwz +author: John Snow Labs +name: urduclassification +date: 2023-11-29 +tags: [roberta, ur, open_source, sequence_classification, onnx] +task: Text Classification +language: ur +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`urduclassification` is a Urdu model originally trained by mwz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/urduclassification_ur_5.2.0_3.0_1701253074220.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/urduclassification_ur_5.2.0_3.0_1701253074220.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("urduclassification","ur")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("urduclassification","ur") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|urduclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ur| +|Size:|473.2 MB| + +## References + +https://huggingface.co/mwz/UrduClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-all_distilroberta_v1_20231125_en.md b/docs/_posts/ahmedlone127/2023-11-30-all_distilroberta_v1_20231125_en.md new file mode 100644 index 000000000000..c50421dc4822 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-all_distilroberta_v1_20231125_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_distilroberta_v1_20231125 RoBertaForSequenceClassification from Kevinger +author: John Snow Labs +name: all_distilroberta_v1_20231125 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_distilroberta_v1_20231125` is a English model originally trained by Kevinger. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_distilroberta_v1_20231125_en_5.2.0_3.0_1701353749281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_distilroberta_v1_20231125_en_5.2.0_3.0_1701353749281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_distilroberta_v1_20231125","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_distilroberta_v1_20231125","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_distilroberta_v1_20231125| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/Kevinger/all-distilroberta-v1-20231125 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-ambiguity_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-30-ambiguity_distilroberta_base_en.md new file mode 100644 index 000000000000..84b7c6d16c48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-ambiguity_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ambiguity_distilroberta_base RoBertaForSequenceClassification from j-hartmann +author: John Snow Labs +name: ambiguity_distilroberta_base +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ambiguity_distilroberta_base` is a English model originally trained by j-hartmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ambiguity_distilroberta_base_en_5.2.0_3.0_1701349387012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ambiguity_distilroberta_base_en_5.2.0_3.0_1701349387012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ambiguity_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ambiguity_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ambiguity_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/j-hartmann/ambiguity-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-arousal_english_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-30-arousal_english_distilroberta_base_en.md new file mode 100644 index 000000000000..5e149da671a9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-arousal_english_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English arousal_english_distilroberta_base RoBertaForSequenceClassification from samueldomdey +author: John Snow Labs +name: arousal_english_distilroberta_base +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arousal_english_distilroberta_base` is a English model originally trained by samueldomdey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arousal_english_distilroberta_base_en_5.2.0_3.0_1701348091631.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arousal_english_distilroberta_base_en_5.2.0_3.0_1701348091631.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("arousal_english_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("arousal_english_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arousal_english_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/samueldomdey/arousal-english-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-augcode_en.md b/docs/_posts/ahmedlone127/2023-11-30-augcode_en.md new file mode 100644 index 000000000000..a51438a60c54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-augcode_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English augcode RoBertaForSequenceClassification from Fujitsu +author: John Snow Labs +name: augcode +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`augcode` is a English model originally trained by Fujitsu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/augcode_en_5.2.0_3.0_1701372289838.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/augcode_en_5.2.0_3.0_1701372289838.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("augcode","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("augcode","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|augcode| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Fujitsu/AugCode \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-autotrain_slovenian_swear_words_74310139575_en.md b/docs/_posts/ahmedlone127/2023-11-30-autotrain_slovenian_swear_words_74310139575_en.md new file mode 100644 index 000000000000..eca0ac6fbb7f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-autotrain_slovenian_swear_words_74310139575_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_slovenian_swear_words_74310139575 RoBertaForSequenceClassification from offlinehq +author: John Snow Labs +name: autotrain_slovenian_swear_words_74310139575 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_slovenian_swear_words_74310139575` is a English model originally trained by offlinehq. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_slovenian_swear_words_74310139575_en_5.2.0_3.0_1701370989672.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_slovenian_swear_words_74310139575_en_5.2.0_3.0_1701370989672.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_slovenian_swear_words_74310139575","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_slovenian_swear_words_74310139575","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_slovenian_swear_words_74310139575| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|415.2 MB| + +## References + +https://huggingface.co/offlinehq/autotrain-slovenian-swear-words-74310139575 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-bert_empathy_en.md b/docs/_posts/ahmedlone127/2023-11-30-bert_empathy_en.md new file mode 100644 index 000000000000..997461286200 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-bert_empathy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bert_empathy RoBertaForSequenceClassification from paragon-analytics +author: John Snow Labs +name: bert_empathy +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bert_empathy` is a English model originally trained by paragon-analytics. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_empathy_en_5.2.0_3.0_1701373718614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_empathy_en_5.2.0_3.0_1701373718614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bert_empathy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bert_empathy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_empathy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|418.8 MB| + +## References + +https://huggingface.co/paragon-analytics/bert_empathy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-bertagent_best_en.md b/docs/_posts/ahmedlone127/2023-11-30-bertagent_best_en.md new file mode 100644 index 000000000000..0db70281c66f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-bertagent_best_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bertagent_best RoBertaForSequenceClassification from EnchantedStardust +author: John Snow Labs +name: bertagent_best +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertagent_best` is a English model originally trained by EnchantedStardust. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertagent_best_en_5.2.0_3.0_1701349084235.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertagent_best_en_5.2.0_3.0_1701349084235.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertagent_best","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertagent_best","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertagent_best| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.3 MB| + +## References + +https://huggingface.co/EnchantedStardust/bertagent-best \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-bertweet_sexism_en.md b/docs/_posts/ahmedlone127/2023-11-30-bertweet_sexism_en.md new file mode 100644 index 000000000000..c1e0de4b8cb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-bertweet_sexism_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bertweet_sexism RoBertaForSequenceClassification from tum-nlp +author: John Snow Labs +name: bertweet_sexism +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertweet_sexism` is a English model originally trained by tum-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertweet_sexism_en_5.2.0_3.0_1701351570398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertweet_sexism_en_5.2.0_3.0_1701351570398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertweet_sexism","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertweet_sexism","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertweet_sexism| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tum-nlp/bertweet-sexism \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-burmese_awesome_unixcoder_en.md b/docs/_posts/ahmedlone127/2023-11-30-burmese_awesome_unixcoder_en.md new file mode 100644 index 000000000000..b5fbb23a8bcc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-burmese_awesome_unixcoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_unixcoder RoBertaForSequenceClassification from buelfhood +author: John Snow Labs +name: burmese_awesome_unixcoder +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_unixcoder` is a English model originally trained by buelfhood. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_unixcoder_en_5.2.0_3.0_1701353008273.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_unixcoder_en_5.2.0_3.0_1701353008273.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_awesome_unixcoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_awesome_unixcoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_unixcoder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|472.4 MB| + +## References + +https://huggingface.co/buelfhood/my_awesome_unixcoder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-burmese_model_en.md b/docs/_posts/ahmedlone127/2023-11-30-burmese_model_en.md new file mode 100644 index 000000000000..ec3b56ccd2aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-burmese_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_model RoBertaForSequenceClassification from dadashzadeh +author: John Snow Labs +name: burmese_model +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_model` is a English model originally trained by dadashzadeh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_model_en_5.2.0_3.0_1701383982493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_model_en_5.2.0_3.0_1701383982493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.1 MB| + +## References + +https://huggingface.co/dadashzadeh/my_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-c4_binary_english_grammar_checker_en.md b/docs/_posts/ahmedlone127/2023-11-30-c4_binary_english_grammar_checker_en.md new file mode 100644 index 000000000000..82834de89e76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-c4_binary_english_grammar_checker_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English c4_binary_english_grammar_checker RoBertaForSequenceClassification from nikolasmoya +author: John Snow Labs +name: c4_binary_english_grammar_checker +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`c4_binary_english_grammar_checker` is a English model originally trained by nikolasmoya. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/c4_binary_english_grammar_checker_en_5.2.0_3.0_1701352230810.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/c4_binary_english_grammar_checker_en_5.2.0_3.0_1701352230810.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("c4_binary_english_grammar_checker","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("c4_binary_english_grammar_checker","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|c4_binary_english_grammar_checker| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/nikolasmoya/c4-binary-english-grammar-checker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_emotion_en.md new file mode 100644 index 000000000000..a1f103d9c275 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cardiffnlp_twitter_roberta_base_emotion RoBertaForSequenceClassification from Mahmoud8 +author: John Snow Labs +name: cardiffnlp_twitter_roberta_base_emotion +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cardiffnlp_twitter_roberta_base_emotion` is a English model originally trained by Mahmoud8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cardiffnlp_twitter_roberta_base_emotion_en_5.2.0_3.0_1701352204778.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cardiffnlp_twitter_roberta_base_emotion_en_5.2.0_3.0_1701352204778.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cardiffnlp_twitter_roberta_base_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cardiffnlp_twitter_roberta_base_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cardiffnlp_twitter_roberta_base_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/Mahmoud8/cardiffnlp-twitter-roberta-base-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal_en.md b/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal_en.md new file mode 100644 index 000000000000..fbcf08dd6607 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal RoBertaForSequenceClassification from Feiiisal +author: John Snow Labs +name: cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal` is a English model originally trained by Feiiisal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal_en_5.2.0_3.0_1701349037666.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal_en_5.2.0_3.0_1701349037666.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_feiiisal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Feiiisal/cardiffnlp_twitter_roberta_base_sentiment_latest_Nov2023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi_en.md b/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi_en.md new file mode 100644 index 000000000000..473ed3aa8048 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi RoBertaForSequenceClassification from Mbabazi +author: John Snow Labs +name: cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi` is a English model originally trained by Mbabazi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi_en_5.2.0_3.0_1701362281073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi_en_5.2.0_3.0_1701362281073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cardiffnlp_twitter_roberta_base_sentiment_latest_nov2023_mbabazi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Mbabazi/cardiffnlp_twitter_roberta_base_sentiment_latest_Nov2023 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-cc_narratives_robertamodel3_en.md b/docs/_posts/ahmedlone127/2023-11-30-cc_narratives_robertamodel3_en.md new file mode 100644 index 000000000000..36b610dccc11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-cc_narratives_robertamodel3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cc_narratives_robertamodel3 RoBertaForSequenceClassification from nnisbett +author: John Snow Labs +name: cc_narratives_robertamodel3 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cc_narratives_robertamodel3` is a English model originally trained by nnisbett. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cc_narratives_robertamodel3_en_5.2.0_3.0_1701354421378.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cc_narratives_robertamodel3_en_5.2.0_3.0_1701354421378.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cc_narratives_robertamodel3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cc_narratives_robertamodel3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cc_narratives_robertamodel3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.1 MB| + +## References + +https://huggingface.co/nnisbett/cc_narratives_robertamodel3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-centralbankroberta_agent_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-30-centralbankroberta_agent_classifier_en.md new file mode 100644 index 000000000000..1e347dcae786 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-centralbankroberta_agent_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English centralbankroberta_agent_classifier RoBertaForSequenceClassification from Moritz-Pfeifer +author: John Snow Labs +name: centralbankroberta_agent_classifier +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`centralbankroberta_agent_classifier` is a English model originally trained by Moritz-Pfeifer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/centralbankroberta_agent_classifier_en_5.2.0_3.0_1701366108041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/centralbankroberta_agent_classifier_en_5.2.0_3.0_1701366108041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("centralbankroberta_agent_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("centralbankroberta_agent_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|centralbankroberta_agent_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.4 MB| + +## References + +https://huggingface.co/Moritz-Pfeifer/CentralBankRoBERTa-agent-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-chatgpt_content_detector_en.md b/docs/_posts/ahmedlone127/2023-11-30-chatgpt_content_detector_en.md new file mode 100644 index 000000000000..cf854fa78098 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-chatgpt_content_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chatgpt_content_detector RoBertaForSequenceClassification from devloverumar +author: John Snow Labs +name: chatgpt_content_detector +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chatgpt_content_detector` is a English model originally trained by devloverumar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chatgpt_content_detector_en_5.2.0_3.0_1701351033499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chatgpt_content_detector_en_5.2.0_3.0_1701351033499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("chatgpt_content_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("chatgpt_content_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chatgpt_content_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|463.2 MB| + +## References + +https://huggingface.co/devloverumar/chatgpt-content-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-chemberta_v2_finetuned_uspto_50k_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-chemberta_v2_finetuned_uspto_50k_classification_en.md new file mode 100644 index 000000000000..c5f22705c18e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-chemberta_v2_finetuned_uspto_50k_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English chemberta_v2_finetuned_uspto_50k_classification RoBertaForSequenceClassification from Phando +author: John Snow Labs +name: chemberta_v2_finetuned_uspto_50k_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`chemberta_v2_finetuned_uspto_50k_classification` is a English model originally trained by Phando. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/chemberta_v2_finetuned_uspto_50k_classification_en_5.2.0_3.0_1701347031707.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/chemberta_v2_finetuned_uspto_50k_classification_en_5.2.0_3.0_1701347031707.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("chemberta_v2_finetuned_uspto_50k_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("chemberta_v2_finetuned_uspto_50k_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|chemberta_v2_finetuned_uspto_50k_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|312.3 MB| + +## References + +https://huggingface.co/Phando/chemberta-v2-finetuned-uspto-50k-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-climatebert_base_f_fever_evidence_related_en.md b/docs/_posts/ahmedlone127/2023-11-30-climatebert_base_f_fever_evidence_related_en.md new file mode 100644 index 000000000000..7a7f2749e13e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-climatebert_base_f_fever_evidence_related_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English climatebert_base_f_fever_evidence_related RoBertaForSequenceClassification from mwong +author: John Snow Labs +name: climatebert_base_f_fever_evidence_related +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`climatebert_base_f_fever_evidence_related` is a English model originally trained by mwong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/climatebert_base_f_fever_evidence_related_en_5.2.0_3.0_1701384495614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/climatebert_base_f_fever_evidence_related_en_5.2.0_3.0_1701384495614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("climatebert_base_f_fever_evidence_related","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("climatebert_base_f_fever_evidence_related","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|climatebert_base_f_fever_evidence_related| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/mwong/climatebert-base-f-fever-evidence-related \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-climatebert_fact_checking_en.md b/docs/_posts/ahmedlone127/2023-11-30-climatebert_fact_checking_en.md new file mode 100644 index 000000000000..8b71f4d49933 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-climatebert_fact_checking_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English climatebert_fact_checking RoBertaForSequenceClassification from amandakonet +author: John Snow Labs +name: climatebert_fact_checking +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`climatebert_fact_checking` is a English model originally trained by amandakonet. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/climatebert_fact_checking_en_5.2.0_3.0_1701347648418.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/climatebert_fact_checking_en_5.2.0_3.0_1701347648418.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("climatebert_fact_checking","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("climatebert_fact_checking","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|climatebert_fact_checking| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/amandakonet/climatebert-fact-checking \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-cn_roberta_sci_en.md b/docs/_posts/ahmedlone127/2023-11-30-cn_roberta_sci_en.md new file mode 100644 index 000000000000..fc5eee0ea1b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-cn_roberta_sci_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cn_roberta_sci RoBertaForSequenceClassification from vishruthnath +author: John Snow Labs +name: cn_roberta_sci +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cn_roberta_sci` is a English model originally trained by vishruthnath. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cn_roberta_sci_en_5.2.0_3.0_1701352443007.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cn_roberta_sci_en_5.2.0_3.0_1701352443007.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cn_roberta_sci","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cn_roberta_sci","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cn_roberta_sci| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|417.0 MB| + +## References + +https://huggingface.co/vishruthnath/CN_RoBERTa_Sci \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-codebert_base_mlm_malicious_urls_en.md b/docs/_posts/ahmedlone127/2023-11-30-codebert_base_mlm_malicious_urls_en.md new file mode 100644 index 000000000000..e85acbbfed2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-codebert_base_mlm_malicious_urls_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English codebert_base_mlm_malicious_urls RoBertaForSequenceClassification from DunnBC22 +author: John Snow Labs +name: codebert_base_mlm_malicious_urls +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`codebert_base_mlm_malicious_urls` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/codebert_base_mlm_malicious_urls_en_5.2.0_3.0_1701364075111.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/codebert_base_mlm_malicious_urls_en_5.2.0_3.0_1701364075111.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_base_mlm_malicious_urls","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_base_mlm_malicious_urls","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|codebert_base_mlm_malicious_urls| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/DunnBC22/codebert-base-mlm-Malicious_URLs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-codebertforreturntypeclassification_en.md b/docs/_posts/ahmedlone127/2023-11-30-codebertforreturntypeclassification_en.md new file mode 100644 index 000000000000..61ba8fa9eab0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-codebertforreturntypeclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English codebertforreturntypeclassification RoBertaForSequenceClassification from UDE-SE +author: John Snow Labs +name: codebertforreturntypeclassification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`codebertforreturntypeclassification` is a English model originally trained by UDE-SE. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/codebertforreturntypeclassification_en_5.2.0_3.0_1701384477373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/codebertforreturntypeclassification_en_5.2.0_3.0_1701384477373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebertforreturntypeclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebertforreturntypeclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|codebertforreturntypeclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/UDE-SE/CodeBERTForReturnTypeClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-cold_fusion_en.md b/docs/_posts/ahmedlone127/2023-11-30-cold_fusion_en.md new file mode 100644 index 000000000000..8ec2f16aad2b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-cold_fusion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_en_5.2.0_3.0_1701350391068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_en_5.2.0_3.0_1701350391068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-complaints_roberta_en.md b/docs/_posts/ahmedlone127/2023-11-30-complaints_roberta_en.md new file mode 100644 index 000000000000..4d8773aa5772 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-complaints_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English complaints_roberta RoBertaForSequenceClassification from ThirdEyeData +author: John Snow Labs +name: complaints_roberta +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`complaints_roberta` is a English model originally trained by ThirdEyeData. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/complaints_roberta_en_5.2.0_3.0_1701353223745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/complaints_roberta_en_5.2.0_3.0_1701353223745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("complaints_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("complaints_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|complaints_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.6 MB| + +## References + +https://huggingface.co/ThirdEyeData/Complaints_Roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-concreteness_english_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-30-concreteness_english_distilroberta_base_en.md new file mode 100644 index 000000000000..23084a03cffa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-concreteness_english_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English concreteness_english_distilroberta_base RoBertaForSequenceClassification from samueldomdey +author: John Snow Labs +name: concreteness_english_distilroberta_base +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`concreteness_english_distilroberta_base` is a English model originally trained by samueldomdey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/concreteness_english_distilroberta_base_en_5.2.0_3.0_1701348509940.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/concreteness_english_distilroberta_base_en_5.2.0_3.0_1701348509940.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("concreteness_english_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("concreteness_english_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|concreteness_english_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/samueldomdey/concreteness-english-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-controversy_prediction_es.md b/docs/_posts/ahmedlone127/2023-11-30-controversy_prediction_es.md new file mode 100644 index 000000000000..7bd440403faf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-controversy_prediction_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish controversy_prediction RoBertaForSequenceClassification from PlanTL-GOB-ES +author: John Snow Labs +name: controversy_prediction +date: 2023-11-30 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`controversy_prediction` is a Castilian, Spanish model originally trained by PlanTL-GOB-ES. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/controversy_prediction_es_5.2.0_3.0_1701366559965.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/controversy_prediction_es_5.2.0_3.0_1701366559965.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("controversy_prediction","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("controversy_prediction","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|controversy_prediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|463.3 MB| + +## References + +https://huggingface.co/PlanTL-GOB-ES/Controversy-Prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-covid_tweet_sentiment_analyzer_roberta_latest_snyamson_en.md b/docs/_posts/ahmedlone127/2023-11-30-covid_tweet_sentiment_analyzer_roberta_latest_snyamson_en.md new file mode 100644 index 000000000000..a5b75a5d9b20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-covid_tweet_sentiment_analyzer_roberta_latest_snyamson_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_tweet_sentiment_analyzer_roberta_latest_snyamson RoBertaForSequenceClassification from snyamson +author: John Snow Labs +name: covid_tweet_sentiment_analyzer_roberta_latest_snyamson +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_tweet_sentiment_analyzer_roberta_latest_snyamson` is a English model originally trained by snyamson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_tweet_sentiment_analyzer_roberta_latest_snyamson_en_5.2.0_3.0_1701359828592.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_tweet_sentiment_analyzer_roberta_latest_snyamson_en_5.2.0_3.0_1701359828592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_tweet_sentiment_analyzer_roberta_latest_snyamson","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_tweet_sentiment_analyzer_roberta_latest_snyamson","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_tweet_sentiment_analyzer_roberta_latest_snyamson| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/snyamson/covid-tweet-sentiment-analyzer-roberta-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-covid_tweet_sentimental_analysis_roberta_en.md b/docs/_posts/ahmedlone127/2023-11-30-covid_tweet_sentimental_analysis_roberta_en.md new file mode 100644 index 000000000000..a410cf9ab6aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-covid_tweet_sentimental_analysis_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_tweet_sentimental_analysis_roberta RoBertaForSequenceClassification from gyesibiney +author: John Snow Labs +name: covid_tweet_sentimental_analysis_roberta +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_tweet_sentimental_analysis_roberta` is a English model originally trained by gyesibiney. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_tweet_sentimental_analysis_roberta_en_5.2.0_3.0_1701384476307.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_tweet_sentimental_analysis_roberta_en_5.2.0_3.0_1701384476307.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_tweet_sentimental_analysis_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_tweet_sentimental_analysis_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_tweet_sentimental_analysis_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/gyesibiney/covid-tweet-sentimental-Analysis-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-covid_vaccine_tweet_sentiment_analysis_roberta_azie88_en.md b/docs/_posts/ahmedlone127/2023-11-30-covid_vaccine_tweet_sentiment_analysis_roberta_azie88_en.md new file mode 100644 index 000000000000..5d062df9b4be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-covid_vaccine_tweet_sentiment_analysis_roberta_azie88_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_vaccine_tweet_sentiment_analysis_roberta_azie88 RoBertaForSequenceClassification from Azie88 +author: John Snow Labs +name: covid_vaccine_tweet_sentiment_analysis_roberta_azie88 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_vaccine_tweet_sentiment_analysis_roberta_azie88` is a English model originally trained by Azie88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_vaccine_tweet_sentiment_analysis_roberta_azie88_en_5.2.0_3.0_1701349613283.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_vaccine_tweet_sentiment_analysis_roberta_azie88_en_5.2.0_3.0_1701349613283.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_vaccine_tweet_sentiment_analysis_roberta_azie88","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_vaccine_tweet_sentiment_analysis_roberta_azie88","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_vaccine_tweet_sentiment_analysis_roberta_azie88| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Azie88/COVID_Vaccine_Tweet_sentiment_analysis_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-cross_encoder_binary_classification_large_en.md b/docs/_posts/ahmedlone127/2023-11-30-cross_encoder_binary_classification_large_en.md new file mode 100644 index 000000000000..384209adb954 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-cross_encoder_binary_classification_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cross_encoder_binary_classification_large RoBertaForSequenceClassification from ajax-law +author: John Snow Labs +name: cross_encoder_binary_classification_large +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cross_encoder_binary_classification_large` is a English model originally trained by ajax-law. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cross_encoder_binary_classification_large_en_5.2.0_3.0_1701347172937.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cross_encoder_binary_classification_large_en_5.2.0_3.0_1701347172937.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cross_encoder_binary_classification_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cross_encoder_binary_classification_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cross_encoder_binary_classification_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ajax-law/cross-encoder-binary-classification-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-deepscc_roberta_en.md b/docs/_posts/ahmedlone127/2023-11-30-deepscc_roberta_en.md new file mode 100644 index 000000000000..0fb513a3b790 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-deepscc_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English deepscc_roberta RoBertaForSequenceClassification from NTUYG +author: John Snow Labs +name: deepscc_roberta +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deepscc_roberta` is a English model originally trained by NTUYG. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deepscc_roberta_en_5.2.0_3.0_1701349590309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deepscc_roberta_en_5.2.0_3.0_1701349590309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("deepscc_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("deepscc_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|deepscc_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|458.9 MB| + +## References + +https://huggingface.co/NTUYG/DeepSCC-RoBERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-disaster_tweet_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-disaster_tweet_classification_en.md new file mode 100644 index 000000000000..bd6af567c9ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-disaster_tweet_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English disaster_tweet_classification RoBertaForSequenceClassification from aellxx +author: John Snow Labs +name: disaster_tweet_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`disaster_tweet_classification` is a English model originally trained by aellxx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/disaster_tweet_classification_en_5.2.0_3.0_1701350810266.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/disaster_tweet_classification_en_5.2.0_3.0_1701350810266.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("disaster_tweet_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("disaster_tweet_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|disaster_tweet_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/aellxx/disaster-tweet-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-distilroberta_base_mahmoud8_en.md b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_base_mahmoud8_en.md new file mode 100644 index 000000000000..07dbef257187 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_base_mahmoud8_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_mahmoud8 RoBertaForSequenceClassification from Mahmoud8 +author: John Snow Labs +name: distilroberta_base_mahmoud8 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_mahmoud8` is a English model originally trained by Mahmoud8. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_mahmoud8_en_5.2.0_3.0_1701350926477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_mahmoud8_en_5.2.0_3.0_1701350926477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mahmoud8","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mahmoud8","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_mahmoud8| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/Mahmoud8/distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-distilroberta_base_profile_rerank_en.md b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_base_profile_rerank_en.md new file mode 100644 index 000000000000..09d9bf61ee49 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_base_profile_rerank_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_profile_rerank RoBertaForSequenceClassification from dijon-ai +author: John Snow Labs +name: distilroberta_base_profile_rerank +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_profile_rerank` is a English model originally trained by dijon-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_profile_rerank_en_5.2.0_3.0_1701385214546.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_profile_rerank_en_5.2.0_3.0_1701385214546.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_profile_rerank","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_profile_rerank","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_profile_rerank| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/dijon-ai/distilroberta-base-profile-rerank \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-distilroberta_financial_news_tweets_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_financial_news_tweets_sentiment_analysis_en.md new file mode 100644 index 000000000000..b83756f06bf6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_financial_news_tweets_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_financial_news_tweets_sentiment_analysis RoBertaForSequenceClassification from mrfakename +author: John Snow Labs +name: distilroberta_financial_news_tweets_sentiment_analysis +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_financial_news_tweets_sentiment_analysis` is a English model originally trained by mrfakename. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_financial_news_tweets_sentiment_analysis_en_5.2.0_3.0_1701387931515.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_financial_news_tweets_sentiment_analysis_en_5.2.0_3.0_1701387931515.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_financial_news_tweets_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_financial_news_tweets_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_financial_news_tweets_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/mrfakename/distilroberta-financial-news-tweets-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-distilroberta_pr200k_ep20_ag_news_en.md b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_pr200k_ep20_ag_news_en.md new file mode 100644 index 000000000000..ba3d36c12a48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_pr200k_ep20_ag_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_pr200k_ep20_ag_news RoBertaForSequenceClassification from judy93536 +author: John Snow Labs +name: distilroberta_pr200k_ep20_ag_news +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_pr200k_ep20_ag_news` is a English model originally trained by judy93536. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_pr200k_ep20_ag_news_en_5.2.0_3.0_1701372379099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_pr200k_ep20_ag_news_en_5.2.0_3.0_1701372379099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pr200k_ep20_ag_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pr200k_ep20_ag_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_pr200k_ep20_ag_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| + +## References + +https://huggingface.co/judy93536/distilroberta-pr200k-ep20-ag_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-distilroberta_pr200k_phrase_5k_en.md b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_pr200k_phrase_5k_en.md new file mode 100644 index 000000000000..813babf1dd7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_pr200k_phrase_5k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_pr200k_phrase_5k RoBertaForSequenceClassification from judy93536 +author: John Snow Labs +name: distilroberta_pr200k_phrase_5k +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_pr200k_phrase_5k` is a English model originally trained by judy93536. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_pr200k_phrase_5k_en_5.2.0_3.0_1701352389031.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_pr200k_phrase_5k_en_5.2.0_3.0_1701352389031.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pr200k_phrase_5k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pr200k_phrase_5k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_pr200k_phrase_5k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/judy93536/distilroberta-pr200k-phrase-5k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-distilroberta_topic_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_topic_classification_en.md new file mode 100644 index 000000000000..7c56fa915f27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-distilroberta_topic_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_topic_classification RoBertaForSequenceClassification from abdulmatinomotoso +author: John Snow Labs +name: distilroberta_topic_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_topic_classification` is a English model originally trained by abdulmatinomotoso. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_topic_classification_en_5.2.0_3.0_1701359484899.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_topic_classification_en_5.2.0_3.0_1701359484899.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_topic_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_topic_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_topic_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.4 MB| + +## References + +https://huggingface.co/abdulmatinomotoso/distilroberta-topic-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-documentclassification_en.md b/docs/_posts/ahmedlone127/2023-11-30-documentclassification_en.md new file mode 100644 index 000000000000..c40f00cdb5be --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-documentclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English documentclassification RoBertaForSequenceClassification from cuadron11 +author: John Snow Labs +name: documentclassification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`documentclassification` is a English model originally trained by cuadron11. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/documentclassification_en_5.2.0_3.0_1701352803463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/documentclassification_en_5.2.0_3.0_1701352803463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("documentclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("documentclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|documentclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|431.4 MB| + +## References + +https://huggingface.co/cuadron11/documentClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-domain_adapted_negation_en.md b/docs/_posts/ahmedlone127/2023-11-30-domain_adapted_negation_en.md new file mode 100644 index 000000000000..c6900a90e32d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-domain_adapted_negation_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English domain_adapted_negation RoBertaForSequenceClassification from iAmmarTahir +author: John Snow Labs +name: domain_adapted_negation +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`domain_adapted_negation` is a English model originally trained by iAmmarTahir. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/domain_adapted_negation_en_5.2.0_3.0_1701385144671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/domain_adapted_negation_en_5.2.0_3.0_1701385144671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("domain_adapted_negation","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("domain_adapted_negation","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|domain_adapted_negation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|427.1 MB| + +## References + +https://huggingface.co/iAmmarTahir/domain-adapted-negation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-drivefeelings_roberta_sentiment_analyzer_for_twitter_en.md b/docs/_posts/ahmedlone127/2023-11-30-drivefeelings_roberta_sentiment_analyzer_for_twitter_en.md new file mode 100644 index 000000000000..e0bfd7c0da3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-drivefeelings_roberta_sentiment_analyzer_for_twitter_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English drivefeelings_roberta_sentiment_analyzer_for_twitter RoBertaForSequenceClassification from bibbia +author: John Snow Labs +name: drivefeelings_roberta_sentiment_analyzer_for_twitter +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`drivefeelings_roberta_sentiment_analyzer_for_twitter` is a English model originally trained by bibbia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/drivefeelings_roberta_sentiment_analyzer_for_twitter_en_5.2.0_3.0_1701348307443.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/drivefeelings_roberta_sentiment_analyzer_for_twitter_en_5.2.0_3.0_1701348307443.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("drivefeelings_roberta_sentiment_analyzer_for_twitter","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("drivefeelings_roberta_sentiment_analyzer_for_twitter","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|drivefeelings_roberta_sentiment_analyzer_for_twitter| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/bibbia/DriveFeelings-Roberta-sentiment-analyzer-for-twitter \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-emotion_english_distilroberta_base_harshv9_en.md b/docs/_posts/ahmedlone127/2023-11-30-emotion_english_distilroberta_base_harshv9_en.md new file mode 100644 index 000000000000..ddcd87817aa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-emotion_english_distilroberta_base_harshv9_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_english_distilroberta_base_harshv9 RoBertaForSequenceClassification from HarshV9 +author: John Snow Labs +name: emotion_english_distilroberta_base_harshv9 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_english_distilroberta_base_harshv9` is a English model originally trained by HarshV9. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_english_distilroberta_base_harshv9_en_5.2.0_3.0_1701346437660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_english_distilroberta_base_harshv9_en_5.2.0_3.0_1701346437660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_english_distilroberta_base_harshv9","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_english_distilroberta_base_harshv9","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_english_distilroberta_base_harshv9| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/HarshV9/emotion-english-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-emotion_english_en.md b/docs/_posts/ahmedlone127/2023-11-30-emotion_english_en.md new file mode 100644 index 000000000000..843e28d30bf8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-emotion_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_english RoBertaForSequenceClassification from jitesh +author: John Snow Labs +name: emotion_english +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_english` is a English model originally trained by jitesh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_english_en_5.2.0_3.0_1701346542157.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_english_en_5.2.0_3.0_1701346542157.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/jitesh/emotion-english \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-environmentalbert_action_en.md b/docs/_posts/ahmedlone127/2023-11-30-environmentalbert_action_en.md new file mode 100644 index 000000000000..ad495ce9f400 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-environmentalbert_action_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English environmentalbert_action RoBertaForSequenceClassification from ESGBERT +author: John Snow Labs +name: environmentalbert_action +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`environmentalbert_action` is a English model originally trained by ESGBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/environmentalbert_action_en_5.2.0_3.0_1701347578698.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/environmentalbert_action_en_5.2.0_3.0_1701347578698.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("environmentalbert_action","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("environmentalbert_action","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|environmentalbert_action| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| + +## References + +https://huggingface.co/ESGBERT/EnvironmentalBERT-action \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-financial_merchant_category_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-financial_merchant_category_classification_en.md new file mode 100644 index 000000000000..eaca77a359d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-financial_merchant_category_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English financial_merchant_category_classification RoBertaForSequenceClassification from Budget +author: John Snow Labs +name: financial_merchant_category_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`financial_merchant_category_classification` is a English model originally trained by Budget. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/financial_merchant_category_classification_en_5.2.0_3.0_1701385155123.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/financial_merchant_category_classification_en_5.2.0_3.0_1701385155123.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("financial_merchant_category_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("financial_merchant_category_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|financial_merchant_category_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|421.4 MB| + +## References + +https://huggingface.co/Budget/financial_merchant_category_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuned_roberta_base_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuned_roberta_base_sentiment_en.md new file mode 100644 index 000000000000..78aa475ef143 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuned_roberta_base_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_roberta_base_sentiment RoBertaForSequenceClassification from KAITANY +author: John Snow Labs +name: finetuned_roberta_base_sentiment +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_roberta_base_sentiment` is a English model originally trained by KAITANY. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_sentiment_en_5.2.0_3.0_1701350579551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_sentiment_en_5.2.0_3.0_1701350579551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_roberta_base_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/KAITANY/finetuned-roberta-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuned_roberta_depression_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuned_roberta_depression_en.md new file mode 100644 index 000000000000..a19071beaaad --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuned_roberta_depression_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_roberta_depression RoBertaForSequenceClassification from ShreyaR +author: John Snow Labs +name: finetuned_roberta_depression +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_roberta_depression` is a English model originally trained by ShreyaR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_roberta_depression_en_5.2.0_3.0_1701354416956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_roberta_depression_en_5.2.0_3.0_1701354416956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_depression","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_depression","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_roberta_depression| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|433.0 MB| + +## References + +https://huggingface.co/ShreyaR/finetuned-roberta-depression \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuned_sentiment_classfication_roberta_model_abubakari_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuned_sentiment_classfication_roberta_model_abubakari_en.md new file mode 100644 index 000000000000..5e2d9e7bf8c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuned_sentiment_classfication_roberta_model_abubakari_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_sentiment_classfication_roberta_model_abubakari RoBertaForSequenceClassification from Abubakari +author: John Snow Labs +name: finetuned_sentiment_classfication_roberta_model_abubakari +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_sentiment_classfication_roberta_model_abubakari` is a English model originally trained by Abubakari. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_classfication_roberta_model_abubakari_en_5.2.0_3.0_1701370051863.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_sentiment_classfication_roberta_model_abubakari_en_5.2.0_3.0_1701370051863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_sentiment_classfication_roberta_model_abubakari","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_sentiment_classfication_roberta_model_abubakari","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_sentiment_classfication_roberta_model_abubakari| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.6 MB| + +## References + +https://huggingface.co/Abubakari/finetuned-Sentiment-classfication-ROBERTA-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_amazon_polarity_7000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_amazon_polarity_7000_samples_en.md new file mode 100644 index 000000000000..b46fb0c8db5c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_amazon_polarity_7000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_base_on_amazon_polarity_7000_samples RoBertaForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_roberta_base_on_amazon_polarity_7000_samples +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_base_on_amazon_polarity_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_amazon_polarity_7000_samples_en_5.2.0_3.0_1701366107584.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_amazon_polarity_7000_samples_en_5.2.0_3.0_1701366107584.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_amazon_polarity_7000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_amazon_polarity_7000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_base_on_amazon_polarity_7000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.4 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-roberta-base-on-amazon_polarity_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_imdb_7000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_imdb_7000_samples_en.md new file mode 100644 index 000000000000..124a6d43dd3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_imdb_7000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_base_on_imdb_7000_samples RoBertaForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_roberta_base_on_imdb_7000_samples +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_base_on_imdb_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_imdb_7000_samples_en_5.2.0_3.0_1701356278064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_imdb_7000_samples_en_5.2.0_3.0_1701356278064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_imdb_7000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_imdb_7000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_base_on_imdb_7000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.6 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-roberta-base-on-imdb_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_yelp_polarity_7000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_yelp_polarity_7000_samples_en.md new file mode 100644 index 000000000000..a095510765c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_base_on_yelp_polarity_7000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_base_on_yelp_polarity_7000_samples RoBertaForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_roberta_base_on_yelp_polarity_7000_samples +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_base_on_yelp_polarity_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_yelp_polarity_7000_samples_en_5.2.0_3.0_1701353274654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_yelp_polarity_7000_samples_en_5.2.0_3.0_1701353274654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_yelp_polarity_7000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_yelp_polarity_7000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_base_on_yelp_polarity_7000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|453.0 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-roberta-base-on-yelp_polarity_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_on_cornel_sentiment_7000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_on_cornel_sentiment_7000_samples_en.md new file mode 100644 index 000000000000..58ddb9b58a28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_on_cornel_sentiment_7000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_on_cornel_sentiment_7000_samples RoBertaForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_roberta_on_cornel_sentiment_7000_samples +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_on_cornel_sentiment_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_on_cornel_sentiment_7000_samples_en_5.2.0_3.0_1701352324416.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_on_cornel_sentiment_7000_samples_en_5.2.0_3.0_1701352324416.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_on_cornel_sentiment_7000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_on_cornel_sentiment_7000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_on_cornel_sentiment_7000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.1 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning_roberta_on_cornel_sentiment_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples_en.md new file mode 100644 index 000000000000..7230bc7b92b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples RoBertaForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples_en_5.2.0_3.0_1701348844151.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples_en_5.2.0_3.0_1701348844151.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_on_tweet_sentiment_sayula_popoluca_neg_7000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.8 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning_roberta_on_Tweet_Sentiment_pos_neg_7000_samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-finetuning_sentiment_model_spanish_der_emmanuel_en.md b/docs/_posts/ahmedlone127/2023-11-30-finetuning_sentiment_model_spanish_der_emmanuel_en.md new file mode 100644 index 000000000000..cf88ca7adb0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-finetuning_sentiment_model_spanish_der_emmanuel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_spanish_der_emmanuel RoBertaForSequenceClassification from der-emmanuel +author: John Snow Labs +name: finetuning_sentiment_model_spanish_der_emmanuel +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_spanish_der_emmanuel` is a English model originally trained by der-emmanuel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_spanish_der_emmanuel_en_5.2.0_3.0_1701360148676.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_spanish_der_emmanuel_en_5.2.0_3.0_1701360148676.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_spanish_der_emmanuel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_spanish_der_emmanuel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_spanish_der_emmanuel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.9 MB| + +## References + +https://huggingface.co/der-emmanuel/finetuning-sentiment-model-spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-firsttry_en.md b/docs/_posts/ahmedlone127/2023-11-30-firsttry_en.md new file mode 100644 index 000000000000..4287c0f65023 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-firsttry_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English firsttry RoBertaForSequenceClassification from Arvnd03 +author: John Snow Labs +name: firsttry +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`firsttry` is a English model originally trained by Arvnd03. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/firsttry_en_5.2.0_3.0_1701384304241.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/firsttry_en_5.2.0_3.0_1701384304241.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("firsttry","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("firsttry","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|firsttry| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.8 MB| + +## References + +https://huggingface.co/Arvnd03/FirstTry \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-fomc_roberta_tim9510019_en.md b/docs/_posts/ahmedlone127/2023-11-30-fomc_roberta_tim9510019_en.md new file mode 100644 index 000000000000..104c2fa2688e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-fomc_roberta_tim9510019_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fomc_roberta_tim9510019 RoBertaForSequenceClassification from tim9510019 +author: John Snow Labs +name: fomc_roberta_tim9510019 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fomc_roberta_tim9510019` is a English model originally trained by tim9510019. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fomc_roberta_tim9510019_en_5.2.0_3.0_1701346981043.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fomc_roberta_tim9510019_en_5.2.0_3.0_1701346981043.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fomc_roberta_tim9510019","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fomc_roberta_tim9510019","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fomc_roberta_tim9510019| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tim9510019/FOMC-RoBERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-fos_level0_en.md b/docs/_posts/ahmedlone127/2023-11-30-fos_level0_en.md new file mode 100644 index 000000000000..d59b7b6d39d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-fos_level0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fos_level0 RoBertaForSequenceClassification from intelcomp +author: John Snow Labs +name: fos_level0 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fos_level0` is a English model originally trained by intelcomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fos_level0_en_5.2.0_3.0_1701356062828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fos_level0_en_5.2.0_3.0_1701356062828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fos_level0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fos_level0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fos_level0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/intelcomp/fos_level0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-fs_distilroberta_fine_tuned_en.md b/docs/_posts/ahmedlone127/2023-11-30-fs_distilroberta_fine_tuned_en.md new file mode 100644 index 000000000000..dc1434bcd231 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-fs_distilroberta_fine_tuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fs_distilroberta_fine_tuned RoBertaForSequenceClassification from Anthos23 +author: John Snow Labs +name: fs_distilroberta_fine_tuned +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fs_distilroberta_fine_tuned` is a English model originally trained by Anthos23. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fs_distilroberta_fine_tuned_en_5.2.0_3.0_1701370989640.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fs_distilroberta_fine_tuned_en_5.2.0_3.0_1701370989640.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fs_distilroberta_fine_tuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fs_distilroberta_fine_tuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fs_distilroberta_fine_tuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.7 MB| + +## References + +https://huggingface.co/Anthos23/FS-distilroberta-fine-tuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-ft_roberta_en.md b/docs/_posts/ahmedlone127/2023-11-30-ft_roberta_en.md new file mode 100644 index 000000000000..d931661b877a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-ft_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ft_roberta RoBertaForSequenceClassification from MingDing2012 +author: John Snow Labs +name: ft_roberta +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ft_roberta` is a English model originally trained by MingDing2012. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ft_roberta_en_5.2.0_3.0_1701359828587.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ft_roberta_en_5.2.0_3.0_1701359828587.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ft_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ft_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ft_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.9 MB| + +## References + +https://huggingface.co/MingDing2012/FT_RoBERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-governancebert_governance_en.md b/docs/_posts/ahmedlone127/2023-11-30-governancebert_governance_en.md new file mode 100644 index 000000000000..b8a0383e9f76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-governancebert_governance_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English governancebert_governance RoBertaForSequenceClassification from ESGBERT +author: John Snow Labs +name: governancebert_governance +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`governancebert_governance` is a English model originally trained by ESGBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/governancebert_governance_en_5.2.0_3.0_1701346089161.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/governancebert_governance_en_5.2.0_3.0_1701346089161.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("governancebert_governance","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("governancebert_governance","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|governancebert_governance| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.9 MB| + +## References + +https://huggingface.co/ESGBERT/GovernanceBERT-governance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-govroberta_governance_en.md b/docs/_posts/ahmedlone127/2023-11-30-govroberta_governance_en.md new file mode 100644 index 000000000000..b6902dadbfa3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-govroberta_governance_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English govroberta_governance RoBertaForSequenceClassification from ESGBERT +author: John Snow Labs +name: govroberta_governance +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`govroberta_governance` is a English model originally trained by ESGBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/govroberta_governance_en_5.2.0_3.0_1701350574660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/govroberta_governance_en_5.2.0_3.0_1701350574660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("govroberta_governance","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("govroberta_governance","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|govroberta_governance| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/ESGBERT/GovRoBERTa-governance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-hate_multi_roberta_hasoc_hindi_hi.md b/docs/_posts/ahmedlone127/2023-11-30-hate_multi_roberta_hasoc_hindi_hi.md new file mode 100644 index 000000000000..ac5dd85c0a9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-hate_multi_roberta_hasoc_hindi_hi.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Hindi hate_multi_roberta_hasoc_hindi RoBertaForSequenceClassification from l3cube-pune +author: John Snow Labs +name: hate_multi_roberta_hasoc_hindi +date: 2023-11-30 +tags: [roberta, hi, open_source, sequence_classification, onnx] +task: Text Classification +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_multi_roberta_hasoc_hindi` is a Hindi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_multi_roberta_hasoc_hindi_hi_5.2.0_3.0_1701366421121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_multi_roberta_hasoc_hindi_hi_5.2.0_3.0_1701366421121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_multi_roberta_hasoc_hindi","hi")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_multi_roberta_hasoc_hindi","hi") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_multi_roberta_hasoc_hindi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|hi| +|Size:|467.1 MB| + +## References + +https://huggingface.co/l3cube-pune/hate-multi-roberta-hasoc-hindi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-hatespeech_refugees_en.md b/docs/_posts/ahmedlone127/2023-11-30-hatespeech_refugees_en.md new file mode 100644 index 000000000000..86646d939a7a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-hatespeech_refugees_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hatespeech_refugees RoBertaForSequenceClassification from henrystoll +author: John Snow Labs +name: hatespeech_refugees +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hatespeech_refugees` is a English model originally trained by henrystoll. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hatespeech_refugees_en_5.2.0_3.0_1701353475168.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hatespeech_refugees_en_5.2.0_3.0_1701353475168.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hatespeech_refugees","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hatespeech_refugees","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hatespeech_refugees| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|299.8 MB| + +## References + +https://huggingface.co/henrystoll/hatespeech-refugees \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-iag_class_en.md b/docs/_posts/ahmedlone127/2023-11-30-iag_class_en.md new file mode 100644 index 000000000000..ac07b5bb0ebb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-iag_class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English iag_class RoBertaForSequenceClassification from audreyvasconcelos +author: John Snow Labs +name: iag_class +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`iag_class` is a English model originally trained by audreyvasconcelos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/iag_class_en_5.2.0_3.0_1701375099439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/iag_class_en_5.2.0_3.0_1701375099439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("iag_class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("iag_class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|iag_class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.5 MB| + +## References + +https://huggingface.co/audreyvasconcelos/iag-class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_adm_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_adm_nl.md new file mode 100644 index 000000000000..bc8592611ce4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_adm_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_adm RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_adm +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_adm` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_adm_nl_5.2.0_3.0_1701372993285.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_adm_nl_5.2.0_3.0_1701372993285.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_adm","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_adm","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_adm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-adm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_att_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_att_nl.md new file mode 100644 index 000000000000..ca54a047ebe0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_att_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_att RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_att +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_att` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_att_nl_5.2.0_3.0_1701356062830.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_att_nl_5.2.0_3.0_1701356062830.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_att","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_att","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_att| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-att \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_berber_languages_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_berber_languages_nl.md new file mode 100644 index 000000000000..583faef57168 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_berber_languages_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_berber_languages RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_berber_languages +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_berber_languages` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_berber_languages_nl_5.2.0_3.0_1701363233799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_berber_languages_nl_5.2.0_3.0_1701363233799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_berber_languages","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_berber_languages","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_berber_languages| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-ber \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_enr_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_enr_nl.md new file mode 100644 index 000000000000..7f5d16f0fdc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_enr_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_enr RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_enr +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_enr` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_enr_nl_5.2.0_3.0_1701372716433.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_enr_nl_5.2.0_3.0_1701372716433.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_enr","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_enr","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_enr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-enr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_etn_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_etn_nl.md new file mode 100644 index 000000000000..5d71a02399ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_etn_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_etn RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_etn +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_etn` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_etn_nl_5.2.0_3.0_1701374098667.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_etn_nl_5.2.0_3.0_1701374098667.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_etn","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_etn","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_etn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-etn \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_fac_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_fac_nl.md new file mode 100644 index 000000000000..bd94b490c8e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_fac_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_fac RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_fac +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_fac` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_fac_nl_5.2.0_3.0_1701375877335.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_fac_nl_5.2.0_3.0_1701375877335.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_fac","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_fac","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_fac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-fac \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_ins_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_ins_nl.md new file mode 100644 index 000000000000..25400fecda40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_ins_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_ins RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_ins +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_ins` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_ins_nl_5.2.0_3.0_1701368650523.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_ins_nl_5.2.0_3.0_1701368650523.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_ins","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_ins","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_ins| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-ins \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_mbw_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_mbw_nl.md new file mode 100644 index 000000000000..313915a5aa96 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_mbw_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_mbw RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_mbw +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_mbw` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_mbw_nl_5.2.0_3.0_1701360666757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_mbw_nl_5.2.0_3.0_1701360666757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_mbw","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_mbw","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_mbw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-mbw \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-icf_levels_stm_nl.md b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_stm_nl.md new file mode 100644 index 000000000000..197269b9a828 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-icf_levels_stm_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish icf_levels_stm RoBertaForSequenceClassification from CLTL +author: John Snow Labs +name: icf_levels_stm +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icf_levels_stm` is a Dutch, Flemish model originally trained by CLTL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icf_levels_stm_nl_5.2.0_3.0_1701364822779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icf_levels_stm_nl_5.2.0_3.0_1701364822779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_stm","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icf_levels_stm","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icf_levels_stm| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|471.9 MB| + +## References + +https://huggingface.co/CLTL/icf-levels-stm \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-input_classifier_v3_muppet_en.md b/docs/_posts/ahmedlone127/2023-11-30-input_classifier_v3_muppet_en.md new file mode 100644 index 000000000000..e6294faca2d0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-input_classifier_v3_muppet_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English input_classifier_v3_muppet RoBertaForSequenceClassification from Abris +author: John Snow Labs +name: input_classifier_v3_muppet +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`input_classifier_v3_muppet` is a English model originally trained by Abris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/input_classifier_v3_muppet_en_5.2.0_3.0_1701346848639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/input_classifier_v3_muppet_en_5.2.0_3.0_1701346848639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("input_classifier_v3_muppet","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("input_classifier_v3_muppet","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|input_classifier_v3_muppet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.3 MB| + +## References + +https://huggingface.co/Abris/input_classifier_v3_muppet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-isitphish_en.md b/docs/_posts/ahmedlone127/2023-11-30-isitphish_en.md new file mode 100644 index 000000000000..fd34f1af4daa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-isitphish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English isitphish RoBertaForSequenceClassification from phishbot +author: John Snow Labs +name: isitphish +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`isitphish` is a English model originally trained by phishbot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/isitphish_en_5.2.0_3.0_1701353046850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/isitphish_en_5.2.0_3.0_1701353046850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("isitphish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("isitphish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|isitphish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.4 MB| + +## References + +https://huggingface.co/phishbot/Isitphish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-issuereportclassifier_nlbse22_en.md b/docs/_posts/ahmedlone127/2023-11-30-issuereportclassifier_nlbse22_en.md new file mode 100644 index 000000000000..a9220b76d908 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-issuereportclassifier_nlbse22_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English issuereportclassifier_nlbse22 RoBertaForSequenceClassification from PeppoCola +author: John Snow Labs +name: issuereportclassifier_nlbse22 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`issuereportclassifier_nlbse22` is a English model originally trained by PeppoCola. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/issuereportclassifier_nlbse22_en_5.2.0_3.0_1701350419589.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/issuereportclassifier_nlbse22_en_5.2.0_3.0_1701350419589.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("issuereportclassifier_nlbse22","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("issuereportclassifier_nlbse22","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|issuereportclassifier_nlbse22| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.1 MB| + +## References + +https://huggingface.co/PeppoCola/IssueReportClassifier-NLBSE22 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-leia_large_en.md b/docs/_posts/ahmedlone127/2023-11-30-leia_large_en.md new file mode 100644 index 000000000000..4f281c059447 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-leia_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English leia_large RoBertaForSequenceClassification from LEIA +author: John Snow Labs +name: leia_large +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`leia_large` is a English model originally trained by LEIA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/leia_large_en_5.2.0_3.0_1701353508087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/leia_large_en_5.2.0_3.0_1701353508087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("leia_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("leia_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|leia_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/LEIA/LEIA-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-llamahoraigpt3_5v4_en.md b/docs/_posts/ahmedlone127/2023-11-30-llamahoraigpt3_5v4_en.md new file mode 100644 index 000000000000..4a523c29dabb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-llamahoraigpt3_5v4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English llamahoraigpt3_5v4 RoBertaForSequenceClassification from stealthwriter +author: John Snow Labs +name: llamahoraigpt3_5v4 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`llamahoraigpt3_5v4` is a English model originally trained by stealthwriter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/llamahoraigpt3_5v4_en_5.2.0_3.0_1701352572434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/llamahoraigpt3_5v4_en_5.2.0_3.0_1701352572434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("llamahoraigpt3_5v4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("llamahoraigpt3_5v4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|llamahoraigpt3_5v4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/stealthwriter/llamaHorAIgpt3.5V4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-manifesto_dutch_binary_relevance_nl.md b/docs/_posts/ahmedlone127/2023-11-30-manifesto_dutch_binary_relevance_nl.md new file mode 100644 index 000000000000..50e662902784 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-manifesto_dutch_binary_relevance_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish manifesto_dutch_binary_relevance RoBertaForSequenceClassification from joris +author: John Snow Labs +name: manifesto_dutch_binary_relevance +date: 2023-11-30 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`manifesto_dutch_binary_relevance` is a Dutch, Flemish model originally trained by joris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/manifesto_dutch_binary_relevance_nl_5.2.0_3.0_1701346739961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/manifesto_dutch_binary_relevance_nl_5.2.0_3.0_1701346739961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("manifesto_dutch_binary_relevance","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("manifesto_dutch_binary_relevance","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|manifesto_dutch_binary_relevance| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|437.9 MB| + +## References + +https://huggingface.co/joris/manifesto-dutch-binary-relevance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-me_sent_roberta_mr.md b/docs/_posts/ahmedlone127/2023-11-30-me_sent_roberta_mr.md new file mode 100644 index 000000000000..18150e0f0f71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-me_sent_roberta_mr.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Marathi me_sent_roberta RoBertaForSequenceClassification from l3cube-pune +author: John Snow Labs +name: me_sent_roberta +date: 2023-11-30 +tags: [roberta, mr, open_source, sequence_classification, onnx] +task: Text Classification +language: mr +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`me_sent_roberta` is a Marathi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/me_sent_roberta_mr_5.2.0_3.0_1701385159332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/me_sent_roberta_mr_5.2.0_3.0_1701385159332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("me_sent_roberta","mr")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("me_sent_roberta","mr") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|me_sent_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|mr| +|Size:|1.0 GB| + +## References + +https://huggingface.co/l3cube-pune/me-sent-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-medical_infusion_pump_en.md b/docs/_posts/ahmedlone127/2023-11-30-medical_infusion_pump_en.md new file mode 100644 index 000000000000..86e0571b5509 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-medical_infusion_pump_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English medical_infusion_pump RoBertaForSequenceClassification from thearod5 +author: John Snow Labs +name: medical_infusion_pump +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`medical_infusion_pump` is a English model originally trained by thearod5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/medical_infusion_pump_en_5.2.0_3.0_1701350377462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/medical_infusion_pump_en_5.2.0_3.0_1701350377462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("medical_infusion_pump","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("medical_infusion_pump","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|medical_infusion_pump| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.5 MB| + +## References + +https://huggingface.co/thearod5/medical-infusion-pump \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-mental_health_disorder_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-mental_health_disorder_classification_en.md new file mode 100644 index 000000000000..fb0c22d61fbe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-mental_health_disorder_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mental_health_disorder_classification RoBertaForSequenceClassification from musagani05 +author: John Snow Labs +name: mental_health_disorder_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mental_health_disorder_classification` is a English model originally trained by musagani05. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mental_health_disorder_classification_en_5.2.0_3.0_1701348734052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mental_health_disorder_classification_en_5.2.0_3.0_1701348734052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mental_health_disorder_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mental_health_disorder_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mental_health_disorder_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|458.6 MB| + +## References + +https://huggingface.co/musagani05/mental-health-disorder-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-mindminer_binary_en.md b/docs/_posts/ahmedlone127/2023-11-30-mindminer_binary_en.md new file mode 100644 index 000000000000..48964cdbee7c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-mindminer_binary_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mindminer_binary RoBertaForSequenceClassification from j-hartmann +author: John Snow Labs +name: mindminer_binary +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mindminer_binary` is a English model originally trained by j-hartmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mindminer_binary_en_5.2.0_3.0_1701349013795.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mindminer_binary_en_5.2.0_3.0_1701349013795.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mindminer_binary","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mindminer_binary","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mindminer_binary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|422.1 MB| + +## References + +https://huggingface.co/j-hartmann/MindMiner-Binary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-mindminer_en.md b/docs/_posts/ahmedlone127/2023-11-30-mindminer_en.md new file mode 100644 index 000000000000..579ff77654a4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-mindminer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mindminer RoBertaForSequenceClassification from j-hartmann +author: John Snow Labs +name: mindminer +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mindminer` is a English model originally trained by j-hartmann. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mindminer_en_5.2.0_3.0_1701347392141.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mindminer_en_5.2.0_3.0_1701347392141.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mindminer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mindminer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mindminer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.5 MB| + +## References + +https://huggingface.co/j-hartmann/MindMiner \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-minilmv2_l6_h768_distilled_from_roberta_large_boolq_en.md b/docs/_posts/ahmedlone127/2023-11-30-minilmv2_l6_h768_distilled_from_roberta_large_boolq_en.md new file mode 100644 index 000000000000..a8f2247ebce3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-minilmv2_l6_h768_distilled_from_roberta_large_boolq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English minilmv2_l6_h768_distilled_from_roberta_large_boolq RoBertaForSequenceClassification from nfliu +author: John Snow Labs +name: minilmv2_l6_h768_distilled_from_roberta_large_boolq +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`minilmv2_l6_h768_distilled_from_roberta_large_boolq` is a English model originally trained by nfliu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h768_distilled_from_roberta_large_boolq_en_5.2.0_3.0_1701353126716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/minilmv2_l6_h768_distilled_from_roberta_large_boolq_en_5.2.0_3.0_1701353126716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("minilmv2_l6_h768_distilled_from_roberta_large_boolq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("minilmv2_l6_h768_distilled_from_roberta_large_boolq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|minilmv2_l6_h768_distilled_from_roberta_large_boolq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|298.3 MB| + +## References + +https://huggingface.co/nfliu/MiniLMv2-L6-H768-distilled-from-RoBERTa-Large_boolq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-moralstories_roberta_action_context_cls_en.md b/docs/_posts/ahmedlone127/2023-11-30-moralstories_roberta_action_context_cls_en.md new file mode 100644 index 000000000000..b985fd423d20 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-moralstories_roberta_action_context_cls_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English moralstories_roberta_action_context_cls RoBertaForSequenceClassification from gFulvio +author: John Snow Labs +name: moralstories_roberta_action_context_cls +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`moralstories_roberta_action_context_cls` is a English model originally trained by gFulvio. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/moralstories_roberta_action_context_cls_en_5.2.0_3.0_1701384628860.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/moralstories_roberta_action_context_cls_en_5.2.0_3.0_1701384628860.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("moralstories_roberta_action_context_cls","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("moralstories_roberta_action_context_cls","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|moralstories_roberta_action_context_cls| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/gFulvio/moralstories-roberta-action.context-cls \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-movie_review_score_discriminator_eng_en.md b/docs/_posts/ahmedlone127/2023-11-30-movie_review_score_discriminator_eng_en.md new file mode 100644 index 000000000000..9d8417af6a00 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-movie_review_score_discriminator_eng_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English movie_review_score_discriminator_eng RoBertaForSequenceClassification from mdj1412 +author: John Snow Labs +name: movie_review_score_discriminator_eng +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`movie_review_score_discriminator_eng` is a English model originally trained by mdj1412. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/movie_review_score_discriminator_eng_en_5.2.0_3.0_1701388688801.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/movie_review_score_discriminator_eng_en_5.2.0_3.0_1701388688801.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("movie_review_score_discriminator_eng","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("movie_review_score_discriminator_eng","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|movie_review_score_discriminator_eng| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.8 MB| + +## References + +https://huggingface.co/mdj1412/movie_review_score_discriminator_eng \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-msmarco_roberta_for_similarity_en.md b/docs/_posts/ahmedlone127/2023-11-30-msmarco_roberta_for_similarity_en.md new file mode 100644 index 000000000000..402c8023254d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-msmarco_roberta_for_similarity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English msmarco_roberta_for_similarity RoBertaForSequenceClassification from cassiepowell +author: John Snow Labs +name: msmarco_roberta_for_similarity +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`msmarco_roberta_for_similarity` is a English model originally trained by cassiepowell. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/msmarco_roberta_for_similarity_en_5.2.0_3.0_1701359028526.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/msmarco_roberta_for_similarity_en_5.2.0_3.0_1701359028526.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("msmarco_roberta_for_similarity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("msmarco_roberta_for_similarity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|msmarco_roberta_for_similarity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.0 MB| + +## References + +https://huggingface.co/cassiepowell/msmarco-RoBERTa-for-similarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-nlp_sentimental_analysis_finetuned_using_roberta_base_model_en.md b/docs/_posts/ahmedlone127/2023-11-30-nlp_sentimental_analysis_finetuned_using_roberta_base_model_en.md new file mode 100644 index 000000000000..e0dbd9aac746 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-nlp_sentimental_analysis_finetuned_using_roberta_base_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nlp_sentimental_analysis_finetuned_using_roberta_base_model RoBertaForSequenceClassification from Achar +author: John Snow Labs +name: nlp_sentimental_analysis_finetuned_using_roberta_base_model +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nlp_sentimental_analysis_finetuned_using_roberta_base_model` is a English model originally trained by Achar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nlp_sentimental_analysis_finetuned_using_roberta_base_model_en_5.2.0_3.0_1701351883340.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nlp_sentimental_analysis_finetuned_using_roberta_base_model_en_5.2.0_3.0_1701351883340.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nlp_sentimental_analysis_finetuned_using_roberta_base_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nlp_sentimental_analysis_finetuned_using_roberta_base_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nlp_sentimental_analysis_finetuned_using_roberta_base_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.3 MB| + +## References + +https://huggingface.co/Achar/NLP-Sentimental-Analysis-Finetuned-using-Roberta-base-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-ogbv_gender_twtrobertabase_english_founta_final_en.md b/docs/_posts/ahmedlone127/2023-11-30-ogbv_gender_twtrobertabase_english_founta_final_en.md new file mode 100644 index 000000000000..274f7c14f19e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-ogbv_gender_twtrobertabase_english_founta_final_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ogbv_gender_twtrobertabase_english_founta_final RoBertaForSequenceClassification from Maha +author: John Snow Labs +name: ogbv_gender_twtrobertabase_english_founta_final +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ogbv_gender_twtrobertabase_english_founta_final` is a English model originally trained by Maha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ogbv_gender_twtrobertabase_english_founta_final_en_5.2.0_3.0_1701369702781.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ogbv_gender_twtrobertabase_english_founta_final_en_5.2.0_3.0_1701369702781.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ogbv_gender_twtrobertabase_english_founta_final","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ogbv_gender_twtrobertabase_english_founta_final","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ogbv_gender_twtrobertabase_english_founta_final| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Maha/OGBV-gender-twtrobertabase-en-founta_final \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-oo_method_test_model_bylibrary_en.md b/docs/_posts/ahmedlone127/2023-11-30-oo_method_test_model_bylibrary_en.md new file mode 100644 index 000000000000..db97cb9ad843 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-oo_method_test_model_bylibrary_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English oo_method_test_model_bylibrary RoBertaForSequenceClassification from ejschwartz +author: John Snow Labs +name: oo_method_test_model_bylibrary +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`oo_method_test_model_bylibrary` is a English model originally trained by ejschwartz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/oo_method_test_model_bylibrary_en_5.2.0_3.0_1701354098146.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/oo_method_test_model_bylibrary_en_5.2.0_3.0_1701354098146.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("oo_method_test_model_bylibrary","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("oo_method_test_model_bylibrary","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|oo_method_test_model_bylibrary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|314.0 MB| + +## References + +https://huggingface.co/ejschwartz/oo-method-test-model-bylibrary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-openai_detector_base_en.md b/docs/_posts/ahmedlone127/2023-11-30-openai_detector_base_en.md new file mode 100644 index 000000000000..70f675df3d48 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-openai_detector_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English openai_detector_base RoBertaForSequenceClassification from nbroad +author: John Snow Labs +name: openai_detector_base +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`openai_detector_base` is a English model originally trained by nbroad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/openai_detector_base_en_5.2.0_3.0_1701347531485.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/openai_detector_base_en_5.2.0_3.0_1701347531485.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("openai_detector_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("openai_detector_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|openai_detector_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/nbroad/openai-detector-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-opp_115_privacy_contact_information_en.md b/docs/_posts/ahmedlone127/2023-11-30-opp_115_privacy_contact_information_en.md new file mode 100644 index 000000000000..5ffd325e3545 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-opp_115_privacy_contact_information_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English opp_115_privacy_contact_information RoBertaForSequenceClassification from jakariamd +author: John Snow Labs +name: opp_115_privacy_contact_information +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`opp_115_privacy_contact_information` is a English model originally trained by jakariamd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/opp_115_privacy_contact_information_en_5.2.0_3.0_1701346370927.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/opp_115_privacy_contact_information_en_5.2.0_3.0_1701346370927.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("opp_115_privacy_contact_information","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("opp_115_privacy_contact_information","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|opp_115_privacy_contact_information| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.5 MB| + +## References + +https://huggingface.co/jakariamd/opp_115_privacy_contact_information \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-output_en.md b/docs/_posts/ahmedlone127/2023-11-30-output_en.md new file mode 100644 index 000000000000..d70988be18a3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-output_en.md @@ -0,0 +1,92 @@ +--- +layout: model +title: English output DistilBertEmbeddings from soyisauce +author: John Snow Labs +name: output +date: 2023-11-30 +tags: [distilbert, en, open_source, fill_mask, onnx] +task: Embeddings +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained DistilBertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`output` is a English model originally trained by soyisauce. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/output_en_5.2.0_3.0_1701362580912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/output_en_5.2.0_3.0_1701362580912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ + .setInputCol("text") \ + .setOutputCol("documents") + + +embeddings =DistilBertEmbeddings.pretrained("output","en") \ + .setInputCols(["documents","token"]) \ + .setOutputCol("embeddings") + +pipeline = Pipeline().setStages([document_assembler, embeddings]) + +pipelineModel = pipeline.fit(data) + +pipelineDF = pipelineModel.transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("embeddings") + +val embeddings = DistilBertEmbeddings + .pretrained("output", "en") + .setInputCols(Array("documents","token")) + .setOutputCol("embeddings") + +val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings)) + +val pipelineModel = pipeline.fit(data) + +val pipelineDF = pipelineModel.transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|output| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +References + +https://huggingface.co/soyisauce/output \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-pcl_robertabase_en.md b/docs/_posts/ahmedlone127/2023-11-30-pcl_robertabase_en.md new file mode 100644 index 000000000000..96b75e774f1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-pcl_robertabase_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pcl_robertabase RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: pcl_robertabase +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pcl_robertabase` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pcl_robertabase_en_5.2.0_3.0_1701364714493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pcl_robertabase_en_5.2.0_3.0_1701364714493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("pcl_robertabase","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("pcl_robertabase","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pcl_robertabase| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.1 MB| + +## References + +https://huggingface.co/cardiffnlp/pcl_robertabase \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-platzi_distilroberta_base_mrpc_glue_edgar_elias_en.md b/docs/_posts/ahmedlone127/2023-11-30-platzi_distilroberta_base_mrpc_glue_edgar_elias_en.md new file mode 100644 index 000000000000..488e8b6071bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-platzi_distilroberta_base_mrpc_glue_edgar_elias_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_distilroberta_base_mrpc_glue_edgar_elias RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_distilroberta_base_mrpc_glue_edgar_elias +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distilroberta_base_mrpc_glue_edgar_elias` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_edgar_elias_en_5.2.0_3.0_1701362279932.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_edgar_elias_en_5.2.0_3.0_1701362279932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_edgar_elias","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_edgar_elias","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distilroberta_base_mrpc_glue_edgar_elias| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/platzi/platzi-distilroberta-base-mrpc-glue-edgar-elias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-polish_bert_en.md b/docs/_posts/ahmedlone127/2023-11-30-polish_bert_en.md new file mode 100644 index 000000000000..c2923a5f3f12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-polish_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English polish_bert RoBertaForSequenceClassification from thearod5 +author: John Snow Labs +name: polish_bert +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`polish_bert` is a English model originally trained by thearod5. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/polish_bert_en_5.2.0_3.0_1701363669434.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/polish_bert_en_5.2.0_3.0_1701363669434.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("polish_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("polish_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|polish_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/thearod5/pl-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-privacy_intent_en.md b/docs/_posts/ahmedlone127/2023-11-30-privacy_intent_en.md new file mode 100644 index 000000000000..bee89ff8bf3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-privacy_intent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English privacy_intent RoBertaForSequenceClassification from remzicam +author: John Snow Labs +name: privacy_intent +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`privacy_intent` is a English model originally trained by remzicam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/privacy_intent_en_5.2.0_3.0_1701351223346.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/privacy_intent_en_5.2.0_3.0_1701351223346.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("privacy_intent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("privacy_intent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|privacy_intent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.5 MB| + +## References + +https://huggingface.co/remzicam/privacy_intent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-progortu_en.md b/docs/_posts/ahmedlone127/2023-11-30-progortu_en.md new file mode 100644 index 000000000000..401f8224c4cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-progortu_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English progortu RoBertaForSequenceClassification from bibbia +author: John Snow Labs +name: progortu +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`progortu` is a English model originally trained by bibbia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/progortu_en_5.2.0_3.0_1701350459730.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/progortu_en_5.2.0_3.0_1701350459730.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("progortu","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("progortu","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|progortu| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/bibbia/progOrtu \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-prompts_reward_model_en.md b/docs/_posts/ahmedlone127/2023-11-30-prompts_reward_model_en.md new file mode 100644 index 000000000000..dbe2821d5906 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-prompts_reward_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English prompts_reward_model RoBertaForSequenceClassification from toloka +author: John Snow Labs +name: prompts_reward_model +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`prompts_reward_model` is a English model originally trained by toloka. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/prompts_reward_model_en_5.2.0_3.0_1701367347254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/prompts_reward_model_en_5.2.0_3.0_1701367347254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("prompts_reward_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("prompts_reward_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|prompts_reward_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/toloka/prompts_reward_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-quora_roberta_base_navteca_en.md b/docs/_posts/ahmedlone127/2023-11-30-quora_roberta_base_navteca_en.md new file mode 100644 index 000000000000..81fea8683d5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-quora_roberta_base_navteca_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English quora_roberta_base_navteca RoBertaForSequenceClassification from navteca +author: John Snow Labs +name: quora_roberta_base_navteca +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quora_roberta_base_navteca` is a English model originally trained by navteca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quora_roberta_base_navteca_en_5.2.0_3.0_1701347917689.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quora_roberta_base_navteca_en_5.2.0_3.0_1701347917689.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_base_navteca","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_base_navteca","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quora_roberta_base_navteca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.4 MB| + +## References + +https://huggingface.co/navteca/quora-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-racism_es.md b/docs/_posts/ahmedlone127/2023-11-30-racism_es.md new file mode 100644 index 000000000000..d62dcc34838b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-racism_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish racism RoBertaForSequenceClassification from davidmasip +author: John Snow Labs +name: racism +date: 2023-11-30 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`racism` is a Castilian, Spanish model originally trained by davidmasip. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/racism_es_5.2.0_3.0_1701348536481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/racism_es_5.2.0_3.0_1701348536481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("racism","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("racism","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|racism| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|447.9 MB| + +## References + +https://huggingface.co/davidmasip/racism \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-racism_finetuned_detests_en.md b/docs/_posts/ahmedlone127/2023-11-30-racism_finetuned_detests_en.md new file mode 100644 index 000000000000..20800ce9ae37 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-racism_finetuned_detests_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English racism_finetuned_detests RoBertaForSequenceClassification from Pablo94 +author: John Snow Labs +name: racism_finetuned_detests +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`racism_finetuned_detests` is a English model originally trained by Pablo94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/racism_finetuned_detests_en_5.2.0_3.0_1701349583186.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/racism_finetuned_detests_en_5.2.0_3.0_1701349583186.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("racism_finetuned_detests","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("racism_finetuned_detests","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|racism_finetuned_detests| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.6 MB| + +## References + +https://huggingface.co/Pablo94/racism-finetuned-detests \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-real_and_fake_news_detection_en.md b/docs/_posts/ahmedlone127/2023-11-30-real_and_fake_news_detection_en.md new file mode 100644 index 000000000000..29ddc9f77dd4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-real_and_fake_news_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English real_and_fake_news_detection RoBertaForSequenceClassification from MYC007 +author: John Snow Labs +name: real_and_fake_news_detection +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`real_and_fake_news_detection` is a English model originally trained by MYC007. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/real_and_fake_news_detection_en_5.2.0_3.0_1701348315692.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/real_and_fake_news_detection_en_5.2.0_3.0_1701348315692.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("real_and_fake_news_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("real_and_fake_news_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|real_and_fake_news_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|459.1 MB| + +## References + +https://huggingface.co/MYC007/Real-and-Fake-News-Detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_150t_argumentative_sentence_detector_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_150t_argumentative_sentence_detector_en.md new file mode 100644 index 000000000000..767f7a0b1ca1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_150t_argumentative_sentence_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_150t_argumentative_sentence_detector RoBertaForSequenceClassification from pheinisch +author: John Snow Labs +name: roberta_base_150t_argumentative_sentence_detector +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_150t_argumentative_sentence_detector` is a English model originally trained by pheinisch. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_150t_argumentative_sentence_detector_en_5.2.0_3.0_1701384326641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_150t_argumentative_sentence_detector_en_5.2.0_3.0_1701384326641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_150t_argumentative_sentence_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_150t_argumentative_sentence_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_150t_argumentative_sentence_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.3 MB| + +## References + +https://huggingface.co/pheinisch/roberta-base-150T-argumentative-sentence-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_ag_news_brachio99_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_ag_news_brachio99_en.md new file mode 100644 index 000000000000..5df2793acac1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_ag_news_brachio99_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_ag_news_brachio99 RoBertaForSequenceClassification from brachio99 +author: John Snow Labs +name: roberta_base_ag_news_brachio99 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ag_news_brachio99` is a English model originally trained by brachio99. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_brachio99_en_5.2.0_3.0_1701356270334.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_brachio99_en_5.2.0_3.0_1701356270334.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_brachio99","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_brachio99","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ag_news_brachio99| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.8 MB| + +## References + +https://huggingface.co/brachio99/roberta-base_ag_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_ag_news_gaionl_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_ag_news_gaionl_en.md new file mode 100644 index 000000000000..3ad9ab4b70f1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_ag_news_gaionl_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_ag_news_gaionl RoBertaForSequenceClassification from gaioNL +author: John Snow Labs +name: roberta_base_ag_news_gaionl +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ag_news_gaionl` is a English model originally trained by gaioNL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_gaionl_en_5.2.0_3.0_1701383995997.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_gaionl_en_5.2.0_3.0_1701383995997.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_gaionl","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_gaionl","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ag_news_gaionl| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.4 MB| + +## References + +https://huggingface.co/gaioNL/roberta-base_ag_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_aita_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_aita_en.md new file mode 100644 index 000000000000..0d9fd850b8a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_aita_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_aita RoBertaForSequenceClassification from cnamuangtoun +author: John Snow Labs +name: roberta_base_aita +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_aita` is a English model originally trained by cnamuangtoun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_aita_en_5.2.0_3.0_1701384208793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_aita_en_5.2.0_3.0_1701384208793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_aita","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_aita","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_aita| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.8 MB| + +## References + +https://huggingface.co/cnamuangtoun/roberta-base-aita \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_eugenia_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_eugenia_en.md new file mode 100644 index 000000000000..994792a80e40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_eugenia_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_eugenia RoBertaForSequenceClassification from Eugenia +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_eugenia +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_eugenia` is a English model originally trained by Eugenia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_eugenia_en_5.2.0_3.0_1701350153604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_eugenia_en_5.2.0_3.0_1701350153604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_eugenia","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_eugenia","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_eugenia| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.8 MB| + +## References + +https://huggingface.co/Eugenia/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_hormigo_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_hormigo_en.md new file mode 100644 index 000000000000..e91a0f6e427a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_hormigo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_hormigo RoBertaForSequenceClassification from Hormigo +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_hormigo +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_hormigo` is a English model originally trained by Hormigo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_hormigo_en_5.2.0_3.0_1701346515199.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_hormigo_en_5.2.0_3.0_1701346515199.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_hormigo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_hormigo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_hormigo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.8 MB| + +## References + +https://huggingface.co/Hormigo/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_leanai_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_leanai_en.md new file mode 100644 index 000000000000..54580a4678d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_amazon_reviews_multi_leanai_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_leanai RoBertaForSequenceClassification from LeanAI +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_leanai +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_leanai` is a English model originally trained by LeanAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_leanai_en_5.2.0_3.0_1701370051882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_leanai_en_5.2.0_3.0_1701370051882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_leanai","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_leanai","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_leanai| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|425.0 MB| + +## References + +https://huggingface.co/LeanAI/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_analisis_de_sentimientos_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_analisis_de_sentimientos_en.md new file mode 100644 index 000000000000..8808dce73b39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_analisis_de_sentimientos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_analisis_de_sentimientos RoBertaForSequenceClassification from DataPath +author: John Snow Labs +name: roberta_base_bne_finetuned_analisis_de_sentimientos +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_analisis_de_sentimientos` is a English model originally trained by DataPath. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_analisis_de_sentimientos_en_5.2.0_3.0_1701346132130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_analisis_de_sentimientos_en_5.2.0_3.0_1701346132130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_analisis_de_sentimientos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_analisis_de_sentimientos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_analisis_de_sentimientos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|451.0 MB| + +## References + +https://huggingface.co/DataPath/roberta-base-bne-finetuned-Analisis_De_Sentimientos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta_en.md new file mode 100644 index 000000000000..3a6e254cf888 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta RoBertaForSequenceClassification from rvrtdta +author: John Snow Labs +name: roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta` is a English model originally trained by rvrtdta. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta_en_5.2.0_3.0_1701353469492.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta_en_5.2.0_3.0_1701353469492.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_meia_analisisdesentimientos_rvrtdta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.9 MB| + +## References + +https://huggingface.co/rvrtdta/roberta-base-bne-finetuned-MeIA-AnalisisDeSentimientos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1_en.md new file mode 100644 index 000000000000..987814db0501 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1 RoBertaForSequenceClassification from vg055 +author: John Snow Labs +name: roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1` is a English model originally trained by vg055. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1_en_5.2.0_3.0_1701384213559.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1_en_5.2.0_3.0_1701384213559.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_tripadvisordomainadaptation_finetuned_e2_restmex2023_polaridadda_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/vg055/roberta-base-bne-finetuned-TripAdvisorDomainAdaptation-finetuned-e2-RestMex2023-polaridadDA-V1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_irony_es.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_irony_es.md new file mode 100644 index 000000000000..97146f2cd894 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_irony_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_base_bne_irony RoBertaForSequenceClassification from dtomas +author: John Snow Labs +name: roberta_base_bne_irony +date: 2023-11-30 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_irony` is a Castilian, Spanish model originally trained by dtomas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_irony_es_5.2.0_3.0_1701351027519.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_irony_es_5.2.0_3.0_1701351027519.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_irony","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_irony","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_irony| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|428.9 MB| + +## References + +https://huggingface.co/dtomas/roberta-base-bne-irony \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_mldoc_es.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_mldoc_es.md new file mode 100644 index 000000000000..5afc5022b559 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_bne_mldoc_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_base_bne_mldoc RoBertaForSequenceClassification from PlanTL-GOB-ES +author: John Snow Labs +name: roberta_base_bne_mldoc +date: 2023-11-30 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_mldoc` is a Castilian, Spanish model originally trained by PlanTL-GOB-ES. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_mldoc_es_5.2.0_3.0_1701373718638.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_mldoc_es_5.2.0_3.0_1701373718638.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_mldoc","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_mldoc","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_mldoc| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|448.4 MB| + +## References + +https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne-mldoc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_cefr_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_cefr_en.md new file mode 100644 index 000000000000..d0f17d2b46d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_cefr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_cefr RoBertaForSequenceClassification from caleb-edukita +author: John Snow Labs +name: roberta_base_cefr +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_cefr` is a English model originally trained by caleb-edukita. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_cefr_en_5.2.0_3.0_1701388465497.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_cefr_en_5.2.0_3.0_1701388465497.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_cefr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_cefr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_cefr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.8 MB| + +## References + +https://huggingface.co/caleb-edukita/roberta-base_cefr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_clickbait_keywords_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_clickbait_keywords_en.md new file mode 100644 index 000000000000..1288fae337ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_clickbait_keywords_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_clickbait_keywords RoBertaForSequenceClassification from Stremie +author: John Snow Labs +name: roberta_base_clickbait_keywords +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_clickbait_keywords` is a English model originally trained by Stremie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_clickbait_keywords_en_5.2.0_3.0_1701351874617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_clickbait_keywords_en_5.2.0_3.0_1701351874617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_clickbait_keywords","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_clickbait_keywords","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_clickbait_keywords| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|451.1 MB| + +## References + +https://huggingface.co/Stremie/roberta-base-clickbait-keywords \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_disaster_tweets_squall_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_disaster_tweets_squall_en.md new file mode 100644 index 000000000000..db7ae8d4d315 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_disaster_tweets_squall_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_disaster_tweets_squall RoBertaForSequenceClassification from maxschlake +author: John Snow Labs +name: roberta_base_disaster_tweets_squall +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_disaster_tweets_squall` is a English model originally trained by maxschlake. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_disaster_tweets_squall_en_5.2.0_3.0_1701346883180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_disaster_tweets_squall_en_5.2.0_3.0_1701346883180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_disaster_tweets_squall","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_disaster_tweets_squall","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_disaster_tweets_squall| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|442.3 MB| + +## References + +https://huggingface.co/maxschlake/roberta-base_disaster_tweets_squall \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_discourse_marker_prediction_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_discourse_marker_prediction_en.md new file mode 100644 index 000000000000..4aa4803f5b2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_discourse_marker_prediction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_discourse_marker_prediction RoBertaForSequenceClassification from sileod +author: John Snow Labs +name: roberta_base_discourse_marker_prediction +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_discourse_marker_prediction` is a English model originally trained by sileod. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_discourse_marker_prediction_en_5.2.0_3.0_1701364838792.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_discourse_marker_prediction_en_5.2.0_3.0_1701364838792.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_discourse_marker_prediction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_discourse_marker_prediction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_discourse_marker_prediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.8 MB| + +## References + +https://huggingface.co/sileod/roberta-base-discourse-marker-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_e_snli_classification_nli_base_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_e_snli_classification_nli_base_en.md new file mode 100644 index 000000000000..12968634819e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_e_snli_classification_nli_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_e_snli_classification_nli_base RoBertaForSequenceClassification from k4black +author: John Snow Labs +name: roberta_base_e_snli_classification_nli_base +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_e_snli_classification_nli_base` is a English model originally trained by k4black. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_e_snli_classification_nli_base_en_5.2.0_3.0_1701347728777.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_e_snli_classification_nli_base_en_5.2.0_3.0_1701347728777.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_e_snli_classification_nli_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_e_snli_classification_nli_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_e_snli_classification_nli_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.6 MB| + +## References + +https://huggingface.co/k4black/roberta-base-e-snli-classification-nli-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emoji_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emoji_en.md new file mode 100644 index 000000000000..d0ff2574ec28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emoji_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_emoji RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_base_emoji +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_emoji` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_emoji_en_5.2.0_3.0_1701368289244.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_emoji_en_5.2.0_3.0_1701368289244.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emoji","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emoji","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_emoji| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.6 MB| + +## References + +https://huggingface.co/cardiffnlp/roberta-base-emoji \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emotion_cardiffnlp_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emotion_cardiffnlp_en.md new file mode 100644 index 000000000000..437e102d3a40 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emotion_cardiffnlp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_emotion_cardiffnlp RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_base_emotion_cardiffnlp +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_emotion_cardiffnlp` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_emotion_cardiffnlp_en_5.2.0_3.0_1701353814117.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_emotion_cardiffnlp_en_5.2.0_3.0_1701353814117.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotion_cardiffnlp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotion_cardiffnlp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_emotion_cardiffnlp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.5 MB| + +## References + +https://huggingface.co/cardiffnlp/roberta-base-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emotion_honours_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emotion_honours_en.md new file mode 100644 index 000000000000..3027ba1d3263 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_emotion_honours_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_emotion_honours RoBertaForSequenceClassification from L-40408203 +author: John Snow Labs +name: roberta_base_emotion_honours +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_emotion_honours` is a English model originally trained by L-40408203. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_emotion_honours_en_5.2.0_3.0_1701347457155.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_emotion_honours_en_5.2.0_3.0_1701347457155.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotion_honours","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_emotion_honours","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_emotion_honours| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.4 MB| + +## References + +https://huggingface.co/L-40408203/roberta-base-emotion-honours \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_empathy_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_empathy_en.md new file mode 100644 index 000000000000..27d224024afb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_empathy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_empathy RoBertaForSequenceClassification from bdotloh +author: John Snow Labs +name: roberta_base_empathy +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_empathy` is a English model originally trained by bdotloh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_empathy_en_5.2.0_3.0_1701384711112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_empathy_en_5.2.0_3.0_1701384711112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_empathy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_empathy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_empathy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|434.7 MB| + +## References + +https://huggingface.co/bdotloh/roberta-base-empathy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_emotion_en.md new file mode 100644 index 000000000000..fcb12e93cd63 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_emotion RoBertaForSequenceClassification from MuntasirHossain +author: John Snow Labs +name: roberta_base_finetuned_emotion +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_emotion` is a English model originally trained by MuntasirHossain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_emotion_en_5.2.0_3.0_1701347868226.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_emotion_en_5.2.0_3.0_1701347868226.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.9 MB| + +## References + +https://huggingface.co/MuntasirHossain/RoBERTa-base-finetuned-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_irony_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_irony_en.md new file mode 100644 index 000000000000..36b051ece462 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_irony_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_irony RoBertaForSequenceClassification from vikram71198 +author: John Snow Labs +name: roberta_base_finetuned_irony +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_irony` is a English model originally trained by vikram71198. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_irony_en_5.2.0_3.0_1701385244623.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_irony_en_5.2.0_3.0_1701385244623.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_irony","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_irony","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_irony| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.4 MB| + +## References + +https://huggingface.co/vikram71198/roberta-base-finetuned-irony \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_sdg_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_sdg_en.md new file mode 100644 index 000000000000..d53ce1ed622f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_sdg_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_sdg RoBertaForSequenceClassification from jonas +author: John Snow Labs +name: roberta_base_finetuned_sdg +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_sdg` is a English model originally trained by jonas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sdg_en_5.2.0_3.0_1701369210757.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sdg_en_5.2.0_3.0_1701369210757.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_sdg","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_sdg","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_sdg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.9 MB| + +## References + +https://huggingface.co/jonas/roberta-base-finetuned-sdg \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_sst2_bhumika_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_sst2_bhumika_en.md new file mode 100644 index 000000000000..0068d38ca6de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_finetuned_sst2_bhumika_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_sst2_bhumika RoBertaForSequenceClassification from Bhumika +author: John Snow Labs +name: roberta_base_finetuned_sst2_bhumika +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_sst2_bhumika` is a English model originally trained by Bhumika. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sst2_bhumika_en_5.2.0_3.0_1701351567338.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_sst2_bhumika_en_5.2.0_3.0_1701351567338.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_sst2_bhumika","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_sst2_bhumika","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_sst2_bhumika| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.5 MB| + +## References + +https://huggingface.co/Bhumika/roberta-base-finetuned-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_md_gender_bias_saved_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_md_gender_bias_saved_en.md new file mode 100644 index 000000000000..4ecb5c188dd9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_md_gender_bias_saved_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_md_gender_bias_saved RoBertaForSequenceClassification from thaile +author: John Snow Labs +name: roberta_base_md_gender_bias_saved +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_md_gender_bias_saved` is a English model originally trained by thaile. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_md_gender_bias_saved_en_5.2.0_3.0_1701349737493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_md_gender_bias_saved_en_5.2.0_3.0_1701349737493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_md_gender_bias_saved","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_md_gender_bias_saved","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_md_gender_bias_saved| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.3 MB| + +## References + +https://huggingface.co/thaile/roberta-base-md_gender_bias-saved \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_nordea_page_types_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_nordea_page_types_en.md new file mode 100644 index 000000000000..8a9beb83a9af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_nordea_page_types_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_nordea_page_types RoBertaForSequenceClassification from arved +author: John Snow Labs +name: roberta_base_nordea_page_types +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_nordea_page_types` is a English model originally trained by arved. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_nordea_page_types_en_5.2.0_3.0_1701384912841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_nordea_page_types_en_5.2.0_3.0_1701384912841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_nordea_page_types","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_nordea_page_types","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_nordea_page_types| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|431.4 MB| + +## References + +https://huggingface.co/arved/roberta-base_nordea_page_types \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_qnli_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_qnli_en.md new file mode 100644 index 000000000000..db070aa7335b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_qnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_qnli RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_qnli +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_qnli` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_qnli_en_5.2.0_3.0_1701349320870.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_qnli_en_5.2.0_3.0_1701349320870.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_qnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|461.0 MB| + +## References + +https://huggingface.co/textattack/roberta-base-QNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_reward_model_falcon_dolly_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_reward_model_falcon_dolly_en.md new file mode 100644 index 000000000000..6c2149745608 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_reward_model_falcon_dolly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_reward_model_falcon_dolly RoBertaForSequenceClassification from argilla +author: John Snow Labs +name: roberta_base_reward_model_falcon_dolly +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_reward_model_falcon_dolly` is a English model originally trained by argilla. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_reward_model_falcon_dolly_en_5.2.0_3.0_1701351780505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_reward_model_falcon_dolly_en_5.2.0_3.0_1701351780505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_reward_model_falcon_dolly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_reward_model_falcon_dolly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_reward_model_falcon_dolly| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/argilla/roberta-base-reward-model-falcon-dolly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_sst2_willheld_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_sst2_willheld_en.md new file mode 100644 index 000000000000..8434a01a991f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_sst2_willheld_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_sst2_willheld RoBertaForSequenceClassification from WillHeld +author: John Snow Labs +name: roberta_base_sst2_willheld +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_sst2_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_sst2_willheld_en_5.2.0_3.0_1701346241671.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_sst2_willheld_en_5.2.0_3.0_1701346241671.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst2_willheld","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst2_willheld","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_sst2_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.3 MB| + +## References + +https://huggingface.co/WillHeld/roberta-base-sst2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_sst_2_64_13_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_sst_2_64_13_en.md new file mode 100644 index 000000000000..0bff9891d6af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_sst_2_64_13_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_sst_2_64_13 RoBertaForSequenceClassification from simonycl +author: John Snow Labs +name: roberta_base_sst_2_64_13 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_sst_2_64_13` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_sst_2_64_13_en_5.2.0_3.0_1701351171067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_sst_2_64_13_en_5.2.0_3.0_1701351171067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst_2_64_13","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst_2_64_13","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_sst_2_64_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.7 MB| + +## References + +https://huggingface.co/simonycl/roberta-base-sst-2-64-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_suicide_prediction_phr_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_suicide_prediction_phr_en.md new file mode 100644 index 000000000000..25e944e0339e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_suicide_prediction_phr_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_suicide_prediction_phr RoBertaForSequenceClassification from vibhorag101 +author: John Snow Labs +name: roberta_base_suicide_prediction_phr +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_suicide_prediction_phr` is a English model originally trained by vibhorag101. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_suicide_prediction_phr_en_5.2.0_3.0_1701348741459.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_suicide_prediction_phr_en_5.2.0_3.0_1701348741459.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_suicide_prediction_phr","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_suicide_prediction_phr","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_suicide_prediction_phr| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.2 MB| + +## References + +https://huggingface.co/vibhorag101/roberta-base-suicide-prediction-phr \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_trigger_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_trigger_en.md new file mode 100644 index 000000000000..fbf8e8751619 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_trigger_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_trigger RoBertaForSequenceClassification from ucsahin +author: John Snow Labs +name: roberta_base_trigger +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_trigger` is a English model originally trained by ucsahin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_trigger_en_5.2.0_3.0_1701346646988.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_trigger_en_5.2.0_3.0_1701346646988.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_trigger","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_trigger","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_trigger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.5 MB| + +## References + +https://huggingface.co/ucsahin/roberta-base-trigger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_base_wnli_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_wnli_en.md new file mode 100644 index 000000000000..d1024677c1a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_base_wnli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_wnli RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_wnli +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_wnli` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_wnli_en_5.2.0_3.0_1701371673042.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_wnli_en_5.2.0_3.0_1701371673042.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_wnli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_wnli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_wnli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|427.6 MB| + +## References + +https://huggingface.co/textattack/roberta-base-WNLI \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_bias_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_bias_en.md new file mode 100644 index 000000000000..2aa160501052 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_bias_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_bias RoBertaForSequenceClassification from aosaf +author: John Snow Labs +name: roberta_bias +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_bias` is a English model originally trained by aosaf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_bias_en_5.2.0_3.0_1701350080711.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_bias_en_5.2.0_3.0_1701350080711.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_bias","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_bias","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_bias| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.7 MB| + +## References + +https://huggingface.co/aosaf/roberta-bias \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_fake_news_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_fake_news_en.md new file mode 100644 index 000000000000..a2d6b9ccf35a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_fake_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_fake_news RoBertaForSequenceClassification from ghanashyamvtatti +author: John Snow Labs +name: roberta_fake_news +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_fake_news` is a English model originally trained by ghanashyamvtatti. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_fake_news_en_5.2.0_3.0_1701346619145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_fake_news_en_5.2.0_3.0_1701346619145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fake_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fake_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_fake_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|459.1 MB| + +## References + +https://huggingface.co/ghanashyamvtatti/roberta-fake-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_fine_tuned_sentiment_sst3_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_fine_tuned_sentiment_sst3_en.md new file mode 100644 index 000000000000..2293f994efd6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_fine_tuned_sentiment_sst3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_fine_tuned_sentiment_sst3 RoBertaForSequenceClassification from RogerKam +author: John Snow Labs +name: roberta_fine_tuned_sentiment_sst3 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_fine_tuned_sentiment_sst3` is a English model originally trained by RogerKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_fine_tuned_sentiment_sst3_en_5.2.0_3.0_1701349228823.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_fine_tuned_sentiment_sst3_en_5.2.0_3.0_1701349228823.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fine_tuned_sentiment_sst3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fine_tuned_sentiment_sst3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_fine_tuned_sentiment_sst3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.8 MB| + +## References + +https://huggingface.co/RogerKam/roberta_fine_tuned_sentiment_sst3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_finetuned_cpv_spanish_mnavas_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_finetuned_cpv_spanish_mnavas_en.md new file mode 100644 index 000000000000..5e64aad9fb12 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_finetuned_cpv_spanish_mnavas_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_cpv_spanish_mnavas RoBertaForSequenceClassification from mnavas +author: John Snow Labs +name: roberta_finetuned_cpv_spanish_mnavas +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_cpv_spanish_mnavas` is a English model originally trained by mnavas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_cpv_spanish_mnavas_en_5.2.0_3.0_1701384485121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_cpv_spanish_mnavas_en_5.2.0_3.0_1701384485121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_cpv_spanish_mnavas","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_cpv_spanish_mnavas","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_cpv_spanish_mnavas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.6 MB| + +## References + +https://huggingface.co/mnavas/roberta-finetuned-CPV_Spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_action_context_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_action_context_en.md new file mode 100644 index 000000000000..8ab03d61dbf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_action_context_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_action_context RoBertaForSequenceClassification from moralstories +author: John Snow Labs +name: roberta_large_action_context +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_action_context` is a English model originally trained by moralstories. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_action_context_en_5.2.0_3.0_1701347458679.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_action_context_en_5.2.0_3.0_1701347458679.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_action_context","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_action_context","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_action_context| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/moralstories/roberta-large_action-context \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_argugpt_sent_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_argugpt_sent_en.md new file mode 100644 index 000000000000..606a58719fe5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_argugpt_sent_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_argugpt_sent RoBertaForSequenceClassification from SJTU-CL +author: John Snow Labs +name: roberta_large_argugpt_sent +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_argugpt_sent` is a English model originally trained by SJTU-CL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_argugpt_sent_en_5.2.0_3.0_1701350966110.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_argugpt_sent_en_5.2.0_3.0_1701350966110.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_argugpt_sent","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_argugpt_sent","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_argugpt_sent| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/SJTU-CL/RoBERTa-large-ArguGPT-sent \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_catalan_paraphrase_ca.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_catalan_paraphrase_ca.md new file mode 100644 index 000000000000..30ec4105ad62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_catalan_paraphrase_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_large_catalan_paraphrase RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_large_catalan_paraphrase +date: 2023-11-30 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_catalan_paraphrase` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_catalan_paraphrase_ca_5.2.0_3.0_1701385041236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_catalan_paraphrase_ca_5.2.0_3.0_1701385041236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_catalan_paraphrase","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_catalan_paraphrase","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_catalan_paraphrase| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|1.3 GB| + +## References + +https://huggingface.co/projecte-aina/roberta-large-ca-paraphrase \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_catalan_v2_massive_ca.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_catalan_v2_massive_ca.md new file mode 100644 index 000000000000..8cb4ba391787 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_catalan_v2_massive_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_large_catalan_v2_massive RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_large_catalan_v2_massive +date: 2023-11-30 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_catalan_v2_massive` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_catalan_v2_massive_ca_5.2.0_3.0_1701352763614.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_catalan_v2_massive_ca_5.2.0_3.0_1701352763614.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_catalan_v2_massive","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_catalan_v2_massive","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_catalan_v2_massive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|1.3 GB| + +## References + +https://huggingface.co/projecte-aina/roberta-large-ca-v2-massive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_cola_howey_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_cola_howey_en.md new file mode 100644 index 000000000000..bbfd8c738c22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_cola_howey_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_cola_howey RoBertaForSequenceClassification from howey +author: John Snow Labs +name: roberta_large_cola_howey +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_cola_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_cola_howey_en_5.2.0_3.0_1701352967174.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_cola_howey_en_5.2.0_3.0_1701352967174.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_cola_howey","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_cola_howey","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_cola_howey| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/howey/roberta-large-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_financial_news_topics_english_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_financial_news_topics_english_en.md new file mode 100644 index 000000000000..843b7b037194 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_financial_news_topics_english_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_financial_news_topics_english RoBertaForSequenceClassification from Jean-Baptiste +author: John Snow Labs +name: roberta_large_financial_news_topics_english +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_financial_news_topics_english` is a English model originally trained by Jean-Baptiste. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_financial_news_topics_english_en_5.2.0_3.0_1701350486318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_financial_news_topics_english_en_5.2.0_3.0_1701350486318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_financial_news_topics_english","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_financial_news_topics_english","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_financial_news_topics_english| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Jean-Baptiste/roberta-large-financial-news-topics-en \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_go_emotions_2_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_go_emotions_2_en.md new file mode 100644 index 000000000000..e1aa15844c1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_go_emotions_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_go_emotions_2 RoBertaForSequenceClassification from tasinhoque +author: John Snow Labs +name: roberta_large_go_emotions_2 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_go_emotions_2` is a English model originally trained by tasinhoque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_go_emotions_2_en_5.2.0_3.0_1701349707001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_go_emotions_2_en_5.2.0_3.0_1701349707001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_go_emotions_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_go_emotions_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_go_emotions_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tasinhoque/roberta-large-go-emotions-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_lora_token_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_lora_token_classification_en.md new file mode 100644 index 000000000000..486ac27eb4ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_lora_token_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_lora_token_classification RoBertaForSequenceClassification from szerinted +author: John Snow Labs +name: roberta_large_lora_token_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_lora_token_classification` is a English model originally trained by szerinted. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_lora_token_classification_en_5.2.0_3.0_1701352579582.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_lora_token_classification_en_5.2.0_3.0_1701352579582.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_lora_token_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_lora_token_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_lora_token_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|846.4 MB| + +## References + +https://huggingface.co/szerinted/roberta-large-lora-token-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_mrpc_howey_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_mrpc_howey_en.md new file mode 100644 index 000000000000..3879b3ab8f4d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_mrpc_howey_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_mrpc_howey RoBertaForSequenceClassification from howey +author: John Snow Labs +name: roberta_large_mrpc_howey +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_mrpc_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_mrpc_howey_en_5.2.0_3.0_1701347659360.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_mrpc_howey_en_5.2.0_3.0_1701347659360.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_mrpc_howey","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_mrpc_howey","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_mrpc_howey| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/howey/roberta-large-mrpc \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_qqp_howey_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_qqp_howey_en.md new file mode 100644 index 000000000000..41a505220f43 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_qqp_howey_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_qqp_howey RoBertaForSequenceClassification from howey +author: John Snow Labs +name: roberta_large_qqp_howey +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_qqp_howey` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_qqp_howey_en_5.2.0_3.0_1701348308735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_qqp_howey_en_5.2.0_3.0_1701348308735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_qqp_howey","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_qqp_howey","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_qqp_howey| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/howey/roberta-large-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_question_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_question_classifier_en.md new file mode 100644 index 000000000000..7799782110ec --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_question_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_question_classifier RoBertaForSequenceClassification from jantrienes +author: John Snow Labs +name: roberta_large_question_classifier +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_question_classifier` is a English model originally trained by jantrienes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_question_classifier_en_5.2.0_3.0_1701349376626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_question_classifier_en_5.2.0_3.0_1701349376626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_question_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_question_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_question_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/jantrienes/roberta-large-question-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_rte_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_rte_en.md new file mode 100644 index 000000000000..7fbc23393362 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_rte_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_rte RoBertaForSequenceClassification from howey +author: John Snow Labs +name: roberta_large_rte +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_rte` is a English model originally trained by howey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_rte_en_5.2.0_3.0_1701347091612.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_rte_en_5.2.0_3.0_1701347091612.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_rte","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_rte","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_rte| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/howey/roberta-large-rte \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_squad_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_squad_classification_en.md new file mode 100644 index 000000000000..fffced7d4d9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_squad_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_squad_classification RoBertaForSequenceClassification from aware-ai +author: John Snow Labs +name: roberta_large_squad_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_squad_classification` is a English model originally trained by aware-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_squad_classification_en_5.2.0_3.0_1701348540009.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_squad_classification_en_5.2.0_3.0_1701348540009.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_squad_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_squad_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_squad_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|845.8 MB| + +## References + +https://huggingface.co/aware-ai/roberta-large-squad-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_sst_2_64_13_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_sst_2_64_13_en.md new file mode 100644 index 000000000000..c13a1c443c14 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_sst_2_64_13_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_sst_2_64_13 RoBertaForSequenceClassification from simonycl +author: John Snow Labs +name: roberta_large_sst_2_64_13 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_sst_2_64_13` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_sst_2_64_13_en_5.2.0_3.0_1701353583491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_sst_2_64_13_en_5.2.0_3.0_1701353583491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_sst_2_64_13","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_sst_2_64_13","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_sst_2_64_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/simonycl/roberta-large-sst-2-64-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_sts_b_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_sts_b_en.md new file mode 100644 index 000000000000..ea3ffae3ad05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_sts_b_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_sts_b RoBertaForSequenceClassification from SparkBeyond +author: John Snow Labs +name: roberta_large_sts_b +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_sts_b` is a English model originally trained by SparkBeyond. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_sts_b_en_5.2.0_3.0_1701386785089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_sts_b_en_5.2.0_3.0_1701386785089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_sts_b","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_sts_b","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_sts_b| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/SparkBeyond/roberta-large-sts-b \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_tweet_topic_multi_all_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_tweet_topic_multi_all_en.md new file mode 100644 index 000000000000..0b1c11acafcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_tweet_topic_multi_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_tweet_topic_multi_all RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_large_tweet_topic_multi_all +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_tweet_topic_multi_all` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_tweet_topic_multi_all_en_5.2.0_3.0_1701350235190.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_tweet_topic_multi_all_en_5.2.0_3.0_1701350235190.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_tweet_topic_multi_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_tweet_topic_multi_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_tweet_topic_multi_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/cardiffnlp/roberta-large-tweet-topic-multi-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_en.md new file mode 100644 index 000000000000..dfda6e91bb6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_vira_intents RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: roberta_large_vira_intents +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_vira_intents` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_en_5.2.0_3.0_1701351358901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_en_5.2.0_3.0_1701351358901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_vira_intents| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ibm/roberta-large-vira-intents \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_mod_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_mod_en.md new file mode 100644 index 000000000000..e06b50830d6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_mod_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_vira_intents_mod RoBertaForSequenceClassification from vira-chatbot +author: John Snow Labs +name: roberta_large_vira_intents_mod +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_vira_intents_mod` is a English model originally trained by vira-chatbot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_mod_en_5.2.0_3.0_1701348088180.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_mod_en_5.2.0_3.0_1701348088180.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_mod","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_mod","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_vira_intents_mod| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/vira-chatbot/roberta-large-vira-intents-mod \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_mod_gpt4_data_aug_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_mod_gpt4_data_aug_en.md new file mode 100644 index 000000000000..58b67c27fd81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_large_vira_intents_mod_gpt4_data_aug_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_vira_intents_mod_gpt4_data_aug RoBertaForSequenceClassification from vira-chatbot +author: John Snow Labs +name: roberta_large_vira_intents_mod_gpt4_data_aug +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_vira_intents_mod_gpt4_data_aug` is a English model originally trained by vira-chatbot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_mod_gpt4_data_aug_en_5.2.0_3.0_1701354004993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_mod_gpt4_data_aug_en_5.2.0_3.0_1701354004993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_mod_gpt4_data_aug","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_mod_gpt4_data_aug","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_vira_intents_mod_gpt4_data_aug| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/vira-chatbot/roberta-large-vira-intents-mod-gpt4-data-aug \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_news_duplicates_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_news_duplicates_en.md new file mode 100644 index 000000000000..667c20b78da0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_news_duplicates_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_news_duplicates RoBertaForSequenceClassification from vslaykovsky +author: John Snow Labs +name: roberta_news_duplicates +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_news_duplicates` is a English model originally trained by vslaykovsky. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_news_duplicates_en_5.2.0_3.0_1701352082127.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_news_duplicates_en_5.2.0_3.0_1701352082127.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_news_duplicates","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_news_duplicates","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_news_duplicates| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.9 MB| + +## References + +https://huggingface.co/vslaykovsky/roberta-news-duplicates \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_nrc_anger_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_nrc_anger_en.md new file mode 100644 index 000000000000..befa9daa40fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_nrc_anger_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_nrc_anger RoBertaForSequenceClassification from neal49 +author: John Snow Labs +name: roberta_nrc_anger +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_nrc_anger` is a English model originally trained by neal49. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_nrc_anger_en_5.2.0_3.0_1701359494303.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_nrc_anger_en_5.2.0_3.0_1701359494303.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_nrc_anger","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_nrc_anger","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_nrc_anger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.6 MB| + +## References + +https://huggingface.co/neal49/roberta-nrc-anger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_sentiment_classifier_vinal_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_sentiment_classifier_vinal_en.md new file mode 100644 index 000000000000..aacb6cd93566 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_sentiment_classifier_vinal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_sentiment_classifier_vinal RoBertaForSequenceClassification from VINAL +author: John Snow Labs +name: roberta_sentiment_classifier_vinal +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentiment_classifier_vinal` is a English model originally trained by VINAL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_vinal_en_5.2.0_3.0_1701361164675.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_vinal_en_5.2.0_3.0_1701361164675.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_vinal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_vinal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentiment_classifier_vinal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.2 MB| + +## References + +https://huggingface.co/VINAL/Roberta-Sentiment-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_spanish_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_spanish_sentiment_en.md new file mode 100644 index 000000000000..c43de4d5e4e4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_spanish_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_spanish_sentiment RoBertaForSequenceClassification from pysentimiento +author: John Snow Labs +name: roberta_spanish_sentiment +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_spanish_sentiment` is a English model originally trained by pysentimiento. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_spanish_sentiment_en_5.2.0_3.0_1701348521613.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_spanish_sentiment_en_5.2.0_3.0_1701348521613.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_spanish_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_spanish_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_spanish_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.6 MB| + +## References + +https://huggingface.co/pysentimiento/roberta-es-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-roberta_swahili_news_classification_sw.md b/docs/_posts/ahmedlone127/2023-11-30-roberta_swahili_news_classification_sw.md new file mode 100644 index 000000000000..d6445431162b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-roberta_swahili_news_classification_sw.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Swahili (macrolanguage) roberta_swahili_news_classification RoBertaForSequenceClassification from flax-community +author: John Snow Labs +name: roberta_swahili_news_classification +date: 2023-11-30 +tags: [roberta, sw, open_source, sequence_classification, onnx] +task: Text Classification +language: sw +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_swahili_news_classification` is a Swahili (macrolanguage) model originally trained by flax-community. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_swahili_news_classification_sw_5.2.0_3.0_1701347684570.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_swahili_news_classification_sw_5.2.0_3.0_1701347684570.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_swahili_news_classification","sw")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_swahili_news_classification","sw") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_swahili_news_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|sw| +|Size:|394.7 MB| + +## References + +https://huggingface.co/flax-community/roberta-swahili-news-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-robertuito_check_worthy_classifier_en.md b/docs/_posts/ahmedlone127/2023-11-30-robertuito_check_worthy_classifier_en.md new file mode 100644 index 000000000000..efbc4fad3610 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-robertuito_check_worthy_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English robertuito_check_worthy_classifier RoBertaForSequenceClassification from Zamoranesis +author: John Snow Labs +name: robertuito_check_worthy_classifier +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robertuito_check_worthy_classifier` is a English model originally trained by Zamoranesis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robertuito_check_worthy_classifier_en_5.2.0_3.0_1701371128824.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robertuito_check_worthy_classifier_en_5.2.0_3.0_1701371128824.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("robertuito_check_worthy_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("robertuito_check_worthy_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robertuito_check_worthy_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/Zamoranesis/Robertuito-check-worthy-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-ruroberta_large_classification_v0_1_en.md b/docs/_posts/ahmedlone127/2023-11-30-ruroberta_large_classification_v0_1_en.md new file mode 100644 index 000000000000..289c81d575b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-ruroberta_large_classification_v0_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ruroberta_large_classification_v0_1 RoBertaForSequenceClassification from Data-Lab +author: John Snow Labs +name: ruroberta_large_classification_v0_1 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ruroberta_large_classification_v0_1` is a English model originally trained by Data-Lab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ruroberta_large_classification_v0_1_en_5.2.0_3.0_1701348991491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ruroberta_large_classification_v0_1_en_5.2.0_3.0_1701348991491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ruroberta_large_classification_v0_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ruroberta_large_classification_v0_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ruroberta_large_classification_v0_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Data-Lab/ruRoberta-large_classification_v0.1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-ruroberta_large_paraphrase_v1_ru.md b/docs/_posts/ahmedlone127/2023-11-30-ruroberta_large_paraphrase_v1_ru.md new file mode 100644 index 000000000000..68a13c3db084 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-ruroberta_large_paraphrase_v1_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian ruroberta_large_paraphrase_v1 RoBertaForSequenceClassification from s-nlp +author: John Snow Labs +name: ruroberta_large_paraphrase_v1 +date: 2023-11-30 +tags: [roberta, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ruroberta_large_paraphrase_v1` is a Russian model originally trained by s-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ruroberta_large_paraphrase_v1_ru_5.2.0_3.0_1701347288037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ruroberta_large_paraphrase_v1_ru_5.2.0_3.0_1701347288037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ruroberta_large_paraphrase_v1","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ruroberta_large_paraphrase_v1","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ruroberta_large_paraphrase_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|1.3 GB| + +## References + +https://huggingface.co/s-nlp/ruRoberta-large-paraphrase-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-ruroberta_russian_rusentitweet_en.md b/docs/_posts/ahmedlone127/2023-11-30-ruroberta_russian_rusentitweet_en.md new file mode 100644 index 000000000000..8ad05e5383b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-ruroberta_russian_rusentitweet_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ruroberta_russian_rusentitweet RoBertaForSequenceClassification from sismetanin +author: John Snow Labs +name: ruroberta_russian_rusentitweet +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ruroberta_russian_rusentitweet` is a English model originally trained by sismetanin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ruroberta_russian_rusentitweet_en_5.2.0_3.0_1701351986978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ruroberta_russian_rusentitweet_en_5.2.0_3.0_1701351986978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ruroberta_russian_rusentitweet","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ruroberta_russian_rusentitweet","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ruroberta_russian_rusentitweet| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/sismetanin/ruroberta-ru-rusentitweet \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sarcasm_detection_roberta_base_pos_en.md b/docs/_posts/ahmedlone127/2023-11-30-sarcasm_detection_roberta_base_pos_en.md new file mode 100644 index 000000000000..ce9ee64f6c1d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sarcasm_detection_roberta_base_pos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sarcasm_detection_roberta_base_pos RoBertaForSequenceClassification from jkhan447 +author: John Snow Labs +name: sarcasm_detection_roberta_base_pos +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sarcasm_detection_roberta_base_pos` is a English model originally trained by jkhan447. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sarcasm_detection_roberta_base_pos_en_5.2.0_3.0_1701351413068.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sarcasm_detection_roberta_base_pos_en_5.2.0_3.0_1701351413068.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sarcasm_detection_roberta_base_pos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sarcasm_detection_roberta_base_pos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sarcasm_detection_roberta_base_pos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.4 MB| + +## References + +https://huggingface.co/jkhan447/sarcasm-detection-RoBerta-base-POS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-securebert_en.md b/docs/_posts/ahmedlone127/2023-11-30-securebert_en.md new file mode 100644 index 000000000000..20e56b120f88 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-securebert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English securebert RoBertaForSequenceClassification from ltkw98 +author: John Snow Labs +name: securebert +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`securebert` is a English model originally trained by ltkw98. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/securebert_en_5.2.0_3.0_1701352844931.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/securebert_en_5.2.0_3.0_1701352844931.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("securebert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("securebert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|securebert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.4 MB| + +## References + +https://huggingface.co/ltkw98/SecureBert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment140_roberta_5e_en.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment140_roberta_5e_en.md new file mode 100644 index 000000000000..5f0bf36dfa6f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment140_roberta_5e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment140_roberta_5e RoBertaForSequenceClassification from pig4431 +author: John Snow Labs +name: sentiment140_roberta_5e +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment140_roberta_5e` is a English model originally trained by pig4431. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment140_roberta_5e_en_5.2.0_3.0_1701349255463.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment140_roberta_5e_en_5.2.0_3.0_1701349255463.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment140_roberta_5e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment140_roberta_5e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment140_roberta_5e| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.5 MB| + +## References + +https://huggingface.co/pig4431/Sentiment140_roBERTa_5E \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_gr8testgad_1_en.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_gr8testgad_1_en.md new file mode 100644 index 000000000000..04259488863c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_gr8testgad_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_gr8testgad_1 RoBertaForSequenceClassification from gr8testgad-1 +author: John Snow Labs +name: sentiment_analysis_gr8testgad_1 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_gr8testgad_1` is a English model originally trained by gr8testgad-1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_gr8testgad_1_en_5.2.0_3.0_1701353752639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_gr8testgad_1_en_5.2.0_3.0_1701353752639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_gr8testgad_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_gr8testgad_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_gr8testgad_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.2 MB| + +## References + +https://huggingface.co/gr8testgad-1/sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_priyabrat_en.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_priyabrat_en.md new file mode 100644 index 000000000000..8eef2e5ab1d6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_priyabrat_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_priyabrat RoBertaForSequenceClassification from priyabrat +author: John Snow Labs +name: sentiment_analysis_priyabrat +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_priyabrat` is a English model originally trained by priyabrat. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_priyabrat_en_5.2.0_3.0_1701352538414.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_priyabrat_en_5.2.0_3.0_1701352538414.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_priyabrat","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_priyabrat","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_priyabrat| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/priyabrat/sentiment_analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_using_steam_data_en.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_using_steam_data_en.md new file mode 100644 index 000000000000..8ab1557d6650 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment_analysis_using_steam_data_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_using_steam_data RoBertaForSequenceClassification from PJHinAI +author: John Snow Labs +name: sentiment_analysis_using_steam_data +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_using_steam_data` is a English model originally trained by PJHinAI. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_using_steam_data_en_5.2.0_3.0_1701384269520.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_using_steam_data_en_5.2.0_3.0_1701384269520.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_using_steam_data","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_using_steam_data","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_using_steam_data| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.7 MB| + +## References + +https://huggingface.co/PJHinAI/sentiment-analysis-using-steam-data \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment_classfication_roberta_model_en.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment_classfication_roberta_model_en.md new file mode 100644 index 000000000000..27d87936278c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment_classfication_roberta_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_classfication_roberta_model RoBertaForSequenceClassification from aaronayitey +author: John Snow Labs +name: sentiment_classfication_roberta_model +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_classfication_roberta_model` is a English model originally trained by aaronayitey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_classfication_roberta_model_en_5.2.0_3.0_1701349926352.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_classfication_roberta_model_en_5.2.0_3.0_1701349926352.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_classfication_roberta_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_classfication_roberta_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_classfication_roberta_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|442.0 MB| + +## References + +https://huggingface.co/aaronayitey/Sentiment-classfication-ROBERTA-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment_hts2_xlm_roberta_hungarian_hu.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment_hts2_xlm_roberta_hungarian_hu.md new file mode 100644 index 000000000000..78e186adbcf7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment_hts2_xlm_roberta_hungarian_hu.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Hungarian sentiment_hts2_xlm_roberta_hungarian RoBertaForSequenceClassification from NYTK +author: John Snow Labs +name: sentiment_hts2_xlm_roberta_hungarian +date: 2023-11-30 +tags: [roberta, hu, open_source, sequence_classification, onnx] +task: Text Classification +language: hu +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_hts2_xlm_roberta_hungarian` is a Hungarian model originally trained by NYTK. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_hts2_xlm_roberta_hungarian_hu_5.2.0_3.0_1701353941483.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_hts2_xlm_roberta_hungarian_hu_5.2.0_3.0_1701353941483.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_hts2_xlm_roberta_hungarian","hu")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_hts2_xlm_roberta_hungarian","hu") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_hts2_xlm_roberta_hungarian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|hu| +|Size:|249.0 MB| + +## References + +https://huggingface.co/NYTK/sentiment-hts2-xlm-roberta-hungarian \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment_review_analysis_roberta_3_en.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment_review_analysis_roberta_3_en.md new file mode 100644 index 000000000000..8b48ed1b1f95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment_review_analysis_roberta_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_review_analysis_roberta_3 RoBertaForSequenceClassification from gyesibiney +author: John Snow Labs +name: sentiment_review_analysis_roberta_3 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_review_analysis_roberta_3` is a English model originally trained by gyesibiney. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_review_analysis_roberta_3_en_5.2.0_3.0_1701346294268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_review_analysis_roberta_3_en_5.2.0_3.0_1701346294268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_review_analysis_roberta_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_review_analysis_roberta_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_review_analysis_roberta_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|453.8 MB| + +## References + +https://huggingface.co/gyesibiney/Sentiment-review-analysis-roberta-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-sentiment_roberta_indonesian_id.md b/docs/_posts/ahmedlone127/2023-11-30-sentiment_roberta_indonesian_id.md new file mode 100644 index 000000000000..2e11e83d369b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-sentiment_roberta_indonesian_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian sentiment_roberta_indonesian RoBertaForSequenceClassification from arifagustyawan +author: John Snow Labs +name: sentiment_roberta_indonesian +date: 2023-11-30 +tags: [roberta, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_roberta_indonesian` is a Indonesian model originally trained by arifagustyawan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_roberta_indonesian_id_5.2.0_3.0_1701366421087.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_roberta_indonesian_id_5.2.0_3.0_1701366421087.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_roberta_indonesian","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_roberta_indonesian","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_roberta_indonesian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| + +## References + +https://huggingface.co/arifagustyawan/sentiment-roberta-id \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-simantic_similarity_en.md b/docs/_posts/ahmedlone127/2023-11-30-simantic_similarity_en.md new file mode 100644 index 000000000000..9cfc1a960e53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-simantic_similarity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English simantic_similarity RoBertaForSequenceClassification from aosaf +author: John Snow Labs +name: simantic_similarity +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`simantic_similarity` is a English model originally trained by aosaf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/simantic_similarity_en_5.2.0_3.0_1701350710295.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/simantic_similarity_en_5.2.0_3.0_1701350710295.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("simantic_similarity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("simantic_similarity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|simantic_similarity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.2 MB| + +## References + +https://huggingface.co/aosaf/simantic-similarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-socroberta_social_en.md b/docs/_posts/ahmedlone127/2023-11-30-socroberta_social_en.md new file mode 100644 index 000000000000..6d5f28a6ed34 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-socroberta_social_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English socroberta_social RoBertaForSequenceClassification from ESGBERT +author: John Snow Labs +name: socroberta_social +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`socroberta_social` is a English model originally trained by ESGBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/socroberta_social_en_5.2.0_3.0_1701353784315.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/socroberta_social_en_5.2.0_3.0_1701353784315.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("socroberta_social","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("socroberta_social","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|socroberta_social| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/ESGBERT/SocRoBERTa-social \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-stackoverflow_roberta_base_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-30-stackoverflow_roberta_base_sentiment_en.md new file mode 100644 index 000000000000..f74390ac6e2d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-stackoverflow_roberta_base_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English stackoverflow_roberta_base_sentiment RoBertaForSequenceClassification from Cloudy1225 +author: John Snow Labs +name: stackoverflow_roberta_base_sentiment +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`stackoverflow_roberta_base_sentiment` is a English model originally trained by Cloudy1225. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/stackoverflow_roberta_base_sentiment_en_5.2.0_3.0_1701351635756.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/stackoverflow_roberta_base_sentiment_en_5.2.0_3.0_1701351635756.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("stackoverflow_roberta_base_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("stackoverflow_roberta_base_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|stackoverflow_roberta_base_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/Cloudy1225/stackoverflow-roberta-base-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-tda_roberta_large_english_cola_en.md b/docs/_posts/ahmedlone127/2023-11-30-tda_roberta_large_english_cola_en.md new file mode 100644 index 000000000000..d5bf8267657d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-tda_roberta_large_english_cola_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tda_roberta_large_english_cola RoBertaForSequenceClassification from iproskurina +author: John Snow Labs +name: tda_roberta_large_english_cola +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tda_roberta_large_english_cola` is a English model originally trained by iproskurina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tda_roberta_large_english_cola_en_5.2.0_3.0_1701352839473.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tda_roberta_large_english_cola_en_5.2.0_3.0_1701352839473.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tda_roberta_large_english_cola","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tda_roberta_large_english_cola","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tda_roberta_large_english_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/iproskurina/tda-roberta-large-en-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-tda_ruroberta_large_russian_cola_ru.md b/docs/_posts/ahmedlone127/2023-11-30-tda_ruroberta_large_russian_cola_ru.md new file mode 100644 index 000000000000..d7c6f59669b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-tda_ruroberta_large_russian_cola_ru.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Russian tda_ruroberta_large_russian_cola RoBertaForSequenceClassification from iproskurina +author: John Snow Labs +name: tda_ruroberta_large_russian_cola +date: 2023-11-30 +tags: [roberta, ru, open_source, sequence_classification, onnx] +task: Text Classification +language: ru +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tda_ruroberta_large_russian_cola` is a Russian model originally trained by iproskurina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tda_ruroberta_large_russian_cola_ru_5.2.0_3.0_1701385018008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tda_ruroberta_large_russian_cola_ru_5.2.0_3.0_1701385018008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tda_ruroberta_large_russian_cola","ru")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tda_ruroberta_large_russian_cola","ru") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tda_ruroberta_large_russian_cola| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ru| +|Size:|1.3 GB| + +## References + +https://huggingface.co/iproskurina/tda-ruroberta-large-ru-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-te_for_event_extraction_en.md b/docs/_posts/ahmedlone127/2023-11-30-te_for_event_extraction_en.md new file mode 100644 index 000000000000..869a0147272f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-te_for_event_extraction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English te_for_event_extraction RoBertaForSequenceClassification from veronica320 +author: John Snow Labs +name: te_for_event_extraction +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`te_for_event_extraction` is a English model originally trained by veronica320. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/te_for_event_extraction_en_5.2.0_3.0_1701348165259.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/te_for_event_extraction_en_5.2.0_3.0_1701348165259.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("te_for_event_extraction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("te_for_event_extraction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|te_for_event_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/veronica320/TE-for-Event-Extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-techdebtclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-30-techdebtclassifier_en.md new file mode 100644 index 000000000000..73c3f0c43b2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-techdebtclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English techdebtclassifier RoBertaForSequenceClassification from davidgaofc +author: John Snow Labs +name: techdebtclassifier +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`techdebtclassifier` is a English model originally trained by davidgaofc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/techdebtclassifier_en_5.2.0_3.0_1701347255486.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/techdebtclassifier_en_5.2.0_3.0_1701347255486.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("techdebtclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("techdebtclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|techdebtclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|314.0 MB| + +## References + +https://huggingface.co/davidgaofc/TechDebtClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-test_trainer_qwekuaryee_en.md b/docs/_posts/ahmedlone127/2023-11-30-test_trainer_qwekuaryee_en.md new file mode 100644 index 000000000000..103b0be8b015 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-test_trainer_qwekuaryee_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English test_trainer_qwekuaryee RoBertaForSequenceClassification from qwekuaryee +author: John Snow Labs +name: test_trainer_qwekuaryee +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_trainer_qwekuaryee` is a English model originally trained by qwekuaryee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_trainer_qwekuaryee_en_5.2.0_3.0_1701362905041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_trainer_qwekuaryee_en_5.2.0_3.0_1701362905041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("test_trainer_qwekuaryee","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("test_trainer_qwekuaryee","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_trainer_qwekuaryee| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.3 MB| + +## References + +https://huggingface.co/qwekuaryee/test_trainer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-text_classification_goemotions_en.md b/docs/_posts/ahmedlone127/2023-11-30-text_classification_goemotions_en.md new file mode 100644 index 000000000000..70cd44ce6af8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-text_classification_goemotions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_classification_goemotions RoBertaForSequenceClassification from tasinhoque +author: John Snow Labs +name: text_classification_goemotions +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classification_goemotions` is a English model originally trained by tasinhoque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classification_goemotions_en_5.2.0_3.0_1701346449751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classification_goemotions_en_5.2.0_3.0_1701346449751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("text_classification_goemotions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("text_classification_goemotions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classification_goemotions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tasinhoque/text-classification-goemotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-text_message_analyzer_finetuned_en.md b/docs/_posts/ahmedlone127/2023-11-30-text_message_analyzer_finetuned_en.md new file mode 100644 index 000000000000..721ec4cc08e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-text_message_analyzer_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_message_analyzer_finetuned RoBertaForSequenceClassification from matchten +author: John Snow Labs +name: text_message_analyzer_finetuned +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_message_analyzer_finetuned` is a English model originally trained by matchten. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_message_analyzer_finetuned_en_5.2.0_3.0_1701348993804.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_message_analyzer_finetuned_en_5.2.0_3.0_1701348993804.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("text_message_analyzer_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("text_message_analyzer_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_message_analyzer_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/matchten/text-message-analyzer-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-textattack_roberta_base_mnli_e_snli_classification_nli_base_en.md b/docs/_posts/ahmedlone127/2023-11-30-textattack_roberta_base_mnli_e_snli_classification_nli_base_en.md new file mode 100644 index 000000000000..f71bea13a92e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-textattack_roberta_base_mnli_e_snli_classification_nli_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English textattack_roberta_base_mnli_e_snli_classification_nli_base RoBertaForSequenceClassification from k4black +author: John Snow Labs +name: textattack_roberta_base_mnli_e_snli_classification_nli_base +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`textattack_roberta_base_mnli_e_snli_classification_nli_base` is a English model originally trained by k4black. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/textattack_roberta_base_mnli_e_snli_classification_nli_base_en_5.2.0_3.0_1701350608073.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/textattack_roberta_base_mnli_e_snli_classification_nli_base_en_5.2.0_3.0_1701350608073.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("textattack_roberta_base_mnli_e_snli_classification_nli_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("textattack_roberta_base_mnli_e_snli_classification_nli_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|textattack_roberta_base_mnli_e_snli_classification_nli_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|463.4 MB| + +## References + +https://huggingface.co/k4black/textattack-roberta-base-MNLI-e-snli-classification-nli-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-title_classification_en.md b/docs/_posts/ahmedlone127/2023-11-30-title_classification_en.md new file mode 100644 index 000000000000..a9c0cfc173d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-title_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English title_classification RoBertaForSequenceClassification from shaoyuyoung +author: John Snow Labs +name: title_classification +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`title_classification` is a English model originally trained by shaoyuyoung. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/title_classification_en_5.2.0_3.0_1701360666841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/title_classification_en_5.2.0_3.0_1701360666841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("title_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("title_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|title_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.0 MB| + +## References + +https://huggingface.co/shaoyuyoung/Title-Classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-toxic_speech_detector_en.md b/docs/_posts/ahmedlone127/2023-11-30-toxic_speech_detector_en.md new file mode 100644 index 000000000000..6354fd283e54 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-toxic_speech_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English toxic_speech_detector RoBertaForSequenceClassification from rb05751 +author: John Snow Labs +name: toxic_speech_detector +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`toxic_speech_detector` is a English model originally trained by rb05751. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/toxic_speech_detector_en_5.2.0_3.0_1701375099476.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/toxic_speech_detector_en_5.2.0_3.0_1701375099476.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("toxic_speech_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("toxic_speech_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|toxic_speech_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|461.7 MB| + +## References + +https://huggingface.co/rb05751/toxic_speech_detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-traceclassifier_en.md b/docs/_posts/ahmedlone127/2023-11-30-traceclassifier_en.md new file mode 100644 index 000000000000..d7e81f9327c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-traceclassifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English traceclassifier RoBertaForSequenceClassification from icelab +author: John Snow Labs +name: traceclassifier +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`traceclassifier` is a English model originally trained by icelab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/traceclassifier_en_5.2.0_3.0_1701368650521.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/traceclassifier_en_5.2.0_3.0_1701368650521.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("traceclassifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("traceclassifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|traceclassifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.4 MB| + +## References + +https://huggingface.co/icelab/TraceClassifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-transformationtransformer_en.md b/docs/_posts/ahmedlone127/2023-11-30-transformationtransformer_en.md new file mode 100644 index 000000000000..cab45f512074 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-transformationtransformer_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English transformationtransformer RoBertaForSequenceClassification from simonschoe +author: John Snow Labs +name: transformationtransformer +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`transformationtransformer` is a English model originally trained by simonschoe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/transformationtransformer_en_5.2.0_3.0_1701351992803.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/transformationtransformer_en_5.2.0_3.0_1701351992803.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("transformationtransformer","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("transformationtransformer","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|transformationtransformer| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.1 MB| + +## References + +https://huggingface.co/simonschoe/TransformationTransformer \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-truefoundary_sentimental_roberta_en.md b/docs/_posts/ahmedlone127/2023-11-30-truefoundary_sentimental_roberta_en.md new file mode 100644 index 000000000000..cdb37b9093ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-truefoundary_sentimental_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English truefoundary_sentimental_roberta RoBertaForSequenceClassification from velvrix +author: John Snow Labs +name: truefoundary_sentimental_roberta +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`truefoundary_sentimental_roberta` is a English model originally trained by velvrix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/truefoundary_sentimental_roberta_en_5.2.0_3.0_1701384917892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/truefoundary_sentimental_roberta_en_5.2.0_3.0_1701384917892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("truefoundary_sentimental_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("truefoundary_sentimental_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|truefoundary_sentimental_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.9 MB| + +## References + +https://huggingface.co/velvrix/truefoundary_sentimental_RoBERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-tweet_topic_latest_single_en.md b/docs/_posts/ahmedlone127/2023-11-30-tweet_topic_latest_single_en.md new file mode 100644 index 000000000000..b90a2563dff7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-tweet_topic_latest_single_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tweet_topic_latest_single RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: tweet_topic_latest_single +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tweet_topic_latest_single` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tweet_topic_latest_single_en_5.2.0_3.0_1701375099192.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tweet_topic_latest_single_en_5.2.0_3.0_1701375099192.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_topic_latest_single","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_topic_latest_single","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tweet_topic_latest_single| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/tweet-topic-latest-single \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_finetuned_model_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_finetuned_model_en.md new file mode 100644 index 000000000000..b768fdcf0f31 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_finetuned_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_finetuned_model RoBertaForSequenceClassification from MaryanneMuchai +author: John Snow Labs +name: twitter_finetuned_model +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_finetuned_model` is a English model originally trained by MaryanneMuchai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_finetuned_model_en_5.2.0_3.0_1701348763313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_finetuned_model_en_5.2.0_3.0_1701348763313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_finetuned_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_finetuned_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_finetuned_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.3 MB| + +## References + +https://huggingface.co/MaryanneMuchai/twitter-finetuned-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_2021_124m_emotion_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_2021_124m_emotion_en.md new file mode 100644 index 000000000000..8ea2a90ef2b8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_2021_124m_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2021_124m_emotion RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2021_124m_emotion +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2021_124m_emotion` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_emotion_en_5.2.0_3.0_1701386556013.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_emotion_en_5.2.0_3.0_1701386556013.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2021_124m_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_2021_124m_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_2021_124m_sentiment_en.md new file mode 100644 index 000000000000..1eaa5f977cbc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_2021_124m_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2021_124m_sentiment RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2021_124m_sentiment +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2021_124m_sentiment` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_sentiment_en_5.2.0_3.0_1701350662611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_sentiment_en_5.2.0_3.0_1701350662611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2021_124m_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_emoji_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_emoji_en.md new file mode 100644 index 000000000000..f89211d73512 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_emoji_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_emoji RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_emoji +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_emoji` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_emoji_en_5.2.0_3.0_1701349321234.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_emoji_en_5.2.0_3.0_1701349321234.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_emoji","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_emoji","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_emoji| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-emoji \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_sentiment_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_sentiment_en.md new file mode 100644 index 000000000000..0e2879237a15 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_sentiment RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_sentiment +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_sentiment` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_sentiment_en_5.2.0_3.0_1701347892528.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_sentiment_en_5.2.0_3.0_1701347892528.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_tweet_topic_single_all_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_tweet_topic_single_all_en.md new file mode 100644 index 000000000000..f737c40cb4d4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_dec2021_tweet_topic_single_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_tweet_topic_single_all RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_tweet_topic_single_all +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_tweet_topic_single_all` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_tweet_topic_single_all_en_5.2.0_3.0_1701346138271.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_tweet_topic_single_all_en_5.2.0_3.0_1701346138271.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_tweet_topic_single_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_tweet_topic_single_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_tweet_topic_single_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-tweet-topic-single-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_fear_intensity_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_fear_intensity_en.md new file mode 100644 index 000000000000..6fb9c6002540 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_fear_intensity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_fear_intensity RoBertaForSequenceClassification from garrettbaber +author: John Snow Labs +name: twitter_roberta_base_fear_intensity +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_fear_intensity` is a English model originally trained by garrettbaber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_fear_intensity_en_5.2.0_3.0_1701370487607.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_fear_intensity_en_5.2.0_3.0_1701370487607.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_fear_intensity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_fear_intensity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_fear_intensity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/garrettbaber/twitter-roberta-base-fear-intensity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_sentiment_latest_kapiche_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_sentiment_latest_kapiche_en.md new file mode 100644 index 000000000000..b3874f92958d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_sentiment_latest_kapiche_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_sentiment_latest_kapiche RoBertaForSequenceClassification from Kapiche +author: John Snow Labs +name: twitter_roberta_base_sentiment_latest_kapiche +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_sentiment_latest_kapiche` is a English model originally trained by Kapiche. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentiment_latest_kapiche_en_5.2.0_3.0_1701346725987.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentiment_latest_kapiche_en_5.2.0_3.0_1701346725987.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentiment_latest_kapiche","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentiment_latest_kapiche","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_sentiment_latest_kapiche| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Kapiche/twitter-roberta-base-sentiment-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_sentimental_analysis_of_covid_tweets_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_sentimental_analysis_of_covid_tweets_en.md new file mode 100644 index 000000000000..b71b1e4a99e3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_sentimental_analysis_of_covid_tweets_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_sentimental_analysis_of_covid_tweets RoBertaForSequenceClassification from Sonny4Sonnix +author: John Snow Labs +name: twitter_roberta_base_sentimental_analysis_of_covid_tweets +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_sentimental_analysis_of_covid_tweets` is a English model originally trained by Sonny4Sonnix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentimental_analysis_of_covid_tweets_en_5.2.0_3.0_1701383998268.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentimental_analysis_of_covid_tweets_en_5.2.0_3.0_1701383998268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentimental_analysis_of_covid_tweets","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentimental_analysis_of_covid_tweets","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_sentimental_analysis_of_covid_tweets| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/Sonny4Sonnix/twitter-roberta-base-sentimental-analysis-of-covid-tweets \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_stance_climate_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_stance_climate_en.md new file mode 100644 index 000000000000..9646d0ccc4d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_base_stance_climate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_stance_climate RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_stance_climate +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_stance_climate` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_climate_en_5.2.0_3.0_1701351414018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_climate_en_5.2.0_3.0_1701351414018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_climate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_climate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_stance_climate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-stance-climate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_sentiment_model_en.md new file mode 100644 index 000000000000..107abdcfba2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_roberta_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_sentiment_model RoBertaForSequenceClassification from Faith-theAnalyst +author: John Snow Labs +name: twitter_roberta_sentiment_model +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_sentiment_model` is a English model originally trained by Faith-theAnalyst. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_sentiment_model_en_5.2.0_3.0_1701350175745.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_sentiment_model_en_5.2.0_3.0_1701350175745.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Faith-theAnalyst/twitter_roberta_sentiment_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-twitter_sexismo_finetuned_robertuito_exist2021_en.md b/docs/_posts/ahmedlone127/2023-11-30-twitter_sexismo_finetuned_robertuito_exist2021_en.md new file mode 100644 index 000000000000..d738ca562471 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-twitter_sexismo_finetuned_robertuito_exist2021_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_sexismo_finetuned_robertuito_exist2021 RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: twitter_sexismo_finetuned_robertuito_exist2021 +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_sexismo_finetuned_robertuito_exist2021` is a English model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_sexismo_finetuned_robertuito_exist2021_en_5.2.0_3.0_1701348361464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_sexismo_finetuned_robertuito_exist2021_en_5.2.0_3.0_1701348361464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_sexismo_finetuned_robertuito_exist2021","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_sexismo_finetuned_robertuito_exist2021","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_sexismo_finetuned_robertuito_exist2021| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/hackathon-pln-es/twitter_sexismo-finetuned-robertuito-exist2021 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-valence_english_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-11-30-valence_english_distilroberta_base_en.md new file mode 100644 index 000000000000..95a96782ce1c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-valence_english_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English valence_english_distilroberta_base RoBertaForSequenceClassification from samueldomdey +author: John Snow Labs +name: valence_english_distilroberta_base +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`valence_english_distilroberta_base` is a English model originally trained by samueldomdey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/valence_english_distilroberta_base_en_5.2.0_3.0_1701352964729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/valence_english_distilroberta_base_en_5.2.0_3.0_1701352964729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("valence_english_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("valence_english_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|valence_english_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/samueldomdey/valence-english-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-verb_class_en.md b/docs/_posts/ahmedlone127/2023-11-30-verb_class_en.md new file mode 100644 index 000000000000..8e510435d845 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-verb_class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English verb_class RoBertaForSequenceClassification from silkski +author: John Snow Labs +name: verb_class +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`verb_class` is a English model originally trained by silkski. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/verb_class_en_5.2.0_3.0_1701353523242.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/verb_class_en_5.2.0_3.0_1701353523242.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("verb_class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("verb_class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|verb_class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|417.7 MB| + +## References + +https://huggingface.co/silkski/verb-class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad_en.md b/docs/_posts/ahmedlone127/2023-11-30-vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad_en.md new file mode 100644 index 000000000000..b779ff6246c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad RoBertaForSequenceClassification from hackathon-somos-nlp-2023 +author: John Snow Labs +name: vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad` is a English model originally trained by hackathon-somos-nlp-2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad_en_5.2.0_3.0_1701348514415.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad_en_5.2.0_3.0_1701348514415.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|vg055_roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_polaridad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.4 MB| + +## References + +https://huggingface.co/hackathon-somos-nlp-2023/vg055-roberta-base-bne-finetuned-analisis-sentimiento-textos-turisticos-mx-polaridad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-11-30-yelp_roberta_5e_en.md b/docs/_posts/ahmedlone127/2023-11-30-yelp_roberta_5e_en.md new file mode 100644 index 000000000000..10102ec4eb1a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-11-30-yelp_roberta_5e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English yelp_roberta_5e RoBertaForSequenceClassification from pig4431 +author: John Snow Labs +name: yelp_roberta_5e +date: 2023-11-30 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`yelp_roberta_5e` is a English model originally trained by pig4431. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/yelp_roberta_5e_en_5.2.0_3.0_1701364714470.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/yelp_roberta_5e_en_5.2.0_3.0_1701364714470.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("yelp_roberta_5e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("yelp_roberta_5e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|yelp_roberta_5e| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.7 MB| + +## References + +https://huggingface.co/pig4431/YELP_roBERTa_5E \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-ai_human_classification_hc3_wiki_en.md b/docs/_posts/ahmedlone127/2023-12-01-ai_human_classification_hc3_wiki_en.md new file mode 100644 index 000000000000..878d0fb539cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-ai_human_classification_hc3_wiki_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ai_human_classification_hc3_wiki RoBertaForSequenceClassification from rajendrabaskota +author: John Snow Labs +name: ai_human_classification_hc3_wiki +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ai_human_classification_hc3_wiki` is a English model originally trained by rajendrabaskota. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ai_human_classification_hc3_wiki_en_5.2.0_3.0_1701412236636.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ai_human_classification_hc3_wiki_en_5.2.0_3.0_1701412236636.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ai_human_classification_hc3_wiki","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ai_human_classification_hc3_wiki","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ai_human_classification_hc3_wiki| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.0 MB| + +## References + +https://huggingface.co/rajendrabaskota/ai-human-classification-hc3-wiki \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-alberta_base_mnli_v1_en.md b/docs/_posts/ahmedlone127/2023-12-01-alberta_base_mnli_v1_en.md new file mode 100644 index 000000000000..c4d4677d81da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-alberta_base_mnli_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English alberta_base_mnli_v1 RoBertaForSequenceClassification from blackbird +author: John Snow Labs +name: alberta_base_mnli_v1 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`alberta_base_mnli_v1` is a English model originally trained by blackbird. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/alberta_base_mnli_v1_en_5.2.0_3.0_1701470882041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/alberta_base_mnli_v1_en_5.2.0_3.0_1701470882041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("alberta_base_mnli_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("alberta_base_mnli_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|alberta_base_mnli_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.4 MB| + +## References + +https://huggingface.co/blackbird/alberta-base-mnli-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_credit_cards_1_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_credit_cards_1_16_5_en.md new file mode 100644 index 000000000000..8f19eddc1c23 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_credit_cards_1_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_credit_cards_1_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_credit_cards_1_16_5 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_credit_cards_1_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_credit_cards_1_16_5_en_5.2.0_3.0_1701419078687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_credit_cards_1_16_5_en_5.2.0_3.0_1701419078687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_credit_cards_1_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_credit_cards_1_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_credit_cards_1_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-credit_cards-1-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_kitchen_and_dining_1_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_kitchen_and_dining_1_16_5_en.md new file mode 100644 index 000000000000..5b7d3b07c0ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_kitchen_and_dining_1_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_kitchen_and_dining_1_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_kitchen_and_dining_1_16_5 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_kitchen_and_dining_1_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_kitchen_and_dining_1_16_5_en_5.2.0_3.0_1701472805327.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_kitchen_and_dining_1_16_5_en_5.2.0_3.0_1701472805327.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_kitchen_and_dining_1_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_kitchen_and_dining_1_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_kitchen_and_dining_1_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-kitchen_and_dining-1-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_small_talk_2_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_small_talk_2_16_5_en.md new file mode 100644 index 000000000000..afd48717d8a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-all_roberta_large_v1_small_talk_2_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_small_talk_2_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_small_talk_2_16_5 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_small_talk_2_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_small_talk_2_16_5_en_5.2.0_3.0_1701450021233.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_small_talk_2_16_5_en_5.2.0_3.0_1701450021233.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_small_talk_2_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_small_talk_2_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_small_talk_2_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-small_talk-2-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-amazonpolarity_roberta_5e_en.md b/docs/_posts/ahmedlone127/2023-12-01-amazonpolarity_roberta_5e_en.md new file mode 100644 index 000000000000..5bda59cc33ce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-amazonpolarity_roberta_5e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English amazonpolarity_roberta_5e RoBertaForSequenceClassification from pig4431 +author: John Snow Labs +name: amazonpolarity_roberta_5e +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazonpolarity_roberta_5e` is a English model originally trained by pig4431. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazonpolarity_roberta_5e_en_5.2.0_3.0_1701411656189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazonpolarity_roberta_5e_en_5.2.0_3.0_1701411656189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("amazonpolarity_roberta_5e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("amazonpolarity_roberta_5e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazonpolarity_roberta_5e| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.5 MB| + +## References + +https://huggingface.co/pig4431/amazonPolarity_roBERTa_5E \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-analysing_socialmedia_sentiment_on_vaccines_en.md b/docs/_posts/ahmedlone127/2023-12-01-analysing_socialmedia_sentiment_on_vaccines_en.md new file mode 100644 index 000000000000..6de5a527d388 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-analysing_socialmedia_sentiment_on_vaccines_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English analysing_socialmedia_sentiment_on_vaccines RoBertaForSequenceClassification from allevelly +author: John Snow Labs +name: analysing_socialmedia_sentiment_on_vaccines +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`analysing_socialmedia_sentiment_on_vaccines` is a English model originally trained by allevelly. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/analysing_socialmedia_sentiment_on_vaccines_en_5.2.0_3.0_1701414213915.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/analysing_socialmedia_sentiment_on_vaccines_en_5.2.0_3.0_1701414213915.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("analysing_socialmedia_sentiment_on_vaccines","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("analysing_socialmedia_sentiment_on_vaccines","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|analysing_socialmedia_sentiment_on_vaccines| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/allevelly/Analysing_socialMedia_sentiment_on_vaccines \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-argument_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-argument_classifier_en.md new file mode 100644 index 000000000000..3304b72365c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-argument_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English argument_classifier RoBertaForSequenceClassification from addy88 +author: John Snow Labs +name: argument_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`argument_classifier` is a English model originally trained by addy88. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/argument_classifier_en_5.2.0_3.0_1701402367183.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/argument_classifier_en_5.2.0_3.0_1701402367183.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("argument_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("argument_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|argument_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.3 MB| + +## References + +https://huggingface.co/addy88/argument-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-argumentmining_cat_ac_vivesdebate_en.md b/docs/_posts/ahmedlone127/2023-12-01-argumentmining_cat_ac_vivesdebate_en.md new file mode 100644 index 000000000000..7f29b0092e9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-argumentmining_cat_ac_vivesdebate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English argumentmining_cat_ac_vivesdebate RoBertaForSequenceClassification from raruidol +author: John Snow Labs +name: argumentmining_cat_ac_vivesdebate +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`argumentmining_cat_ac_vivesdebate` is a English model originally trained by raruidol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/argumentmining_cat_ac_vivesdebate_en_5.2.0_3.0_1701472229118.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/argumentmining_cat_ac_vivesdebate_en_5.2.0_3.0_1701472229118.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("argumentmining_cat_ac_vivesdebate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("argumentmining_cat_ac_vivesdebate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|argumentmining_cat_ac_vivesdebate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|425.0 MB| + +## References + +https://huggingface.co/raruidol/ArgumentMining-CAT-AC-VivesDebate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-article_title_pol_en.md b/docs/_posts/ahmedlone127/2023-12-01-article_title_pol_en.md new file mode 100644 index 000000000000..2d02d82ef5b1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-article_title_pol_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English article_title_pol RoBertaForSequenceClassification from helliun +author: John Snow Labs +name: article_title_pol +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`article_title_pol` is a English model originally trained by helliun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/article_title_pol_en_5.2.0_3.0_1701450708317.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/article_title_pol_en_5.2.0_3.0_1701450708317.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("article_title_pol","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("article_title_pol","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|article_title_pol| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/helliun/article_title_pol \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-autotrain_banking77_distilroberta_44209111546_en.md b/docs/_posts/ahmedlone127/2023-12-01-autotrain_banking77_distilroberta_44209111546_en.md new file mode 100644 index 000000000000..369d9b1d15c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-autotrain_banking77_distilroberta_44209111546_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_banking77_distilroberta_44209111546 RoBertaForSequenceClassification from derek-thomas +author: John Snow Labs +name: autotrain_banking77_distilroberta_44209111546 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_banking77_distilroberta_44209111546` is a English model originally trained by derek-thomas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_banking77_distilroberta_44209111546_en_5.2.0_3.0_1701471504293.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_banking77_distilroberta_44209111546_en_5.2.0_3.0_1701471504293.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_banking77_distilroberta_44209111546","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_banking77_distilroberta_44209111546","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_banking77_distilroberta_44209111546| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/derek-thomas/autotrain-banking77-distilroberta-44209111546 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-autotrain_fake_news_fine_tuned_v4_38998102353_en.md b/docs/_posts/ahmedlone127/2023-12-01-autotrain_fake_news_fine_tuned_v4_38998102353_en.md new file mode 100644 index 000000000000..0de049a59ac9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-autotrain_fake_news_fine_tuned_v4_38998102353_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_fake_news_fine_tuned_v4_38998102353 RoBertaForSequenceClassification from systash +author: John Snow Labs +name: autotrain_fake_news_fine_tuned_v4_38998102353 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_fake_news_fine_tuned_v4_38998102353` is a English model originally trained by systash. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_fake_news_fine_tuned_v4_38998102353_en_5.2.0_3.0_1701412629312.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_fake_news_fine_tuned_v4_38998102353_en_5.2.0_3.0_1701412629312.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_fake_news_fine_tuned_v4_38998102353","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_fake_news_fine_tuned_v4_38998102353","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_fake_news_fine_tuned_v4_38998102353| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.5 MB| + +## References + +https://huggingface.co/systash/autotrain-fake_news_fine_tuned_v4-38998102353 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-autotrain_imdb_textclassification_46471115127_en.md b/docs/_posts/ahmedlone127/2023-12-01-autotrain_imdb_textclassification_46471115127_en.md new file mode 100644 index 000000000000..8a6007053dd3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-autotrain_imdb_textclassification_46471115127_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_imdb_textclassification_46471115127 RoBertaForSequenceClassification from davis901 +author: John Snow Labs +name: autotrain_imdb_textclassification_46471115127 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_imdb_textclassification_46471115127` is a English model originally trained by davis901. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_imdb_textclassification_46471115127_en_5.2.0_3.0_1701469516998.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_imdb_textclassification_46471115127_en_5.2.0_3.0_1701469516998.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_imdb_textclassification_46471115127","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_imdb_textclassification_46471115127","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_imdb_textclassification_46471115127| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/davis901/autotrain-imdb-textclassification-46471115127 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-autotrain_neurips_chanllenge_1287149282_en.md b/docs/_posts/ahmedlone127/2023-12-01-autotrain_neurips_chanllenge_1287149282_en.md new file mode 100644 index 000000000000..a07da45338af --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-autotrain_neurips_chanllenge_1287149282_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autotrain_neurips_chanllenge_1287149282 RoBertaForSequenceClassification from jawadhussein462 +author: John Snow Labs +name: autotrain_neurips_chanllenge_1287149282 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autotrain_neurips_chanllenge_1287149282` is a English model originally trained by jawadhussein462. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autotrain_neurips_chanllenge_1287149282_en_5.2.0_3.0_1701472132170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autotrain_neurips_chanllenge_1287149282_en_5.2.0_3.0_1701472132170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_neurips_chanllenge_1287149282","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("autotrain_neurips_chanllenge_1287149282","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autotrain_neurips_chanllenge_1287149282| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/jawadhussein462/autotrain-neurips_chanllenge-1287149282 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-bertin_base_paws_x_spanish_es.md b/docs/_posts/ahmedlone127/2023-12-01-bertin_base_paws_x_spanish_es.md new file mode 100644 index 000000000000..2cc6f8e40292 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-bertin_base_paws_x_spanish_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish bertin_base_paws_x_spanish RoBertaForSequenceClassification from bertin-project +author: John Snow Labs +name: bertin_base_paws_x_spanish +date: 2023-12-01 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertin_base_paws_x_spanish` is a Castilian, Spanish model originally trained by bertin-project. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertin_base_paws_x_spanish_es_5.2.0_3.0_1701405194654.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertin_base_paws_x_spanish_es_5.2.0_3.0_1701405194654.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertin_base_paws_x_spanish","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertin_base_paws_x_spanish","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertin_base_paws_x_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|432.3 MB| + +## References + +https://huggingface.co/bertin-project/bertin-base-paws-x-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test_en.md b/docs/_posts/ahmedlone127/2023-12-01-bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test_en.md new file mode 100644 index 000000000000..5a62096458f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test RoBertaForSequenceClassification from Sleoruiz +author: John Snow Labs +name: bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test` is a English model originally trained by Sleoruiz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test_en_5.2.0_3.0_1701415209560.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test_en_5.2.0_3.0_1701415209560.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertin_roberta_fine_tuned_text_classification_sl_data_augmentation_test| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.7 MB| + +## References + +https://huggingface.co/Sleoruiz/bertin-roberta-fine-tuned-text-classification-SL-data-augmentation-test \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-bot_selector_en.md b/docs/_posts/ahmedlone127/2023-12-01-bot_selector_en.md new file mode 100644 index 000000000000..70b32fb30cb0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-bot_selector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bot_selector RoBertaForSequenceClassification from GeniusVoice +author: John Snow Labs +name: bot_selector +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bot_selector` is a English model originally trained by GeniusVoice. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bot_selector_en_5.2.0_3.0_1701423615990.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bot_selector_en_5.2.0_3.0_1701423615990.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bot_selector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bot_selector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bot_selector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.9 MB| + +## References + +https://huggingface.co/GeniusVoice/bot-selector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-bsc_bio_ehr_spanish_cantemist_es.md b/docs/_posts/ahmedlone127/2023-12-01-bsc_bio_ehr_spanish_cantemist_es.md new file mode 100644 index 000000000000..77bb311ce8eb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-bsc_bio_ehr_spanish_cantemist_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish bsc_bio_ehr_spanish_cantemist RoBertaForSequenceClassification from IIC +author: John Snow Labs +name: bsc_bio_ehr_spanish_cantemist +date: 2023-12-01 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bsc_bio_ehr_spanish_cantemist` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bsc_bio_ehr_spanish_cantemist_es_5.2.0_3.0_1701424837477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bsc_bio_ehr_spanish_cantemist_es_5.2.0_3.0_1701424837477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bsc_bio_ehr_spanish_cantemist","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bsc_bio_ehr_spanish_cantemist","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bsc_bio_ehr_spanish_cantemist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|437.0 MB| + +## References + +https://huggingface.co/IIC/bsc-bio-ehr-es-cantemist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-burmese_awesome_unixcoder_2_en.md b/docs/_posts/ahmedlone127/2023-12-01-burmese_awesome_unixcoder_2_en.md new file mode 100644 index 000000000000..6954d8e55c6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-burmese_awesome_unixcoder_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_unixcoder_2 RoBertaForSequenceClassification from buelfhood +author: John Snow Labs +name: burmese_awesome_unixcoder_2 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_unixcoder_2` is a English model originally trained by buelfhood. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_unixcoder_2_en_5.2.0_3.0_1701473078861.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_unixcoder_2_en_5.2.0_3.0_1701473078861.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_awesome_unixcoder_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_awesome_unixcoder_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_unixcoder_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|472.4 MB| + +## References + +https://huggingface.co/buelfhood/my_awesome_unixcoder_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-cbd_en.md b/docs/_posts/ahmedlone127/2023-12-01-cbd_en.md new file mode 100644 index 000000000000..8d5c668e4ea2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-cbd_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cbd RoBertaForSequenceClassification from Kidsshield +author: John Snow Labs +name: cbd +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cbd` is a English model originally trained by Kidsshield. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cbd_en_5.2.0_3.0_1701473324136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cbd_en_5.2.0_3.0_1701473324136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cbd","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cbd","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cbd| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.8 MB| + +## References + +https://huggingface.co/Kidsshield/CBD \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-censor_testing_performance_en.md b/docs/_posts/ahmedlone127/2023-12-01-censor_testing_performance_en.md new file mode 100644 index 000000000000..4b7ce65d6ef7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-censor_testing_performance_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English censor_testing_performance RoBertaForSequenceClassification from Gadmz +author: John Snow Labs +name: censor_testing_performance +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`censor_testing_performance` is a English model originally trained by Gadmz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/censor_testing_performance_en_5.2.0_3.0_1701425153819.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/censor_testing_performance_en_5.2.0_3.0_1701425153819.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("censor_testing_performance","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("censor_testing_performance","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|censor_testing_performance| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.6 MB| + +## References + +https://huggingface.co/Gadmz/censor-testing-performance \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-class_poems_spanish_en.md b/docs/_posts/ahmedlone127/2023-12-01-class_poems_spanish_en.md new file mode 100644 index 000000000000..79b894551cb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-class_poems_spanish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English class_poems_spanish RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: class_poems_spanish +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`class_poems_spanish` is a English model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/class_poems_spanish_en_5.2.0_3.0_1701402002195.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/class_poems_spanish_en_5.2.0_3.0_1701402002195.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("class_poems_spanish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("class_poems_spanish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|class_poems_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.1 MB| + +## References + +https://huggingface.co/hackathon-pln-es/class-poems-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-climatebert_base_f_climate_evidence_related_en.md b/docs/_posts/ahmedlone127/2023-12-01-climatebert_base_f_climate_evidence_related_en.md new file mode 100644 index 000000000000..ad3a5a882cb1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-climatebert_base_f_climate_evidence_related_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English climatebert_base_f_climate_evidence_related RoBertaForSequenceClassification from mwong +author: John Snow Labs +name: climatebert_base_f_climate_evidence_related +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`climatebert_base_f_climate_evidence_related` is a English model originally trained by mwong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/climatebert_base_f_climate_evidence_related_en_5.2.0_3.0_1701428443726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/climatebert_base_f_climate_evidence_related_en_5.2.0_3.0_1701428443726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("climatebert_base_f_climate_evidence_related","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("climatebert_base_f_climate_evidence_related","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|climatebert_base_f_climate_evidence_related| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/mwong/climatebert-base-f-climate-evidence-related \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-codebert_docstring_inconsistency_en.md b/docs/_posts/ahmedlone127/2023-12-01-codebert_docstring_inconsistency_en.md new file mode 100644 index 000000000000..7f981b7a1ec0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-codebert_docstring_inconsistency_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English codebert_docstring_inconsistency RoBertaForSequenceClassification from Fsoft-AIC +author: John Snow Labs +name: codebert_docstring_inconsistency +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`codebert_docstring_inconsistency` is a English model originally trained by Fsoft-AIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/codebert_docstring_inconsistency_en_5.2.0_3.0_1701427199647.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/codebert_docstring_inconsistency_en_5.2.0_3.0_1701427199647.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_docstring_inconsistency","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("codebert_docstring_inconsistency","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|codebert_docstring_inconsistency| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Fsoft-AIC/Codebert-docstring-inconsistency \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-cohen_en.md b/docs/_posts/ahmedlone127/2023-12-01-cohen_en.md new file mode 100644 index 000000000000..33759921b8e6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-cohen_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cohen RoBertaForSequenceClassification from gngpostalsrvc +author: John Snow Labs +name: cohen +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cohen` is a English model originally trained by gngpostalsrvc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cohen_en_5.2.0_3.0_1701434002610.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cohen_en_5.2.0_3.0_1701434002610.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cohen","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cohen","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cohen| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|5.4 MB| + +## References + +https://huggingface.co/gngpostalsrvc/COHeN \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-cold_fusion_finetuned_effectiveness_redditcmv_en.md b/docs/_posts/ahmedlone127/2023-12-01-cold_fusion_finetuned_effectiveness_redditcmv_en.md new file mode 100644 index 000000000000..68cab184ab4a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-cold_fusion_finetuned_effectiveness_redditcmv_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_finetuned_effectiveness_redditcmv RoBertaForSequenceClassification from jakub014 +author: John Snow Labs +name: cold_fusion_finetuned_effectiveness_redditcmv +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_finetuned_effectiveness_redditcmv` is a English model originally trained by jakub014. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_finetuned_effectiveness_redditcmv_en_5.2.0_3.0_1701471121499.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_finetuned_effectiveness_redditcmv_en_5.2.0_3.0_1701471121499.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_finetuned_effectiveness_redditcmv","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_finetuned_effectiveness_redditcmv","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_finetuned_effectiveness_redditcmv| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/jakub014/ColD-Fusion-finetuned-effectiveness-redditCMV \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-comparation_2_text_en.md b/docs/_posts/ahmedlone127/2023-12-01-comparation_2_text_en.md new file mode 100644 index 000000000000..252b0c010e3f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-comparation_2_text_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English comparation_2_text RoBertaForSequenceClassification from luisvidal-lv +author: John Snow Labs +name: comparation_2_text +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`comparation_2_text` is a English model originally trained by luisvidal-lv. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/comparation_2_text_en_5.2.0_3.0_1701410070376.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/comparation_2_text_en_5.2.0_3.0_1701410070376.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("comparation_2_text","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("comparation_2_text","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|comparation_2_text| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/luisvidal-lv/comparation_2_text \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-covid19_tweets_sentimentanaysis_rorbeta_en.md b/docs/_posts/ahmedlone127/2023-12-01-covid19_tweets_sentimentanaysis_rorbeta_en.md new file mode 100644 index 000000000000..66d8710a7ec7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-covid19_tweets_sentimentanaysis_rorbeta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid19_tweets_sentimentanaysis_rorbeta RoBertaForSequenceClassification from einnake +author: John Snow Labs +name: covid19_tweets_sentimentanaysis_rorbeta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid19_tweets_sentimentanaysis_rorbeta` is a English model originally trained by einnake. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid19_tweets_sentimentanaysis_rorbeta_en_5.2.0_3.0_1701422763129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid19_tweets_sentimentanaysis_rorbeta_en_5.2.0_3.0_1701422763129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid19_tweets_sentimentanaysis_rorbeta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid19_tweets_sentimentanaysis_rorbeta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid19_tweets_sentimentanaysis_rorbeta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/einnake/Covid19_Tweets_SentimentAnaysis_Rorbeta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-covid_analysis_model_en.md b/docs/_posts/ahmedlone127/2023-12-01-covid_analysis_model_en.md new file mode 100644 index 000000000000..cccafb1780ea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-covid_analysis_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_analysis_model RoBertaForSequenceClassification from phinm +author: John Snow Labs +name: covid_analysis_model +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_analysis_model` is a English model originally trained by phinm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_analysis_model_en_5.2.0_3.0_1701397881944.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_analysis_model_en_5.2.0_3.0_1701397881944.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_analysis_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_analysis_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_analysis_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/phinm/covid_analysis_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-covid_hatespeech_detection_en.md b/docs/_posts/ahmedlone127/2023-12-01-covid_hatespeech_detection_en.md new file mode 100644 index 000000000000..98b3c330aa13 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-covid_hatespeech_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_hatespeech_detection RoBertaForSequenceClassification from nihaldsouza1 +author: John Snow Labs +name: covid_hatespeech_detection +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_hatespeech_detection` is a English model originally trained by nihaldsouza1. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_hatespeech_detection_en_5.2.0_3.0_1701450180648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_hatespeech_detection_en_5.2.0_3.0_1701450180648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_hatespeech_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_hatespeech_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_hatespeech_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.0 MB| + +## References + +https://huggingface.co/nihaldsouza1/covid-hatespeech-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo_en.md b/docs/_posts/ahmedlone127/2023-12-01-covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo_en.md new file mode 100644 index 000000000000..3ea361a4356f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo RoBertaForSequenceClassification from fantasticrambo +author: John Snow Labs +name: covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo` is a English model originally trained by fantasticrambo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo_en_5.2.0_3.0_1701396982373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo_en_5.2.0_3.0_1701396982373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|covid_tweet_sentiment_analyzer_roberta_latest_fantasticrambo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/fantasticrambo/covid-tweet-sentiment-analyzer-roberta-latest \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-da_roberta_babe_en.md b/docs/_posts/ahmedlone127/2023-12-01-da_roberta_babe_en.md new file mode 100644 index 000000000000..fef93ba87412 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-da_roberta_babe_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English da_roberta_babe RoBertaForSequenceClassification from mediabiasgroup +author: John Snow Labs +name: da_roberta_babe +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`da_roberta_babe` is a English model originally trained by mediabiasgroup. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/da_roberta_babe_en_5.2.0_3.0_1701389515072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/da_roberta_babe_en_5.2.0_3.0_1701389515072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("da_roberta_babe","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("da_roberta_babe","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|da_roberta_babe| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|454.8 MB| + +## References + +https://huggingface.co/mediabiasgroup/DA-RoBERTa-BABE \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-da_roberta_pretrained_en.md b/docs/_posts/ahmedlone127/2023-12-01-da_roberta_pretrained_en.md new file mode 100644 index 000000000000..6e83c5f488ae --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-da_roberta_pretrained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English da_roberta_pretrained RoBertaForSequenceClassification from mediabiasgroup +author: John Snow Labs +name: da_roberta_pretrained +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`da_roberta_pretrained` is a English model originally trained by mediabiasgroup. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/da_roberta_pretrained_en_5.2.0_3.0_1701416444529.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/da_roberta_pretrained_en_5.2.0_3.0_1701416444529.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("da_roberta_pretrained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("da_roberta_pretrained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|da_roberta_pretrained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.4 MB| + +## References + +https://huggingface.co/mediabiasgroup/DA-RoBERTa-pretrained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-danish_roberta_botxo_danish_finetuned_hatespeech_en.md b/docs/_posts/ahmedlone127/2023-12-01-danish_roberta_botxo_danish_finetuned_hatespeech_en.md new file mode 100644 index 000000000000..b54acca67e89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-danish_roberta_botxo_danish_finetuned_hatespeech_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English danish_roberta_botxo_danish_finetuned_hatespeech RoBertaForSequenceClassification from emfa +author: John Snow Labs +name: danish_roberta_botxo_danish_finetuned_hatespeech +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`danish_roberta_botxo_danish_finetuned_hatespeech` is a English model originally trained by emfa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/danish_roberta_botxo_danish_finetuned_hatespeech_en_5.2.0_3.0_1701472994125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/danish_roberta_botxo_danish_finetuned_hatespeech_en_5.2.0_3.0_1701472994125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("danish_roberta_botxo_danish_finetuned_hatespeech","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("danish_roberta_botxo_danish_finetuned_hatespeech","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|danish_roberta_botxo_danish_finetuned_hatespeech| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/emfa/danish-roberta-botxo-danish-finetuned-hatespeech \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-disaster_model_1_en.md b/docs/_posts/ahmedlone127/2023-12-01-disaster_model_1_en.md new file mode 100644 index 000000000000..55a59dd0f007 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-disaster_model_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English disaster_model_1 RoBertaForSequenceClassification from aellxx +author: John Snow Labs +name: disaster_model_1 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`disaster_model_1` is a English model originally trained by aellxx. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/disaster_model_1_en_5.2.0_3.0_1701448586598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/disaster_model_1_en_5.2.0_3.0_1701448586598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("disaster_model_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("disaster_model_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|disaster_model_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/aellxx/disaster-model-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-discourse_classification_using_robrta_base_en.md b/docs/_posts/ahmedlone127/2023-12-01-discourse_classification_using_robrta_base_en.md new file mode 100644 index 000000000000..9935a0bd3581 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-discourse_classification_using_robrta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English discourse_classification_using_robrta_base RoBertaForSequenceClassification from Manishkalra +author: John Snow Labs +name: discourse_classification_using_robrta_base +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`discourse_classification_using_robrta_base` is a English model originally trained by Manishkalra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/discourse_classification_using_robrta_base_en_5.2.0_3.0_1701471540347.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/discourse_classification_using_robrta_base_en_5.2.0_3.0_1701471540347.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("discourse_classification_using_robrta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("discourse_classification_using_robrta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|discourse_classification_using_robrta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.8 MB| + +## References + +https://huggingface.co/Manishkalra/discourse_classification_using_robrta_base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true_en.md new file mode 100644 index 000000000000..7ea1f6939447 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true RoBertaForSequenceClassification from ali2066 +author: John Snow Labs +name: distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true_en_5.2.0_3.0_1701407914451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true_en_5.2.0_3.0_1701407914451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbertfinal_ctxsentence_train_all_test_french_second_train_set_null_true| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/ali2066/DistilBERTFINAL_ctxSentence_TRAIN_all_TEST_french_second_train_set_NULL_True \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_ep20_phrase5k_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_ep20_phrase5k_en.md new file mode 100644 index 000000000000..7134a1bc3a78 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_ep20_phrase5k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_ep20_phrase5k RoBertaForSequenceClassification from judy93536 +author: John Snow Labs +name: distilroberta_base_ep20_phrase5k +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_ep20_phrase5k` is a English model originally trained by judy93536. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_ep20_phrase5k_en_5.2.0_3.0_1701433769058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_ep20_phrase5k_en_5.2.0_3.0_1701433769058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_ep20_phrase5k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_ep20_phrase5k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_ep20_phrase5k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/judy93536/distilroberta-base-ep20-phrase5k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_finetuned_fakenews_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_finetuned_fakenews_en.md new file mode 100644 index 000000000000..8212acc9cf05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_finetuned_fakenews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_finetuned_fakenews RoBertaForSequenceClassification from GonzaloA +author: John Snow Labs +name: distilroberta_base_finetuned_fakenews +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_finetuned_fakenews` is a English model originally trained by GonzaloA. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_finetuned_fakenews_en_5.2.0_3.0_1701470053493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_finetuned_fakenews_en_5.2.0_3.0_1701470053493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_finetuned_fakenews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_finetuned_fakenews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_finetuned_fakenews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/GonzaloA/distilroberta-base-finetuned-fakeNews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_jobs_clasf_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_jobs_clasf_en.md new file mode 100644 index 000000000000..ca6e552599b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_jobs_clasf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_jobs_clasf RoBertaForSequenceClassification from dijon-ai +author: John Snow Labs +name: distilroberta_base_jobs_clasf +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_jobs_clasf` is a English model originally trained by dijon-ai. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_jobs_clasf_en_5.2.0_3.0_1701417310053.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_jobs_clasf_en_5.2.0_3.0_1701417310053.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_jobs_clasf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_jobs_clasf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_jobs_clasf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/dijon-ai/distilroberta-base-jobs-clasf \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_mrpc_glue_ubaldogs_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_mrpc_glue_ubaldogs_en.md new file mode 100644 index 000000000000..76294847a3c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_mrpc_glue_ubaldogs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_mrpc_glue_ubaldogs RoBertaForSequenceClassification from ubas9109 +author: John Snow Labs +name: distilroberta_base_mrpc_glue_ubaldogs +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_mrpc_glue_ubaldogs` is a English model originally trained by ubas9109. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_ubaldogs_en_5.2.0_3.0_1701450805440.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_ubaldogs_en_5.2.0_3.0_1701450805440.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_ubaldogs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_ubaldogs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_mrpc_glue_ubaldogs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/ubas9109/distilroberta-base-mrpc-glue-ubaldogs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_uncased_distilled_emotion_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_uncased_distilled_emotion_en.md new file mode 100644 index 000000000000..4660cde42efd --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_base_uncased_distilled_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_uncased_distilled_emotion RoBertaForSequenceClassification from vladkolev +author: John Snow Labs +name: distilroberta_base_uncased_distilled_emotion +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_uncased_distilled_emotion` is a English model originally trained by vladkolev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_uncased_distilled_emotion_en_5.2.0_3.0_1701472622718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_uncased_distilled_emotion_en_5.2.0_3.0_1701472622718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_uncased_distilled_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_uncased_distilled_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_uncased_distilled_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/vladkolev/distilroberta-base-uncased-distilled-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_csic_anomaly_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_csic_anomaly_en.md new file mode 100644 index 000000000000..77206c94614f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_csic_anomaly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_csic_anomaly RoBertaForSequenceClassification from EgilKarlsen +author: John Snow Labs +name: distilroberta_csic_anomaly +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_csic_anomaly` is a English model originally trained by EgilKarlsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_csic_anomaly_en_5.2.0_3.0_1701470138121.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_csic_anomaly_en_5.2.0_3.0_1701470138121.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_csic_anomaly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_csic_anomaly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_csic_anomaly| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/EgilKarlsen/DistilRoberta_CSIC-Anomaly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_newsapi121k_phrase5k_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_newsapi121k_phrase5k_en.md new file mode 100644 index 000000000000..584c57b3e969 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_newsapi121k_phrase5k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_newsapi121k_phrase5k RoBertaForSequenceClassification from judy93536 +author: John Snow Labs +name: distilroberta_newsapi121k_phrase5k +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_newsapi121k_phrase5k` is a English model originally trained by judy93536. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_newsapi121k_phrase5k_en_5.2.0_3.0_1701392479431.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_newsapi121k_phrase5k_en_5.2.0_3.0_1701392479431.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_newsapi121k_phrase5k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_newsapi121k_phrase5k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_newsapi121k_phrase5k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/judy93536/distilroberta-newsapi121k-phrase5k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-distilroberta_rbm231k_ep20_op40_phrase5k_en.md b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_rbm231k_ep20_op40_phrase5k_en.md new file mode 100644 index 000000000000..a4701d1ae419 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-distilroberta_rbm231k_ep20_op40_phrase5k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_rbm231k_ep20_op40_phrase5k RoBertaForSequenceClassification from judy93536 +author: John Snow Labs +name: distilroberta_rbm231k_ep20_op40_phrase5k +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_rbm231k_ep20_op40_phrase5k` is a English model originally trained by judy93536. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_rbm231k_ep20_op40_phrase5k_en_5.2.0_3.0_1701415440081.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_rbm231k_ep20_op40_phrase5k_en_5.2.0_3.0_1701415440081.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_rbm231k_ep20_op40_phrase5k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_rbm231k_ep20_op40_phrase5k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_rbm231k_ep20_op40_phrase5k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| + +## References + +https://huggingface.co/judy93536/distilroberta-rbm231k-ep20-op40-phrase5k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-div_class_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-div_class_roberta_en.md new file mode 100644 index 000000000000..9278f8b26ebb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-div_class_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English div_class_roberta RoBertaForSequenceClassification from mahesh27 +author: John Snow Labs +name: div_class_roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`div_class_roberta` is a English model originally trained by mahesh27. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/div_class_roberta_en_5.2.0_3.0_1701431431762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/div_class_roberta_en_5.2.0_3.0_1701431431762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("div_class_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("div_class_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|div_class_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|380.5 MB| + +## References + +https://huggingface.co/mahesh27/div-class-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-drug_stance_bert_en.md b/docs/_posts/ahmedlone127/2023-12-01-drug_stance_bert_en.md new file mode 100644 index 000000000000..750c5dfe7578 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-drug_stance_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English drug_stance_bert RoBertaForSequenceClassification from ningkko +author: John Snow Labs +name: drug_stance_bert +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`drug_stance_bert` is a English model originally trained by ningkko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/drug_stance_bert_en_5.2.0_3.0_1701413148067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/drug_stance_bert_en_5.2.0_3.0_1701413148067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("drug_stance_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("drug_stance_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|drug_stance_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/ningkko/drug-stance-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-edos_2023_baseline_roberta_base_label_category_en.md b/docs/_posts/ahmedlone127/2023-12-01-edos_2023_baseline_roberta_base_label_category_en.md new file mode 100644 index 000000000000..3fec85f7ba30 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-edos_2023_baseline_roberta_base_label_category_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English edos_2023_baseline_roberta_base_label_category RoBertaForSequenceClassification from lct-rug-2022 +author: John Snow Labs +name: edos_2023_baseline_roberta_base_label_category +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`edos_2023_baseline_roberta_base_label_category` is a English model originally trained by lct-rug-2022. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/edos_2023_baseline_roberta_base_label_category_en_5.2.0_3.0_1701390422985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/edos_2023_baseline_roberta_base_label_category_en_5.2.0_3.0_1701390422985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("edos_2023_baseline_roberta_base_label_category","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("edos_2023_baseline_roberta_base_label_category","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|edos_2023_baseline_roberta_base_label_category| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|299.7 MB| + +## References + +https://huggingface.co/lct-rug-2022/edos-2023-baseline-roberta-base-label_category \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-edos_2023_baseline_roberta_base_label_sexist_en.md b/docs/_posts/ahmedlone127/2023-12-01-edos_2023_baseline_roberta_base_label_sexist_en.md new file mode 100644 index 000000000000..3a6eabca7448 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-edos_2023_baseline_roberta_base_label_sexist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English edos_2023_baseline_roberta_base_label_sexist RoBertaForSequenceClassification from lct-rug-2022 +author: John Snow Labs +name: edos_2023_baseline_roberta_base_label_sexist +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`edos_2023_baseline_roberta_base_label_sexist` is a English model originally trained by lct-rug-2022. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/edos_2023_baseline_roberta_base_label_sexist_en_5.2.0_3.0_1701392207925.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/edos_2023_baseline_roberta_base_label_sexist_en_5.2.0_3.0_1701392207925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("edos_2023_baseline_roberta_base_label_sexist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("edos_2023_baseline_roberta_base_label_sexist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|edos_2023_baseline_roberta_base_label_sexist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|299.7 MB| + +## References + +https://huggingface.co/lct-rug-2022/edos-2023-baseline-roberta-base-label_sexist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-email_fraud_detector_en.md b/docs/_posts/ahmedlone127/2023-12-01-email_fraud_detector_en.md new file mode 100644 index 000000000000..7796a5566ab3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-email_fraud_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English email_fraud_detector RoBertaForSequenceClassification from tush9905 +author: John Snow Labs +name: email_fraud_detector +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`email_fraud_detector` is a English model originally trained by tush9905. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/email_fraud_detector_en_5.2.0_3.0_1701432768929.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/email_fraud_detector_en_5.2.0_3.0_1701432768929.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("email_fraud_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("email_fraud_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|email_fraud_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|453.8 MB| + +## References + +https://huggingface.co/tush9905/email_fraud_detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-email_spam_detection_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-email_spam_detection_roberta_en.md new file mode 100644 index 000000000000..32f78a6ae6cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-email_spam_detection_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English email_spam_detection_roberta RoBertaForSequenceClassification from dima806 +author: John Snow Labs +name: email_spam_detection_roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`email_spam_detection_roberta` is a English model originally trained by dima806. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/email_spam_detection_roberta_en_5.2.0_3.0_1701436018264.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/email_spam_detection_roberta_en_5.2.0_3.0_1701436018264.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("email_spam_detection_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("email_spam_detection_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|email_spam_detection_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|454.7 MB| + +## References + +https://huggingface.co/dima806/email-spam-detection-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-emotion_classification_en.md b/docs/_posts/ahmedlone127/2023-12-01-emotion_classification_en.md new file mode 100644 index 000000000000..391c7327121e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-emotion_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_classification RoBertaForSequenceClassification from imrazaa +author: John Snow Labs +name: emotion_classification +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_classification` is a English model originally trained by imrazaa. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_classification_en_5.2.0_3.0_1701395571514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_classification_en_5.2.0_3.0_1701395571514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.1 MB| + +## References + +https://huggingface.co/imrazaa/emotion_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-emotion_text_classifier_finetuned_solidarity_en.md b/docs/_posts/ahmedlone127/2023-12-01-emotion_text_classifier_finetuned_solidarity_en.md new file mode 100644 index 000000000000..329a18456215 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-emotion_text_classifier_finetuned_solidarity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emotion_text_classifier_finetuned_solidarity RoBertaForSequenceClassification from awwalker +author: John Snow Labs +name: emotion_text_classifier_finetuned_solidarity +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emotion_text_classifier_finetuned_solidarity` is a English model originally trained by awwalker. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emotion_text_classifier_finetuned_solidarity_en_5.2.0_3.0_1701471623052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emotion_text_classifier_finetuned_solidarity_en_5.2.0_3.0_1701471623052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_text_classifier_finetuned_solidarity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emotion_text_classifier_finetuned_solidarity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emotion_text_classifier_finetuned_solidarity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/awwalker/emotion_text_classifier-finetuned-solidarity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-enhanced_roberta_sentiments_spanish_en.md b/docs/_posts/ahmedlone127/2023-12-01-enhanced_roberta_sentiments_spanish_en.md new file mode 100644 index 000000000000..dc475904ebd5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-enhanced_roberta_sentiments_spanish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English enhanced_roberta_sentiments_spanish RoBertaForSequenceClassification from Manauu17 +author: John Snow Labs +name: enhanced_roberta_sentiments_spanish +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`enhanced_roberta_sentiments_spanish` is a English model originally trained by Manauu17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/enhanced_roberta_sentiments_spanish_en_5.2.0_3.0_1701469971565.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/enhanced_roberta_sentiments_spanish_en_5.2.0_3.0_1701469971565.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("enhanced_roberta_sentiments_spanish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("enhanced_roberta_sentiments_spanish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|enhanced_roberta_sentiments_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/Manauu17/enhanced_roberta_sentiments_es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-event_detection_xlm_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-event_detection_xlm_roberta_en.md new file mode 100644 index 000000000000..a0d5f5df96bc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-event_detection_xlm_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English event_detection_xlm_roberta RoBertaForSequenceClassification from clarin-knext +author: John Snow Labs +name: event_detection_xlm_roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`event_detection_xlm_roberta` is a English model originally trained by clarin-knext. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/event_detection_xlm_roberta_en_5.2.0_3.0_1701423639072.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/event_detection_xlm_roberta_en_5.2.0_3.0_1701423639072.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("event_detection_xlm_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("event_detection_xlm_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|event_detection_xlm_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|415.6 MB| + +## References + +https://huggingface.co/clarin-knext/event-detection-xlm-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-fakenews_roberta_large_en.md b/docs/_posts/ahmedlone127/2023-12-01-fakenews_roberta_large_en.md new file mode 100644 index 000000000000..0cfa3e25c093 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-fakenews_roberta_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fakenews_roberta_large RoBertaForSequenceClassification from Denyol +author: John Snow Labs +name: fakenews_roberta_large +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fakenews_roberta_large` is a English model originally trained by Denyol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fakenews_roberta_large_en_5.2.0_3.0_1701472260571.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fakenews_roberta_large_en_5.2.0_3.0_1701472260571.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fakenews_roberta_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fakenews_roberta_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fakenews_roberta_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Denyol/FakeNews-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-financial_message_classification_en.md b/docs/_posts/ahmedlone127/2023-12-01-financial_message_classification_en.md new file mode 100644 index 000000000000..0488daeb8e3a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-financial_message_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English financial_message_classification RoBertaForSequenceClassification from Budget +author: John Snow Labs +name: financial_message_classification +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`financial_message_classification` is a English model originally trained by Budget. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/financial_message_classification_en_5.2.0_3.0_1701439409617.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/financial_message_classification_en_5.2.0_3.0_1701439409617.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("financial_message_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("financial_message_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|financial_message_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|421.4 MB| + +## References + +https://huggingface.co/Budget/financial_message_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-finetune_paraphrase_model_en.md b/docs/_posts/ahmedlone127/2023-12-01-finetune_paraphrase_model_en.md new file mode 100644 index 000000000000..9e4af5b24789 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-finetune_paraphrase_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetune_paraphrase_model RoBertaForSequenceClassification from chitra +author: John Snow Labs +name: finetune_paraphrase_model +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetune_paraphrase_model` is a English model originally trained by chitra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetune_paraphrase_model_en_5.2.0_3.0_1701421955641.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetune_paraphrase_model_en_5.2.0_3.0_1701421955641.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetune_paraphrase_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetune_paraphrase_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetune_paraphrase_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/chitra/finetune-paraphrase-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-finetuned_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-01-finetuned_roberta_base_en.md new file mode 100644 index 000000000000..53a3f0f63b1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-finetuned_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_roberta_base RoBertaForSequenceClassification from KABANDA18 +author: John Snow Labs +name: finetuned_roberta_base +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_roberta_base` is a English model originally trained by KABANDA18. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_en_5.2.0_3.0_1701393750609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_en_5.2.0_3.0_1701393750609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.3 MB| + +## References + +https://huggingface.co/KABANDA18/FineTuned-Roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-finetuning_customer_sentiment_model_300_samples_en.md b/docs/_posts/ahmedlone127/2023-12-01-finetuning_customer_sentiment_model_300_samples_en.md new file mode 100644 index 000000000000..e380018c6555 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-finetuning_customer_sentiment_model_300_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_customer_sentiment_model_300_samples RoBertaForSequenceClassification from Psunrise +author: John Snow Labs +name: finetuning_customer_sentiment_model_300_samples +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_customer_sentiment_model_300_samples` is a English model originally trained by Psunrise. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_customer_sentiment_model_300_samples_en_5.2.0_3.0_1701423717609.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_customer_sentiment_model_300_samples_en_5.2.0_3.0_1701423717609.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_customer_sentiment_model_300_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_customer_sentiment_model_300_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_customer_sentiment_model_300_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|434.4 MB| + +## References + +https://huggingface.co/Psunrise/finetuning-customer-sentiment-model-300-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-finetuning_roberta_base_on_imdb_en.md b/docs/_posts/ahmedlone127/2023-12-01-finetuning_roberta_base_on_imdb_en.md new file mode 100644 index 000000000000..45c8c588c20e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-finetuning_roberta_base_on_imdb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_roberta_base_on_imdb RoBertaForSequenceClassification from Ibrahim-Alam +author: John Snow Labs +name: finetuning_roberta_base_on_imdb +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_roberta_base_on_imdb` is a English model originally trained by Ibrahim-Alam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_imdb_en_5.2.0_3.0_1701450180639.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_roberta_base_on_imdb_en_5.2.0_3.0_1701450180639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_imdb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_roberta_base_on_imdb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_roberta_base_on_imdb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.5 MB| + +## References + +https://huggingface.co/Ibrahim-Alam/finetuning-roberta-base-on-imdb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_en.md b/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_en.md new file mode 100644 index 000000000000..5d8e1c90cc4f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model RoBertaForSequenceClassification from Saberi +author: John Snow Labs +name: finetuning_sentiment_model +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model` is a English model originally trained by Saberi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_en_5.2.0_3.0_1701472459410.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_en_5.2.0_3.0_1701472459410.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.7 MB| + +## References + +https://huggingface.co/Saberi/finetuning-sentiment-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_spanish_hraul_en.md b/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_spanish_hraul_en.md new file mode 100644 index 000000000000..c88f4db53195 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_spanish_hraul_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_spanish_hraul RoBertaForSequenceClassification from HRaul +author: John Snow Labs +name: finetuning_sentiment_model_spanish_hraul +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_spanish_hraul` is a English model originally trained by HRaul. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_spanish_hraul_en_5.2.0_3.0_1701393958243.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_spanish_hraul_en_5.2.0_3.0_1701393958243.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_spanish_hraul","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_spanish_hraul","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_spanish_hraul| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.9 MB| + +## References + +https://huggingface.co/HRaul/finetuning-sentiment-model-spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_urdu_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_urdu_roberta_en.md new file mode 100644 index 000000000000..a76b0a7adfb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-finetuning_sentiment_model_urdu_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_urdu_roberta RoBertaForSequenceClassification from maazmikail +author: John Snow Labs +name: finetuning_sentiment_model_urdu_roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_urdu_roberta` is a English model originally trained by maazmikail. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_urdu_roberta_en_5.2.0_3.0_1701436369349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_urdu_roberta_en_5.2.0_3.0_1701436369349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_urdu_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_urdu_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_urdu_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|473.2 MB| + +## References + +https://huggingface.co/maazmikail/finetuning-sentiment-model-urdu-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-formality_classification_icebert_is.md b/docs/_posts/ahmedlone127/2023-12-01-formality_classification_icebert_is.md new file mode 100644 index 000000000000..7189a4e2388e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-formality_classification_icebert_is.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Icelandic formality_classification_icebert RoBertaForSequenceClassification from svanhvit +author: John Snow Labs +name: formality_classification_icebert +date: 2023-12-01 +tags: [roberta, is, open_source, sequence_classification, onnx] +task: Text Classification +language: is +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`formality_classification_icebert` is a Icelandic model originally trained by svanhvit. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/formality_classification_icebert_is_5.2.0_3.0_1701435802718.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/formality_classification_icebert_is_5.2.0_3.0_1701435802718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("formality_classification_icebert","is")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("formality_classification_icebert","is") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|formality_classification_icebert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|is| +|Size:|446.7 MB| + +## References + +https://huggingface.co/svanhvit/formality-classification-icebert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-fossilbert_en.md b/docs/_posts/ahmedlone127/2023-12-01-fossilbert_en.md new file mode 100644 index 000000000000..04867f44148b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-fossilbert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fossilbert RoBertaForSequenceClassification from ljhemmi +author: John Snow Labs +name: fossilbert +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fossilbert` is a English model originally trained by ljhemmi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fossilbert_en_5.2.0_3.0_1701434075983.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fossilbert_en_5.2.0_3.0_1701434075983.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fossilbert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fossilbert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fossilbert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/ljhemmi/FossilBERT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-gpt_opinion_model_en.md b/docs/_posts/ahmedlone127/2023-12-01-gpt_opinion_model_en.md new file mode 100644 index 000000000000..2425192bb955 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-gpt_opinion_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English gpt_opinion_model RoBertaForSequenceClassification from raydentseng +author: John Snow Labs +name: gpt_opinion_model +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gpt_opinion_model` is a English model originally trained by raydentseng. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gpt_opinion_model_en_5.2.0_3.0_1701423515465.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gpt_opinion_model_en_5.2.0_3.0_1701423515465.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("gpt_opinion_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("gpt_opinion_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gpt_opinion_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.9 MB| + +## References + +https://huggingface.co/raydentseng/gpt_opinion_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-hate_hate_random1_seed1_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-01-hate_hate_random1_seed1_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..66946c5d9bc1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-hate_hate_random1_seed1_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random1_seed1_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random1_seed1_twitter_roberta_large_2022_154m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random1_seed1_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed1_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701473364086.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed1_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701473364086.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed1_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed1_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random1_seed1_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random1_seed1-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-hate_roberta_hasoc_hindi_hi.md b/docs/_posts/ahmedlone127/2023-12-01-hate_roberta_hasoc_hindi_hi.md new file mode 100644 index 000000000000..e9b395ccc43b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-hate_roberta_hasoc_hindi_hi.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Hindi hate_roberta_hasoc_hindi RoBertaForSequenceClassification from l3cube-pune +author: John Snow Labs +name: hate_roberta_hasoc_hindi +date: 2023-12-01 +tags: [roberta, hi, open_source, sequence_classification, onnx] +task: Text Classification +language: hi +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_roberta_hasoc_hindi` is a Hindi model originally trained by l3cube-pune. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_roberta_hasoc_hindi_hi_5.2.0_3.0_1701409950635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_roberta_hasoc_hindi_hi_5.2.0_3.0_1701409950635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_roberta_hasoc_hindi","hi")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_roberta_hasoc_hindi","hi") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_roberta_hasoc_hindi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|hi| +|Size:|467.1 MB| + +## References + +https://huggingface.co/l3cube-pune/hate-roberta-hasoc-hindi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-hate_speech_detector_en.md b/docs/_posts/ahmedlone127/2023-12-01-hate_speech_detector_en.md new file mode 100644 index 000000000000..bb65e6d97165 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-hate_speech_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_speech_detector RoBertaForSequenceClassification from Elluran +author: John Snow Labs +name: hate_speech_detector +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_speech_detector` is a English model originally trained by Elluran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_speech_detector_en_5.2.0_3.0_1701396389238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_speech_detector_en_5.2.0_3.0_1701396389238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_speech_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_speech_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_speech_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.8 MB| + +## References + +https://huggingface.co/Elluran/Hate_speech_detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-hc3_wiki_domain_classification_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-hc3_wiki_domain_classification_roberta_en.md new file mode 100644 index 000000000000..1d70a53aadfa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-hc3_wiki_domain_classification_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hc3_wiki_domain_classification_roberta RoBertaForSequenceClassification from rajendrabaskota +author: John Snow Labs +name: hc3_wiki_domain_classification_roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hc3_wiki_domain_classification_roberta` is a English model originally trained by rajendrabaskota. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hc3_wiki_domain_classification_roberta_en_5.2.0_3.0_1701471315156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hc3_wiki_domain_classification_roberta_en_5.2.0_3.0_1701471315156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hc3_wiki_domain_classification_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hc3_wiki_domain_classification_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hc3_wiki_domain_classification_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.4 MB| + +## References + +https://huggingface.co/rajendrabaskota/hc3-wiki-domain-classification-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-horai_medium_31k_roberta_large_e5_en.md b/docs/_posts/ahmedlone127/2023-12-01-horai_medium_31k_roberta_large_e5_en.md new file mode 100644 index 000000000000..2ed64b2a8c05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-horai_medium_31k_roberta_large_e5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English horai_medium_31k_roberta_large_e5 RoBertaForSequenceClassification from stealthwriter +author: John Snow Labs +name: horai_medium_31k_roberta_large_e5 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`horai_medium_31k_roberta_large_e5` is a English model originally trained by stealthwriter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/horai_medium_31k_roberta_large_e5_en_5.2.0_3.0_1701475006136.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/horai_medium_31k_roberta_large_e5_en_5.2.0_3.0_1701475006136.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("horai_medium_31k_roberta_large_e5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("horai_medium_31k_roberta_large_e5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|horai_medium_31k_roberta_large_e5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/stealthwriter/HorAI-medium-31k-roberta-large-e5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-horai_medium_31k_roberta_large_e5_lr3_en.md b/docs/_posts/ahmedlone127/2023-12-01-horai_medium_31k_roberta_large_e5_lr3_en.md new file mode 100644 index 000000000000..d2f6985fe5d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-horai_medium_31k_roberta_large_e5_lr3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English horai_medium_31k_roberta_large_e5_lr3 RoBertaForSequenceClassification from stealthwriter +author: John Snow Labs +name: horai_medium_31k_roberta_large_e5_lr3 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`horai_medium_31k_roberta_large_e5_lr3` is a English model originally trained by stealthwriter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/horai_medium_31k_roberta_large_e5_lr3_en_5.2.0_3.0_1701436971991.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/horai_medium_31k_roberta_large_e5_lr3_en_5.2.0_3.0_1701436971991.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("horai_medium_31k_roberta_large_e5_lr3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("horai_medium_31k_roberta_large_e5_lr3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|horai_medium_31k_roberta_large_e5_lr3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/stealthwriter/HorAI-medium-31k-roberta-large-e5-lr3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-icebert_finetuned_iec_sentence_bs16_en.md b/docs/_posts/ahmedlone127/2023-12-01-icebert_finetuned_iec_sentence_bs16_en.md new file mode 100644 index 000000000000..d0f93efb343a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-icebert_finetuned_iec_sentence_bs16_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English icebert_finetuned_iec_sentence_bs16 RoBertaForSequenceClassification from vesteinn +author: John Snow Labs +name: icebert_finetuned_iec_sentence_bs16 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`icebert_finetuned_iec_sentence_bs16` is a English model originally trained by vesteinn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/icebert_finetuned_iec_sentence_bs16_en_5.2.0_3.0_1701447262648.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/icebert_finetuned_iec_sentence_bs16_en_5.2.0_3.0_1701447262648.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("icebert_finetuned_iec_sentence_bs16","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("icebert_finetuned_iec_sentence_bs16","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|icebert_finetuned_iec_sentence_bs16| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.3 MB| + +## References + +https://huggingface.co/vesteinn/IceBERT-finetuned-iec-sentence-bs16 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-idmgsp_roberta_train_abstract_en.md b/docs/_posts/ahmedlone127/2023-12-01-idmgsp_roberta_train_abstract_en.md new file mode 100644 index 000000000000..f2ff10aa0c93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-idmgsp_roberta_train_abstract_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English idmgsp_roberta_train_abstract RoBertaForSequenceClassification from tum-nlp +author: John Snow Labs +name: idmgsp_roberta_train_abstract +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idmgsp_roberta_train_abstract` is a English model originally trained by tum-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_abstract_en_5.2.0_3.0_1701448975677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_abstract_en_5.2.0_3.0_1701448975677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_abstract","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_abstract","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idmgsp_roberta_train_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|448.8 MB| + +## References + +https://huggingface.co/tum-nlp/IDMGSP-RoBERTa-TRAIN-ABSTRACT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-imdb_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-imdb_roberta_en.md new file mode 100644 index 000000000000..490e5617d650 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-imdb_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English imdb_roberta RoBertaForSequenceClassification from smiller324 +author: John Snow Labs +name: imdb_roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdb_roberta` is a English model originally trained by smiller324. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdb_roberta_en_5.2.0_3.0_1701389949500.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdb_roberta_en_5.2.0_3.0_1701389949500.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("imdb_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("imdb_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdb_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.0 MB| + +## References + +https://huggingface.co/smiller324/imdb_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-imp_hatred_en.md b/docs/_posts/ahmedlone127/2023-12-01-imp_hatred_en.md new file mode 100644 index 000000000000..78d19eaba3fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-imp_hatred_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English imp_hatred RoBertaForSequenceClassification from crcb +author: John Snow Labs +name: imp_hatred +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imp_hatred` is a English model originally trained by crcb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imp_hatred_en_5.2.0_3.0_1701433872403.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imp_hatred_en_5.2.0_3.0_1701433872403.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("imp_hatred","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("imp_hatred","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imp_hatred| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|439.0 MB| + +## References + +https://huggingface.co/crcb/imp_hatred \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-ipc_level1_a_en.md b/docs/_posts/ahmedlone127/2023-12-01-ipc_level1_a_en.md new file mode 100644 index 000000000000..94909ba7ff62 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-ipc_level1_a_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ipc_level1_a RoBertaForSequenceClassification from intelcomp +author: John Snow Labs +name: ipc_level1_a +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ipc_level1_a` is a English model originally trained by intelcomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ipc_level1_a_en_5.2.0_3.0_1701433244514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ipc_level1_a_en_5.2.0_3.0_1701433244514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ipc_level1_a","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ipc_level1_a","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ipc_level1_a| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/intelcomp/ipc_level1_A \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-kgrouping_roberta_large_en.md b/docs/_posts/ahmedlone127/2023-12-01-kgrouping_roberta_large_en.md new file mode 100644 index 000000000000..81871a073aa2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-kgrouping_roberta_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English kgrouping_roberta_large RoBertaForSequenceClassification from Maunish +author: John Snow Labs +name: kgrouping_roberta_large +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`kgrouping_roberta_large` is a English model originally trained by Maunish. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/kgrouping_roberta_large_en_5.2.0_3.0_1701437793439.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/kgrouping_roberta_large_en_5.2.0_3.0_1701437793439.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("kgrouping_roberta_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("kgrouping_roberta_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|kgrouping_roberta_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Maunish/kgrouping-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-khipu_finetuned_amazon_reviews_multi_osanseviero_en.md b/docs/_posts/ahmedlone127/2023-12-01-khipu_finetuned_amazon_reviews_multi_osanseviero_en.md new file mode 100644 index 000000000000..735c1eed5cf2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-khipu_finetuned_amazon_reviews_multi_osanseviero_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English khipu_finetuned_amazon_reviews_multi_osanseviero RoBertaForSequenceClassification from osanseviero +author: John Snow Labs +name: khipu_finetuned_amazon_reviews_multi_osanseviero +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`khipu_finetuned_amazon_reviews_multi_osanseviero` is a English model originally trained by osanseviero. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/khipu_finetuned_amazon_reviews_multi_osanseviero_en_5.2.0_3.0_1701426706995.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/khipu_finetuned_amazon_reviews_multi_osanseviero_en_5.2.0_3.0_1701426706995.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("khipu_finetuned_amazon_reviews_multi_osanseviero","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("khipu_finetuned_amazon_reviews_multi_osanseviero","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|khipu_finetuned_amazon_reviews_multi_osanseviero| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.3 MB| + +## References + +https://huggingface.co/osanseviero/khipu-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-khu_text_classification_roberta_base_sept_2022_en.md b/docs/_posts/ahmedlone127/2023-12-01-khu_text_classification_roberta_base_sept_2022_en.md new file mode 100644 index 000000000000..5e40a55339e8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-khu_text_classification_roberta_base_sept_2022_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English khu_text_classification_roberta_base_sept_2022 RoBertaForSequenceClassification from sabhashanki +author: John Snow Labs +name: khu_text_classification_roberta_base_sept_2022 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`khu_text_classification_roberta_base_sept_2022` is a English model originally trained by sabhashanki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/khu_text_classification_roberta_base_sept_2022_en_5.2.0_3.0_1701471141580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/khu_text_classification_roberta_base_sept_2022_en_5.2.0_3.0_1701471141580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("khu_text_classification_roberta_base_sept_2022","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("khu_text_classification_roberta_base_sept_2022","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|khu_text_classification_roberta_base_sept_2022| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/sabhashanki/khu-text-classification-roberta-base-sept-2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-lever_gsm8k_codex_en.md b/docs/_posts/ahmedlone127/2023-12-01-lever_gsm8k_codex_en.md new file mode 100644 index 000000000000..38fa09873774 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-lever_gsm8k_codex_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English lever_gsm8k_codex RoBertaForSequenceClassification from niansong1996 +author: John Snow Labs +name: lever_gsm8k_codex +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`lever_gsm8k_codex` is a English model originally trained by niansong1996. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/lever_gsm8k_codex_en_5.2.0_3.0_1701447346138.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/lever_gsm8k_codex_en_5.2.0_3.0_1701447346138.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("lever_gsm8k_codex","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("lever_gsm8k_codex","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|lever_gsm8k_codex| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/niansong1996/lever-gsm8k-codex \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-mbti_classification_roberta_base_shunian_en.md b/docs/_posts/ahmedlone127/2023-12-01-mbti_classification_roberta_base_shunian_en.md new file mode 100644 index 000000000000..00f3707a4968 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-mbti_classification_roberta_base_shunian_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mbti_classification_roberta_base_shunian RoBertaForSequenceClassification from Shunian +author: John Snow Labs +name: mbti_classification_roberta_base_shunian +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mbti_classification_roberta_base_shunian` is a English model originally trained by Shunian. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mbti_classification_roberta_base_shunian_en_5.2.0_3.0_1701427086959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mbti_classification_roberta_base_shunian_en_5.2.0_3.0_1701427086959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mbti_classification_roberta_base_shunian","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mbti_classification_roberta_base_shunian","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mbti_classification_roberta_base_shunian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.8 MB| + +## References + +https://huggingface.co/Shunian/mbti-classification-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-methane_solution_en.md b/docs/_posts/ahmedlone127/2023-12-01-methane_solution_en.md new file mode 100644 index 000000000000..a6dd10eed2ef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-methane_solution_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English methane_solution RoBertaForSequenceClassification from TingluZ +author: John Snow Labs +name: methane_solution +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`methane_solution` is a English model originally trained by TingluZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/methane_solution_en_5.2.0_3.0_1701431344887.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/methane_solution_en_5.2.0_3.0_1701431344887.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("methane_solution","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("methane_solution","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|methane_solution| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.6 MB| + +## References + +https://huggingface.co/TingluZ/methane-solution \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-monoroberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-monoroberta_en.md new file mode 100644 index 000000000000..8a8bf35e2dd1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-monoroberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English monoroberta RoBertaForSequenceClassification from veneres +author: John Snow Labs +name: monoroberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`monoroberta` is a English model originally trained by veneres. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/monoroberta_en_5.2.0_3.0_1701416665962.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/monoroberta_en_5.2.0_3.0_1701416665962.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("monoroberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("monoroberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|monoroberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.6 MB| + +## References + +https://huggingface.co/veneres/monoroberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-mood_you_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-mood_you_classifier_en.md new file mode 100644 index 000000000000..96294e0c5438 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-mood_you_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mood_you_classifier RoBertaForSequenceClassification from nicolauduran45 +author: John Snow Labs +name: mood_you_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mood_you_classifier` is a English model originally trained by nicolauduran45. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mood_you_classifier_en_5.2.0_3.0_1701394675449.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mood_you_classifier_en_5.2.0_3.0_1701394675449.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mood_you_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mood_you_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mood_you_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.0 MB| + +## References + +https://huggingface.co/nicolauduran45/mood-you-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-nace2_level1_42_en.md b/docs/_posts/ahmedlone127/2023-12-01-nace2_level1_42_en.md new file mode 100644 index 000000000000..8cf3870f71cc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-nace2_level1_42_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nace2_level1_42 RoBertaForSequenceClassification from intelcomp +author: John Snow Labs +name: nace2_level1_42 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nace2_level1_42` is a English model originally trained by intelcomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nace2_level1_42_en_5.2.0_3.0_1701469635185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nace2_level1_42_en_5.2.0_3.0_1701469635185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nace2_level1_42","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nace2_level1_42","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nace2_level1_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/intelcomp/nace2_level1_42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m_en.md b/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m_en.md new file mode 100644 index 000000000000..516542e9ec1e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701435125490.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701435125490.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_random1_seed2_twitter_roberta_base_2019_90m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_random1_seed2-twitter-roberta-base-2019-90m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_temporal_twitter_roberta_base_2021_124m_en.md b/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_temporal_twitter_roberta_base_2021_124m_en.md new file mode 100644 index 000000000000..2fc62f084d4e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_temporal_twitter_roberta_base_2021_124m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_temporal_twitter_roberta_base_2021_124m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_temporal_twitter_roberta_base_2021_124m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_temporal_twitter_roberta_base_2021_124m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_temporal_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701470843175.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_temporal_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701470843175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_temporal_twitter_roberta_base_2021_124m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_temporal_twitter_roberta_base_2021_124m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_temporal_twitter_roberta_base_2021_124m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_temporal-twitter-roberta-base-2021-124m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_temporal_twitter_roberta_base_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_temporal_twitter_roberta_base_2022_154m_en.md new file mode 100644 index 000000000000..5baece17134c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-nerd_nerd_temporal_twitter_roberta_base_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_temporal_twitter_roberta_base_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_temporal_twitter_roberta_base_2022_154m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_temporal_twitter_roberta_base_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_temporal_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701471136699.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_temporal_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701471136699.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_temporal_twitter_roberta_base_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_temporal_twitter_roberta_base_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_temporal_twitter_roberta_base_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_temporal-twitter-roberta-base-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac_en.md b/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac_en.md new file mode 100644 index 000000000000..c23466ec9fcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac RoBertaForSequenceClassification from babyalpac +author: John Snow Labs +name: nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac` is a English model originally trained by babyalpac. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac_en_5.2.0_3.0_1701414711515.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac_en_5.2.0_3.0_1701414711515.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nli_roberta_base_finetuned_for_amazon_review_ratings_babyalpac| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/babyalpac/nli-roberta-base-finetuned-for-amazon-review-ratings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg_en.md b/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg_en.md new file mode 100644 index 000000000000..2f5b6fee34d2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg RoBertaForSequenceClassification from coleperg +author: John Snow Labs +name: nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg` is a English model originally trained by coleperg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg_en_5.2.0_3.0_1701449836166.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg_en_5.2.0_3.0_1701449836166.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nli_roberta_base_finetuned_for_amazon_review_ratings_coleperg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/coleperg/nli-roberta-base-finetuned-for-amazon-review-ratings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001_en.md b/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001_en.md new file mode 100644 index 000000000000..27458cd188ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001 RoBertaForSequenceClassification from mrhalim2001 +author: John Snow Labs +name: nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001` is a English model originally trained by mrhalim2001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001_en_5.2.0_3.0_1701471534690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001_en_5.2.0_3.0_1701471534690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nli_roberta_base_finetuned_for_amazon_review_ratings_mrhalim2001| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/mrhalim2001/nli-roberta-base-finetuned-for-amazon-review-ratings \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-ogbv_gender_twtrobertabase_english_davidson_en.md b/docs/_posts/ahmedlone127/2023-12-01-ogbv_gender_twtrobertabase_english_davidson_en.md new file mode 100644 index 000000000000..663445f573f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-ogbv_gender_twtrobertabase_english_davidson_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ogbv_gender_twtrobertabase_english_davidson RoBertaForSequenceClassification from Maha +author: John Snow Labs +name: ogbv_gender_twtrobertabase_english_davidson +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ogbv_gender_twtrobertabase_english_davidson` is a English model originally trained by Maha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ogbv_gender_twtrobertabase_english_davidson_en_5.2.0_3.0_1701393232809.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ogbv_gender_twtrobertabase_english_davidson_en_5.2.0_3.0_1701393232809.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ogbv_gender_twtrobertabase_english_davidson","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ogbv_gender_twtrobertabase_english_davidson","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ogbv_gender_twtrobertabase_english_davidson| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Maha/OGBV-gender-twtrobertabase-en-davidson \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-ogbv_gender_twtrobertabase_english_trac1_en.md b/docs/_posts/ahmedlone127/2023-12-01-ogbv_gender_twtrobertabase_english_trac1_en.md new file mode 100644 index 000000000000..a4c96a29a867 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-ogbv_gender_twtrobertabase_english_trac1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ogbv_gender_twtrobertabase_english_trac1 RoBertaForSequenceClassification from Maha +author: John Snow Labs +name: ogbv_gender_twtrobertabase_english_trac1 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ogbv_gender_twtrobertabase_english_trac1` is a English model originally trained by Maha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ogbv_gender_twtrobertabase_english_trac1_en_5.2.0_3.0_1701395191498.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ogbv_gender_twtrobertabase_english_trac1_en_5.2.0_3.0_1701395191498.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ogbv_gender_twtrobertabase_english_trac1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ogbv_gender_twtrobertabase_english_trac1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ogbv_gender_twtrobertabase_english_trac1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/Maha/OGBV-gender-twtrobertabase-en-trac1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-paraphrase_diversity_ranker_en.md b/docs/_posts/ahmedlone127/2023-12-01-paraphrase_diversity_ranker_en.md new file mode 100644 index 000000000000..bb8102fd50da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-paraphrase_diversity_ranker_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English paraphrase_diversity_ranker RoBertaForSequenceClassification from Ashishkr +author: John Snow Labs +name: paraphrase_diversity_ranker +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`paraphrase_diversity_ranker` is a English model originally trained by Ashishkr. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/paraphrase_diversity_ranker_en_5.2.0_3.0_1701408082018.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/paraphrase_diversity_ranker_en_5.2.0_3.0_1701408082018.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("paraphrase_diversity_ranker","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("paraphrase_diversity_ranker","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|paraphrase_diversity_ranker| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.2 MB| + +## References + +https://huggingface.co/Ashishkr/paraphrase_diversity_ranker \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-pc_6_100__roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-pc_6_100__roberta_en.md new file mode 100644 index 000000000000..e94a918a582e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-pc_6_100__roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pc_6_100__roberta RoBertaForSequenceClassification from anony12sub34 +author: John Snow Labs +name: pc_6_100__roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pc_6_100__roberta` is a English model originally trained by anony12sub34. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pc_6_100__roberta_en_5.2.0_3.0_1701472668170.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pc_6_100__roberta_en_5.2.0_3.0_1701472668170.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("pc_6_100__roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("pc_6_100__roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pc_6_100__roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.2 MB| + +## References + +https://huggingface.co/anony12sub34/pc_6_100__roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-per_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-per_classifier_en.md new file mode 100644 index 000000000000..52364995fc56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-per_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English per_classifier RoBertaForSequenceClassification from soulFoo +author: John Snow Labs +name: per_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`per_classifier` is a English model originally trained by soulFoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/per_classifier_en_5.2.0_3.0_1701392378873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/per_classifier_en_5.2.0_3.0_1701392378873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("per_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("per_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|per_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.1 MB| + +## References + +https://huggingface.co/soulFoo/per-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque_en.md b/docs/_posts/ahmedlone127/2023-12-01-platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque_en.md new file mode 100644 index 000000000000..9c45eb36e73d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque_en_5.2.0_3.0_1701469708075.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque_en_5.2.0_3.0_1701469708075.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distilroberta_base_mrpc_glue_juan_jose_cano_duque| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/platzi/platzi-distilroberta-base-mrpc-glue-juan-jose-cano-duque \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-pregnancy_model_en.md b/docs/_posts/ahmedlone127/2023-12-01-pregnancy_model_en.md new file mode 100644 index 000000000000..57074d7afc9f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-pregnancy_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English pregnancy_model RoBertaForSequenceClassification from skelley +author: John Snow Labs +name: pregnancy_model +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`pregnancy_model` is a English model originally trained by skelley. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/pregnancy_model_en_5.2.0_3.0_1701400297658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/pregnancy_model_en_5.2.0_3.0_1701400297658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("pregnancy_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("pregnancy_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|pregnancy_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.5 MB| + +## References + +https://huggingface.co/skelley/pregnancy_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-puoberta_news_tn.md b/docs/_posts/ahmedlone127/2023-12-01-puoberta_news_tn.md new file mode 100644 index 000000000000..70834020a697 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-puoberta_news_tn.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Tswana puoberta_news RoBertaForSequenceClassification from dsfsi +author: John Snow Labs +name: puoberta_news +date: 2023-12-01 +tags: [roberta, tn, open_source, sequence_classification, onnx] +task: Text Classification +language: tn +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`puoberta_news` is a Tswana model originally trained by dsfsi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/puoberta_news_tn_5.2.0_3.0_1701399775896.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/puoberta_news_tn_5.2.0_3.0_1701399775896.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("puoberta_news","tn")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("puoberta_news","tn") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|puoberta_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|tn| +|Size:|313.9 MB| + +## References + +https://huggingface.co/dsfsi/PuoBERTa-News \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-qa_consolidation_en.md b/docs/_posts/ahmedlone127/2023-12-01-qa_consolidation_en.md new file mode 100644 index 000000000000..cadf5a689ed4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-qa_consolidation_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English qa_consolidation RoBertaForSequenceClassification from Salesforce +author: John Snow Labs +name: qa_consolidation +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`qa_consolidation` is a English model originally trained by Salesforce. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qa_consolidation_en_5.2.0_3.0_1701469780994.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qa_consolidation_en_5.2.0_3.0_1701469780994.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("qa_consolidation","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("qa_consolidation","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|qa_consolidation| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|847.1 MB| + +## References + +https://huggingface.co/Salesforce/qa_consolidation \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-quora_roberta_large_navteca_en.md b/docs/_posts/ahmedlone127/2023-12-01-quora_roberta_large_navteca_en.md new file mode 100644 index 000000000000..123f28ab2669 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-quora_roberta_large_navteca_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English quora_roberta_large_navteca RoBertaForSequenceClassification from navteca +author: John Snow Labs +name: quora_roberta_large_navteca +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`quora_roberta_large_navteca` is a English model originally trained by navteca. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/quora_roberta_large_navteca_en_5.2.0_3.0_1701409742254.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/quora_roberta_large_navteca_en_5.2.0_3.0_1701409742254.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_large_navteca","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("quora_roberta_large_navteca","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|quora_roberta_large_navteca| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/navteca/quora-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-racism_finetuned_detests_29_10_2022_en.md b/docs/_posts/ahmedlone127/2023-12-01-racism_finetuned_detests_29_10_2022_en.md new file mode 100644 index 000000000000..02c7c0688455 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-racism_finetuned_detests_29_10_2022_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English racism_finetuned_detests_29_10_2022 RoBertaForSequenceClassification from Pablo94 +author: John Snow Labs +name: racism_finetuned_detests_29_10_2022 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`racism_finetuned_detests_29_10_2022` is a English model originally trained by Pablo94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/racism_finetuned_detests_29_10_2022_en_5.2.0_3.0_1701471805618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/racism_finetuned_detests_29_10_2022_en_5.2.0_3.0_1701471805618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("racism_finetuned_detests_29_10_2022","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("racism_finetuned_detests_29_10_2022","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|racism_finetuned_detests_29_10_2022| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.6 MB| + +## References + +https://huggingface.co/Pablo94/racism-finetuned-detests-29-10-2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-rate_jokes_bert_en.md b/docs/_posts/ahmedlone127/2023-12-01-rate_jokes_bert_en.md new file mode 100644 index 000000000000..b3e56f0741d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-rate_jokes_bert_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rate_jokes_bert RoBertaForSequenceClassification from mohameddhiab +author: John Snow Labs +name: rate_jokes_bert +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rate_jokes_bert` is a English model originally trained by mohameddhiab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rate_jokes_bert_en_5.2.0_3.0_1701471755545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rate_jokes_bert_en_5.2.0_3.0_1701471755545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("rate_jokes_bert","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("rate_jokes_bert","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rate_jokes_bert| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|420.8 MB| + +## References + +https://huggingface.co/mohameddhiab/rate-jokes-bert \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-readability_spanish_benchmark_bertin_spanish_paragraphs_2class_en.md b/docs/_posts/ahmedlone127/2023-12-01-readability_spanish_benchmark_bertin_spanish_paragraphs_2class_en.md new file mode 100644 index 000000000000..e4196c42d446 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-readability_spanish_benchmark_bertin_spanish_paragraphs_2class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English readability_spanish_benchmark_bertin_spanish_paragraphs_2class RoBertaForSequenceClassification from lmvasque +author: John Snow Labs +name: readability_spanish_benchmark_bertin_spanish_paragraphs_2class +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`readability_spanish_benchmark_bertin_spanish_paragraphs_2class` is a English model originally trained by lmvasque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_bertin_spanish_paragraphs_2class_en_5.2.0_3.0_1701469983298.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_bertin_spanish_paragraphs_2class_en_5.2.0_3.0_1701469983298.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("readability_spanish_benchmark_bertin_spanish_paragraphs_2class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("readability_spanish_benchmark_bertin_spanish_paragraphs_2class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|readability_spanish_benchmark_bertin_spanish_paragraphs_2class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.4 MB| + +## References + +https://huggingface.co/lmvasque/readability-es-benchmark-bertin-es-paragraphs-2class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-results_lazaro97_en.md b/docs/_posts/ahmedlone127/2023-12-01-results_lazaro97_en.md new file mode 100644 index 000000000000..964c0fdd4b35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-results_lazaro97_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English results_lazaro97 RoBertaForSequenceClassification from Lazaro97 +author: John Snow Labs +name: results_lazaro97 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`results_lazaro97` is a English model originally trained by Lazaro97. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/results_lazaro97_en_5.2.0_3.0_1701413057993.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/results_lazaro97_en_5.2.0_3.0_1701413057993.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("results_lazaro97","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("results_lazaro97","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|results_lazaro97| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|432.5 MB| + +## References + +https://huggingface.co/Lazaro97/results \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-reward_model_sandeep12345_en.md b/docs/_posts/ahmedlone127/2023-12-01-reward_model_sandeep12345_en.md new file mode 100644 index 000000000000..356dbfa3965d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-reward_model_sandeep12345_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English reward_model_sandeep12345 RoBertaForSequenceClassification from sandeep12345 +author: John Snow Labs +name: reward_model_sandeep12345 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reward_model_sandeep12345` is a English model originally trained by sandeep12345. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reward_model_sandeep12345_en_5.2.0_3.0_1701419945326.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reward_model_sandeep12345_en_5.2.0_3.0_1701419945326.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("reward_model_sandeep12345","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("reward_model_sandeep12345","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reward_model_sandeep12345| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/sandeep12345/reward_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-reward_model_thorirhrafn_en.md b/docs/_posts/ahmedlone127/2023-12-01-reward_model_thorirhrafn_en.md new file mode 100644 index 000000000000..a5f5d478c3c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-reward_model_thorirhrafn_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English reward_model_thorirhrafn RoBertaForSequenceClassification from thorirhrafn +author: John Snow Labs +name: reward_model_thorirhrafn +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`reward_model_thorirhrafn` is a English model originally trained by thorirhrafn. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/reward_model_thorirhrafn_en_5.2.0_3.0_1701438015314.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/reward_model_thorirhrafn_en_5.2.0_3.0_1701438015314.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("reward_model_thorirhrafn","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("reward_model_thorirhrafn","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|reward_model_thorirhrafn| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/thorirhrafn/reward_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik_en.md b/docs/_posts/ahmedlone127/2023-12-01-rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik_en.md new file mode 100644 index 000000000000..536bc2bcd74e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik RoBertaForSequenceClassification from SudiptoPramanik +author: John Snow Labs +name: rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik` is a English model originally trained by SudiptoPramanik. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik_en_5.2.0_3.0_1701399168754.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik_en_5.2.0_3.0_1701399168754.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rewardmodelsmallerquestionwithtwolabelslengthjustified_sudiptopramanik| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|433.3 MB| + +## References + +https://huggingface.co/SudiptoPramanik/RewardModelSmallerQuestionWithTwoLabelsLengthJustified \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-rlhf_reward_model_cambioml_en.md b/docs/_posts/ahmedlone127/2023-12-01-rlhf_reward_model_cambioml_en.md new file mode 100644 index 000000000000..ccd2618f3482 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-rlhf_reward_model_cambioml_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rlhf_reward_model_cambioml RoBertaForSequenceClassification from cambioml +author: John Snow Labs +name: rlhf_reward_model_cambioml +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rlhf_reward_model_cambioml` is a English model originally trained by cambioml. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rlhf_reward_model_cambioml_en_5.2.0_3.0_1701415818052.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rlhf_reward_model_cambioml_en_5.2.0_3.0_1701415818052.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("rlhf_reward_model_cambioml","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("rlhf_reward_model_cambioml","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rlhf_reward_model_cambioml| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|144.9 MB| + +## References + +https://huggingface.co/cambioml/rlhf-reward-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-rm_base_en.md b/docs/_posts/ahmedlone127/2023-12-01-rm_base_en.md new file mode 100644 index 000000000000..eef3d920f5f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-rm_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English rm_base RoBertaForSequenceClassification from davidgaofc +author: John Snow Labs +name: rm_base +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`rm_base` is a English model originally trained by davidgaofc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/rm_base_en_5.2.0_3.0_1701396389219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/rm_base_en_5.2.0_3.0_1701396389219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("rm_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("rm_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|rm_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/davidgaofc/RM_base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-robbert_v2_dutch_finetuned_snli_en.md b/docs/_posts/ahmedlone127/2023-12-01-robbert_v2_dutch_finetuned_snli_en.md new file mode 100644 index 000000000000..3435fe006c18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-robbert_v2_dutch_finetuned_snli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English robbert_v2_dutch_finetuned_snli RoBertaForSequenceClassification from LoicDL +author: John Snow Labs +name: robbert_v2_dutch_finetuned_snli +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robbert_v2_dutch_finetuned_snli` is a English model originally trained by LoicDL. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robbert_v2_dutch_finetuned_snli_en_5.2.0_3.0_1701473173204.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robbert_v2_dutch_finetuned_snli_en_5.2.0_3.0_1701473173204.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("robbert_v2_dutch_finetuned_snli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("robbert_v2_dutch_finetuned_snli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robbert_v2_dutch_finetuned_snli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.9 MB| + +## References + +https://huggingface.co/LoicDL/robbert-v2-dutch-finetuned-snli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_ag_news_202310232117_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_ag_news_202310232117_en.md new file mode 100644 index 000000000000..24429c9ce65b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_ag_news_202310232117_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_ag_news_202310232117 RoBertaForSequenceClassification from DaymonQu +author: John Snow Labs +name: roberta_base_ag_news_202310232117 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ag_news_202310232117` is a English model originally trained by DaymonQu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_202310232117_en_5.2.0_3.0_1701393764964.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ag_news_202310232117_en_5.2.0_3.0_1701393764964.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_202310232117","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ag_news_202310232117","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ag_news_202310232117| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.9 MB| + +## References + +https://huggingface.co/DaymonQu/roberta-base_ag_news_202310232117 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_agnews_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_agnews_en.md new file mode 100644 index 000000000000..59fba088110d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_agnews_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_agnews RoBertaForSequenceClassification from tamhuynh27 +author: John Snow Labs +name: roberta_base_agnews +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_agnews` is a English model originally trained by tamhuynh27. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_agnews_en_5.2.0_3.0_1701403813063.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_agnews_en_5.2.0_3.0_1701403813063.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_agnews","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_agnews","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_agnews| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/tamhuynh27/roberta-base-agnews \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_biomedical_clinical_spanish_finetuned_text_classification_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_biomedical_clinical_spanish_finetuned_text_classification_en.md new file mode 100644 index 000000000000..853d91ed6917 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_biomedical_clinical_spanish_finetuned_text_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_biomedical_clinical_spanish_finetuned_text_classification RoBertaForSequenceClassification from asdc +author: John Snow Labs +name: roberta_base_biomedical_clinical_spanish_finetuned_text_classification +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_biomedical_clinical_spanish_finetuned_text_classification` is a English model originally trained by asdc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_biomedical_clinical_spanish_finetuned_text_classification_en_5.2.0_3.0_1701405998741.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_biomedical_clinical_spanish_finetuned_text_classification_en_5.2.0_3.0_1701405998741.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_biomedical_clinical_spanish_finetuned_text_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_biomedical_clinical_spanish_finetuned_text_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_biomedical_clinical_spanish_finetuned_text_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.6 MB| + +## References + +https://huggingface.co/asdc/roberta-base-biomedical-clinical-es-finetuned-text_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_clasificacion_german_texto_supervisado_victorayora_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_clasificacion_german_texto_supervisado_victorayora_en.md new file mode 100644 index 000000000000..992955895af1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_clasificacion_german_texto_supervisado_victorayora_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_clasificacion_german_texto_supervisado_victorayora RoBertaForSequenceClassification from VictorAyora +author: John Snow Labs +name: roberta_base_bne_clasificacion_german_texto_supervisado_victorayora +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_clasificacion_german_texto_supervisado_victorayora` is a English model originally trained by VictorAyora. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_clasificacion_german_texto_supervisado_victorayora_en_5.2.0_3.0_1701446986901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_clasificacion_german_texto_supervisado_victorayora_en_5.2.0_3.0_1701446986901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_clasificacion_german_texto_supervisado_victorayora","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_clasificacion_german_texto_supervisado_victorayora","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_clasificacion_german_texto_supervisado_victorayora| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.8 MB| + +## References + +https://huggingface.co/VictorAyora/roberta-base-bne-clasificacion-de-texto-supervisado \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_csalamea_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_csalamea_en.md new file mode 100644 index 000000000000..eac674f3c9b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_csalamea_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_csalamea RoBertaForSequenceClassification from csalamea +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_csalamea +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_csalamea` is a English model originally trained by csalamea. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_csalamea_en_5.2.0_3.0_1701406743541.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_csalamea_en_5.2.0_3.0_1701406743541.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_csalamea","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_csalamea","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_csalamea| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.8 MB| + +## References + +https://huggingface.co/csalamea/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_fjluque_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_fjluque_en.md new file mode 100644 index 000000000000..c1d67b27ddb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_fjluque_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_fjluque RoBertaForSequenceClassification from fjluque +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_fjluque +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_fjluque` is a English model originally trained by fjluque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_fjluque_en_5.2.0_3.0_1701413689941.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_fjluque_en_5.2.0_3.0_1701413689941.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_fjluque","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_fjluque","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_fjluque| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.1 MB| + +## References + +https://huggingface.co/fjluque/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_hackertec_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_hackertec_en.md new file mode 100644 index 000000000000..41fbae0b14fc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_hackertec_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_hackertec RoBertaForSequenceClassification from hackertec +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_hackertec +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_hackertec` is a English model originally trained by hackertec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_hackertec_en_5.2.0_3.0_1701474576968.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_hackertec_en_5.2.0_3.0_1701474576968.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_hackertec","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_hackertec","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_hackertec| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|431.9 MB| + +## References + +https://huggingface.co/hackertec/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz_en.md new file mode 100644 index 000000000000..f628d3eb1d10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz RoBertaForSequenceClassification from IsabellaKarabasz +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz` is a English model originally trained by IsabellaKarabasz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz_en_5.2.0_3.0_1701401077596.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz_en_5.2.0_3.0_1701401077596.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_isabellakarabasz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|297.3 MB| + +## References + +https://huggingface.co/IsabellaKarabasz/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_lewtun_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_lewtun_en.md new file mode 100644 index 000000000000..62a57a115cea --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_lewtun_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_lewtun RoBertaForSequenceClassification from lewtun +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_lewtun +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_lewtun` is a English model originally trained by lewtun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_lewtun_en_5.2.0_3.0_1701426600881.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_lewtun_en_5.2.0_3.0_1701426600881.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_lewtun","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_lewtun","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_lewtun| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.7 MB| + +## References + +https://huggingface.co/lewtun/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k_en.md new file mode 100644 index 000000000000..97a70a7a4bc7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k RoBertaForSequenceClassification from MyBad2K +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k` is a English model originally trained by MyBad2K. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k_en_5.2.0_3.0_1701389132693.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k_en_5.2.0_3.0_1701389132693.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_mybad2k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.9 MB| + +## References + +https://huggingface.co/MyBad2K/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_proggleb_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_proggleb_en.md new file mode 100644 index 000000000000..cd37c96d8dba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_proggleb_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_proggleb RoBertaForSequenceClassification from Proggleb +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_proggleb +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_proggleb` is a English model originally trained by Proggleb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_proggleb_en_5.2.0_3.0_1701420655458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_proggleb_en_5.2.0_3.0_1701420655458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_proggleb","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_proggleb","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_proggleb| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.5 MB| + +## References + +https://huggingface.co/Proggleb/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_taller_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_taller_en.md new file mode 100644 index 000000000000..f9afa42fc6ba --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_taller_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_taller RoBertaForSequenceClassification from hackertec +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_taller +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_taller` is a English model originally trained by hackertec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_taller_en_5.2.0_3.0_1701420366037.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_taller_en_5.2.0_3.0_1701420366037.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_taller","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_taller","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_taller| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|431.9 MB| + +## References + +https://huggingface.co/hackertec/roberta-base-bne-finetuned-amazon_reviews_multi-taller \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano_en.md new file mode 100644 index 000000000000..c63be385057f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano RoBertaForSequenceClassification from ValenHumano +author: John Snow Labs +name: roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano` is a English model originally trained by ValenHumano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano_en_5.2.0_3.0_1701472018045.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano_en_5.2.0_3.0_1701472018045.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_amazon_reviews_multi_valenhumano| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.8 MB| + +## References + +https://huggingface.co/ValenHumano/roberta-base-bne-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais_en.md new file mode 100644 index 000000000000..4f1195189c10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais RoBertaForSequenceClassification from vg055 +author: John Snow Labs +name: roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais` is a English model originally trained by vg055. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais_en_5.2.0_3.0_1701443398214.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais_en_5.2.0_3.0_1701443398214.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_analisis_sentimiento_textos_turisticos_mx_pais| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.4 MB| + +## References + +https://huggingface.co/vg055/roberta-base-bne-finetuned-analisis-sentimiento-textos-turisticos-mx-pais \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_detests_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_detests_en.md new file mode 100644 index 000000000000..543d9abbbee6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_detests_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_detests RoBertaForSequenceClassification from Pablo94 +author: John Snow Labs +name: roberta_base_bne_finetuned_detests +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_detests` is a English model originally trained by Pablo94. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_detests_en_5.2.0_3.0_1701404135690.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_detests_en_5.2.0_3.0_1701404135690.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_detests","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_detests","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_detests| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.7 MB| + +## References + +https://huggingface.co/Pablo94/roberta-base-bne-finetuned-detests \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_personality_multi_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_personality_multi_en.md new file mode 100644 index 000000000000..fcfb04798276 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_bne_finetuned_personality_multi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_personality_multi RoBertaForSequenceClassification from titi7242229 +author: John Snow Labs +name: roberta_base_bne_finetuned_personality_multi +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_personality_multi` is a English model originally trained by titi7242229. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_personality_multi_en_5.2.0_3.0_1701406743450.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_personality_multi_en_5.2.0_3.0_1701406743450.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_personality_multi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_personality_multi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_personality_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.5 MB| + +## References + +https://huggingface.co/titi7242229/roberta-base-bne-finetuned_personality_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_catalan_finetuned_hate_speech_offensive_catalan_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_catalan_finetuned_hate_speech_offensive_catalan_en.md new file mode 100644 index 000000000000..945f21de64e9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_catalan_finetuned_hate_speech_offensive_catalan_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_catalan_finetuned_hate_speech_offensive_catalan RoBertaForSequenceClassification from JonatanGk +author: John Snow Labs +name: roberta_base_catalan_finetuned_hate_speech_offensive_catalan +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_catalan_finetuned_hate_speech_offensive_catalan` is a English model originally trained by JonatanGk. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_catalan_finetuned_hate_speech_offensive_catalan_en_5.2.0_3.0_1701400732762.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_catalan_finetuned_hate_speech_offensive_catalan_en_5.2.0_3.0_1701400732762.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_catalan_finetuned_hate_speech_offensive_catalan","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_catalan_finetuned_hate_speech_offensive_catalan","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_catalan_finetuned_hate_speech_offensive_catalan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|448.0 MB| + +## References + +https://huggingface.co/JonatanGk/roberta-base-ca-finetuned-hate-speech-offensive-catalan \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_catalan_v2_cased_wikicat_catalan_ca.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_catalan_v2_cased_wikicat_catalan_ca.md new file mode 100644 index 000000000000..6df55770d37c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_catalan_v2_cased_wikicat_catalan_ca.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Catalan, Valencian roberta_base_catalan_v2_cased_wikicat_catalan RoBertaForSequenceClassification from projecte-aina +author: John Snow Labs +name: roberta_base_catalan_v2_cased_wikicat_catalan +date: 2023-12-01 +tags: [roberta, ca, open_source, sequence_classification, onnx] +task: Text Classification +language: ca +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_catalan_v2_cased_wikicat_catalan` is a Catalan, Valencian model originally trained by projecte-aina. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_catalan_v2_cased_wikicat_catalan_ca_5.2.0_3.0_1701411384911.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_catalan_v2_cased_wikicat_catalan_ca_5.2.0_3.0_1701411384911.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_catalan_v2_cased_wikicat_catalan","ca")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_catalan_v2_cased_wikicat_catalan","ca") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_catalan_v2_cased_wikicat_catalan| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ca| +|Size:|459.6 MB| + +## References + +https://huggingface.co/projecte-aina/roberta-base-ca-v2-cased-wikicat-ca \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_cola_willheld_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_cola_willheld_en.md new file mode 100644 index 000000000000..ed62f9d45823 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_cola_willheld_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_cola_willheld RoBertaForSequenceClassification from WillHeld +author: John Snow Labs +name: roberta_base_cola_willheld +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_cola_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_cola_willheld_en_5.2.0_3.0_1701418768856.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_cola_willheld_en_5.2.0_3.0_1701418768856.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_cola_willheld","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_cola_willheld","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_cola_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.1 MB| + +## References + +https://huggingface.co/WillHeld/roberta-base-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_comma_correction_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_comma_correction_classifier_en.md new file mode 100644 index 000000000000..dc48abeb359d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_comma_correction_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_comma_correction_classifier RoBertaForSequenceClassification from pavlichenko +author: John Snow Labs +name: roberta_base_comma_correction_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_comma_correction_classifier` is a English model originally trained by pavlichenko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_comma_correction_classifier_en_5.2.0_3.0_1701428126452.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_comma_correction_classifier_en_5.2.0_3.0_1701428126452.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_comma_correction_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_comma_correction_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_comma_correction_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.4 MB| + +## References + +https://huggingface.co/pavlichenko/roberta-base-comma-correction-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_fever_evidence_related_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_fever_evidence_related_en.md new file mode 100644 index 000000000000..6252062dd3d8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_fever_evidence_related_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_fever_evidence_related RoBertaForSequenceClassification from mwong +author: John Snow Labs +name: roberta_base_fever_evidence_related +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_fever_evidence_related` is a English model originally trained by mwong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_fever_evidence_related_en_5.2.0_3.0_1701472734873.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_fever_evidence_related_en_5.2.0_3.0_1701472734873.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_fever_evidence_related","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_fever_evidence_related","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_fever_evidence_related| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|300.4 MB| + +## References + +https://huggingface.co/mwong/roberta-base-fever-evidence-related \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_cola_jxuhf_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_cola_jxuhf_en.md new file mode 100644 index 000000000000..d1b556d848b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_cola_jxuhf_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_cola_jxuhf RoBertaForSequenceClassification from jxuhf +author: John Snow Labs +name: roberta_base_finetuned_cola_jxuhf +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_cola_jxuhf` is a English model originally trained by jxuhf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_cola_jxuhf_en_5.2.0_3.0_1701441077853.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_cola_jxuhf_en_5.2.0_3.0_1701441077853.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_cola_jxuhf","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_cola_jxuhf","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_cola_jxuhf| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|434.8 MB| + +## References + +https://huggingface.co/jxuhf/roberta-base-finetuned-cola \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_highlight_detection_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_highlight_detection_en.md new file mode 100644 index 000000000000..a263b2946302 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_highlight_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_highlight_detection RoBertaForSequenceClassification from Epidot +author: John Snow Labs +name: roberta_base_finetuned_highlight_detection +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_highlight_detection` is a English model originally trained by Epidot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_highlight_detection_en_5.2.0_3.0_1701432425839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_highlight_detection_en_5.2.0_3.0_1701432425839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_highlight_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_highlight_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_highlight_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|453.9 MB| + +## References + +https://huggingface.co/Epidot/roberta-base-finetuned-highlight-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_jigsaw_toxic_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_jigsaw_toxic_en.md new file mode 100644 index 000000000000..72d45c6bbf2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_jigsaw_toxic_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_jigsaw_toxic RoBertaForSequenceClassification from affahrizain +author: John Snow Labs +name: roberta_base_finetuned_jigsaw_toxic +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_jigsaw_toxic` is a English model originally trained by affahrizain. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_jigsaw_toxic_en_5.2.0_3.0_1701438436670.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_jigsaw_toxic_en_5.2.0_3.0_1701438436670.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_jigsaw_toxic","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_jigsaw_toxic","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_jigsaw_toxic| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.8 MB| + +## References + +https://huggingface.co/affahrizain/roberta-base-finetuned-jigsaw-toxic \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_restaurant_reviews_sentiment_regression_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_restaurant_reviews_sentiment_regression_en.md new file mode 100644 index 000000000000..842e1c6640b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_finetuned_restaurant_reviews_sentiment_regression_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_restaurant_reviews_sentiment_regression RoBertaForSequenceClassification from tillschwoerer +author: John Snow Labs +name: roberta_base_finetuned_restaurant_reviews_sentiment_regression +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_restaurant_reviews_sentiment_regression` is a English model originally trained by tillschwoerer. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_restaurant_reviews_sentiment_regression_en_5.2.0_3.0_1701440448981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_restaurant_reviews_sentiment_regression_en_5.2.0_3.0_1701440448981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_restaurant_reviews_sentiment_regression","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_restaurant_reviews_sentiment_regression","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_restaurant_reviews_sentiment_regression| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.0 MB| + +## References + +https://huggingface.co/tillschwoerer/roberta-base-finetuned-restaurant-reviews-sentiment-regression \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_mnli_2e_5_42_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_mnli_2e_5_42_en.md new file mode 100644 index 000000000000..9e07378579b3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_mnli_2e_5_42_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_mnli_2e_5_42 RoBertaForSequenceClassification from TehranNLP-org +author: John Snow Labs +name: roberta_base_mnli_2e_5_42 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_mnli_2e_5_42` is a English model originally trained by TehranNLP-org. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_mnli_2e_5_42_en_5.2.0_3.0_1701442412885.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_mnli_2e_5_42_en_5.2.0_3.0_1701442412885.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mnli_2e_5_42","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mnli_2e_5_42","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_mnli_2e_5_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.5 MB| + +## References + +https://huggingface.co/TehranNLP-org/roberta-base-mnli-2e-5-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_motivational_interviewing_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_motivational_interviewing_en.md new file mode 100644 index 000000000000..aa51bfb94aa3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_motivational_interviewing_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_motivational_interviewing RoBertaForSequenceClassification from clulab +author: John Snow Labs +name: roberta_base_motivational_interviewing +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_motivational_interviewing` is a English model originally trained by clulab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_motivational_interviewing_en_5.2.0_3.0_1701413459661.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_motivational_interviewing_en_5.2.0_3.0_1701413459661.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_motivational_interviewing","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_motivational_interviewing","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_motivational_interviewing| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|416.9 MB| + +## References + +https://huggingface.co/clulab/roberta-base-motivational-interviewing \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_mrpc_2e_5_42_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_mrpc_2e_5_42_en.md new file mode 100644 index 000000000000..681242fcb14a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_mrpc_2e_5_42_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_mrpc_2e_5_42 RoBertaForSequenceClassification from TehranNLP-org +author: John Snow Labs +name: roberta_base_mrpc_2e_5_42 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_mrpc_2e_5_42` is a English model originally trained by TehranNLP-org. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_mrpc_2e_5_42_en_5.2.0_3.0_1701414468698.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_mrpc_2e_5_42_en_5.2.0_3.0_1701414468698.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mrpc_2e_5_42","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mrpc_2e_5_42","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_mrpc_2e_5_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.1 MB| + +## References + +https://huggingface.co/TehranNLP-org/roberta-base-mrpc-2e-5-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_news_category_top10_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_news_category_top10_en.md new file mode 100644 index 000000000000..2aa280d7bc03 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_news_category_top10_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_news_category_top10 RoBertaForSequenceClassification from heegyu +author: John Snow Labs +name: roberta_base_news_category_top10 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_news_category_top10` is a English model originally trained by heegyu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_news_category_top10_en_5.2.0_3.0_1701434649660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_news_category_top10_en_5.2.0_3.0_1701434649660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_news_category_top10","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_news_category_top10","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_news_category_top10| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.9 MB| + +## References + +https://huggingface.co/heegyu/roberta-base-news-category-top10 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_qnli_willheld_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_qnli_willheld_en.md new file mode 100644 index 000000000000..753fcb4a29db --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_qnli_willheld_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_qnli_willheld RoBertaForSequenceClassification from WillHeld +author: John Snow Labs +name: roberta_base_qnli_willheld +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_qnli_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_qnli_willheld_en_5.2.0_3.0_1701409867119.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_qnli_willheld_en_5.2.0_3.0_1701409867119.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qnli_willheld","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qnli_willheld","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_qnli_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.6 MB| + +## References + +https://huggingface.co/WillHeld/roberta-base-qnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_qqp_willheld_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_qqp_willheld_en.md new file mode 100644 index 000000000000..d5afc777ca93 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_qqp_willheld_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_qqp_willheld RoBertaForSequenceClassification from WillHeld +author: John Snow Labs +name: roberta_base_qqp_willheld +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_qqp_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_qqp_willheld_en_5.2.0_3.0_1701421534397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_qqp_willheld_en_5.2.0_3.0_1701421534397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qqp_willheld","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qqp_willheld","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_qqp_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.4 MB| + +## References + +https://huggingface.co/WillHeld/roberta-base-qqp \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_spanish_wikicat_spanish_es.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_spanish_wikicat_spanish_es.md new file mode 100644 index 000000000000..69fcf8d62190 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_spanish_wikicat_spanish_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_base_spanish_wikicat_spanish RoBertaForSequenceClassification from PlanTL-GOB-ES +author: John Snow Labs +name: roberta_base_spanish_wikicat_spanish +date: 2023-12-01 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_spanish_wikicat_spanish` is a Castilian, Spanish model originally trained by PlanTL-GOB-ES. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_spanish_wikicat_spanish_es_5.2.0_3.0_1701399264737.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_spanish_wikicat_spanish_es_5.2.0_3.0_1701399264737.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_spanish_wikicat_spanish","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_spanish_wikicat_spanish","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_spanish_wikicat_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|455.4 MB| + +## References + +https://huggingface.co/PlanTL-GOB-ES/roberta-base-es-wikicat-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_sts_b_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_sts_b_en.md new file mode 100644 index 000000000000..10e351607798 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_sts_b_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_sts_b RoBertaForSequenceClassification from textattack +author: John Snow Labs +name: roberta_base_sts_b +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_sts_b` is a English model originally trained by textattack. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_sts_b_en_5.2.0_3.0_1701404135673.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_sts_b_en_5.2.0_3.0_1701404135673.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sts_b","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sts_b","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_sts_b| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.5 MB| + +## References + +https://huggingface.co/textattack/roberta-base-STS-B \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_stsb_jeremiahz_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_stsb_jeremiahz_en.md new file mode 100644 index 000000000000..85c0587e029f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_stsb_jeremiahz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_stsb_jeremiahz RoBertaForSequenceClassification from JeremiahZ +author: John Snow Labs +name: roberta_base_stsb_jeremiahz +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_stsb_jeremiahz` is a English model originally trained by JeremiahZ. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_stsb_jeremiahz_en_5.2.0_3.0_1701398502041.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_stsb_jeremiahz_en_5.2.0_3.0_1701398502041.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_stsb_jeremiahz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_stsb_jeremiahz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_stsb_jeremiahz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.5 MB| + +## References + +https://huggingface.co/JeremiahZ/roberta-base-stsb \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_topic_single_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_topic_single_en.md new file mode 100644 index 000000000000..48c7e03fdd17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_topic_single_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_topic_single RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_base_topic_single +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_topic_single` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_topic_single_en_5.2.0_3.0_1701433586288.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_topic_single_en_5.2.0_3.0_1701433586288.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_topic_single","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_topic_single","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_topic_single| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.5 MB| + +## References + +https://huggingface.co/cardiffnlp/roberta-base-topic-single \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_tweet_topic_multi_all_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_tweet_topic_multi_all_en.md new file mode 100644 index 000000000000..d19f54119a97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_tweet_topic_multi_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_tweet_topic_multi_all RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_base_tweet_topic_multi_all +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_tweet_topic_multi_all` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_topic_multi_all_en_5.2.0_3.0_1701417636395.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_topic_multi_all_en_5.2.0_3.0_1701417636395.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_topic_multi_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_topic_multi_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_tweet_topic_multi_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.4 MB| + +## References + +https://huggingface.co/cardiffnlp/roberta-base-tweet-topic-multi-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_tweet_topic_single_all_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_tweet_topic_single_all_en.md new file mode 100644 index 000000000000..4eb3ec286e4c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_tweet_topic_single_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_tweet_topic_single_all RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_base_tweet_topic_single_all +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_tweet_topic_single_all` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_topic_single_all_en_5.2.0_3.0_1701472919128.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_topic_single_all_en_5.2.0_3.0_1701472919128.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_topic_single_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_topic_single_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_tweet_topic_single_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.5 MB| + +## References + +https://huggingface.co/cardiffnlp/roberta-base-tweet-topic-single-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_twitter_pop_binary_sentiment_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_twitter_pop_binary_sentiment_en.md new file mode 100644 index 000000000000..de82b198c7b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_twitter_pop_binary_sentiment_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_twitter_pop_binary_sentiment RoBertaForSequenceClassification from guyhadad01 +author: John Snow Labs +name: roberta_base_twitter_pop_binary_sentiment +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_twitter_pop_binary_sentiment` is a English model originally trained by guyhadad01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_twitter_pop_binary_sentiment_en_5.2.0_3.0_1701470589125.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_twitter_pop_binary_sentiment_en_5.2.0_3.0_1701470589125.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_twitter_pop_binary_sentiment","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_twitter_pop_binary_sentiment","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_twitter_pop_binary_sentiment| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/guyhadad01/Roberta-base-twitter-pop-binary-sentiment \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_vira_dialog_acts_live_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_vira_dialog_acts_live_en.md new file mode 100644 index 000000000000..82be1fbb36a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_vira_dialog_acts_live_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_vira_dialog_acts_live RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: roberta_base_vira_dialog_acts_live +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_vira_dialog_acts_live` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_vira_dialog_acts_live_en_5.2.0_3.0_1701419032375.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_vira_dialog_acts_live_en_5.2.0_3.0_1701419032375.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_vira_dialog_acts_live","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_vira_dialog_acts_live","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_vira_dialog_acts_live| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|417.0 MB| + +## References + +https://huggingface.co/ibm/roberta-base-vira-dialog-acts-live \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_base_yelp_full_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_yelp_full_en.md new file mode 100644 index 000000000000..4f9d12d22b46 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_base_yelp_full_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_yelp_full RoBertaForSequenceClassification from jjezabek +author: John Snow Labs +name: roberta_base_yelp_full +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_yelp_full` is a English model originally trained by jjezabek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_yelp_full_en_5.2.0_3.0_1701438935852.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_yelp_full_en_5.2.0_3.0_1701438935852.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_yelp_full","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_yelp_full","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_yelp_full| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.6 MB| + +## References + +https://huggingface.co/jjezabek/roberta-base-yelp_full \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_crypto_profiling_task1_2_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_crypto_profiling_task1_2_en.md new file mode 100644 index 000000000000..70a69a9a9437 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_crypto_profiling_task1_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_crypto_profiling_task1_2 RoBertaForSequenceClassification from pabagcha +author: John Snow Labs +name: roberta_crypto_profiling_task1_2 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_crypto_profiling_task1_2` is a English model originally trained by pabagcha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_crypto_profiling_task1_2_en_5.2.0_3.0_1701469545332.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_crypto_profiling_task1_2_en_5.2.0_3.0_1701469545332.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_crypto_profiling_task1_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_crypto_profiling_task1_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_crypto_profiling_task1_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/pabagcha/roberta_crypto_profiling_task1_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_emo_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_emo_en.md new file mode 100644 index 000000000000..1ab5e79a8f71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_emo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_emo RoBertaForSequenceClassification from gustavecortal +author: John Snow Labs +name: roberta_emo +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_emo` is a English model originally trained by gustavecortal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_emo_en_5.2.0_3.0_1701469815331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_emo_en_5.2.0_3.0_1701469815331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_emo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_emo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_emo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/gustavecortal/roberta_emo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_feature_assamese_text_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_feature_assamese_text_en.md new file mode 100644 index 000000000000..a2fdb3282478 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_feature_assamese_text_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_feature_assamese_text RoBertaForSequenceClassification from SajjadAyoubi +author: John Snow Labs +name: roberta_feature_assamese_text +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_feature_assamese_text` is a English model originally trained by SajjadAyoubi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_feature_assamese_text_en_5.2.0_3.0_1701414936981.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_feature_assamese_text_en_5.2.0_3.0_1701414936981.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_feature_assamese_text","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_feature_assamese_text","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_feature_assamese_text| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.2 MB| + +## References + +https://huggingface.co/SajjadAyoubi/roberta-feature-as-text \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_cpv_spanish_oeg_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_cpv_spanish_oeg_en.md new file mode 100644 index 000000000000..49310605ee95 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_cpv_spanish_oeg_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_cpv_spanish_oeg RoBertaForSequenceClassification from oeg +author: John Snow Labs +name: roberta_finetuned_cpv_spanish_oeg +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_cpv_spanish_oeg` is a English model originally trained by oeg. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_cpv_spanish_oeg_en_5.2.0_3.0_1701432926763.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_cpv_spanish_oeg_en_5.2.0_3.0_1701432926763.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_cpv_spanish_oeg","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_cpv_spanish_oeg","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_cpv_spanish_oeg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.6 MB| + +## References + +https://huggingface.co/oeg/roberta-finetuned-CPV_Spanish \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_snips_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_snips_en.md new file mode 100644 index 000000000000..fbad1ce127f7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_snips_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_snips RoBertaForSequenceClassification from benayas +author: John Snow Labs +name: roberta_finetuned_snips +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_snips` is a English model originally trained by benayas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_snips_en_5.2.0_3.0_1701435127510.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_snips_en_5.2.0_3.0_1701435127510.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_snips","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_snips","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_snips| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.2 MB| + +## References + +https://huggingface.co/benayas/roberta-finetuned-snips \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_webclassification_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_webclassification_en.md new file mode 100644 index 000000000000..3ddb1449bc57 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_webclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_webclassification RoBertaForSequenceClassification from mnavas +author: John Snow Labs +name: roberta_finetuned_webclassification +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_webclassification` is a English model originally trained by mnavas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_webclassification_en_5.2.0_3.0_1701390662539.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_webclassification_en_5.2.0_3.0_1701390662539.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_webclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_webclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_webclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|850.9 MB| + +## References + +https://huggingface.co/mnavas/roberta-finetuned-WebClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_webclassification_v2_smalllinguaenv2_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_webclassification_v2_smalllinguaenv2_en.md new file mode 100644 index 000000000000..ecc459b605b0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_finetuned_webclassification_v2_smalllinguaenv2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_webclassification_v2_smalllinguaenv2 RoBertaForSequenceClassification from mnavas +author: John Snow Labs +name: roberta_finetuned_webclassification_v2_smalllinguaenv2 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_webclassification_v2_smalllinguaenv2` is a English model originally trained by mnavas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_webclassification_v2_smalllinguaenv2_en_5.2.0_3.0_1701430034216.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_webclassification_v2_smalllinguaenv2_en_5.2.0_3.0_1701430034216.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_webclassification_v2_smalllinguaenv2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_webclassification_v2_smalllinguaenv2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_webclassification_v2_smalllinguaenv2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|836.5 MB| + +## References + +https://huggingface.co/mnavas/roberta-finetuned-WebClassification-v2-smalllinguaENv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_for_eyewitness_confidence_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_for_eyewitness_confidence_en.md new file mode 100644 index 000000000000..d77597c4b826 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_for_eyewitness_confidence_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_for_eyewitness_confidence RoBertaForSequenceClassification from psheaton +author: John Snow Labs +name: roberta_for_eyewitness_confidence +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_for_eyewitness_confidence` is a English model originally trained by psheaton. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_for_eyewitness_confidence_en_5.2.0_3.0_1701402568956.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_for_eyewitness_confidence_en_5.2.0_3.0_1701402568956.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_for_eyewitness_confidence","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_for_eyewitness_confidence","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_for_eyewitness_confidence| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|418.0 MB| + +## References + +https://huggingface.co/psheaton/RoBERTa_for_eyewitness_confidence \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_goemotions_6_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_goemotions_6_en.md new file mode 100644 index 000000000000..360f48ec37c2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_goemotions_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_goemotions_6 RoBertaForSequenceClassification from Nakul24 +author: John Snow Labs +name: roberta_goemotions_6 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_goemotions_6` is a English model originally trained by Nakul24. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_goemotions_6_en_5.2.0_3.0_1701470271953.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_goemotions_6_en_5.2.0_3.0_1701470271953.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_goemotions_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_goemotions_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_goemotions_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.0 MB| + +## References + +https://huggingface.co/Nakul24/RoBERTa-Goemotions-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_2_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_2_en.md new file mode 100644 index 000000000000..2ad6f7725bb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_2 RoBertaForSequenceClassification from hagara +author: John Snow Labs +name: roberta_large_2 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_2` is a English model originally trained by hagara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_2_en_5.2.0_3.0_1701407063714.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_2_en_5.2.0_3.0_1701407063714.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/hagara/roberta-large-2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_bne_finetuned_go_emotions_spanish_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_bne_finetuned_go_emotions_spanish_en.md new file mode 100644 index 000000000000..7cce47690369 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_bne_finetuned_go_emotions_spanish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_bne_finetuned_go_emotions_spanish RoBertaForSequenceClassification from mrm8488 +author: John Snow Labs +name: roberta_large_bne_finetuned_go_emotions_spanish +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_bne_finetuned_go_emotions_spanish` is a English model originally trained by mrm8488. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_bne_finetuned_go_emotions_spanish_en_5.2.0_3.0_1701440822559.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_bne_finetuned_go_emotions_spanish_en_5.2.0_3.0_1701440822559.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_bne_finetuned_go_emotions_spanish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_bne_finetuned_go_emotions_spanish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_bne_finetuned_go_emotions_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/mrm8488/roberta-large-bne-finetuned-go_emotions-es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_boolq_finetuned_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_boolq_finetuned_en.md new file mode 100644 index 000000000000..5100ede1fb6e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_boolq_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_boolq_finetuned RoBertaForSequenceClassification from apugachev +author: John Snow Labs +name: roberta_large_boolq_finetuned +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_boolq_finetuned` is a English model originally trained by apugachev. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_boolq_finetuned_en_5.2.0_3.0_1701470492122.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_boolq_finetuned_en_5.2.0_3.0_1701470492122.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_boolq_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_boolq_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_boolq_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/apugachev/roberta-large-boolq-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_dominant_culture_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_dominant_culture_en.md new file mode 100644 index 000000000000..7ac0a7eb25b7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_dominant_culture_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_dominant_culture RoBertaForSequenceClassification from CultureBERT +author: John Snow Labs +name: roberta_large_dominant_culture +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_dominant_culture` is a English model originally trained by CultureBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_dominant_culture_en_5.2.0_3.0_1701402367828.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_dominant_culture_en_5.2.0_3.0_1701402367828.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_dominant_culture","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_dominant_culture","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_dominant_culture| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/CultureBERT/roberta-large-dominant-culture \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_finetuned_code_mixed_ds_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_finetuned_code_mixed_ds_en.md new file mode 100644 index 000000000000..b9c89d1a8c39 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_finetuned_code_mixed_ds_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_finetuned_code_mixed_ds RoBertaForSequenceClassification from IIIT-L +author: John Snow Labs +name: roberta_large_finetuned_code_mixed_ds +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_finetuned_code_mixed_ds` is a English model originally trained by IIIT-L. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_code_mixed_ds_en_5.2.0_3.0_1701472977507.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_code_mixed_ds_en_5.2.0_3.0_1701472977507.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_code_mixed_ds","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_code_mixed_ds","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_finetuned_code_mixed_ds| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/IIIT-L/roberta-large-finetuned-code-mixed-DS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_go_emotions_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_go_emotions_en.md new file mode 100644 index 000000000000..aacd778900fb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_go_emotions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_go_emotions RoBertaForSequenceClassification from tasinhoque +author: John Snow Labs +name: roberta_large_go_emotions +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_go_emotions` is a English model originally trained by tasinhoque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_go_emotions_en_5.2.0_3.0_1701408713681.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_go_emotions_en_5.2.0_3.0_1701408713681.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_go_emotions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_go_emotions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_go_emotions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tasinhoque/roberta-large-go-emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_hate_offensive_normal_speech_lr_2e_05_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_hate_offensive_normal_speech_lr_2e_05_en.md new file mode 100644 index 000000000000..1a22fcbe32a2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_hate_offensive_normal_speech_lr_2e_05_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_hate_offensive_normal_speech_lr_2e_05 RoBertaForSequenceClassification from DrishtiSharma +author: John Snow Labs +name: roberta_large_hate_offensive_normal_speech_lr_2e_05 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_hate_offensive_normal_speech_lr_2e_05` is a English model originally trained by DrishtiSharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_hate_offensive_normal_speech_lr_2e_05_en_5.2.0_3.0_1701475006129.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_hate_offensive_normal_speech_lr_2e_05_en_5.2.0_3.0_1701475006129.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_hate_offensive_normal_speech_lr_2e_05","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_hate_offensive_normal_speech_lr_2e_05","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_hate_offensive_normal_speech_lr_2e_05| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/DrishtiSharma/roberta-large-hate-offensive-normal-speech-lr-2e-05 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_qa_suffix_defteval_t6_st1_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_qa_suffix_defteval_t6_st1_en.md new file mode 100644 index 000000000000..2772d965a47e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_qa_suffix_defteval_t6_st1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_qa_suffix_defteval_t6_st1 RoBertaForSequenceClassification from tobiaslee +author: John Snow Labs +name: roberta_large_qa_suffix_defteval_t6_st1 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_qa_suffix_defteval_t6_st1` is a English model originally trained by tobiaslee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_qa_suffix_defteval_t6_st1_en_5.2.0_3.0_1701470853958.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_qa_suffix_defteval_t6_st1_en_5.2.0_3.0_1701470853958.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_qa_suffix_defteval_t6_st1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_qa_suffix_defteval_t6_st1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_qa_suffix_defteval_t6_st1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tobiaslee/roberta-large-qa-suffix-defteval-t6-st1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_large_vira_intents_mod_gpt4_data_aug_ep5_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_vira_intents_mod_gpt4_data_aug_ep5_en.md new file mode 100644 index 000000000000..3603293e0348 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_large_vira_intents_mod_gpt4_data_aug_ep5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_vira_intents_mod_gpt4_data_aug_ep5 RoBertaForSequenceClassification from vira-chatbot +author: John Snow Labs +name: roberta_large_vira_intents_mod_gpt4_data_aug_ep5 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_vira_intents_mod_gpt4_data_aug_ep5` is a English model originally trained by vira-chatbot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_mod_gpt4_data_aug_ep5_en_5.2.0_3.0_1701449193325.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_mod_gpt4_data_aug_ep5_en_5.2.0_3.0_1701449193325.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_mod_gpt4_data_aug_ep5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_mod_gpt4_data_aug_ep5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_vira_intents_mod_gpt4_data_aug_ep5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/vira-chatbot/roberta-large-vira-intents-mod-gpt4-data-aug-ep5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_mental_health_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_mental_health_en.md new file mode 100644 index 000000000000..6e955f4bd3a0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_mental_health_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_mental_health RoBertaForSequenceClassification from fzetter +author: John Snow Labs +name: roberta_mental_health +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_mental_health` is a English model originally trained by fzetter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_mental_health_en_5.2.0_3.0_1701451284388.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_mental_health_en_5.2.0_3.0_1701451284388.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_mental_health","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_mental_health","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_mental_health| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|459.9 MB| + +## References + +https://huggingface.co/fzetter/roberta-mental-health \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_movie_review_capstone_2_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_movie_review_capstone_2_en.md new file mode 100644 index 000000000000..3bcdc7c38c6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_movie_review_capstone_2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_movie_review_capstone_2 RoBertaForSequenceClassification from gyesibiney +author: John Snow Labs +name: roberta_movie_review_capstone_2 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_movie_review_capstone_2` is a English model originally trained by gyesibiney. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_movie_review_capstone_2_en_5.2.0_3.0_1701440449001.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_movie_review_capstone_2_en_5.2.0_3.0_1701440449001.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_movie_review_capstone_2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_movie_review_capstone_2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_movie_review_capstone_2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|454.5 MB| + +## References + +https://huggingface.co/gyesibiney/Roberta-movie-review-capstone_2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_news_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_news_classifier_en.md new file mode 100644 index 000000000000..bb041cd1b971 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_news_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_news_classifier RoBertaForSequenceClassification from russellc +author: John Snow Labs +name: roberta_news_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_news_classifier` is a English model originally trained by russellc. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_news_classifier_en_5.2.0_3.0_1701470518313.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_news_classifier_en_5.2.0_3.0_1701470518313.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_news_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_news_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_news_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|463.7 MB| + +## References + +https://huggingface.co/russellc/roberta-news-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_nrc_fear_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_nrc_fear_en.md new file mode 100644 index 000000000000..c28cf0d6a06b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_nrc_fear_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_nrc_fear RoBertaForSequenceClassification from neal49 +author: John Snow Labs +name: roberta_nrc_fear +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_nrc_fear` is a English model originally trained by neal49. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_nrc_fear_en_5.2.0_3.0_1701428126444.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_nrc_fear_en_5.2.0_3.0_1701428126444.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_nrc_fear","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_nrc_fear","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_nrc_fear| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|431.8 MB| + +## References + +https://huggingface.co/neal49/roberta-nrc-fear \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_persuade_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_persuade_en.md new file mode 100644 index 000000000000..eb84191c7027 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_persuade_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_persuade RoBertaForSequenceClassification from paragon-analytics +author: John Snow Labs +name: roberta_persuade +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_persuade` is a English model originally trained by paragon-analytics. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_persuade_en_5.2.0_3.0_1701411036976.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_persuade_en_5.2.0_3.0_1701411036976.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_persuade","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_persuade","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_persuade| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.5 MB| + +## References + +https://huggingface.co/paragon-analytics/roberta_persuade \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_psychotherapy_eval_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_psychotherapy_eval_en.md new file mode 100644 index 000000000000..ae9d92a26519 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_psychotherapy_eval_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_psychotherapy_eval RoBertaForSequenceClassification from margotwagner +author: John Snow Labs +name: roberta_psychotherapy_eval +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_psychotherapy_eval` is a English model originally trained by margotwagner. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_psychotherapy_eval_en_5.2.0_3.0_1701437863930.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_psychotherapy_eval_en_5.2.0_3.0_1701437863930.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_psychotherapy_eval","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_psychotherapy_eval","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_psychotherapy_eval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|444.1 MB| + +## References + +https://huggingface.co/margotwagner/roberta-psychotherapy-eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_scarcasm_discriminator_xsy_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_scarcasm_discriminator_xsy_en.md new file mode 100644 index 000000000000..1d799dd6c95c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_scarcasm_discriminator_xsy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_scarcasm_discriminator_xsy RoBertaForSequenceClassification from XSY +author: John Snow Labs +name: roberta_scarcasm_discriminator_xsy +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_scarcasm_discriminator_xsy` is a English model originally trained by XSY. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_scarcasm_discriminator_xsy_en_5.2.0_3.0_1701470421172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_scarcasm_discriminator_xsy_en_5.2.0_3.0_1701470421172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_scarcasm_discriminator_xsy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_scarcasm_discriminator_xsy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_scarcasm_discriminator_xsy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|451.0 MB| + +## References + +https://huggingface.co/XSY/roberta-scarcasm-discriminator \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiment_classifier_aliyyah_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiment_classifier_aliyyah_en.md new file mode 100644 index 000000000000..365e752227a8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiment_classifier_aliyyah_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_sentiment_classifier_aliyyah RoBertaForSequenceClassification from Aliyyah +author: John Snow Labs +name: roberta_sentiment_classifier_aliyyah +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentiment_classifier_aliyyah` is a English model originally trained by Aliyyah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_aliyyah_en_5.2.0_3.0_1701407346055.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_aliyyah_en_5.2.0_3.0_1701407346055.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_aliyyah","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_aliyyah","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentiment_classifier_aliyyah| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.2 MB| + +## References + +https://huggingface.co/Aliyyah/Roberta-Sentiment-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiment_classifier_reginandcrabbe_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiment_classifier_reginandcrabbe_en.md new file mode 100644 index 000000000000..214fce58fa6b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiment_classifier_reginandcrabbe_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_sentiment_classifier_reginandcrabbe RoBertaForSequenceClassification from reginandcrabbe +author: John Snow Labs +name: roberta_sentiment_classifier_reginandcrabbe +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentiment_classifier_reginandcrabbe` is a English model originally trained by reginandcrabbe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_reginandcrabbe_en_5.2.0_3.0_1701395194221.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentiment_classifier_reginandcrabbe_en_5.2.0_3.0_1701395194221.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_reginandcrabbe","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_classifier_reginandcrabbe","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentiment_classifier_reginandcrabbe| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.2 MB| + +## References + +https://huggingface.co/reginandcrabbe/Roberta-Sentiment-Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiments_spanish_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiments_spanish_en.md new file mode 100644 index 000000000000..00737a6bd50f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_sentiments_spanish_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_sentiments_spanish RoBertaForSequenceClassification from Manauu17 +author: John Snow Labs +name: roberta_sentiments_spanish +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentiments_spanish` is a English model originally trained by Manauu17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentiments_spanish_en_5.2.0_3.0_1701431431847.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentiments_spanish_en_5.2.0_3.0_1701431431847.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiments_spanish","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiments_spanish","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentiments_spanish| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/Manauu17/roberta_sentiments_es \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_temp_classifier_bootstrapped_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_temp_classifier_bootstrapped_en.md new file mode 100644 index 000000000000..6b766c8b13d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_temp_classifier_bootstrapped_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_temp_classifier_bootstrapped RoBertaForSequenceClassification from research-dump +author: John Snow Labs +name: roberta_temp_classifier_bootstrapped +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_temp_classifier_bootstrapped` is a English model originally trained by research-dump. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_temp_classifier_bootstrapped_en_5.2.0_3.0_1701470632393.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_temp_classifier_bootstrapped_en_5.2.0_3.0_1701470632393.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_temp_classifier_bootstrapped","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_temp_classifier_bootstrapped","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_temp_classifier_bootstrapped| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|448.0 MB| + +## References + +https://huggingface.co/research-dump/roberta_temp_classifier_bootstrapped \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_twitter_sentiment_extraction_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_twitter_sentiment_extraction_en.md new file mode 100644 index 000000000000..4a3502d25a53 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_twitter_sentiment_extraction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_twitter_sentiment_extraction RoBertaForSequenceClassification from cruiser +author: John Snow Labs +name: roberta_twitter_sentiment_extraction +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_twitter_sentiment_extraction` is a English model originally trained by cruiser. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_twitter_sentiment_extraction_en_5.2.0_3.0_1701473340012.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_twitter_sentiment_extraction_en_5.2.0_3.0_1701473340012.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_twitter_sentiment_extraction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_twitter_sentiment_extraction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_twitter_sentiment_extraction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/cruiser/roberta-twitter-sentiment-extraction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-roberta_wiki_detector_en.md b/docs/_posts/ahmedlone127/2023-12-01-roberta_wiki_detector_en.md new file mode 100644 index 000000000000..a0cec11c524b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-roberta_wiki_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_wiki_detector RoBertaForSequenceClassification from andreas122001 +author: John Snow Labs +name: roberta_wiki_detector +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_wiki_detector` is a English model originally trained by andreas122001. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_wiki_detector_en_5.2.0_3.0_1701390700769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_wiki_detector_en_5.2.0_3.0_1701390700769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_wiki_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_wiki_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_wiki_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.0 MB| + +## References + +https://huggingface.co/andreas122001/roberta-wiki-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-robertuito_base_uncased_emotion_en.md b/docs/_posts/ahmedlone127/2023-12-01-robertuito_base_uncased_emotion_en.md new file mode 100644 index 000000000000..fb6cc8ec82c7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-robertuito_base_uncased_emotion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English robertuito_base_uncased_emotion RoBertaForSequenceClassification from pysentimiento +author: John Snow Labs +name: robertuito_base_uncased_emotion +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robertuito_base_uncased_emotion` is a English model originally trained by pysentimiento. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robertuito_base_uncased_emotion_en_5.2.0_3.0_1701406624841.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robertuito_base_uncased_emotion_en_5.2.0_3.0_1701406624841.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("robertuito_base_uncased_emotion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("robertuito_base_uncased_emotion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robertuito_base_uncased_emotion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.4 MB| + +## References + +https://huggingface.co/pysentimiento/robertuito-base-uncased-emotion \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentence_sentiments_analysis_roberta_uholodala_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentence_sentiments_analysis_roberta_uholodala_en.md new file mode 100644 index 000000000000..56736a024cf1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentence_sentiments_analysis_roberta_uholodala_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentence_sentiments_analysis_roberta_uholodala RoBertaForSequenceClassification from UholoDala +author: John Snow Labs +name: sentence_sentiments_analysis_roberta_uholodala +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentence_sentiments_analysis_roberta_uholodala` is a English model originally trained by UholoDala. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentence_sentiments_analysis_roberta_uholodala_en_5.2.0_3.0_1701470068813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentence_sentiments_analysis_roberta_uholodala_en_5.2.0_3.0_1701470068813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentence_sentiments_analysis_roberta_uholodala","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentence_sentiments_analysis_roberta_uholodala","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentence_sentiments_analysis_roberta_uholodala| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|461.1 MB| + +## References + +https://huggingface.co/UholoDala/sentence_sentiments_analysis_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentiment_analysis_sudhanvasp_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentiment_analysis_sudhanvasp_en.md new file mode 100644 index 000000000000..a4179d34d69f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentiment_analysis_sudhanvasp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_analysis_sudhanvasp RoBertaForSequenceClassification from sudhanvasp +author: John Snow Labs +name: sentiment_analysis_sudhanvasp +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_analysis_sudhanvasp` is a English model originally trained by sudhanvasp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_analysis_sudhanvasp_en_5.2.0_3.0_1701390908918.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_analysis_sudhanvasp_en_5.2.0_3.0_1701390908918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_sudhanvasp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_analysis_sudhanvasp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_analysis_sudhanvasp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/sudhanvasp/Sentiment-Analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentiment_detector_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentiment_detector_en.md new file mode 100644 index 000000000000..6cce687c537a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentiment_detector_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_detector RoBertaForSequenceClassification from ishaansharma +author: John Snow Labs +name: sentiment_detector +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_detector` is a English model originally trained by ishaansharma. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_detector_en_5.2.0_3.0_1701474238540.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_detector_en_5.2.0_3.0_1701474238540.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_detector","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_detector","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_detector| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/ishaansharma/sentiment-detector \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentiment_metaverse_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentiment_metaverse_en.md new file mode 100644 index 000000000000..7e3d10f69f18 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentiment_metaverse_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_metaverse RoBertaForSequenceClassification from agnesemi +author: John Snow Labs +name: sentiment_metaverse +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_metaverse` is a English model originally trained by agnesemi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_metaverse_en_5.2.0_3.0_1701473347957.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_metaverse_en_5.2.0_3.0_1701473347957.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_metaverse","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_metaverse","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_metaverse| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/agnesemi/sentiment-metaverse \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m_en.md new file mode 100644 index 000000000000..e9e1dfc5868f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701422569491.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701422569491.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_random2_seed0_twitter_roberta_base_2019_90m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_random2_seed0-twitter-roberta-base-2019-90m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m_en.md new file mode 100644 index 000000000000..614327687854 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701431071551.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701431071551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_random3_seed2_twitter_roberta_base_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_random3_seed2-twitter-roberta-base-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020_en.md new file mode 100644 index 000000000000..dcee5f5e55da --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020 RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701470938189.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701470938189.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_random3_seed2_twitter_roberta_base_dec2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_random3_seed2-twitter-roberta-base-dec2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentimentalroberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentimentalroberta_en.md new file mode 100644 index 000000000000..846823ff111d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentimentalroberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentimentalroberta RoBertaForSequenceClassification from kojoboyoo +author: John Snow Labs +name: sentimentalroberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentimentalroberta` is a English model originally trained by kojoboyoo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentimentalroberta_en_5.2.0_3.0_1701402367154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentimentalroberta_en_5.2.0_3.0_1701402367154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentimentalroberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentimentalroberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentimentalroberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.3 MB| + +## References + +https://huggingface.co/kojoboyoo/sentimentalroberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-sentimentroberta_babe_2epochs_en.md b/docs/_posts/ahmedlone127/2023-12-01-sentimentroberta_babe_2epochs_en.md new file mode 100644 index 000000000000..a311e8c3e31a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-sentimentroberta_babe_2epochs_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentimentroberta_babe_2epochs RoBertaForSequenceClassification from jordankrishnayah +author: John Snow Labs +name: sentimentroberta_babe_2epochs +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentimentroberta_babe_2epochs` is a English model originally trained by jordankrishnayah. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentimentroberta_babe_2epochs_en_5.2.0_3.0_1701391336331.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentimentroberta_babe_2epochs_en_5.2.0_3.0_1701391336331.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentimentroberta_babe_2epochs","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentimentroberta_babe_2epochs","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentimentroberta_babe_2epochs| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/jordankrishnayah/sentimentROBERTA-BABE-2epochs \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-subject_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-subject_classifier_en.md new file mode 100644 index 000000000000..2265a46a152d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-subject_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English subject_classifier RoBertaForSequenceClassification from Jackett +author: John Snow Labs +name: subject_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`subject_classifier` is a English model originally trained by Jackett. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/subject_classifier_en_5.2.0_3.0_1701426046149.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/subject_classifier_en_5.2.0_3.0_1701426046149.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("subject_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("subject_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|subject_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.4 MB| + +## References + +https://huggingface.co/Jackett/subject_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-summary_roberta_content_en.md b/docs/_posts/ahmedlone127/2023-12-01-summary_roberta_content_en.md new file mode 100644 index 000000000000..1bf9e3cf9a8c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-summary_roberta_content_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English summary_roberta_content RoBertaForSequenceClassification from tiedaar +author: John Snow Labs +name: summary_roberta_content +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`summary_roberta_content` is a English model originally trained by tiedaar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/summary_roberta_content_en_5.2.0_3.0_1701471623076.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/summary_roberta_content_en_5.2.0_3.0_1701471623076.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("summary_roberta_content","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("summary_roberta_content","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|summary_roberta_content| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.8 MB| + +## References + +https://huggingface.co/tiedaar/summary-roberta-content \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-tagger_en.md b/docs/_posts/ahmedlone127/2023-12-01-tagger_en.md new file mode 100644 index 000000000000..e931529be87b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-tagger_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tagger RoBertaForSequenceClassification from yo +author: John Snow Labs +name: tagger +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tagger` is a English model originally trained by yo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tagger_en_5.2.0_3.0_1701428126185.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tagger_en_5.2.0_3.0_1701428126185.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tagger","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tagger","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tagger| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/yo/tagger \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-test_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-test_classifier_en.md new file mode 100644 index 000000000000..5a3a37241a9d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-test_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English test_classifier RoBertaForSequenceClassification from michellejieli +author: John Snow Labs +name: test_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`test_classifier` is a English model originally trained by michellejieli. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/test_classifier_en_5.2.0_3.0_1701472500058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/test_classifier_en_5.2.0_3.0_1701472500058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("test_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("test_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|test_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.8 MB| + +## References + +https://huggingface.co/michellejieli/test_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-text_classification_roberta_base_sept2022_en.md b/docs/_posts/ahmedlone127/2023-12-01-text_classification_roberta_base_sept2022_en.md new file mode 100644 index 000000000000..e0933e8bebc2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-text_classification_roberta_base_sept2022_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English text_classification_roberta_base_sept2022 RoBertaForSequenceClassification from sabhashanki +author: John Snow Labs +name: text_classification_roberta_base_sept2022 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`text_classification_roberta_base_sept2022` is a English model originally trained by sabhashanki. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/text_classification_roberta_base_sept2022_en_5.2.0_3.0_1701470780479.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/text_classification_roberta_base_sept2022_en_5.2.0_3.0_1701470780479.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("text_classification_roberta_base_sept2022","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("text_classification_roberta_base_sept2022","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|text_classification_roberta_base_sept2022| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.4 MB| + +## References + +https://huggingface.co/sabhashanki/text_classification-roberta_base_sept2022 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-thesis_freeform_en.md b/docs/_posts/ahmedlone127/2023-12-01-thesis_freeform_en.md new file mode 100644 index 000000000000..48cd59198c67 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-thesis_freeform_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English thesis_freeform RoBertaForSequenceClassification from maretamasaeva +author: John Snow Labs +name: thesis_freeform +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thesis_freeform` is a English model originally trained by maretamasaeva. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thesis_freeform_en_5.2.0_3.0_1701473179064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thesis_freeform_en_5.2.0_3.0_1701473179064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("thesis_freeform","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("thesis_freeform","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thesis_freeform| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|456.0 MB| + +## References + +https://huggingface.co/maretamasaeva/thesis-freeform \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-tickettagger_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-01-tickettagger_roberta_en.md new file mode 100644 index 000000000000..c8bc3407a5e7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-tickettagger_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tickettagger_roberta RoBertaForSequenceClassification from rafaelkallis +author: John Snow Labs +name: tickettagger_roberta +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tickettagger_roberta` is a English model originally trained by rafaelkallis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tickettagger_roberta_en_5.2.0_3.0_1701470355999.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tickettagger_roberta_en_5.2.0_3.0_1701470355999.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tickettagger_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tickettagger_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tickettagger_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/rafaelkallis/tickettagger-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-title_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-01-title_classifier_en.md new file mode 100644 index 000000000000..8277d306a637 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-title_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English title_classifier RoBertaForSequenceClassification from xqewec +author: John Snow Labs +name: title_classifier +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`title_classifier` is a English model originally trained by xqewec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/title_classifier_en_5.2.0_3.0_1701417213581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/title_classifier_en_5.2.0_3.0_1701417213581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("title_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("title_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|title_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.1 MB| + +## References + +https://huggingface.co/xqewec/title_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-topic_topic_random2_seed2_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_random2_seed2_roberta_base_en.md new file mode 100644 index 000000000000..d0cddd86fe0e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_random2_seed2_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random2_seed2_roberta_base RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random2_seed2_roberta_base +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random2_seed2_roberta_base` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random2_seed2_roberta_base_en_5.2.0_3.0_1701429547664.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random2_seed2_roberta_base_en_5.2.0_3.0_1701429547664.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random2_seed2_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random2_seed2_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random2_seed2_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.4 MB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random2_seed2-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-topic_topic_random3_seed2_twitter_roberta_base_dec2020_en.md b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_random3_seed2_twitter_roberta_base_dec2020_en.md new file mode 100644 index 000000000000..d0fb4c68565f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_random3_seed2_twitter_roberta_base_dec2020_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random3_seed2_twitter_roberta_base_dec2020 RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random3_seed2_twitter_roberta_base_dec2020 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random3_seed2_twitter_roberta_base_dec2020` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random3_seed2_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701429683716.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random3_seed2_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701429683716.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random3_seed2_twitter_roberta_base_dec2020","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random3_seed2_twitter_roberta_base_dec2020","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random3_seed2_twitter_roberta_base_dec2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random3_seed2-twitter-roberta-base-dec2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_bertweet_large_en.md b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_bertweet_large_en.md new file mode 100644 index 000000000000..ad52f5851660 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_bertweet_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_temporal_bertweet_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_temporal_bertweet_large +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_temporal_bertweet_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_bertweet_large_en_5.2.0_3.0_1701425911687.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_bertweet_large_en_5.2.0_3.0_1701425911687.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_bertweet_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_bertweet_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_temporal_bertweet_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_temporal-bertweet-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_roberta_base_en.md new file mode 100644 index 000000000000..61bb1d78bf7b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_temporal_roberta_base RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_temporal_roberta_base +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_temporal_roberta_base` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_roberta_base_en_5.2.0_3.0_1701408488318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_roberta_base_en_5.2.0_3.0_1701408488318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_temporal_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|441.5 MB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_temporal-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_base_2021_124m_en.md b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_base_2021_124m_en.md new file mode 100644 index 000000000000..1e26d7044770 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_base_2021_124m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_temporal_twitter_roberta_base_2021_124m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_temporal_twitter_roberta_base_2021_124m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_temporal_twitter_roberta_base_2021_124m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701420076130.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701420076130.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_twitter_roberta_base_2021_124m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_twitter_roberta_base_2021_124m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_temporal_twitter_roberta_base_2021_124m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_temporal-twitter-roberta-base-2021-124m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_base_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_base_2022_154m_en.md new file mode 100644 index 000000000000..618f080f501f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_base_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_temporal_twitter_roberta_base_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_temporal_twitter_roberta_base_2022_154m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_temporal_twitter_roberta_base_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701396982558.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701396982558.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_twitter_roberta_base_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_twitter_roberta_base_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_temporal_twitter_roberta_base_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_temporal-twitter-roberta-base-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..fad0e2d54688 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-topic_topic_temporal_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_temporal_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_temporal_twitter_roberta_large_2022_154m +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_temporal_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701411831281.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_temporal_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701411831281.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_temporal_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_temporal_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_temporal-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-tweet_emotion_eval_en.md b/docs/_posts/ahmedlone127/2023-12-01-tweet_emotion_eval_en.md new file mode 100644 index 000000000000..3a5039801166 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-tweet_emotion_eval_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tweet_emotion_eval RoBertaForSequenceClassification from elozano +author: John Snow Labs +name: tweet_emotion_eval +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tweet_emotion_eval` is a English model originally trained by elozano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tweet_emotion_eval_en_5.2.0_3.0_1701448371926.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tweet_emotion_eval_en_5.2.0_3.0_1701448371926.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_emotion_eval","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_emotion_eval","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tweet_emotion_eval| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.6 MB| + +## References + +https://huggingface.co/elozano/tweet_emotion_eval \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-tweeteval_roberta_5e_en.md b/docs/_posts/ahmedlone127/2023-12-01-tweeteval_roberta_5e_en.md new file mode 100644 index 000000000000..9b48d68a5797 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-tweeteval_roberta_5e_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tweeteval_roberta_5e RoBertaForSequenceClassification from pig4431 +author: John Snow Labs +name: tweeteval_roberta_5e +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tweeteval_roberta_5e` is a English model originally trained by pig4431. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tweeteval_roberta_5e_en_5.2.0_3.0_1701472825752.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tweeteval_roberta_5e_en_5.2.0_3.0_1701472825752.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweeteval_roberta_5e","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweeteval_roberta_5e","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tweeteval_roberta_5e| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.4 MB| + +## References + +https://huggingface.co/pig4431/TweetEval_roBERTa_5E \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2019_90m_tweet_topic_single_all_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2019_90m_tweet_topic_single_all_en.md new file mode 100644 index 000000000000..ad59b8b0bf66 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2019_90m_tweet_topic_single_all_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2019_90m_tweet_topic_single_all RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2019_90m_tweet_topic_single_all +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2019_90m_tweet_topic_single_all` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2019_90m_tweet_topic_single_all_en_5.2.0_3.0_1701470450407.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2019_90m_tweet_topic_single_all_en_5.2.0_3.0_1701470450407.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2019_90m_tweet_topic_single_all","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2019_90m_tweet_topic_single_all","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2019_90m_tweet_topic_single_all| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2019-90m-tweet-topic-single-all \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_emoji_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_emoji_en.md new file mode 100644 index 000000000000..30fd3555cead --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_emoji_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2021_124m_emoji RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2021_124m_emoji +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2021_124m_emoji` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_emoji_en_5.2.0_3.0_1701422456895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_emoji_en_5.2.0_3.0_1701422456895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_emoji","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_emoji","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2021_124m_emoji| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m-emoji \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_topic_multi_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_topic_multi_en.md new file mode 100644 index 000000000000..12f239a40f3d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_topic_multi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2021_124m_topic_multi RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2021_124m_topic_multi +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2021_124m_topic_multi` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_topic_multi_en_5.2.0_3.0_1701445568820.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_topic_multi_en_5.2.0_3.0_1701445568820.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_topic_multi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_topic_multi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2021_124m_topic_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m-topic-multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_topic_single_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_topic_single_en.md new file mode 100644 index 000000000000..bbc98fe2e2c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_2021_124m_topic_single_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2021_124m_topic_single RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2021_124m_topic_single +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2021_124m_topic_single` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_topic_single_en_5.2.0_3.0_1701473117880.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_topic_single_en_5.2.0_3.0_1701473117880.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_topic_single","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_topic_single","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2021_124m_topic_single| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m-topic-single \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_hate_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_hate_en.md new file mode 100644 index 000000000000..3cd406888850 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_hate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_hate RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_hate +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_hate` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_hate_en_5.2.0_3.0_1701471152089.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_hate_en_5.2.0_3.0_1701471152089.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_hate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_hate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_hate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_topic_multi_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_topic_multi_en.md new file mode 100644 index 000000000000..e413b78a32b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_topic_multi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_topic_multi RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_topic_multi +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_topic_multi` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_topic_multi_en_5.2.0_3.0_1701417048726.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_topic_multi_en_5.2.0_3.0_1701417048726.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_topic_multi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_topic_multi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_topic_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-topic-multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_topic_single_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_topic_single_en.md new file mode 100644 index 000000000000..5acded977318 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_dec2021_topic_single_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_dec2021_topic_single RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_dec2021_topic_single +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_dec2021_topic_single` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_topic_single_en_5.2.0_3.0_1701472340354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_dec2021_topic_single_en_5.2.0_3.0_1701472340354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_topic_single","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_dec2021_topic_single","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_dec2021_topic_single| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2021-topic-single \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_semeval18_emodetection_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_semeval18_emodetection_en.md new file mode 100644 index 000000000000..6c94ce755dce --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_semeval18_emodetection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_semeval18_emodetection RoBertaForSequenceClassification from maxpe +author: John Snow Labs +name: twitter_roberta_base_semeval18_emodetection +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_semeval18_emodetection` is a English model originally trained by maxpe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_semeval18_emodetection_en_5.2.0_3.0_1701421257700.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_semeval18_emodetection_en_5.2.0_3.0_1701421257700.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_semeval18_emodetection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_semeval18_emodetection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_semeval18_emodetection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.3 MB| + +## References + +https://huggingface.co/maxpe/twitter-roberta-base_semeval18_emodetection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11_en.md new file mode 100644 index 000000000000..57fba3c6827c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11 RoBertaForSequenceClassification from ali2066 +author: John Snow Labs +name: twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11 +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11_en_5.2.0_3.0_1701471726015.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11_en_5.2.0_3.0_1701471726015.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_sentence_itr0_1e_05_all_01_03_2022_13_53_11| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/ali2066/twitter_RoBERTa_base_sentence_itr0_1e-05_all_01_03_2022-13_53_11 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_stance_feminist_en.md b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_stance_feminist_en.md new file mode 100644 index 000000000000..cdd92a16d528 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-twitter_roberta_base_stance_feminist_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_stance_feminist RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_stance_feminist +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_stance_feminist` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_feminist_en_5.2.0_3.0_1701398502280.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_feminist_en_5.2.0_3.0_1701398502280.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_feminist","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_feminist","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_stance_feminist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-stance-feminist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-university_project_en.md b/docs/_posts/ahmedlone127/2023-12-01-university_project_en.md new file mode 100644 index 000000000000..838a58e6cf81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-university_project_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English university_project RoBertaForSequenceClassification from AmiraliRezaie +author: John Snow Labs +name: university_project +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`university_project` is a English model originally trained by AmiraliRezaie. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/university_project_en_5.2.0_3.0_1701425375618.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/university_project_en_5.2.0_3.0_1701425375618.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("university_project","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("university_project","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|university_project| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/AmiraliRezaie/university_project \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-unixcoder_java_complexity_prediction_en.md b/docs/_posts/ahmedlone127/2023-12-01-unixcoder_java_complexity_prediction_en.md new file mode 100644 index 000000000000..e8223339d473 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-unixcoder_java_complexity_prediction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English unixcoder_java_complexity_prediction RoBertaForSequenceClassification from codeparrot +author: John Snow Labs +name: unixcoder_java_complexity_prediction +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`unixcoder_java_complexity_prediction` is a English model originally trained by codeparrot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/unixcoder_java_complexity_prediction_en_5.2.0_3.0_1701418926033.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/unixcoder_java_complexity_prediction_en_5.2.0_3.0_1701418926033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("unixcoder_java_complexity_prediction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("unixcoder_java_complexity_prediction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|unixcoder_java_complexity_prediction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|472.4 MB| + +## References + +https://huggingface.co/codeparrot/unixcoder-java-complexity-prediction \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-urduintentclassification_ur.md b/docs/_posts/ahmedlone127/2023-12-01-urduintentclassification_ur.md new file mode 100644 index 000000000000..b2934120bf28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-urduintentclassification_ur.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Urdu urduintentclassification RoBertaForSequenceClassification from mwz +author: John Snow Labs +name: urduintentclassification +date: 2023-12-01 +tags: [roberta, ur, open_source, sequence_classification, onnx] +task: Text Classification +language: ur +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`urduintentclassification` is a Urdu model originally trained by mwz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/urduintentclassification_ur_5.2.0_3.0_1701447857456.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/urduintentclassification_ur_5.2.0_3.0_1701447857456.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("urduintentclassification","ur")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("urduintentclassification","ur") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|urduintentclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|ur| +|Size:|473.4 MB| + +## References + +https://huggingface.co/mwz/UrduIntentClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-01-urdusentimentclassification_en.md b/docs/_posts/ahmedlone127/2023-12-01-urdusentimentclassification_en.md new file mode 100644 index 000000000000..71cc4aa32fb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-01-urdusentimentclassification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English urdusentimentclassification RoBertaForSequenceClassification from mwz +author: John Snow Labs +name: urdusentimentclassification +date: 2023-12-01 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`urdusentimentclassification` is a English model originally trained by mwz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/urdusentimentclassification_en_5.2.0_3.0_1701415651590.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/urdusentimentclassification_en_5.2.0_3.0_1701415651590.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("urdusentimentclassification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("urdusentimentclassification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|urdusentimentclassification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|473.2 MB| + +## References + +https://huggingface.co/mwz/UrduSentimentClassification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-3rh_en.md b/docs/_posts/ahmedlone127/2023-12-02-3rh_en.md new file mode 100644 index 000000000000..e5b289668d59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-3rh_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 3rh RoBertaForSequenceClassification from aloxatel +author: John Snow Labs +name: 3rh +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`3rh` is a English model originally trained by aloxatel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/3rh_en_5.2.0_3.0_1701538107263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/3rh_en_5.2.0_3.0_1701538107263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("3rh","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("3rh","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|3rh| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/aloxatel/3RH \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-4_way_detection_prop_16_en.md b/docs/_posts/ahmedlone127/2023-12-02-4_way_detection_prop_16_en.md new file mode 100644 index 000000000000..d98b27ef264b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-4_way_detection_prop_16_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English 4_way_detection_prop_16 RoBertaForSequenceClassification from ultra-coder54732 +author: John Snow Labs +name: 4_way_detection_prop_16 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`4_way_detection_prop_16` is a English model originally trained by ultra-coder54732. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/4_way_detection_prop_16_en_5.2.0_3.0_1701498030547.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/4_way_detection_prop_16_en_5.2.0_3.0_1701498030547.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("4_way_detection_prop_16","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("4_way_detection_prop_16","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|4_way_detection_prop_16| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.7 MB| + +## References + +https://huggingface.co/ultra-coder54732/4-way-detection-prop-16 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-a2_en.md b/docs/_posts/ahmedlone127/2023-12-02-a2_en.md new file mode 100644 index 000000000000..b5e31ebec81d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-a2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English a2 RoBertaForSequenceClassification from ethanrom +author: John Snow Labs +name: a2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`a2` is a English model originally trained by ethanrom. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/a2_en_5.2.0_3.0_1701481751360.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/a2_en_5.2.0_3.0_1701481751360.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("a2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("a2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|a2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ethanrom/a2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-adequacymodel_en.md b/docs/_posts/ahmedlone127/2023-12-02-adequacymodel_en.md new file mode 100644 index 000000000000..d49ee25af1ca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-adequacymodel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English adequacymodel RoBertaForSequenceClassification from ashwinpokee +author: John Snow Labs +name: adequacymodel +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`adequacymodel` is a English model originally trained by ashwinpokee. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/adequacymodel_en_5.2.0_3.0_1701517975373.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/adequacymodel_en_5.2.0_3.0_1701517975373.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("adequacymodel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("adequacymodel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|adequacymodel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/ashwinpokee/adequacymodel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_banking_2_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_banking_2_16_5_en.md new file mode 100644 index 000000000000..9ac4cc4f6940 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_banking_2_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_banking_2_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_banking_2_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_banking_2_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_banking_2_16_5_en_5.2.0_3.0_1701512590514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_banking_2_16_5_en_5.2.0_3.0_1701512590514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_banking_2_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_banking_2_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_banking_2_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-banking-2-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_banking_7_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_banking_7_16_5_en.md new file mode 100644 index 000000000000..45444e889424 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_banking_7_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_banking_7_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_banking_7_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_banking_7_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_banking_7_16_5_en_5.2.0_3.0_1701524580211.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_banking_7_16_5_en_5.2.0_3.0_1701524580211.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_banking_7_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_banking_7_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_banking_7_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-banking-7-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_credit_cards_1000_16_5_oos_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_credit_cards_1000_16_5_oos_en.md new file mode 100644 index 000000000000..f57136788440 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_credit_cards_1000_16_5_oos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_credit_cards_1000_16_5_oos RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_credit_cards_1000_16_5_oos +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_credit_cards_1000_16_5_oos` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_credit_cards_1000_16_5_oos_en_5.2.0_3.0_1701534826697.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_credit_cards_1000_16_5_oos_en_5.2.0_3.0_1701534826697.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_credit_cards_1000_16_5_oos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_credit_cards_1000_16_5_oos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_credit_cards_1000_16_5_oos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-credit_cards-1000-16-5-oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_home_2_16_5_oos_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_home_2_16_5_oos_en.md new file mode 100644 index 000000000000..6ab7fe2a9dfb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_home_2_16_5_oos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_home_2_16_5_oos RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_home_2_16_5_oos +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_home_2_16_5_oos` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_home_2_16_5_oos_en_5.2.0_3.0_1701542287112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_home_2_16_5_oos_en_5.2.0_3.0_1701542287112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_home_2_16_5_oos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_home_2_16_5_oos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_home_2_16_5_oos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-home-2-16-5-oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_home_7_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_home_7_16_5_en.md new file mode 100644 index 000000000000..e8672e3fcd25 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_home_7_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_home_7_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_home_7_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_home_7_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_home_7_16_5_en_5.2.0_3.0_1701509825003.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_home_7_16_5_en_5.2.0_3.0_1701509825003.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_home_7_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_home_7_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_home_7_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-home-7-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_kitchen_and_dining_9_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_kitchen_and_dining_9_16_5_en.md new file mode 100644 index 000000000000..eca65c9c17f0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_kitchen_and_dining_9_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_kitchen_and_dining_9_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_kitchen_and_dining_9_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_kitchen_and_dining_9_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_kitchen_and_dining_9_16_5_en_5.2.0_3.0_1701502036032.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_kitchen_and_dining_9_16_5_en_5.2.0_3.0_1701502036032.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_kitchen_and_dining_9_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_kitchen_and_dining_9_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_kitchen_and_dining_9_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-kitchen_and_dining-9-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_meta_3_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_meta_3_16_5_en.md new file mode 100644 index 000000000000..057c180ba082 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_meta_3_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_meta_3_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_meta_3_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_meta_3_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_meta_3_16_5_en_5.2.0_3.0_1701506633246.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_meta_3_16_5_en_5.2.0_3.0_1701506633246.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_meta_3_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_meta_3_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_meta_3_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-meta-3-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_small_talk_3_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_small_talk_3_16_5_en.md new file mode 100644 index 000000000000..53f6474ea0a7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_small_talk_3_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_small_talk_3_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_small_talk_3_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_small_talk_3_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_small_talk_3_16_5_en_5.2.0_3.0_1701526846882.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_small_talk_3_16_5_en_5.2.0_3.0_1701526846882.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_small_talk_3_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_small_talk_3_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_small_talk_3_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-small_talk-3-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_travel_2_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_travel_2_16_5_en.md new file mode 100644 index 000000000000..5df1eab47813 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_travel_2_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_travel_2_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_travel_2_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_travel_2_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_travel_2_16_5_en_5.2.0_3.0_1701532328286.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_travel_2_16_5_en_5.2.0_3.0_1701532328286.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_travel_2_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_travel_2_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_travel_2_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-travel-2-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_work_2_16_5_oos_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_work_2_16_5_oos_en.md new file mode 100644 index 000000000000..06ce25ef4480 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_work_2_16_5_oos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_work_2_16_5_oos RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_work_2_16_5_oos +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_work_2_16_5_oos` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_work_2_16_5_oos_en_5.2.0_3.0_1701538299626.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_work_2_16_5_oos_en_5.2.0_3.0_1701538299626.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_work_2_16_5_oos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_work_2_16_5_oos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_work_2_16_5_oos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-work-2-16-5-oos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_work_6_16_5_en.md b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_work_6_16_5_en.md new file mode 100644 index 000000000000..2fdfb0b7014c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-all_roberta_large_v1_work_6_16_5_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English all_roberta_large_v1_work_6_16_5 RoBertaForSequenceClassification from fathyshalab +author: John Snow Labs +name: all_roberta_large_v1_work_6_16_5 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`all_roberta_large_v1_work_6_16_5` is a English model originally trained by fathyshalab. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_work_6_16_5_en_5.2.0_3.0_1701537195156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/all_roberta_large_v1_work_6_16_5_en_5.2.0_3.0_1701537195156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_work_6_16_5","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("all_roberta_large_v1_work_6_16_5","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|all_roberta_large_v1_work_6_16_5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/fathyshalab/all-roberta-large-v1-work-6-16-5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-amazon_cross_encoder_en.md b/docs/_posts/ahmedlone127/2023-12-02-amazon_cross_encoder_en.md new file mode 100644 index 000000000000..4504eb9de1bb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-amazon_cross_encoder_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English amazon_cross_encoder RoBertaForSequenceClassification from LiYuan +author: John Snow Labs +name: amazon_cross_encoder +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`amazon_cross_encoder` is a English model originally trained by LiYuan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/amazon_cross_encoder_en_5.2.0_3.0_1701496228755.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/amazon_cross_encoder_en_5.2.0_3.0_1701496228755.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("amazon_cross_encoder","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("amazon_cross_encoder","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|amazon_cross_encoder| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/LiYuan/amazon-cross-encoder \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-arif_fake_news_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-02-arif_fake_news_classifier_en.md new file mode 100644 index 000000000000..33e367fc8a7e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-arif_fake_news_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English arif_fake_news_classifier RoBertaForSequenceClassification from Marif +author: John Snow Labs +name: arif_fake_news_classifier +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`arif_fake_news_classifier` is a English model originally trained by Marif. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/arif_fake_news_classifier_en_5.2.0_3.0_1701478194959.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/arif_fake_news_classifier_en_5.2.0_3.0_1701478194959.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("arif_fake_news_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("arif_fake_news_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|arif_fake_news_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|457.5 MB| + +## References + +https://huggingface.co/Marif/Arif_fake_news_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-autonlp_txc_17923124_en.md b/docs/_posts/ahmedlone127/2023-12-02-autonlp_txc_17923124_en.md new file mode 100644 index 000000000000..0a0f8ad30a0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-autonlp_txc_17923124_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English autonlp_txc_17923124 RoBertaForSequenceClassification from emekaboris +author: John Snow Labs +name: autonlp_txc_17923124 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`autonlp_txc_17923124` is a English model originally trained by emekaboris. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/autonlp_txc_17923124_en_5.2.0_3.0_1701495211922.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/autonlp_txc_17923124_en_5.2.0_3.0_1701495211922.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("autonlp_txc_17923124","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("autonlp_txc_17923124","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|autonlp_txc_17923124| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.5 MB| + +## References + +https://huggingface.co/emekaboris/autonlp-txc-17923124 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-avg_en.md b/docs/_posts/ahmedlone127/2023-12-02-avg_en.md new file mode 100644 index 000000000000..253be5601757 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-avg_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English avg RoBertaForSequenceClassification from aloxatel +author: John Snow Labs +name: avg +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`avg` is a English model originally trained by aloxatel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/avg_en_5.2.0_3.0_1701506020493.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/avg_en_5.2.0_3.0_1701506020493.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("avg","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("avg","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|avg| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/aloxatel/AVG \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-bertovosentneg0_en.md b/docs/_posts/ahmedlone127/2023-12-02-bertovosentneg0_en.md new file mode 100644 index 000000000000..2aa41aafc808 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-bertovosentneg0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bertovosentneg0 RoBertaForSequenceClassification from Tanor +author: John Snow Labs +name: bertovosentneg0 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bertovosentneg0` is a English model originally trained by Tanor. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bertovosentneg0_en_5.2.0_3.0_1701535282535.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bertovosentneg0_en_5.2.0_3.0_1701535282535.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertovosentneg0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bertovosentneg0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bertovosentneg0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|533.0 MB| + +## References + +https://huggingface.co/Tanor/BERTovoSENTNEG0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-bitcoin_sentiment_train_en.md b/docs/_posts/ahmedlone127/2023-12-02-bitcoin_sentiment_train_en.md new file mode 100644 index 000000000000..a4336bdf883f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-bitcoin_sentiment_train_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bitcoin_sentiment_train RoBertaForSequenceClassification from ma2sevich +author: John Snow Labs +name: bitcoin_sentiment_train +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bitcoin_sentiment_train` is a English model originally trained by ma2sevich. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bitcoin_sentiment_train_en_5.2.0_3.0_1701477234024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bitcoin_sentiment_train_en_5.2.0_3.0_1701477234024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bitcoin_sentiment_train","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bitcoin_sentiment_train","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bitcoin_sentiment_train| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/ma2sevich/bitcoin_sentiment_train \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-bmw_en.md b/docs/_posts/ahmedlone127/2023-12-02-bmw_en.md new file mode 100644 index 000000000000..1aa052922c9e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-bmw_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English bmw RoBertaForSequenceClassification from UT +author: John Snow Labs +name: bmw +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bmw` is a English model originally trained by UT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bmw_en_5.2.0_3.0_1701500596815.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bmw_en_5.2.0_3.0_1701500596815.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("bmw","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("bmw","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bmw| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|770.4 MB| + +## References + +https://huggingface.co/UT/BMW \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-brainbert_base_korean_kornli_en.md b/docs/_posts/ahmedlone127/2023-12-02-brainbert_base_korean_kornli_en.md new file mode 100644 index 000000000000..bcd1384f6b99 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-brainbert_base_korean_kornli_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English brainbert_base_korean_kornli RoBertaForSequenceClassification from hyunwoongko +author: John Snow Labs +name: brainbert_base_korean_kornli +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`brainbert_base_korean_kornli` is a English model originally trained by hyunwoongko. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/brainbert_base_korean_kornli_en_5.2.0_3.0_1701485818533.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/brainbert_base_korean_kornli_en_5.2.0_3.0_1701485818533.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("brainbert_base_korean_kornli","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("brainbert_base_korean_kornli","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|brainbert_base_korean_kornli| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|265.5 MB| + +## References + +https://huggingface.co/hyunwoongko/brainbert-base-ko-kornli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-burmese_awesome_model_classification_en.md b/docs/_posts/ahmedlone127/2023-12-02-burmese_awesome_model_classification_en.md new file mode 100644 index 000000000000..dd226cbc91f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-burmese_awesome_model_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English burmese_awesome_model_classification RoBertaForSequenceClassification from ketong3906 +author: John Snow Labs +name: burmese_awesome_model_classification +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`burmese_awesome_model_classification` is a English model originally trained by ketong3906. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_classification_en_5.2.0_3.0_1701521628172.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/burmese_awesome_model_classification_en_5.2.0_3.0_1701521628172.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_awesome_model_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("burmese_awesome_model_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|burmese_awesome_model_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.1 MB| + +## References + +https://huggingface.co/ketong3906/my_awesome_model_classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-carer_5way_en.md b/docs/_posts/ahmedlone127/2023-12-02-carer_5way_en.md new file mode 100644 index 000000000000..dafefba41d44 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-carer_5way_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English carer_5way RoBertaForSequenceClassification from crcb +author: John Snow Labs +name: carer_5way +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`carer_5way` is a English model originally trained by crcb. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/carer_5way_en_5.2.0_3.0_1701541804505.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/carer_5way_en_5.2.0_3.0_1701541804505.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("carer_5way","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("carer_5way","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|carer_5way| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.9 MB| + +## References + +https://huggingface.co/crcb/carer_5way \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cares_roberta_bne_en.md b/docs/_posts/ahmedlone127/2023-12-02-cares_roberta_bne_en.md new file mode 100644 index 000000000000..47bc922cf543 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cares_roberta_bne_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cares_roberta_bne RoBertaForSequenceClassification from chizhikchi +author: John Snow Labs +name: cares_roberta_bne +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cares_roberta_bne` is a English model originally trained by chizhikchi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cares_roberta_bne_en_5.2.0_3.0_1701493017509.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cares_roberta_bne_en_5.2.0_3.0_1701493017509.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cares_roberta_bne","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cares_roberta_bne","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cares_roberta_bne| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|427.5 MB| + +## References + +https://huggingface.co/chizhikchi/cares-roberta-bne \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cares_roberta_clinical_en.md b/docs/_posts/ahmedlone127/2023-12-02-cares_roberta_clinical_en.md new file mode 100644 index 000000000000..f6e2998f48c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cares_roberta_clinical_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cares_roberta_clinical RoBertaForSequenceClassification from chizhikchi +author: John Snow Labs +name: cares_roberta_clinical +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cares_roberta_clinical` is a English model originally trained by chizhikchi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cares_roberta_clinical_en_5.2.0_3.0_1701512723851.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cares_roberta_clinical_en_5.2.0_3.0_1701512723851.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cares_roberta_clinical","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cares_roberta_clinical","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cares_roberta_clinical| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|445.1 MB| + +## References + +https://huggingface.co/chizhikchi/cares-roberta-clinical \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-categorizacion_comercios_v_0_0_3_en.md b/docs/_posts/ahmedlone127/2023-12-02-categorizacion_comercios_v_0_0_3_en.md new file mode 100644 index 000000000000..1d83e9f98711 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-categorizacion_comercios_v_0_0_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English categorizacion_comercios_v_0_0_3 RoBertaForSequenceClassification from salascorp +author: John Snow Labs +name: categorizacion_comercios_v_0_0_3 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`categorizacion_comercios_v_0_0_3` is a English model originally trained by salascorp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/categorizacion_comercios_v_0_0_3_en_5.2.0_3.0_1701514168732.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/categorizacion_comercios_v_0_0_3_en_5.2.0_3.0_1701514168732.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("categorizacion_comercios_v_0_0_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("categorizacion_comercios_v_0_0_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|categorizacion_comercios_v_0_0_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/salascorp/categorizacion_comercios_v_0.0.3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-categorizacion_comercios_v_0_0_4_en.md b/docs/_posts/ahmedlone127/2023-12-02-categorizacion_comercios_v_0_0_4_en.md new file mode 100644 index 000000000000..f77d4ecbb46c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-categorizacion_comercios_v_0_0_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English categorizacion_comercios_v_0_0_4 RoBertaForSequenceClassification from salascorp +author: John Snow Labs +name: categorizacion_comercios_v_0_0_4 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`categorizacion_comercios_v_0_0_4` is a English model originally trained by salascorp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/categorizacion_comercios_v_0_0_4_en_5.2.0_3.0_1701536069581.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/categorizacion_comercios_v_0_0_4_en_5.2.0_3.0_1701536069581.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("categorizacion_comercios_v_0_0_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("categorizacion_comercios_v_0_0_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|categorizacion_comercios_v_0_0_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/salascorp/categorizacion_comercios_v_0.0.4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-catexp2_en.md b/docs/_posts/ahmedlone127/2023-12-02-catexp2_en.md new file mode 100644 index 000000000000..a98da6b90ed0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-catexp2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English catexp2 RoBertaForSequenceClassification from owen99630 +author: John Snow Labs +name: catexp2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`catexp2` is a English model originally trained by owen99630. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/catexp2_en_5.2.0_3.0_1701543628677.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/catexp2_en_5.2.0_3.0_1701543628677.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("catexp2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("catexp2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|catexp2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|421.6 MB| + +## References + +https://huggingface.co/owen99630/catexp2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cbd_fulldata_en.md b/docs/_posts/ahmedlone127/2023-12-02-cbd_fulldata_en.md new file mode 100644 index 000000000000..0a90ddd21541 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cbd_fulldata_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cbd_fulldata RoBertaForSequenceClassification from Kidsshield +author: John Snow Labs +name: cbd_fulldata +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cbd_fulldata` is a English model originally trained by Kidsshield. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cbd_fulldata_en_5.2.0_3.0_1701500833024.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cbd_fulldata_en_5.2.0_3.0_1701500833024.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cbd_fulldata","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cbd_fulldata","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cbd_fulldata| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.8 MB| + +## References + +https://huggingface.co/Kidsshield/CBD-FullData \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr15_seed1_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr15_seed1_en.md new file mode 100644 index 000000000000..fa353c123bdc --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr15_seed1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr15_seed1 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr15_seed1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr15_seed1` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr15_seed1_en_5.2.0_3.0_1701511183743.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr15_seed1_en_5.2.0_3.0_1701511183743.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr15_seed1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr15_seed1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr15_seed1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr15-seed1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr17_seed4_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr17_seed4_en.md new file mode 100644 index 000000000000..b7e1cf060acf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr17_seed4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr17_seed4 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr17_seed4 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr17_seed4` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr17_seed4_en_5.2.0_3.0_1701537468895.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr17_seed4_en_5.2.0_3.0_1701537468895.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr17_seed4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr17_seed4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr17_seed4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr17-seed4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr19_seed2_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr19_seed2_en.md new file mode 100644 index 000000000000..66a060935c5f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr19_seed2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr19_seed2 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr19_seed2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr19_seed2` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr19_seed2_en_5.2.0_3.0_1701532927767.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr19_seed2_en_5.2.0_3.0_1701532927767.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr19_seed2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr19_seed2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr19_seed2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr19-seed2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr1_seed3_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr1_seed3_en.md new file mode 100644 index 000000000000..09a5bccae727 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr1_seed3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr1_seed3 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr1_seed3 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr1_seed3` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr1_seed3_en_5.2.0_3.0_1701531654362.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr1_seed3_en_5.2.0_3.0_1701531654362.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr1_seed3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr1_seed3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr1_seed3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.7 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr1-seed3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr20_seed3_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr20_seed3_en.md new file mode 100644 index 000000000000..1abc9748f6c8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr20_seed3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr20_seed3 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr20_seed3 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr20_seed3` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr20_seed3_en_5.2.0_3.0_1701530988808.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr20_seed3_en_5.2.0_3.0_1701530988808.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr20_seed3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr20_seed3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr20_seed3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr20-seed3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr24_seed2_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr24_seed2_en.md new file mode 100644 index 000000000000..8ac6e14b4332 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr24_seed2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr24_seed2 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr24_seed2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr24_seed2` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr24_seed2_en_5.2.0_3.0_1701508075985.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr24_seed2_en_5.2.0_3.0_1701508075985.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr24_seed2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr24_seed2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr24_seed2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.9 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr24-seed2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr27_seed0_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr27_seed0_en.md new file mode 100644 index 000000000000..ec1d2c963384 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr27_seed0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr27_seed0 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr27_seed0 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr27_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr27_seed0_en_5.2.0_3.0_1701523004608.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr27_seed0_en_5.2.0_3.0_1701523004608.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr27_seed0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr27_seed0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr27_seed0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.0 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr27-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr4_seed0_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr4_seed0_en.md new file mode 100644 index 000000000000..b4dbb7476803 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr4_seed0_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr4_seed0 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr4_seed0 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr4_seed0` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr4_seed0_en_5.2.0_3.0_1701532309329.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr4_seed0_en_5.2.0_3.0_1701532309329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr4_seed0","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr4_seed0","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr4_seed0| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.8 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr4-seed0 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr6_seed2_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr6_seed2_en.md new file mode 100644 index 000000000000..857aee82e272 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr6_seed2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr6_seed2 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr6_seed2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr6_seed2` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr6_seed2_en_5.2.0_3.0_1701518981610.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr6_seed2_en_5.2.0_3.0_1701518981610.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr6_seed2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr6_seed2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr6_seed2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.8 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr6-seed2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr6_seed3_en.md b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr6_seed3_en.md new file mode 100644 index 000000000000..201d6aa87e87 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cold_fusion_itr6_seed3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cold_fusion_itr6_seed3 RoBertaForSequenceClassification from ibm +author: John Snow Labs +name: cold_fusion_itr6_seed3 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cold_fusion_itr6_seed3` is a English model originally trained by ibm. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cold_fusion_itr6_seed3_en_5.2.0_3.0_1701512590455.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cold_fusion_itr6_seed3_en_5.2.0_3.0_1701512590455.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr6_seed3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cold_fusion_itr6_seed3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cold_fusion_itr6_seed3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.8 MB| + +## References + +https://huggingface.co/ibm/ColD-Fusion-itr6-seed3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-comparador_german_textos_en.md b/docs/_posts/ahmedlone127/2023-12-02-comparador_german_textos_en.md new file mode 100644 index 000000000000..47106988bebe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-comparador_german_textos_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English comparador_german_textos RoBertaForSequenceClassification from ellucas +author: John Snow Labs +name: comparador_german_textos +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`comparador_german_textos` is a English model originally trained by ellucas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/comparador_german_textos_en_5.2.0_3.0_1701486573309.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/comparador_german_textos_en_5.2.0_3.0_1701486573309.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("comparador_german_textos","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("comparador_german_textos","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|comparador_german_textos| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/ellucas/Comparador-de-textos \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-cse244b_hw2_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-02-cse244b_hw2_roberta_en.md new file mode 100644 index 000000000000..5157b2b88edf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-cse244b_hw2_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English cse244b_hw2_roberta RoBertaForSequenceClassification from Brendan +author: John Snow Labs +name: cse244b_hw2_roberta +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`cse244b_hw2_roberta` is a English model originally trained by Brendan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/cse244b_hw2_roberta_en_5.2.0_3.0_1701493802545.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/cse244b_hw2_roberta_en_5.2.0_3.0_1701493802545.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("cse244b_hw2_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("cse244b_hw2_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|cse244b_hw2_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|438.3 MB| + +## References + +https://huggingface.co/Brendan/cse244b-hw2-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-depression_reddit_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-depression_reddit_distilroberta_base_en.md new file mode 100644 index 000000000000..9e632c34257b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-depression_reddit_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English depression_reddit_distilroberta_base RoBertaForSequenceClassification from mrjunos +author: John Snow Labs +name: depression_reddit_distilroberta_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`depression_reddit_distilroberta_base` is a English model originally trained by mrjunos. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/depression_reddit_distilroberta_base_en_5.2.0_3.0_1701540586236.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/depression_reddit_distilroberta_base_en_5.2.0_3.0_1701540586236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("depression_reddit_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("depression_reddit_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|depression_reddit_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/mrjunos/depression-reddit-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false_en.md b/docs/_posts/ahmedlone127/2023-12-02-distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false_en.md new file mode 100644 index 000000000000..f08c5b8f2c7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false RoBertaForSequenceClassification from ali2066 +author: John Snow Labs +name: distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false` is a English model originally trained by ali2066. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false_en_5.2.0_3.0_1701487708481.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false_en_5.2.0_3.0_1701487708481.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilbertfinal_ctxsentence_train_essays_test_null_second_train_set_null_false| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/ali2066/DistilBERTFINAL_ctxSentence_TRAIN_essays_TEST_NULL_second_train_set_null_False \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-distilroberta_base_mic_sym_en.md b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_base_mic_sym_en.md new file mode 100644 index 000000000000..fddbab0aaeaf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_base_mic_sym_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_mic_sym RoBertaForSequenceClassification from agi-css +author: John Snow Labs +name: distilroberta_base_mic_sym +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_mic_sym` is a English model originally trained by agi-css. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_mic_sym_en_5.2.0_3.0_1701525955961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_mic_sym_en_5.2.0_3.0_1701525955961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mic_sym","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mic_sym","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_mic_sym| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/agi-css/distilroberta-base-mic-sym \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-distilroberta_base_mrpc_glue_jovanlopez32_en.md b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_base_mrpc_glue_jovanlopez32_en.md new file mode 100644 index 000000000000..cd9a0ff27313 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_base_mrpc_glue_jovanlopez32_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_base_mrpc_glue_jovanlopez32 RoBertaForSequenceClassification from jovanlopez32 +author: John Snow Labs +name: distilroberta_base_mrpc_glue_jovanlopez32 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_base_mrpc_glue_jovanlopez32` is a English model originally trained by jovanlopez32. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_jovanlopez32_en_5.2.0_3.0_1701517262786.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_base_mrpc_glue_jovanlopez32_en_5.2.0_3.0_1701517262786.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_jovanlopez32","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_base_mrpc_glue_jovanlopez32","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_base_mrpc_glue_jovanlopez32| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/jovanlopez32/distilroberta-base-mrpc-glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-distilroberta_bbc_news_en.md b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_bbc_news_en.md new file mode 100644 index 000000000000..8636984d7809 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_bbc_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_bbc_news RoBertaForSequenceClassification from AyoubChLin +author: John Snow Labs +name: distilroberta_bbc_news +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_bbc_news` is a English model originally trained by AyoubChLin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_bbc_news_en_5.2.0_3.0_1701515710768.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_bbc_news_en_5.2.0_3.0_1701515710768.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_bbc_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_bbc_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_bbc_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/AyoubChLin/DistilRoberta-bbc_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-distilroberta_pkdd_anomaly_baseline_en.md b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_pkdd_anomaly_baseline_en.md new file mode 100644 index 000000000000..9546909dadef --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_pkdd_anomaly_baseline_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_pkdd_anomaly_baseline RoBertaForSequenceClassification from EgilKarlsen +author: John Snow Labs +name: distilroberta_pkdd_anomaly_baseline +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_pkdd_anomaly_baseline` is a English model originally trained by EgilKarlsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_pkdd_anomaly_baseline_en_5.2.0_3.0_1701489775240.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_pkdd_anomaly_baseline_en_5.2.0_3.0_1701489775240.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pkdd_anomaly_baseline","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pkdd_anomaly_baseline","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_pkdd_anomaly_baseline| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.5 MB| + +## References + +https://huggingface.co/EgilKarlsen/DistilRoBERTa_PKDD-Anomaly_Baseline \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-distilroberta_pkdd_anomaly_en.md b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_pkdd_anomaly_en.md new file mode 100644 index 000000000000..7374a4397317 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_pkdd_anomaly_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_pkdd_anomaly RoBertaForSequenceClassification from EgilKarlsen +author: John Snow Labs +name: distilroberta_pkdd_anomaly +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_pkdd_anomaly` is a English model originally trained by EgilKarlsen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_pkdd_anomaly_en_5.2.0_3.0_1701501895657.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_pkdd_anomaly_en_5.2.0_3.0_1701501895657.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pkdd_anomaly","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_pkdd_anomaly","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_pkdd_anomaly| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|309.0 MB| + +## References + +https://huggingface.co/EgilKarlsen/DistilRoBERTa_PKDD-Anomaly \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-distilroberta_rb156k_opt15_ep40_phrase5k_en.md b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_rb156k_opt15_ep40_phrase5k_en.md new file mode 100644 index 000000000000..2e6cbae837c3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-distilroberta_rb156k_opt15_ep40_phrase5k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English distilroberta_rb156k_opt15_ep40_phrase5k RoBertaForSequenceClassification from judy93536 +author: John Snow Labs +name: distilroberta_rb156k_opt15_ep40_phrase5k +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`distilroberta_rb156k_opt15_ep40_phrase5k` is a English model originally trained by judy93536. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilroberta_rb156k_opt15_ep40_phrase5k_en_5.2.0_3.0_1701479466685.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilroberta_rb156k_opt15_ep40_phrase5k_en_5.2.0_3.0_1701479466685.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_rb156k_opt15_ep40_phrase5k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("distilroberta_rb156k_opt15_ep40_phrase5k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|distilroberta_rb156k_opt15_ep40_phrase5k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.1 MB| + +## References + +https://huggingface.co/judy93536/distilroberta-rb156k-opt15-ep40-phrase5k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-dominance_english_distilroberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-dominance_english_distilroberta_base_en.md new file mode 100644 index 000000000000..2f996009c593 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-dominance_english_distilroberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English dominance_english_distilroberta_base RoBertaForSequenceClassification from samueldomdey +author: John Snow Labs +name: dominance_english_distilroberta_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dominance_english_distilroberta_base` is a English model originally trained by samueldomdey. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dominance_english_distilroberta_base_en_5.2.0_3.0_1701494995049.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dominance_english_distilroberta_base_en_5.2.0_3.0_1701494995049.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("dominance_english_distilroberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("dominance_english_distilroberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|dominance_english_distilroberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/samueldomdey/dominance-english-distilroberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-emergency_text_classfication2_en.md b/docs/_posts/ahmedlone127/2023-12-02-emergency_text_classfication2_en.md new file mode 100644 index 000000000000..931a57a92b71 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-emergency_text_classfication2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English emergency_text_classfication2 RoBertaForSequenceClassification from Americo +author: John Snow Labs +name: emergency_text_classfication2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`emergency_text_classfication2` is a English model originally trained by Americo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/emergency_text_classfication2_en_5.2.0_3.0_1701503098168.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/emergency_text_classfication2_en_5.2.0_3.0_1701503098168.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("emergency_text_classfication2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("emergency_text_classfication2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|emergency_text_classfication2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.5 MB| + +## References + +https://huggingface.co/Americo/emergency-text-classfication2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-financial_phrasebank_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-02-financial_phrasebank_roberta_en.md new file mode 100644 index 000000000000..7301dd288661 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-financial_phrasebank_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English financial_phrasebank_roberta RoBertaForSequenceClassification from langecod +author: John Snow Labs +name: financial_phrasebank_roberta +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`financial_phrasebank_roberta` is a English model originally trained by langecod. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/financial_phrasebank_roberta_en_5.2.0_3.0_1701489682454.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/financial_phrasebank_roberta_en_5.2.0_3.0_1701489682454.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("financial_phrasebank_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("financial_phrasebank_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|financial_phrasebank_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/langecod/Financial_Phrasebank_RoBERTa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-fine_tunning_roberta_bne_hate_offensive_en.md b/docs/_posts/ahmedlone127/2023-12-02-fine_tunning_roberta_bne_hate_offensive_en.md new file mode 100644 index 000000000000..241dd7504562 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-fine_tunning_roberta_bne_hate_offensive_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English fine_tunning_roberta_bne_hate_offensive RoBertaForSequenceClassification from esmarquez17 +author: John Snow Labs +name: fine_tunning_roberta_bne_hate_offensive +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`fine_tunning_roberta_bne_hate_offensive` is a English model originally trained by esmarquez17. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/fine_tunning_roberta_bne_hate_offensive_en_5.2.0_3.0_1701533732651.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/fine_tunning_roberta_bne_hate_offensive_en_5.2.0_3.0_1701533732651.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("fine_tunning_roberta_bne_hate_offensive","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("fine_tunning_roberta_bne_hate_offensive","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|fine_tunning_roberta_bne_hate_offensive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.3 MB| + +## References + +https://huggingface.co/esmarquez17/fine-tunning-roberta-bne-hate-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-finetuned_adversarial_paraphrase_model_en.md b/docs/_posts/ahmedlone127/2023-12-02-finetuned_adversarial_paraphrase_model_en.md new file mode 100644 index 000000000000..d42a7e2e77b5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-finetuned_adversarial_paraphrase_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_adversarial_paraphrase_model RoBertaForSequenceClassification from chitra +author: John Snow Labs +name: finetuned_adversarial_paraphrase_model +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_adversarial_paraphrase_model` is a English model originally trained by chitra. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_adversarial_paraphrase_model_en_5.2.0_3.0_1701477704954.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_adversarial_paraphrase_model_en_5.2.0_3.0_1701477704954.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_adversarial_paraphrase_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_adversarial_paraphrase_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_adversarial_paraphrase_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/chitra/finetuned-adversarial-paraphrase-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-finetuned_beliefs_sentiment_classifier_experiment1_en.md b/docs/_posts/ahmedlone127/2023-12-02-finetuned_beliefs_sentiment_classifier_experiment1_en.md new file mode 100644 index 000000000000..7526df3c6180 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-finetuned_beliefs_sentiment_classifier_experiment1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_beliefs_sentiment_classifier_experiment1 RoBertaForSequenceClassification from hriaz +author: John Snow Labs +name: finetuned_beliefs_sentiment_classifier_experiment1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_beliefs_sentiment_classifier_experiment1` is a English model originally trained by hriaz. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_beliefs_sentiment_classifier_experiment1_en_5.2.0_3.0_1701496132874.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_beliefs_sentiment_classifier_experiment1_en_5.2.0_3.0_1701496132874.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_beliefs_sentiment_classifier_experiment1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_beliefs_sentiment_classifier_experiment1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_beliefs_sentiment_classifier_experiment1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/hriaz/finetuned_beliefs_sentiment_classifier_experiment1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-finetuned_roberta_base_model_kwasiasomani_en.md b/docs/_posts/ahmedlone127/2023-12-02-finetuned_roberta_base_model_kwasiasomani_en.md new file mode 100644 index 000000000000..fe28471063c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-finetuned_roberta_base_model_kwasiasomani_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_roberta_base_model_kwasiasomani RoBertaForSequenceClassification from Kwasiasomani +author: John Snow Labs +name: finetuned_roberta_base_model_kwasiasomani +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_roberta_base_model_kwasiasomani` is a English model originally trained by Kwasiasomani. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_model_kwasiasomani_en_5.2.0_3.0_1701494007008.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_model_kwasiasomani_en_5.2.0_3.0_1701494007008.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base_model_kwasiasomani","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base_model_kwasiasomani","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_roberta_base_model_kwasiasomani| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.3 MB| + +## References + +https://huggingface.co/Kwasiasomani/Finetuned-Roberta-base-model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-finetuned_roberta_base_on_iemocap_1_en.md b/docs/_posts/ahmedlone127/2023-12-02-finetuned_roberta_base_on_iemocap_1_en.md new file mode 100644 index 000000000000..8f07a5656b76 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-finetuned_roberta_base_on_iemocap_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_roberta_base_on_iemocap_1 RoBertaForSequenceClassification from minoosh +author: John Snow Labs +name: finetuned_roberta_base_on_iemocap_1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_roberta_base_on_iemocap_1` is a English model originally trained by minoosh. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_on_iemocap_1_en_5.2.0_3.0_1701494200318.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_roberta_base_on_iemocap_1_en_5.2.0_3.0_1701494200318.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base_on_iemocap_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_roberta_base_on_iemocap_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_roberta_base_on_iemocap_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/minoosh/finetuned_roberta-base-on-IEMOCAP_1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-finetuned_robertuito_base_cased_v_p_g_en.md b/docs/_posts/ahmedlone127/2023-12-02-finetuned_robertuito_base_cased_v_p_g_en.md new file mode 100644 index 000000000000..392f0ff44a81 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-finetuned_robertuito_base_cased_v_p_g_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuned_robertuito_base_cased_v_p_g RoBertaForSequenceClassification from JosePezantes +author: John Snow Labs +name: finetuned_robertuito_base_cased_v_p_g +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuned_robertuito_base_cased_v_p_g` is a English model originally trained by JosePezantes. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuned_robertuito_base_cased_v_p_g_en_5.2.0_3.0_1701488206938.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuned_robertuito_base_cased_v_p_g_en_5.2.0_3.0_1701488206938.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_robertuito_base_cased_v_p_g","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuned_robertuito_base_cased_v_p_g","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuned_robertuito_base_cased_v_p_g| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/JosePezantes/finetuned-robertuito-base-cased-V-P-G \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-finetuning_sentiment_model_1_en.md b/docs/_posts/ahmedlone127/2023-12-02-finetuning_sentiment_model_1_en.md new file mode 100644 index 000000000000..85f799663127 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-finetuning_sentiment_model_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_1 RoBertaForSequenceClassification from pabagcha +author: John Snow Labs +name: finetuning_sentiment_model_1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_1` is a English model originally trained by pabagcha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_1_en_5.2.0_3.0_1701539885912.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_1_en_5.2.0_3.0_1701539885912.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.6 MB| + +## References + +https://huggingface.co/pabagcha/finetuning-sentiment-model-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-finetuning_sentiment_model_roberta_base_25000_samples_en.md b/docs/_posts/ahmedlone127/2023-12-02-finetuning_sentiment_model_roberta_base_25000_samples_en.md new file mode 100644 index 000000000000..0dc255299e86 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-finetuning_sentiment_model_roberta_base_25000_samples_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English finetuning_sentiment_model_roberta_base_25000_samples RoBertaForSequenceClassification from choidf +author: John Snow Labs +name: finetuning_sentiment_model_roberta_base_25000_samples +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`finetuning_sentiment_model_roberta_base_25000_samples` is a English model originally trained by choidf. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_roberta_base_25000_samples_en_5.2.0_3.0_1701487968858.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/finetuning_sentiment_model_roberta_base_25000_samples_en_5.2.0_3.0_1701487968858.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_roberta_base_25000_samples","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("finetuning_sentiment_model_roberta_base_25000_samples","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|finetuning_sentiment_model_roberta_base_25000_samples| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|462.8 MB| + +## References + +https://huggingface.co/choidf/finetuning-sentiment-model-roberta-base-25000-samples \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-gottbert_base_finetuned_fbi_german_de.md b/docs/_posts/ahmedlone127/2023-12-02-gottbert_base_finetuned_fbi_german_de.md new file mode 100644 index 000000000000..44509b12f554 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-gottbert_base_finetuned_fbi_german_de.md @@ -0,0 +1,97 @@ +--- +layout: model +title: German gottbert_base_finetuned_fbi_german RoBertaForSequenceClassification from julius-br +author: John Snow Labs +name: gottbert_base_finetuned_fbi_german +date: 2023-12-02 +tags: [roberta, de, open_source, sequence_classification, onnx] +task: Text Classification +language: de +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`gottbert_base_finetuned_fbi_german` is a German model originally trained by julius-br. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/gottbert_base_finetuned_fbi_german_de_5.2.0_3.0_1701481864366.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/gottbert_base_finetuned_fbi_german_de_5.2.0_3.0_1701481864366.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("gottbert_base_finetuned_fbi_german","de")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("gottbert_base_finetuned_fbi_german","de") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|gottbert_base_finetuned_fbi_german| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|de| +|Size:|472.8 MB| + +## References + +https://huggingface.co/julius-br/gottbert-base-finetuned-fbi-german \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-grammar_model_en.md b/docs/_posts/ahmedlone127/2023-12-02-grammar_model_en.md new file mode 100644 index 000000000000..f6d4b4feab0b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-grammar_model_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English grammar_model RoBertaForSequenceClassification from lightcarrieson +author: John Snow Labs +name: grammar_model +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`grammar_model` is a English model originally trained by lightcarrieson. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/grammar_model_en_5.2.0_3.0_1701495130580.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/grammar_model_en_5.2.0_3.0_1701495130580.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("grammar_model","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("grammar_model","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|grammar_model| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.9 MB| + +## References + +https://huggingface.co/lightcarrieson/grammar_model \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random0_seed0_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random0_seed0_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..b58628a29f06 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random0_seed0_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random0_seed0_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random0_seed0_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random0_seed0_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random0_seed0_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701530374564.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random0_seed0_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701530374564.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random0_seed0_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random0_seed0_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random0_seed0_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random0_seed0-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random0_seed2_roberta_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random0_seed2_roberta_large_en.md new file mode 100644 index 000000000000..cfd3372a83de --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random0_seed2_roberta_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random0_seed2_roberta_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random0_seed2_roberta_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random0_seed2_roberta_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random0_seed2_roberta_large_en_5.2.0_3.0_1701517262735.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random0_seed2_roberta_large_en_5.2.0_3.0_1701517262735.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random0_seed2_roberta_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random0_seed2_roberta_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random0_seed2_roberta_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random0_seed2-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed0_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed0_roberta_base_en.md new file mode 100644 index 000000000000..ab5e7f85ce02 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed0_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random1_seed0_roberta_base RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random1_seed0_roberta_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random1_seed0_roberta_base` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed0_roberta_base_en_5.2.0_3.0_1701535031350.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed0_roberta_base_en_5.2.0_3.0_1701535031350.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed0_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed0_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random1_seed0_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|427.7 MB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random1_seed0-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed0_twitter_roberta_base_2019_90m_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed0_twitter_roberta_base_2019_90m_en.md new file mode 100644 index 000000000000..cb334cd7d98c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed0_twitter_roberta_base_2019_90m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random1_seed0_twitter_roberta_base_2019_90m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random1_seed0_twitter_roberta_base_2019_90m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random1_seed0_twitter_roberta_base_2019_90m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed0_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701538524184.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed0_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701538524184.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed0_twitter_roberta_base_2019_90m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed0_twitter_roberta_base_2019_90m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random1_seed0_twitter_roberta_base_2019_90m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random1_seed0-twitter-roberta-base-2019-90m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed1_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed1_roberta_base_en.md new file mode 100644 index 000000000000..fdf468df9650 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed1_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random1_seed1_roberta_base RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random1_seed1_roberta_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random1_seed1_roberta_base` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed1_roberta_base_en_5.2.0_3.0_1701536182817.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed1_roberta_base_en_5.2.0_3.0_1701536182817.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed1_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed1_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random1_seed1_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|427.7 MB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random1_seed1-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed2_bertweet_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed2_bertweet_large_en.md new file mode 100644 index 000000000000..445d9f522c10 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random1_seed2_bertweet_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random1_seed2_bertweet_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random1_seed2_bertweet_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random1_seed2_bertweet_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed2_bertweet_large_en_5.2.0_3.0_1701516256850.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random1_seed2_bertweet_large_en_5.2.0_3.0_1701516256850.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed2_bertweet_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random1_seed2_bertweet_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random1_seed2_bertweet_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random1_seed2-bertweet-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random2_seed1_twitter_roberta_base_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random2_seed1_twitter_roberta_base_2022_154m_en.md new file mode 100644 index 000000000000..5c24e96b8690 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random2_seed1_twitter_roberta_base_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random2_seed1_twitter_roberta_base_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random2_seed1_twitter_roberta_base_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random2_seed1_twitter_roberta_base_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random2_seed1_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701520147397.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random2_seed1_twitter_roberta_base_2022_154m_en_5.2.0_3.0_1701520147397.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random2_seed1_twitter_roberta_base_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random2_seed1_twitter_roberta_base_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random2_seed1_twitter_roberta_base_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random2_seed1-twitter-roberta-base-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random2_seed2_bertweet_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random2_seed2_bertweet_large_en.md new file mode 100644 index 000000000000..c0dd470c7e11 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random2_seed2_bertweet_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random2_seed2_bertweet_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random2_seed2_bertweet_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random2_seed2_bertweet_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random2_seed2_bertweet_large_en_5.2.0_3.0_1701524586406.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random2_seed2_bertweet_large_en_5.2.0_3.0_1701524586406.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random2_seed2_bertweet_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random2_seed2_bertweet_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random2_seed2_bertweet_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random2_seed2-bertweet-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random3_seed0_twitter_roberta_base_2021_124m_en.md b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random3_seed0_twitter_roberta_base_2021_124m_en.md new file mode 100644 index 000000000000..e9d756f6289e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-hate_hate_random3_seed0_twitter_roberta_base_2021_124m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English hate_hate_random3_seed0_twitter_roberta_base_2021_124m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: hate_hate_random3_seed0_twitter_roberta_base_2021_124m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`hate_hate_random3_seed0_twitter_roberta_base_2021_124m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/hate_hate_random3_seed0_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701543250901.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/hate_hate_random3_seed0_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701543250901.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random3_seed0_twitter_roberta_base_2021_124m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("hate_hate_random3_seed0_twitter_roberta_base_2021_124m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|hate_hate_random3_seed0_twitter_roberta_base_2021_124m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/hate-hate_random3_seed0-twitter-roberta-base-2021-124m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-horai_v2_47k_large_e1_lr_6_en.md b/docs/_posts/ahmedlone127/2023-12-02-horai_v2_47k_large_e1_lr_6_en.md new file mode 100644 index 000000000000..1e5ce7738c61 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-horai_v2_47k_large_e1_lr_6_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English horai_v2_47k_large_e1_lr_6 RoBertaForSequenceClassification from stealthwriter +author: John Snow Labs +name: horai_v2_47k_large_e1_lr_6 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`horai_v2_47k_large_e1_lr_6` is a English model originally trained by stealthwriter. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/horai_v2_47k_large_e1_lr_6_en_5.2.0_3.0_1701504139300.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/horai_v2_47k_large_e1_lr_6_en_5.2.0_3.0_1701504139300.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("horai_v2_47k_large_e1_lr_6","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("horai_v2_47k_large_e1_lr_6","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|horai_v2_47k_large_e1_lr_6| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/stealthwriter/HorAI-V2-47k-large-e1-lr-6 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_gpt3_abstract_en.md b/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_gpt3_abstract_en.md new file mode 100644 index 000000000000..f7660601c777 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_gpt3_abstract_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English idmgsp_roberta_train_gpt3_abstract RoBertaForSequenceClassification from tum-nlp +author: John Snow Labs +name: idmgsp_roberta_train_gpt3_abstract +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idmgsp_roberta_train_gpt3_abstract` is a English model originally trained by tum-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_gpt3_abstract_en_5.2.0_3.0_1701475644251.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_gpt3_abstract_en_5.2.0_3.0_1701475644251.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_gpt3_abstract","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_gpt3_abstract","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idmgsp_roberta_train_gpt3_abstract| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|449.3 MB| + +## References + +https://huggingface.co/tum-nlp/IDMGSP-RoBERTa-TRAIN_GPT3-ABSTRACT \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_gpt3_conclusion_en.md b/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_gpt3_conclusion_en.md new file mode 100644 index 000000000000..60781e1f3edf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_gpt3_conclusion_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English idmgsp_roberta_train_gpt3_conclusion RoBertaForSequenceClassification from tum-nlp +author: John Snow Labs +name: idmgsp_roberta_train_gpt3_conclusion +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idmgsp_roberta_train_gpt3_conclusion` is a English model originally trained by tum-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_gpt3_conclusion_en_5.2.0_3.0_1701514080064.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_gpt3_conclusion_en_5.2.0_3.0_1701514080064.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_gpt3_conclusion","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_gpt3_conclusion","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idmgsp_roberta_train_gpt3_conclusion| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.4 MB| + +## References + +https://huggingface.co/tum-nlp/IDMGSP-RoBERTa-TRAIN_GPT3-CONCLUSION \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_introduction_en.md b/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_introduction_en.md new file mode 100644 index 000000000000..5a619d9cf02f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-idmgsp_roberta_train_introduction_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English idmgsp_roberta_train_introduction RoBertaForSequenceClassification from tum-nlp +author: John Snow Labs +name: idmgsp_roberta_train_introduction +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`idmgsp_roberta_train_introduction` is a English model originally trained by tum-nlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_introduction_en_5.2.0_3.0_1701510860914.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/idmgsp_roberta_train_introduction_en_5.2.0_3.0_1701510860914.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_introduction","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("idmgsp_roberta_train_introduction","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|idmgsp_roberta_train_introduction| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.3 MB| + +## References + +https://huggingface.co/tum-nlp/IDMGSP-RoBERTa-TRAIN-INTRODUCTION \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-imdbreviews_classification_roberta_v01_en.md b/docs/_posts/ahmedlone127/2023-12-02-imdbreviews_classification_roberta_v01_en.md new file mode 100644 index 000000000000..fbe26c38ce84 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-imdbreviews_classification_roberta_v01_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English imdbreviews_classification_roberta_v01 RoBertaForSequenceClassification from dfelorza +author: John Snow Labs +name: imdbreviews_classification_roberta_v01 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`imdbreviews_classification_roberta_v01` is a English model originally trained by dfelorza. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/imdbreviews_classification_roberta_v01_en_5.2.0_3.0_1701484060951.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/imdbreviews_classification_roberta_v01_en_5.2.0_3.0_1701484060951.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("imdbreviews_classification_roberta_v01","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("imdbreviews_classification_roberta_v01","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|imdbreviews_classification_roberta_v01| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.4 MB| + +## References + +https://huggingface.co/dfelorza/imdbreviews_classification_roberta_v01 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-incremental_semi_supervised_training_500k_equal_en.md b/docs/_posts/ahmedlone127/2023-12-02-incremental_semi_supervised_training_500k_equal_en.md new file mode 100644 index 000000000000..fa7130f47e9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-incremental_semi_supervised_training_500k_equal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English incremental_semi_supervised_training_500k_equal RoBertaForSequenceClassification from bitsanlp +author: John Snow Labs +name: incremental_semi_supervised_training_500k_equal +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`incremental_semi_supervised_training_500k_equal` is a English model originally trained by bitsanlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/incremental_semi_supervised_training_500k_equal_en_5.2.0_3.0_1701497783145.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/incremental_semi_supervised_training_500k_equal_en_5.2.0_3.0_1701497783145.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("incremental_semi_supervised_training_500k_equal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("incremental_semi_supervised_training_500k_equal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|incremental_semi_supervised_training_500k_equal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/bitsanlp/incremental-semi-supervised-training-500k-equal \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-indonesian_roberta_base_prdect_indonesian_id.md b/docs/_posts/ahmedlone127/2023-12-02-indonesian_roberta_base_prdect_indonesian_id.md new file mode 100644 index 000000000000..9b622142ed59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-indonesian_roberta_base_prdect_indonesian_id.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Indonesian indonesian_roberta_base_prdect_indonesian RoBertaForSequenceClassification from w11wo +author: John Snow Labs +name: indonesian_roberta_base_prdect_indonesian +date: 2023-12-02 +tags: [roberta, id, open_source, sequence_classification, onnx] +task: Text Classification +language: id +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`indonesian_roberta_base_prdect_indonesian` is a Indonesian model originally trained by w11wo. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/indonesian_roberta_base_prdect_indonesian_id_5.2.0_3.0_1701503098356.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/indonesian_roberta_base_prdect_indonesian_id_5.2.0_3.0_1701503098356.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("indonesian_roberta_base_prdect_indonesian","id")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("indonesian_roberta_base_prdect_indonesian","id") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|indonesian_roberta_base_prdect_indonesian| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|id| +|Size:|467.6 MB| + +## References + +https://huggingface.co/w11wo/indonesian-roberta-base-prdect-id \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-ipc_level1_g_en.md b/docs/_posts/ahmedlone127/2023-12-02-ipc_level1_g_en.md new file mode 100644 index 000000000000..0b7cdd30e1f3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-ipc_level1_g_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English ipc_level1_g RoBertaForSequenceClassification from intelcomp +author: John Snow Labs +name: ipc_level1_g +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`ipc_level1_g` is a English model originally trained by intelcomp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ipc_level1_g_en_5.2.0_3.0_1701485535892.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/ipc_level1_g_en_5.2.0_3.0_1701485535892.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("ipc_level1_g","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("ipc_level1_g","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|ipc_level1_g| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/intelcomp/ipc_level1_G \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-jrtec_distilroberta_base_mrpc_glue_omar_espejel_en.md b/docs/_posts/ahmedlone127/2023-12-02-jrtec_distilroberta_base_mrpc_glue_omar_espejel_en.md new file mode 100644 index 000000000000..fb597a854da3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-jrtec_distilroberta_base_mrpc_glue_omar_espejel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English jrtec_distilroberta_base_mrpc_glue_omar_espejel RoBertaForSequenceClassification from jrtec +author: John Snow Labs +name: jrtec_distilroberta_base_mrpc_glue_omar_espejel +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`jrtec_distilroberta_base_mrpc_glue_omar_espejel` is a English model originally trained by jrtec. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/jrtec_distilroberta_base_mrpc_glue_omar_espejel_en_5.2.0_3.0_1701522616349.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/jrtec_distilroberta_base_mrpc_glue_omar_espejel_en_5.2.0_3.0_1701522616349.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("jrtec_distilroberta_base_mrpc_glue_omar_espejel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("jrtec_distilroberta_base_mrpc_glue_omar_espejel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|jrtec_distilroberta_base_mrpc_glue_omar_espejel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/jrtec/jrtec-distilroberta-base-mrpc-glue-omar-espejel \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-khipu_finetuned_amazon_reviews_multi_javiteri95_en.md b/docs/_posts/ahmedlone127/2023-12-02-khipu_finetuned_amazon_reviews_multi_javiteri95_en.md new file mode 100644 index 000000000000..d2531deb44aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-khipu_finetuned_amazon_reviews_multi_javiteri95_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English khipu_finetuned_amazon_reviews_multi_javiteri95 RoBertaForSequenceClassification from javiteri95 +author: John Snow Labs +name: khipu_finetuned_amazon_reviews_multi_javiteri95 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`khipu_finetuned_amazon_reviews_multi_javiteri95` is a English model originally trained by javiteri95. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/khipu_finetuned_amazon_reviews_multi_javiteri95_en_5.2.0_3.0_1701500547660.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/khipu_finetuned_amazon_reviews_multi_javiteri95_en_5.2.0_3.0_1701500547660.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("khipu_finetuned_amazon_reviews_multi_javiteri95","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("khipu_finetuned_amazon_reviews_multi_javiteri95","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|khipu_finetuned_amazon_reviews_multi_javiteri95| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.7 MB| + +## References + +https://huggingface.co/javiteri95/khipu-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-khipu_finetuned_amazon_reviews_multi_sasha_en.md b/docs/_posts/ahmedlone127/2023-12-02-khipu_finetuned_amazon_reviews_multi_sasha_en.md new file mode 100644 index 000000000000..1a76f3b7265c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-khipu_finetuned_amazon_reviews_multi_sasha_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English khipu_finetuned_amazon_reviews_multi_sasha RoBertaForSequenceClassification from sasha +author: John Snow Labs +name: khipu_finetuned_amazon_reviews_multi_sasha +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`khipu_finetuned_amazon_reviews_multi_sasha` is a English model originally trained by sasha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/khipu_finetuned_amazon_reviews_multi_sasha_en_5.2.0_3.0_1701497524112.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/khipu_finetuned_amazon_reviews_multi_sasha_en_5.2.0_3.0_1701497524112.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("khipu_finetuned_amazon_reviews_multi_sasha","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("khipu_finetuned_amazon_reviews_multi_sasha","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|khipu_finetuned_amazon_reviews_multi_sasha| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|428.7 MB| + +## References + +https://huggingface.co/sasha/khipu-finetuned-amazon_reviews_multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-language_detection_robert_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-language_detection_robert_base_en.md new file mode 100644 index 000000000000..3cee8fe9d500 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-language_detection_robert_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English language_detection_robert_base RoBertaForSequenceClassification from jkhan447 +author: John Snow Labs +name: language_detection_robert_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`language_detection_robert_base` is a English model originally trained by jkhan447. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/language_detection_robert_base_en_5.2.0_3.0_1701519691402.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/language_detection_robert_base_en_5.2.0_3.0_1701519691402.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("language_detection_robert_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("language_detection_robert_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|language_detection_robert_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.2 MB| + +## References + +https://huggingface.co/jkhan447/language-detection-RoBert-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-legal_components_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-02-legal_components_roberta_en.md new file mode 100644 index 000000000000..b13121dc1f05 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-legal_components_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English legal_components_roberta RoBertaForSequenceClassification from nihiluis +author: John Snow Labs +name: legal_components_roberta +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`legal_components_roberta` is a English model originally trained by nihiluis. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/legal_components_roberta_en_5.2.0_3.0_1701487273799.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/legal_components_roberta_en_5.2.0_3.0_1701487273799.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("legal_components_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("legal_components_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|legal_components_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.0 MB| + +## References + +https://huggingface.co/nihiluis/legal-components-roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-me_sentiment_roberta_p1_en.md b/docs/_posts/ahmedlone127/2023-12-02-me_sentiment_roberta_p1_en.md new file mode 100644 index 000000000000..5b321d0439aa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-me_sentiment_roberta_p1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English me_sentiment_roberta_p1 RoBertaForSequenceClassification from afiqlol +author: John Snow Labs +name: me_sentiment_roberta_p1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`me_sentiment_roberta_p1` is a English model originally trained by afiqlol. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/me_sentiment_roberta_p1_en_5.2.0_3.0_1701512904143.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/me_sentiment_roberta_p1_en_5.2.0_3.0_1701512904143.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("me_sentiment_roberta_p1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("me_sentiment_roberta_p1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|me_sentiment_roberta_p1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/afiqlol/me-sentiment-roberta-p1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-mformer_authority_en.md b/docs/_posts/ahmedlone127/2023-12-02-mformer_authority_en.md new file mode 100644 index 000000000000..3be7ca016f33 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-mformer_authority_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mformer_authority RoBertaForSequenceClassification from joshnguyen +author: John Snow Labs +name: mformer_authority +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mformer_authority` is a English model originally trained by joshnguyen. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mformer_authority_en_5.2.0_3.0_1701475210673.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mformer_authority_en_5.2.0_3.0_1701475210673.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mformer_authority","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mformer_authority","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mformer_authority| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.7 MB| + +## References + +https://huggingface.co/joshnguyen/mformer-authority \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-mymodel_classify_category_news_en.md b/docs/_posts/ahmedlone127/2023-12-02-mymodel_classify_category_news_en.md new file mode 100644 index 000000000000..1d417ee6e3fa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-mymodel_classify_category_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English mymodel_classify_category_news RoBertaForSequenceClassification from duwuonline +author: John Snow Labs +name: mymodel_classify_category_news +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mymodel_classify_category_news` is a English model originally trained by duwuonline. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mymodel_classify_category_news_en_5.2.0_3.0_1701524855992.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mymodel_classify_category_news_en_5.2.0_3.0_1701524855992.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("mymodel_classify_category_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("mymodel_classify_category_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|mymodel_classify_category_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|845.9 MB| + +## References + +https://huggingface.co/duwuonline/mymodel-classify-category-news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random0_seed0_bertweet_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random0_seed0_bertweet_large_en.md new file mode 100644 index 000000000000..4d7aacf05c35 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random0_seed0_bertweet_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_random0_seed0_bertweet_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_random0_seed0_bertweet_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_random0_seed0_bertweet_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_random0_seed0_bertweet_large_en_5.2.0_3.0_1701540181095.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_random0_seed0_bertweet_large_en_5.2.0_3.0_1701540181095.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random0_seed0_bertweet_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random0_seed0_bertweet_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_random0_seed0_bertweet_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_random0_seed0-bertweet-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m_en.md b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m_en.md new file mode 100644 index 000000000000..17d283fa1825 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701518044354.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m_en_5.2.0_3.0_1701518044354.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_random0_seed2_twitter_roberta_base_2019_90m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_random0_seed2-twitter-roberta-base-2019-90m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random1_seed1_twitter_roberta_base_dec2020_en.md b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random1_seed1_twitter_roberta_base_dec2020_en.md new file mode 100644 index 000000000000..a20e1afef32c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random1_seed1_twitter_roberta_base_dec2020_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_random1_seed1_twitter_roberta_base_dec2020 RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_random1_seed1_twitter_roberta_base_dec2020 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_random1_seed1_twitter_roberta_base_dec2020` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_random1_seed1_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701507087949.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_random1_seed1_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701507087949.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random1_seed1_twitter_roberta_base_dec2020","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random1_seed1_twitter_roberta_base_dec2020","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_random1_seed1_twitter_roberta_base_dec2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_random1_seed1-twitter-roberta-base-dec2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed0_bertweet_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed0_bertweet_large_en.md new file mode 100644 index 000000000000..eb616e80a374 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed0_bertweet_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_random3_seed0_bertweet_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_random3_seed0_bertweet_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_random3_seed0_bertweet_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_random3_seed0_bertweet_large_en_5.2.0_3.0_1701532328282.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_random3_seed0_bertweet_large_en_5.2.0_3.0_1701532328282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random3_seed0_bertweet_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random3_seed0_bertweet_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_random3_seed0_bertweet_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_random3_seed0-bertweet-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed1_twitter_roberta_base_dec2020_en.md b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed1_twitter_roberta_base_dec2020_en.md new file mode 100644 index 000000000000..5c432ceb69b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed1_twitter_roberta_base_dec2020_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_random3_seed1_twitter_roberta_base_dec2020 RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_random3_seed1_twitter_roberta_base_dec2020 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_random3_seed1_twitter_roberta_base_dec2020` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_random3_seed1_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701546051658.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_random3_seed1_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701546051658.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random3_seed1_twitter_roberta_base_dec2020","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random3_seed1_twitter_roberta_base_dec2020","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_random3_seed1_twitter_roberta_base_dec2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_random3_seed1-twitter-roberta-base-dec2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..e575d31c862d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701543333099.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701543333099.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_random3_seed2_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_random3_seed2-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_temporal_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_temporal_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..7d55243a1c56 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nerd_nerd_temporal_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nerd_nerd_temporal_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: nerd_nerd_temporal_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nerd_nerd_temporal_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nerd_nerd_temporal_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701545239451.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nerd_nerd_temporal_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701545239451.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_temporal_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nerd_nerd_temporal_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nerd_nerd_temporal_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/nerd-nerd_temporal-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-nosql_identifier_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-nosql_identifier_roberta_base_en.md new file mode 100644 index 000000000000..c3e074ae7b9c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-nosql_identifier_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English nosql_identifier_roberta_base RoBertaForSequenceClassification from ankush-003 +author: John Snow Labs +name: nosql_identifier_roberta_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nosql_identifier_roberta_base` is a English model originally trained by ankush-003. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nosql_identifier_roberta_base_en_5.2.0_3.0_1701507967464.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nosql_identifier_roberta_base_en_5.2.0_3.0_1701507967464.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("nosql_identifier_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("nosql_identifier_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|nosql_identifier_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/ankush-003/nosql-identifier-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-parts_matching_en.md b/docs/_posts/ahmedlone127/2023-12-02-parts_matching_en.md new file mode 100644 index 000000000000..fc0681e33d27 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-parts_matching_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English parts_matching RoBertaForSequenceClassification from sattensil +author: John Snow Labs +name: parts_matching +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`parts_matching` is a English model originally trained by sattensil. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/parts_matching_en_5.2.0_3.0_1701501090506.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/parts_matching_en_5.2.0_3.0_1701501090506.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("parts_matching","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("parts_matching","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|parts_matching| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.0 MB| + +## References + +https://huggingface.co/sattensil/parts_matching \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_elyager_en.md b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_elyager_en.md new file mode 100644 index 000000000000..78e1662a4538 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_elyager_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_distilroberta_base_mrpc_elyager RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_distilroberta_base_mrpc_elyager +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distilroberta_base_mrpc_elyager` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_elyager_en_5.2.0_3.0_1701545873971.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_elyager_en_5.2.0_3.0_1701545873971.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_elyager","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_elyager","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distilroberta_base_mrpc_elyager| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/platzi/platzi-distilroberta-base-mrpc-elyager \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_en.md b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_en.md new file mode 100644 index 000000000000..9c9d660e7681 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_en.md @@ -0,0 +1,98 @@ +--- +layout: model +title: English platzi_distilroberta_base_mrpc_glue BertForSequenceClassification from eormeno12 +author: John Snow Labs +name: platzi_distilroberta_base_mrpc_glue +date: 2023-12-02 +tags: [bert, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained BertForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distilroberta_base_mrpc_glue` is a English model originally trained by eormeno12. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_en_5.2.0_3.0_1701541488950.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_en_5.2.0_3.0_1701541488950.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = BertForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) +``` +```scala +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = BertForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distilroberta_base_mrpc_glue| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +References + +https://huggingface.co/eormeno12/platzi-distilroberta-base-mrpc-glue \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes_en.md b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes_en.md new file mode 100644 index 000000000000..ec57e06a34fe --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes_en_5.2.0_3.0_1701481751383.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes_en_5.2.0_3.0_1701481751383.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distilroberta_base_mrpc_glue_luis_rogelio_reyes| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/platzi/platzi-distilroberta-base-mrpc-glue-luis-rogelio-reyes \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_pablo_campino1_en.md b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_pablo_campino1_en.md new file mode 100644 index 000000000000..b8a805fa6602 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_pablo_campino1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_distilroberta_base_mrpc_glue_pablo_campino1 RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_distilroberta_base_mrpc_glue_pablo_campino1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distilroberta_base_mrpc_glue_pablo_campino1` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_pablo_campino1_en_5.2.0_3.0_1701546546633.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_pablo_campino1_en_5.2.0_3.0_1701546546633.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_pablo_campino1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_pablo_campino1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distilroberta_base_mrpc_glue_pablo_campino1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/platzi/platzi-distilroberta-base-mrpc-glue-pablo-campino1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_yeder_lvicente_en.md b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_yeder_lvicente_en.md new file mode 100644 index 000000000000..f3db90ae705f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-platzi_distilroberta_base_mrpc_glue_yeder_lvicente_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_distilroberta_base_mrpc_glue_yeder_lvicente RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_distilroberta_base_mrpc_glue_yeder_lvicente +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distilroberta_base_mrpc_glue_yeder_lvicente` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_yeder_lvicente_en_5.2.0_3.0_1701483218534.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distilroberta_base_mrpc_glue_yeder_lvicente_en_5.2.0_3.0_1701483218534.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_yeder_lvicente","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distilroberta_base_mrpc_glue_yeder_lvicente","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distilroberta_base_mrpc_glue_yeder_lvicente| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/platzi/platzi-distilroberta-base-mrpc-glue-yeder-lvicente \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-platzi_distingroberta_base_mrpc_glue_pixelciosa_en.md b/docs/_posts/ahmedlone127/2023-12-02-platzi_distingroberta_base_mrpc_glue_pixelciosa_en.md new file mode 100644 index 000000000000..01d80b4790c0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-platzi_distingroberta_base_mrpc_glue_pixelciosa_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_distingroberta_base_mrpc_glue_pixelciosa RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_distingroberta_base_mrpc_glue_pixelciosa +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_distingroberta_base_mrpc_glue_pixelciosa` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_distingroberta_base_mrpc_glue_pixelciosa_en_5.2.0_3.0_1701521511751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_distingroberta_base_mrpc_glue_pixelciosa_en_5.2.0_3.0_1701521511751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distingroberta_base_mrpc_glue_pixelciosa","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_distingroberta_base_mrpc_glue_pixelciosa","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_distingroberta_base_mrpc_glue_pixelciosa| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|308.6 MB| + +## References + +https://huggingface.co/platzi/platzi-distingroberta-base-mrpc-glue-pixelciosa \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-platzi_roberta22_base_mrpc_glue_yimmy_cruz_en.md b/docs/_posts/ahmedlone127/2023-12-02-platzi_roberta22_base_mrpc_glue_yimmy_cruz_en.md new file mode 100644 index 000000000000..baa0ccc059d7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-platzi_roberta22_base_mrpc_glue_yimmy_cruz_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English platzi_roberta22_base_mrpc_glue_yimmy_cruz RoBertaForSequenceClassification from platzi +author: John Snow Labs +name: platzi_roberta22_base_mrpc_glue_yimmy_cruz +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`platzi_roberta22_base_mrpc_glue_yimmy_cruz` is a English model originally trained by platzi. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/platzi_roberta22_base_mrpc_glue_yimmy_cruz_en_5.2.0_3.0_1701494426263.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/platzi_roberta22_base_mrpc_glue_yimmy_cruz_en_5.2.0_3.0_1701494426263.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_roberta22_base_mrpc_glue_yimmy_cruz","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("platzi_roberta22_base_mrpc_glue_yimmy_cruz","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|platzi_roberta22_base_mrpc_glue_yimmy_cruz| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|435.1 MB| + +## References + +https://huggingface.co/platzi/platzi-roberta22-base-mrpc-glue-yimmy-cruz \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-politeness_roberta_text_disagreement_binary_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-02-politeness_roberta_text_disagreement_binary_classifier_en.md new file mode 100644 index 000000000000..06b2b6f8f122 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-politeness_roberta_text_disagreement_binary_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English politeness_roberta_text_disagreement_binary_classifier RoBertaForSequenceClassification from RuyuanWan +author: John Snow Labs +name: politeness_roberta_text_disagreement_binary_classifier +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`politeness_roberta_text_disagreement_binary_classifier` is a English model originally trained by RuyuanWan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/politeness_roberta_text_disagreement_binary_classifier_en_5.2.0_3.0_1701505472771.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/politeness_roberta_text_disagreement_binary_classifier_en_5.2.0_3.0_1701505472771.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("politeness_roberta_text_disagreement_binary_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("politeness_roberta_text_disagreement_binary_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|politeness_roberta_text_disagreement_binary_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.5 MB| + +## References + +https://huggingface.co/RuyuanWan/Politeness_RoBERTa_Text_Disagreement_Binary_Classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-politeness_roberta_text_disagreement_predictor_en.md b/docs/_posts/ahmedlone127/2023-12-02-politeness_roberta_text_disagreement_predictor_en.md new file mode 100644 index 000000000000..3854a33156f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-politeness_roberta_text_disagreement_predictor_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English politeness_roberta_text_disagreement_predictor RoBertaForSequenceClassification from RuyuanWan +author: John Snow Labs +name: politeness_roberta_text_disagreement_predictor +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`politeness_roberta_text_disagreement_predictor` is a English model originally trained by RuyuanWan. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/politeness_roberta_text_disagreement_predictor_en_5.2.0_3.0_1701489808198.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/politeness_roberta_text_disagreement_predictor_en_5.2.0_3.0_1701489808198.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("politeness_roberta_text_disagreement_predictor","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("politeness_roberta_text_disagreement_predictor","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|politeness_roberta_text_disagreement_predictor| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|423.9 MB| + +## References + +https://huggingface.co/RuyuanWan/Politeness_RoBERTa_Text_Disagreement_Predictor \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-praxis_sentiminds_youtube_twitter_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-02-praxis_sentiminds_youtube_twitter_roberta_en.md new file mode 100644 index 000000000000..ebe742b33745 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-praxis_sentiminds_youtube_twitter_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English praxis_sentiminds_youtube_twitter_roberta RoBertaForSequenceClassification from RanjithN +author: John Snow Labs +name: praxis_sentiminds_youtube_twitter_roberta +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`praxis_sentiminds_youtube_twitter_roberta` is a English model originally trained by RanjithN. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/praxis_sentiminds_youtube_twitter_roberta_en_5.2.0_3.0_1701516335598.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/praxis_sentiminds_youtube_twitter_roberta_en_5.2.0_3.0_1701516335598.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("praxis_sentiminds_youtube_twitter_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("praxis_sentiminds_youtube_twitter_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|praxis_sentiminds_youtube_twitter_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.2 MB| + +## References + +https://huggingface.co/RanjithN/praxis_sentiminds_youtube_twitter_roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-project_uspto_en.md b/docs/_posts/ahmedlone127/2023-12-02-project_uspto_en.md new file mode 100644 index 000000000000..3796c836e7f8 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-project_uspto_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English project_uspto RoBertaForSequenceClassification from moonahhyun +author: John Snow Labs +name: project_uspto +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`project_uspto` is a English model originally trained by moonahhyun. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/project_uspto_en_5.2.0_3.0_1701489866229.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/project_uspto_en_5.2.0_3.0_1701489866229.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("project_uspto","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("project_uspto","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|project_uspto| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.5 MB| + +## References + +https://huggingface.co/moonahhyun/project-uspto \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-readability_spanish_benchmark_bertin_spanish_sentences_2class_en.md b/docs/_posts/ahmedlone127/2023-12-02-readability_spanish_benchmark_bertin_spanish_sentences_2class_en.md new file mode 100644 index 000000000000..6a9e5ec5f14a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-readability_spanish_benchmark_bertin_spanish_sentences_2class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English readability_spanish_benchmark_bertin_spanish_sentences_2class RoBertaForSequenceClassification from lmvasque +author: John Snow Labs +name: readability_spanish_benchmark_bertin_spanish_sentences_2class +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`readability_spanish_benchmark_bertin_spanish_sentences_2class` is a English model originally trained by lmvasque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_bertin_spanish_sentences_2class_en_5.2.0_3.0_1701483833105.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_bertin_spanish_sentences_2class_en_5.2.0_3.0_1701483833105.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("readability_spanish_benchmark_bertin_spanish_sentences_2class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("readability_spanish_benchmark_bertin_spanish_sentences_2class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|readability_spanish_benchmark_bertin_spanish_sentences_2class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.4 MB| + +## References + +https://huggingface.co/lmvasque/readability-es-benchmark-bertin-es-sentences-2class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-readability_spanish_benchmark_bertin_spanish_sentences_3class_en.md b/docs/_posts/ahmedlone127/2023-12-02-readability_spanish_benchmark_bertin_spanish_sentences_3class_en.md new file mode 100644 index 000000000000..c157db88bc85 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-readability_spanish_benchmark_bertin_spanish_sentences_3class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English readability_spanish_benchmark_bertin_spanish_sentences_3class RoBertaForSequenceClassification from lmvasque +author: John Snow Labs +name: readability_spanish_benchmark_bertin_spanish_sentences_3class +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`readability_spanish_benchmark_bertin_spanish_sentences_3class` is a English model originally trained by lmvasque. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_bertin_spanish_sentences_3class_en_5.2.0_3.0_1701511664978.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/readability_spanish_benchmark_bertin_spanish_sentences_3class_en_5.2.0_3.0_1701511664978.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("readability_spanish_benchmark_bertin_spanish_sentences_3class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("readability_spanish_benchmark_bertin_spanish_sentences_3class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|readability_spanish_benchmark_bertin_spanish_sentences_3class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.5 MB| + +## References + +https://huggingface.co/lmvasque/readability-es-benchmark-bertin-es-sentences-3class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-regression_roberta_en.md b/docs/_posts/ahmedlone127/2023-12-02-regression_roberta_en.md new file mode 100644 index 000000000000..78819d5cc577 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-regression_roberta_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English regression_roberta RoBertaForSequenceClassification from Svetlana0303 +author: John Snow Labs +name: regression_roberta +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`regression_roberta` is a English model originally trained by Svetlana0303. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/regression_roberta_en_5.2.0_3.0_1701476184789.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/regression_roberta_en_5.2.0_3.0_1701476184789.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("regression_roberta","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("regression_roberta","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|regression_roberta| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|424.8 MB| + +## References + +https://huggingface.co/Svetlana0303/Regression_Roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-riskdt_en.md b/docs/_posts/ahmedlone127/2023-12-02-riskdt_en.md new file mode 100644 index 000000000000..309454a1cfcb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-riskdt_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English riskdt RoBertaForSequenceClassification from owen99630 +author: John Snow Labs +name: riskdt +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`riskdt` is a English model originally trained by owen99630. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/riskdt_en_5.2.0_3.0_1701506754788.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/riskdt_en_5.2.0_3.0_1701506754788.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("riskdt","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("riskdt","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|riskdt| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|448.1 MB| + +## References + +https://huggingface.co/owen99630/riskdt \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-robbert_twitter_sentiment_tokenized_en.md b/docs/_posts/ahmedlone127/2023-12-02-robbert_twitter_sentiment_tokenized_en.md new file mode 100644 index 000000000000..9266a338be5e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-robbert_twitter_sentiment_tokenized_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English robbert_twitter_sentiment_tokenized RoBertaForSequenceClassification from btjiong +author: John Snow Labs +name: robbert_twitter_sentiment_tokenized +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robbert_twitter_sentiment_tokenized` is a English model originally trained by btjiong. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robbert_twitter_sentiment_tokenized_en_5.2.0_3.0_1701514261022.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robbert_twitter_sentiment_tokenized_en_5.2.0_3.0_1701514261022.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("robbert_twitter_sentiment_tokenized","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("robbert_twitter_sentiment_tokenized","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robbert_twitter_sentiment_tokenized| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.9 MB| + +## References + +https://huggingface.co/btjiong/robbert-twitter-sentiment-tokenized \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-robbert_v2_dutch_base_hebban_reviews5_nl.md b/docs/_posts/ahmedlone127/2023-12-02-robbert_v2_dutch_base_hebban_reviews5_nl.md new file mode 100644 index 000000000000..db40041c500d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-robbert_v2_dutch_base_hebban_reviews5_nl.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Dutch, Flemish robbert_v2_dutch_base_hebban_reviews5 RoBertaForSequenceClassification from BramVanroy +author: John Snow Labs +name: robbert_v2_dutch_base_hebban_reviews5 +date: 2023-12-02 +tags: [roberta, nl, open_source, sequence_classification, onnx] +task: Text Classification +language: nl +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robbert_v2_dutch_base_hebban_reviews5` is a Dutch, Flemish model originally trained by BramVanroy. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robbert_v2_dutch_base_hebban_reviews5_nl_5.2.0_3.0_1701480282067.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robbert_v2_dutch_base_hebban_reviews5_nl_5.2.0_3.0_1701480282067.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("robbert_v2_dutch_base_hebban_reviews5","nl")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("robbert_v2_dutch_base_hebban_reviews5","nl") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robbert_v2_dutch_base_hebban_reviews5| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|nl| +|Size:|437.9 MB| + +## References + +https://huggingface.co/BramVanroy/robbert-v2-dutch-base-hebban-reviews5 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel_en.md new file mode 100644 index 000000000000..9e077cf47161 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel RoBertaForSequenceClassification from oskrmiguel +author: John Snow Labs +name: roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel` is a English model originally trained by oskrmiguel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel_en_5.2.0_3.0_1701527598566.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel_en_5.2.0_3.0_1701527598566.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_clasificacion_german_texto_supervisado_oskrmiguel| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.8 MB| + +## References + +https://huggingface.co/oskrmiguel/roberta-base-bne-clasificacion-de-texto-supervisado \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_detector_german_stress_detector_german_stress_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_detector_german_stress_detector_german_stress_en.md new file mode 100644 index 000000000000..aad4d7361b4b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_detector_german_stress_detector_german_stress_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_detector_german_stress_detector_german_stress RoBertaForSequenceClassification from ValenHumano +author: John Snow Labs +name: roberta_base_bne_detector_german_stress_detector_german_stress +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_detector_german_stress_detector_german_stress` is a English model originally trained by ValenHumano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_detector_german_stress_detector_german_stress_en_5.2.0_3.0_1701507778461.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_detector_german_stress_detector_german_stress_en_5.2.0_3.0_1701507778461.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_detector_german_stress_detector_german_stress","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_detector_german_stress_detector_german_stress","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_detector_german_stress_detector_german_stress| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.2 MB| + +## References + +https://huggingface.co/ValenHumano/roberta-base-bne-detector-de-stress-detector-de-stress \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_detector_german_stress_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_detector_german_stress_en.md new file mode 100644 index 000000000000..60d174bdbe89 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_detector_german_stress_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_detector_german_stress RoBertaForSequenceClassification from ValenHumano +author: John Snow Labs +name: roberta_base_bne_detector_german_stress +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_detector_german_stress` is a English model originally trained by ValenHumano. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_detector_german_stress_en_5.2.0_3.0_1701476064374.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_detector_german_stress_en_5.2.0_3.0_1701476064374.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_detector_german_stress","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_detector_german_stress","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_detector_german_stress| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.3 MB| + +## References + +https://huggingface.co/ValenHumano/roberta-base-bne-detector-de-stress \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_personality_multi_3_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_personality_multi_3_en.md new file mode 100644 index 000000000000..dc10feae69cb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_personality_multi_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_personality_multi_3 RoBertaForSequenceClassification from titi7242229 +author: John Snow Labs +name: roberta_base_bne_finetuned_personality_multi_3 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_personality_multi_3` is a English model originally trained by titi7242229. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_personality_multi_3_en_5.2.0_3.0_1701483107175.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_personality_multi_3_en_5.2.0_3.0_1701483107175.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_personality_multi_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_personality_multi_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_personality_multi_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.1 MB| + +## References + +https://huggingface.co/titi7242229/roberta-base-bne-finetuned_personality_multi_3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_personality_multi_4_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_personality_multi_4_en.md new file mode 100644 index 000000000000..e6995acad781 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_personality_multi_4_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_personality_multi_4 RoBertaForSequenceClassification from titi7242229 +author: John Snow Labs +name: roberta_base_bne_finetuned_personality_multi_4 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_personality_multi_4` is a English model originally trained by titi7242229. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_personality_multi_4_en_5.2.0_3.0_1701487272713.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_personality_multi_4_en_5.2.0_3.0_1701487272713.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_personality_multi_4","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_personality_multi_4","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_personality_multi_4| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|453.4 MB| + +## References + +https://huggingface.co/titi7242229/roberta-base-bne-finetuned_personality_multi_4 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad_en.md new file mode 100644 index 000000000000..dd0b7a5f14f6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad RoBertaForSequenceClassification from vg055 +author: John Snow Labs +name: roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad` is a English model originally trained by vg055. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad_en_5.2.0_3.0_1701480884165.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad_en_5.2.0_3.0_1701480884165.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_bne_finetuned_tripadvisor_finetuned_analisis_sentimiento_restmex2023_polaridad| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|466.2 MB| + +## References + +https://huggingface.co/vg055/roberta-base-bne-finetuned-tripAdvisor-finetuned-analisis-sentimiento-restmex2023-polaridad \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_finetuned_imdb_spoilers_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_finetuned_imdb_spoilers_en.md new file mode 100644 index 000000000000..47b3384569f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_finetuned_imdb_spoilers_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_finetuned_imdb_spoilers RoBertaForSequenceClassification from bhavyagiri +author: John Snow Labs +name: roberta_base_finetuned_imdb_spoilers +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_finetuned_imdb_spoilers` is a English model originally trained by bhavyagiri. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_imdb_spoilers_en_5.2.0_3.0_1701480282796.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_finetuned_imdb_spoilers_en_5.2.0_3.0_1701480282796.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_imdb_spoilers","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_finetuned_imdb_spoilers","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_finetuned_imdb_spoilers| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.6 MB| + +## References + +https://huggingface.co/bhavyagiri/roberta-base-finetuned-imdb-spoilers \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_mnli_willheld_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_mnli_willheld_en.md new file mode 100644 index 000000000000..df404e4108f2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_mnli_willheld_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_mnli_willheld RoBertaForSequenceClassification from WillHeld +author: John Snow Labs +name: roberta_base_mnli_willheld +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_mnli_willheld` is a English model originally trained by WillHeld. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_mnli_willheld_en_5.2.0_3.0_1701501806961.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_mnli_willheld_en_5.2.0_3.0_1701501806961.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mnli_willheld","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_mnli_willheld","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_mnli_willheld| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|465.3 MB| + +## References + +https://huggingface.co/WillHeld/roberta-base-mnli \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_ours_rundi_3_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_ours_rundi_3_en.md new file mode 100644 index 000000000000..29b0dd468d17 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_ours_rundi_3_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_ours_rundi_3 RoBertaForSequenceClassification from SkyR +author: John Snow Labs +name: roberta_base_ours_rundi_3 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_ours_rundi_3` is a English model originally trained by SkyR. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_ours_rundi_3_en_5.2.0_3.0_1701530077635.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_ours_rundi_3_en_5.2.0_3.0_1701530077635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ours_rundi_3","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_ours_rundi_3","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_ours_rundi_3| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|429.2 MB| + +## References + +https://huggingface.co/SkyR/roberta-base-ours-run-3 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_qqp_2e_5_42_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_qqp_2e_5_42_en.md new file mode 100644 index 000000000000..57d66ab74c3b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_qqp_2e_5_42_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_qqp_2e_5_42 RoBertaForSequenceClassification from TehranNLP-org +author: John Snow Labs +name: roberta_base_qqp_2e_5_42 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_qqp_2e_5_42` is a English model originally trained by TehranNLP-org. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_qqp_2e_5_42_en_5.2.0_3.0_1701486246389.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_qqp_2e_5_42_en_5.2.0_3.0_1701486246389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qqp_2e_5_42","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_qqp_2e_5_42","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_qqp_2e_5_42| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|464.3 MB| + +## References + +https://huggingface.co/TehranNLP-org/roberta-base-qqp-2e-5-42 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_sst_2_32_13_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_sst_2_32_13_en.md new file mode 100644 index 000000000000..c8185b8fe055 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_sst_2_32_13_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_sst_2_32_13 RoBertaForSequenceClassification from simonycl +author: John Snow Labs +name: roberta_base_sst_2_32_13 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_sst_2_32_13` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_sst_2_32_13_en_5.2.0_3.0_1701477234002.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_sst_2_32_13_en_5.2.0_3.0_1701477234002.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst_2_32_13","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_sst_2_32_13","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_sst_2_32_13| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|426.1 MB| + +## References + +https://huggingface.co/simonycl/roberta-base-sst-2-32-13 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_topic_multi_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_topic_multi_en.md new file mode 100644 index 000000000000..90544125cbb3 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_topic_multi_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_topic_multi RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: roberta_base_topic_multi +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_topic_multi` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_topic_multi_en_5.2.0_3.0_1701489151305.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_topic_multi_en_5.2.0_3.0_1701489151305.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_topic_multi","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_topic_multi","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_topic_multi| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|300.4 MB| + +## References + +https://huggingface.co/cardiffnlp/roberta-base-topic-multi \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_base_tweet_about_disaster_or_not_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_tweet_about_disaster_or_not_en.md new file mode 100644 index 000000000000..d3836b7e7187 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_base_tweet_about_disaster_or_not_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_base_tweet_about_disaster_or_not RoBertaForSequenceClassification from DunnBC22 +author: John Snow Labs +name: roberta_base_tweet_about_disaster_or_not +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_base_tweet_about_disaster_or_not` is a English model originally trained by DunnBC22. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_about_disaster_or_not_en_5.2.0_3.0_1701486137341.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_base_tweet_about_disaster_or_not_en_5.2.0_3.0_1701486137341.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_about_disaster_or_not","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_base_tweet_about_disaster_or_not","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_base_tweet_about_disaster_or_not| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|450.2 MB| + +## References + +https://huggingface.co/DunnBC22/roberta-base-Tweet_About_Disaster_Or_Not \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_corona_class_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_corona_class_en.md new file mode 100644 index 000000000000..cbc37483d846 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_corona_class_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_corona_class RoBertaForSequenceClassification from Peed911 +author: John Snow Labs +name: roberta_corona_class +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_corona_class` is a English model originally trained by Peed911. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_corona_class_en_5.2.0_3.0_1701488673311.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_corona_class_en_5.2.0_3.0_1701488673311.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_corona_class","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_corona_class","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_corona_class| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|427.8 MB| + +## References + +https://huggingface.co/Peed911/Roberta_corona_class \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_crypto_profiling_task1_complete_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_crypto_profiling_task1_complete_en.md new file mode 100644 index 000000000000..4d3d793db361 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_crypto_profiling_task1_complete_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_crypto_profiling_task1_complete RoBertaForSequenceClassification from pabagcha +author: John Snow Labs +name: roberta_crypto_profiling_task1_complete +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_crypto_profiling_task1_complete` is a English model originally trained by pabagcha. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_crypto_profiling_task1_complete_en_5.2.0_3.0_1701483690857.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_crypto_profiling_task1_complete_en_5.2.0_3.0_1701483690857.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_crypto_profiling_task1_complete","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_crypto_profiling_task1_complete","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_crypto_profiling_task1_complete| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/pabagcha/roberta_crypto_profiling_task1_complete \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_fine_tuned_sentiment_financial_news_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_fine_tuned_sentiment_financial_news_en.md new file mode 100644 index 000000000000..23b29ccbecaa --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_fine_tuned_sentiment_financial_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_fine_tuned_sentiment_financial_news RoBertaForSequenceClassification from RogerKam +author: John Snow Labs +name: roberta_fine_tuned_sentiment_financial_news +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_fine_tuned_sentiment_financial_news` is a English model originally trained by RogerKam. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_fine_tuned_sentiment_financial_news_en_5.2.0_3.0_1701484691811.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_fine_tuned_sentiment_financial_news_en_5.2.0_3.0_1701484691811.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fine_tuned_sentiment_financial_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_fine_tuned_sentiment_financial_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_fine_tuned_sentiment_financial_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.5 MB| + +## References + +https://huggingface.co/RogerKam/roberta_fine_tuned_sentiment_financial_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_da_task_b_100k_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_da_task_b_100k_en.md new file mode 100644 index 000000000000..bd25a140d81a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_da_task_b_100k_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_da_task_b_100k RoBertaForSequenceClassification from bitsanlp +author: John Snow Labs +name: roberta_finetuned_da_task_b_100k +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_da_task_b_100k` is a English model originally trained by bitsanlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_da_task_b_100k_en_5.2.0_3.0_1701522616398.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_da_task_b_100k_en_5.2.0_3.0_1701522616398.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_da_task_b_100k","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_da_task_b_100k","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_da_task_b_100k| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/bitsanlp/roberta-finetuned-DA-task-B-100k \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_solvencia_v1_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_solvencia_v1_en.md new file mode 100644 index 000000000000..384d820b9e9a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_solvencia_v1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_solvencia_v1 RoBertaForSequenceClassification from mnavas +author: John Snow Labs +name: roberta_finetuned_solvencia_v1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_solvencia_v1` is a English model originally trained by mnavas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_solvencia_v1_en_5.2.0_3.0_1701492808818.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_solvencia_v1_en_5.2.0_3.0_1701492808818.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_solvencia_v1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_solvencia_v1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_solvencia_v1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|849.2 MB| + +## References + +https://huggingface.co/mnavas/roberta-finetuned-solvencia-v1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas_en.md new file mode 100644 index 000000000000..a0669c9f12d1 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas RoBertaForSequenceClassification from mnavas +author: John Snow Labs +name: roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas` is a English model originally trained by mnavas. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas_en_5.2.0_3.0_1701479513299.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas_en_5.2.0_3.0_1701479513299.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_finetuned_webclassification_v2_smalllinguaesv2_mnavas| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|811.7 MB| + +## References + +https://huggingface.co/mnavas/roberta-finetuned-WebClassification-v2-smalllinguaESv2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_adhocracy_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_adhocracy_en.md new file mode 100644 index 000000000000..d860f2d573b9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_adhocracy_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_adhocracy RoBertaForSequenceClassification from CultureBERT +author: John Snow Labs +name: roberta_large_adhocracy +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_adhocracy` is a English model originally trained by CultureBERT. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_adhocracy_en_5.2.0_3.0_1701491632512.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_adhocracy_en_5.2.0_3.0_1701491632512.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_adhocracy","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_adhocracy","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_adhocracy| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/CultureBERT/roberta-large-adhocracy \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_bbc_news_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_bbc_news_en.md new file mode 100644 index 000000000000..be07d20c8b22 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_bbc_news_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_bbc_news RoBertaForSequenceClassification from AyoubChLin +author: John Snow Labs +name: roberta_large_bbc_news +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_bbc_news` is a English model originally trained by AyoubChLin. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_bbc_news_en_5.2.0_3.0_1701491355960.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_bbc_news_en_5.2.0_3.0_1701491355960.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_bbc_news","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_bbc_news","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_bbc_news| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/AyoubChLin/roberta-large-bbc_news \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_bne_cantemist_es.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_bne_cantemist_es.md new file mode 100644 index 000000000000..8af2e8f2c337 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_bne_cantemist_es.md @@ -0,0 +1,97 @@ +--- +layout: model +title: Castilian, Spanish roberta_large_bne_cantemist RoBertaForSequenceClassification from IIC +author: John Snow Labs +name: roberta_large_bne_cantemist +date: 2023-12-02 +tags: [roberta, es, open_source, sequence_classification, onnx] +task: Text Classification +language: es +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_bne_cantemist` is a Castilian, Spanish model originally trained by IIC. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_bne_cantemist_es_5.2.0_3.0_1701536882606.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_bne_cantemist_es_5.2.0_3.0_1701536882606.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_bne_cantemist","es")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_bne_cantemist","es") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_bne_cantemist| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|es| +|Size:|1.3 GB| + +## References + +https://huggingface.co/IIC/roberta-large-bne-cantemist \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_depression_classification_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_depression_classification_en.md new file mode 100644 index 000000000000..a35cd749ba28 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_depression_classification_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_depression_classification RoBertaForSequenceClassification from Trong-Nghia +author: John Snow Labs +name: roberta_large_depression_classification +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_depression_classification` is a English model originally trained by Trong-Nghia. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_depression_classification_en_5.2.0_3.0_1701486171616.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_depression_classification_en_5.2.0_3.0_1701486171616.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_depression_classification","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_depression_classification","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_depression_classification| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Trong-Nghia/roberta-large-depression-classification \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_e_snli_classification_nli_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_e_snli_classification_nli_base_en.md new file mode 100644 index 000000000000..917fa64f9667 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_e_snli_classification_nli_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_e_snli_classification_nli_base RoBertaForSequenceClassification from k4black +author: John Snow Labs +name: roberta_large_e_snli_classification_nli_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_e_snli_classification_nli_base` is a English model originally trained by k4black. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_e_snli_classification_nli_base_en_5.2.0_3.0_1701534194090.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_e_snli_classification_nli_base_en_5.2.0_3.0_1701534194090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_e_snli_classification_nli_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_e_snli_classification_nli_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_e_snli_classification_nli_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/k4black/roberta-large-e-snli-classification-nli-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_finetuned_non_code_mixed_ds_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_finetuned_non_code_mixed_ds_en.md new file mode 100644 index 000000000000..81c0b96bd8b6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_finetuned_non_code_mixed_ds_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_finetuned_non_code_mixed_ds RoBertaForSequenceClassification from IIIT-L +author: John Snow Labs +name: roberta_large_finetuned_non_code_mixed_ds +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_finetuned_non_code_mixed_ds` is a English model originally trained by IIIT-L. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_non_code_mixed_ds_en_5.2.0_3.0_1701534194206.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_non_code_mixed_ds_en_5.2.0_3.0_1701534194206.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_non_code_mixed_ds","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_non_code_mixed_ds","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_finetuned_non_code_mixed_ds| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/IIIT-L/roberta-large-finetuned-non-code-mixed-DS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_finetuned_ours_ds_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_finetuned_ours_ds_en.md new file mode 100644 index 000000000000..1cef89f5cff7 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_finetuned_ours_ds_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_finetuned_ours_ds RoBertaForSequenceClassification from IIIT-L +author: John Snow Labs +name: roberta_large_finetuned_ours_ds +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_finetuned_ours_ds` is a English model originally trained by IIIT-L. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_ours_ds_en_5.2.0_3.0_1701529612736.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_finetuned_ours_ds_en_5.2.0_3.0_1701529612736.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_ours_ds","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_finetuned_ours_ds","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_finetuned_ours_ds| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/IIIT-L/roberta-large-finetuned-ours-DS \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_legal_v2_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_legal_v2_en.md new file mode 100644 index 000000000000..c5d7a4dbb48c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_legal_v2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_legal_v2 RoBertaForSequenceClassification from timoneda +author: John Snow Labs +name: roberta_large_legal_v2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_legal_v2` is a English model originally trained by timoneda. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_legal_v2_en_5.2.0_3.0_1701499737843.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_legal_v2_en_5.2.0_3.0_1701499737843.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_legal_v2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_legal_v2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_legal_v2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/timoneda/roberta-large-legal-v2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_qqp_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_qqp_en.md new file mode 100644 index 000000000000..cdad862b8a7d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_qqp_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_qqp RoBertaForSequenceClassification from Shobhank-iiitdwd +author: John Snow Labs +name: roberta_large_qqp +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_qqp` is a English model originally trained by Shobhank-iiitdwd. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_qqp_en_5.2.0_3.0_1701497050751.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_qqp_en_5.2.0_3.0_1701497050751.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_qqp","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_qqp","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_qqp| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/Shobhank-iiitdwd/RoBERTa-large-QQP \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_sst_2_64_13_smoothed_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_sst_2_64_13_smoothed_en.md new file mode 100644 index 000000000000..19cb57f3d4c4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_sst_2_64_13_smoothed_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_sst_2_64_13_smoothed RoBertaForSequenceClassification from simonycl +author: John Snow Labs +name: roberta_large_sst_2_64_13_smoothed +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_sst_2_64_13_smoothed` is a English model originally trained by simonycl. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_sst_2_64_13_smoothed_en_5.2.0_3.0_1701523190748.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_sst_2_64_13_smoothed_en_5.2.0_3.0_1701523190748.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_sst_2_64_13_smoothed","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_sst_2_64_13_smoothed","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_sst_2_64_13_smoothed| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/simonycl/roberta-large-sst-2-64-13-smoothed \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_large_vira_intents_live_vira_chatbot_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_vira_intents_live_vira_chatbot_en.md new file mode 100644 index 000000000000..67f154becfff --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_large_vira_intents_live_vira_chatbot_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_large_vira_intents_live_vira_chatbot RoBertaForSequenceClassification from vira-chatbot +author: John Snow Labs +name: roberta_large_vira_intents_live_vira_chatbot +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_large_vira_intents_live_vira_chatbot` is a English model originally trained by vira-chatbot. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_live_vira_chatbot_en_5.2.0_3.0_1701521383058.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_large_vira_intents_live_vira_chatbot_en_5.2.0_3.0_1701521383058.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_live_vira_chatbot","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_large_vira_intents_live_vira_chatbot","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_large_vira_intents_live_vira_chatbot| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/vira-chatbot/roberta-large-vira-intents-live \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_nei_fact_check_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_nei_fact_check_en.md new file mode 100644 index 000000000000..77574c32b86a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_nei_fact_check_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_nei_fact_check RoBertaForSequenceClassification from Dzeniks +author: John Snow Labs +name: roberta_nei_fact_check +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_nei_fact_check` is a English model originally trained by Dzeniks. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_nei_fact_check_en_5.2.0_3.0_1701497644934.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_nei_fact_check_en_5.2.0_3.0_1701497644934.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_nei_fact_check","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_nei_fact_check","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_nei_fact_check| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.5 MB| + +## References + +https://huggingface.co/Dzeniks/roberta-nei-fact-check \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_pipeiq_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_pipeiq_en.md new file mode 100644 index 000000000000..690d906043b4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_pipeiq_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_pipeiq RoBertaForSequenceClassification from velvrix +author: John Snow Labs +name: roberta_pipeiq +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_pipeiq` is a English model originally trained by velvrix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_pipeiq_en_5.2.0_3.0_1701498807503.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_pipeiq_en_5.2.0_3.0_1701498807503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_pipeiq","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_pipeiq","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_pipeiq| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|437.0 MB| + +## References + +https://huggingface.co/velvrix/RoBERTA_pipeiq \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_rakshit122_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_rakshit122_en.md new file mode 100644 index 000000000000..b277eb07faa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_rakshit122_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_rakshit122 RoBertaForSequenceClassification from Rakshit122 +author: John Snow Labs +name: roberta_rakshit122 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_rakshit122` is a English model originally trained by Rakshit122. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_rakshit122_en_5.2.0_3.0_1701484964704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_rakshit122_en_5.2.0_3.0_1701484964704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_rakshit122","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_rakshit122","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_rakshit122| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|455.5 MB| + +## References + +https://huggingface.co/Rakshit122/roberta \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_reman_gustavecortal_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_reman_gustavecortal_en.md new file mode 100644 index 000000000000..c0c600f75fb2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_reman_gustavecortal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_reman_gustavecortal RoBertaForSequenceClassification from gustavecortal +author: John Snow Labs +name: roberta_reman_gustavecortal +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_reman_gustavecortal` is a English model originally trained by gustavecortal. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_reman_gustavecortal_en_5.2.0_3.0_1701515450308.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_reman_gustavecortal_en_5.2.0_3.0_1701515450308.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_reman_gustavecortal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_reman_gustavecortal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_reman_gustavecortal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/gustavecortal/roberta-reman \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_sentiment_analysis_finetune_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_sentiment_analysis_finetune_en.md new file mode 100644 index 000000000000..c72bddb2c95c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_sentiment_analysis_finetune_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_sentiment_analysis_finetune RoBertaForSequenceClassification from siberett +author: John Snow Labs +name: roberta_sentiment_analysis_finetune +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_sentiment_analysis_finetune` is a English model originally trained by siberett. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_sentiment_analysis_finetune_en_5.2.0_3.0_1701476453462.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_sentiment_analysis_finetune_en_5.2.0_3.0_1701476453462.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_analysis_finetune","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_sentiment_analysis_finetune","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_sentiment_analysis_finetune| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/siberett/roberta-sentiment-analysis-finetune \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_similarity_trained_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_similarity_trained_en.md new file mode 100644 index 000000000000..071b3c2fc05b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_similarity_trained_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_similarity_trained RoBertaForSequenceClassification from EducativeCS2023 +author: John Snow Labs +name: roberta_similarity_trained +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_similarity_trained` is a English model originally trained by EducativeCS2023. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_similarity_trained_en_5.2.0_3.0_1701477585253.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_similarity_trained_en_5.2.0_3.0_1701477585253.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_similarity_trained","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_similarity_trained","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_similarity_trained| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|440.7 MB| + +## References + +https://huggingface.co/EducativeCS2023/roberta-similarity-trained \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_toxicity_classifier_arsive_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_toxicity_classifier_arsive_en.md new file mode 100644 index 000000000000..e0a173c1043f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_toxicity_classifier_arsive_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_toxicity_classifier_arsive RoBertaForSequenceClassification from Arsive +author: John Snow Labs +name: roberta_toxicity_classifier_arsive +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_toxicity_classifier_arsive` is a English model originally trained by Arsive. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_toxicity_classifier_arsive_en_5.2.0_3.0_1701479405286.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_toxicity_classifier_arsive_en_5.2.0_3.0_1701479405286.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxicity_classifier_arsive","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_toxicity_classifier_arsive","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_toxicity_classifier_arsive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|460.2 MB| + +## References + +https://huggingface.co/Arsive/roberta-toxicity-classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-roberta_uniacco_en.md b/docs/_posts/ahmedlone127/2023-12-02-roberta_uniacco_en.md new file mode 100644 index 000000000000..0ffddc1abee2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-roberta_uniacco_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English roberta_uniacco RoBertaForSequenceClassification from velvrix +author: John Snow Labs +name: roberta_uniacco +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`roberta_uniacco` is a English model originally trained by velvrix. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/roberta_uniacco_en_5.2.0_3.0_1701481614722.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/roberta_uniacco_en_5.2.0_3.0_1701481614722.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_uniacco","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("roberta_uniacco","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|roberta_uniacco| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|452.5 MB| + +## References + +https://huggingface.co/velvrix/roberta_Uniacco \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-robertuito_pfinal_en.md b/docs/_posts/ahmedlone127/2023-12-02-robertuito_pfinal_en.md new file mode 100644 index 000000000000..c1a1f2706feb --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-robertuito_pfinal_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English robertuito_pfinal RoBertaForSequenceClassification from fredymad +author: John Snow Labs +name: robertuito_pfinal +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`robertuito_pfinal` is a English model originally trained by fredymad. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/robertuito_pfinal_en_5.2.0_3.0_1701518714156.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/robertuito_pfinal_en_5.2.0_3.0_1701518714156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("robertuito_pfinal","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("robertuito_pfinal","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|robertuito_pfinal| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/fredymad/robertuito_Pfinal \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sarcasm_detection_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-sarcasm_detection_roberta_base_en.md new file mode 100644 index 000000000000..b779d057b7f4 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sarcasm_detection_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sarcasm_detection_roberta_base RoBertaForSequenceClassification from jkhan447 +author: John Snow Labs +name: sarcasm_detection_roberta_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sarcasm_detection_roberta_base` is a English model originally trained by jkhan447. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sarcasm_detection_roberta_base_en_5.2.0_3.0_1701479380769.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sarcasm_detection_roberta_base_en_5.2.0_3.0_1701479380769.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sarcasm_detection_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sarcasm_detection_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sarcasm_detection_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|447.9 MB| + +## References + +https://huggingface.co/jkhan447/sarcasm-detection-RoBerta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-scientific_exaggeration_detection_en.md b/docs/_posts/ahmedlone127/2023-12-02-scientific_exaggeration_detection_en.md new file mode 100644 index 000000000000..a08b7eb1263f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-scientific_exaggeration_detection_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English scientific_exaggeration_detection RoBertaForSequenceClassification from copenlu +author: John Snow Labs +name: scientific_exaggeration_detection +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`scientific_exaggeration_detection` is a English model originally trained by copenlu. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/scientific_exaggeration_detection_en_5.2.0_3.0_1701491251659.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/scientific_exaggeration_detection_en_5.2.0_3.0_1701491251659.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("scientific_exaggeration_detection","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("scientific_exaggeration_detection","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|scientific_exaggeration_detection| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|443.8 MB| + +## References + +https://huggingface.co/copenlu/scientific-exaggeration-detection \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sensitive_spanish_classifier_en.md b/docs/_posts/ahmedlone127/2023-12-02-sensitive_spanish_classifier_en.md new file mode 100644 index 000000000000..fe91bf9292d9 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sensitive_spanish_classifier_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sensitive_spanish_classifier RoBertaForSequenceClassification from victoriapl01 +author: John Snow Labs +name: sensitive_spanish_classifier +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sensitive_spanish_classifier` is a English model originally trained by victoriapl01. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sensitive_spanish_classifier_en_5.2.0_3.0_1701507605219.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sensitive_spanish_classifier_en_5.2.0_3.0_1701507605219.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sensitive_spanish_classifier","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sensitive_spanish_classifier","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sensitive_spanish_classifier| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|446.4 MB| + +## References + +https://huggingface.co/victoriapl01/sensitive_spanish_classifier \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sentiment_roberta_latest_e4_b16_data2_en.md b/docs/_posts/ahmedlone127/2023-12-02-sentiment_roberta_latest_e4_b16_data2_en.md new file mode 100644 index 000000000000..35b79eacbb0d --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sentiment_roberta_latest_e4_b16_data2_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_roberta_latest_e4_b16_data2 RoBertaForSequenceClassification from JerryYanJiang +author: John Snow Labs +name: sentiment_roberta_latest_e4_b16_data2 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_roberta_latest_e4_b16_data2` is a English model originally trained by JerryYanJiang. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_roberta_latest_e4_b16_data2_en_5.2.0_3.0_1701520414482.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_roberta_latest_e4_b16_data2_en_5.2.0_3.0_1701520414482.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_roberta_latest_e4_b16_data2","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_roberta_latest_e4_b16_data2","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_roberta_latest_e4_b16_data2| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/JerryYanJiang/sentiment-roberta-latest-e4-b16-data2 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random0_seed0_roberta_base_en.md b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random0_seed0_roberta_base_en.md new file mode 100644 index 000000000000..8fc050681485 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random0_seed0_roberta_base_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_random0_seed0_roberta_base RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_random0_seed0_roberta_base +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_random0_seed0_roberta_base` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random0_seed0_roberta_base_en_5.2.0_3.0_1701516940274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random0_seed0_roberta_base_en_5.2.0_3.0_1701516940274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random0_seed0_roberta_base","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random0_seed0_roberta_base","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_random0_seed0_roberta_base| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|430.3 MB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_random0_seed0-roberta-base \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random1_seed1_bertweet_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random1_seed1_bertweet_large_en.md new file mode 100644 index 000000000000..1b075c436593 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random1_seed1_bertweet_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_random1_seed1_bertweet_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_random1_seed1_bertweet_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_random1_seed1_bertweet_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random1_seed1_bertweet_large_en_5.2.0_3.0_1701542287163.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random1_seed1_bertweet_large_en_5.2.0_3.0_1701542287163.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random1_seed1_bertweet_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random1_seed1_bertweet_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_random1_seed1_bertweet_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_random1_seed1-bertweet-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..82ddaafbcb2e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701511062604.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701511062604.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_random2_seed0_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_random2_seed0-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..0ea23a7f0eca --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701521802813.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701521802813.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_random2_seed2_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_random2_seed2-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020_en.md b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020_en.md new file mode 100644 index 000000000000..11d559272578 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020 RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701518342793.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020_en_5.2.0_3.0_1701518342793.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_temporal_twitter_roberta_base_dec2020| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_temporal-twitter-roberta-base-dec2020 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..5fa3e1a1bcd2 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701544156867.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701544156867.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|sentiment_sentiment_small_temporal_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/sentiment-sentiment_small_temporal-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-summary_roberta_wording_en.md b/docs/_posts/ahmedlone127/2023-12-02-summary_roberta_wording_en.md new file mode 100644 index 000000000000..46725157818a --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-summary_roberta_wording_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English summary_roberta_wording RoBertaForSequenceClassification from tiedaar +author: John Snow Labs +name: summary_roberta_wording +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`summary_roberta_wording` is a English model originally trained by tiedaar. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/summary_roberta_wording_en_5.2.0_3.0_1701496233477.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/summary_roberta_wording_en_5.2.0_3.0_1701496233477.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("summary_roberta_wording","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("summary_roberta_wording","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|summary_roberta_wording| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|436.8 MB| + +## References + +https://huggingface.co/tiedaar/summary-roberta-wording \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-thesis_constructauxsentence_1_en.md b/docs/_posts/ahmedlone127/2023-12-02-thesis_constructauxsentence_1_en.md new file mode 100644 index 000000000000..9836e88be955 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-thesis_constructauxsentence_1_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English thesis_constructauxsentence_1 RoBertaForSequenceClassification from UchihaMadara +author: John Snow Labs +name: thesis_constructauxsentence_1 +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`thesis_constructauxsentence_1` is a English model originally trained by UchihaMadara. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/thesis_constructauxsentence_1_en_5.2.0_3.0_1701510399575.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/thesis_constructauxsentence_1_en_5.2.0_3.0_1701510399575.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("thesis_constructauxsentence_1","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("thesis_constructauxsentence_1","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|thesis_constructauxsentence_1| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/UchihaMadara/Thesis-ConstructAuxSentence-1 \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed1_bertweet_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed1_bertweet_large_en.md new file mode 100644 index 000000000000..241e02814aa6 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed1_bertweet_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random1_seed1_bertweet_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random1_seed1_bertweet_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random1_seed1_bertweet_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed1_bertweet_large_en_5.2.0_3.0_1701520033603.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed1_bertweet_large_en_5.2.0_3.0_1701520033603.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed1_bertweet_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed1_bertweet_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random1_seed1_bertweet_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random1_seed1-bertweet-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_roberta_large_en.md b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_roberta_large_en.md new file mode 100644 index 000000000000..2974ca28d83b --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_roberta_large_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random1_seed2_roberta_large RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random1_seed2_roberta_large +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random1_seed2_roberta_large` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed2_roberta_large_en_5.2.0_3.0_1701528513729.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed2_roberta_large_en_5.2.0_3.0_1701528513729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed2_roberta_large","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed2_roberta_large","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random1_seed2_roberta_large| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random1_seed2-roberta-large \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_twitter_roberta_base_2021_124m_en.md b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_twitter_roberta_base_2021_124m_en.md new file mode 100644 index 000000000000..af5cf3130ef0 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_twitter_roberta_base_2021_124m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random1_seed2_twitter_roberta_base_2021_124m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random1_seed2_twitter_roberta_base_2021_124m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random1_seed2_twitter_roberta_base_2021_124m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed2_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701540513611.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed2_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701540513611.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed2_twitter_roberta_base_2021_124m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed2_twitter_roberta_base_2021_124m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random1_seed2_twitter_roberta_base_2021_124m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random1_seed2-twitter-roberta-base-2021-124m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..4b1fedf3b14e --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random1_seed2_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random1_seed2_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random1_seed2_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random1_seed2_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed2_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701545562154.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random1_seed2_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701545562154.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed2_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random1_seed2_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random1_seed2_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random1_seed2-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random2_seed0_twitter_roberta_large_2022_154m_en.md b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random2_seed0_twitter_roberta_large_2022_154m_en.md new file mode 100644 index 000000000000..dcdd64bb60ed --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random2_seed0_twitter_roberta_large_2022_154m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random2_seed0_twitter_roberta_large_2022_154m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random2_seed0_twitter_roberta_large_2022_154m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random2_seed0_twitter_roberta_large_2022_154m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random2_seed0_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701526522267.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random2_seed0_twitter_roberta_large_2022_154m_en_5.2.0_3.0_1701526522267.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random2_seed0_twitter_roberta_large_2022_154m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random2_seed0_twitter_roberta_large_2022_154m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random2_seed0_twitter_roberta_large_2022_154m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random2_seed0-twitter-roberta-large-2022-154m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random3_seed1_twitter_roberta_base_2021_124m_en.md b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random3_seed1_twitter_roberta_base_2021_124m_en.md new file mode 100644 index 000000000000..9cc5c91f91ee --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-topic_topic_random3_seed1_twitter_roberta_base_2021_124m_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English topic_topic_random3_seed1_twitter_roberta_base_2021_124m RoBertaForSequenceClassification from tweettemposhift +author: John Snow Labs +name: topic_topic_random3_seed1_twitter_roberta_base_2021_124m +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`topic_topic_random3_seed1_twitter_roberta_base_2021_124m` is a English model originally trained by tweettemposhift. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/topic_topic_random3_seed1_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701535183515.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/topic_topic_random3_seed1_twitter_roberta_base_2021_124m_en_5.2.0_3.0_1701535183515.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random3_seed1_twitter_roberta_base_2021_124m","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("topic_topic_random3_seed1_twitter_roberta_base_2021_124m","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|topic_topic_random3_seed1_twitter_roberta_base_2021_124m| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/tweettemposhift/topic-topic_random3_seed1-twitter-roberta-base-2021-124m \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-transformationtransformer3d_en.md b/docs/_posts/ahmedlone127/2023-12-02-transformationtransformer3d_en.md new file mode 100644 index 000000000000..bc9e083d4280 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-transformationtransformer3d_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English transformationtransformer3d RoBertaForSequenceClassification from simonschoe +author: John Snow Labs +name: transformationtransformer3d +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`transformationtransformer3d` is a English model originally trained by simonschoe. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/transformationtransformer3d_en_5.2.0_3.0_1701545562238.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/transformationtransformer3d_en_5.2.0_3.0_1701545562238.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("transformationtransformer3d","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("transformationtransformer3d","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|transformationtransformer3d| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|434.0 MB| + +## References + +https://huggingface.co/simonschoe/TransformationTransformer3D \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-tweet_colombia_emotions_en.md b/docs/_posts/ahmedlone127/2023-12-02-tweet_colombia_emotions_en.md new file mode 100644 index 000000000000..a7152bda03d5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-tweet_colombia_emotions_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English tweet_colombia_emotions RoBertaForSequenceClassification from jjiguaran +author: John Snow Labs +name: tweet_colombia_emotions +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`tweet_colombia_emotions` is a English model originally trained by jjiguaran. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/tweet_colombia_emotions_en_5.2.0_3.0_1701509007274.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/tweet_colombia_emotions_en_5.2.0_3.0_1701509007274.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_colombia_emotions","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("tweet_colombia_emotions","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|tweet_colombia_emotions| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/jjiguaran/tweet_colombia_emotions \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_2021_124m_hate_en.md b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_2021_124m_hate_en.md new file mode 100644 index 000000000000..0a6650e336c5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_2021_124m_hate_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2021_124m_hate RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2021_124m_hate +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2021_124m_hate` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_hate_en_5.2.0_3.0_1701478274458.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_hate_en_5.2.0_3.0_1701478274458.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_hate","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_hate","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2021_124m_hate| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m-hate \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_2021_124m_offensive_en.md b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_2021_124m_offensive_en.md new file mode 100644 index 000000000000..3ab0494d85f5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_2021_124m_offensive_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_2021_124m_offensive RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_2021_124m_offensive +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_2021_124m_offensive` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_offensive_en_5.2.0_3.0_1701501806833.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_2021_124m_offensive_en_5.2.0_3.0_1701501806833.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_offensive","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_2021_124m_offensive","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_2021_124m_offensive| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.3 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m-offensive \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_anger_intensity_en.md b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_anger_intensity_en.md new file mode 100644 index 000000000000..235add0575cf --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_anger_intensity_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_anger_intensity RoBertaForSequenceClassification from garrettbaber +author: John Snow Labs +name: twitter_roberta_base_anger_intensity +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_anger_intensity` is a English model originally trained by garrettbaber. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_anger_intensity_en_5.2.0_3.0_1701511769328.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_anger_intensity_en_5.2.0_3.0_1701511769328.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_anger_intensity","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_anger_intensity","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_anger_intensity| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/garrettbaber/twitter-roberta-base-anger-intensity \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_hate_finetuned_en.md b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_hate_finetuned_en.md new file mode 100644 index 000000000000..20825b1a019f --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_hate_finetuned_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_hate_finetuned RoBertaForSequenceClassification from fahad1247 +author: John Snow Labs +name: twitter_roberta_base_hate_finetuned +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_hate_finetuned` is a English model originally trained by fahad1247. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_finetuned_en_5.2.0_3.0_1701509298260.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_hate_finetuned_en_5.2.0_3.0_1701509298260.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate_finetuned","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_hate_finetuned","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_hate_finetuned| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/fahad1247/twitter-roberta-base-hate-finetuned \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_stance_atheism_en.md b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_stance_atheism_en.md new file mode 100644 index 000000000000..ab4a92f3f827 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_stance_atheism_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_stance_atheism RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_stance_atheism +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_stance_atheism` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_atheism_en_5.2.0_3.0_1701476069353.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_atheism_en_5.2.0_3.0_1701476069353.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_atheism","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_atheism","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_stance_atheism| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-stance-atheism \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_stance_hillary_en.md b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_stance_hillary_en.md new file mode 100644 index 000000000000..2d9d91401ec5 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twitter_roberta_base_stance_hillary_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_roberta_base_stance_hillary RoBertaForSequenceClassification from cardiffnlp +author: John Snow Labs +name: twitter_roberta_base_stance_hillary +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_roberta_base_stance_hillary` is a English model originally trained by cardiffnlp. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_hillary_en_5.2.0_3.0_1701492943779.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_roberta_base_stance_hillary_en_5.2.0_3.0_1701492943779.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_hillary","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_roberta_base_stance_hillary","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_roberta_base_stance_hillary| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|468.1 MB| + +## References + +https://huggingface.co/cardiffnlp/twitter-roberta-base-stance-hillary \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twitter_sexismo_finetuned_exist2021_metwo_en.md b/docs/_posts/ahmedlone127/2023-12-02-twitter_sexismo_finetuned_exist2021_metwo_en.md new file mode 100644 index 000000000000..c7f2154add2c --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twitter_sexismo_finetuned_exist2021_metwo_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twitter_sexismo_finetuned_exist2021_metwo RoBertaForSequenceClassification from hackathon-pln-es +author: John Snow Labs +name: twitter_sexismo_finetuned_exist2021_metwo +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twitter_sexismo_finetuned_exist2021_metwo` is a English model originally trained by hackathon-pln-es. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twitter_sexismo_finetuned_exist2021_metwo_en_5.2.0_3.0_1701504191704.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twitter_sexismo_finetuned_exist2021_metwo_en_5.2.0_3.0_1701504191704.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_sexismo_finetuned_exist2021_metwo","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twitter_sexismo_finetuned_exist2021_metwo","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twitter_sexismo_finetuned_exist2021_metwo| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|408.3 MB| + +## References + +https://huggingface.co/hackathon-pln-es/twitter_sexismo-finetuned-exist2021-metwo \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-twittercorona_en.md b/docs/_posts/ahmedlone127/2023-12-02-twittercorona_en.md new file mode 100644 index 000000000000..eb7ee0449234 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-twittercorona_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English twittercorona RoBertaForSequenceClassification from lukxus +author: John Snow Labs +name: twittercorona +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`twittercorona` is a English model originally trained by lukxus. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/twittercorona_en_5.2.0_3.0_1701480394514.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/twittercorona_en_5.2.0_3.0_1701480394514.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("twittercorona","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("twittercorona","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|twittercorona| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|467.7 MB| + +## References + +https://huggingface.co/lukxus/TwitterCorona \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-uzroberta_sentiment_analysis_en.md b/docs/_posts/ahmedlone127/2023-12-02-uzroberta_sentiment_analysis_en.md new file mode 100644 index 000000000000..0f1cd6a03b59 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-uzroberta_sentiment_analysis_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English uzroberta_sentiment_analysis RoBertaForSequenceClassification from murodbek +author: John Snow Labs +name: uzroberta_sentiment_analysis +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`uzroberta_sentiment_analysis` is a English model originally trained by murodbek. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/uzroberta_sentiment_analysis_en_5.2.0_3.0_1701506526839.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/uzroberta_sentiment_analysis_en_5.2.0_3.0_1701506526839.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("uzroberta_sentiment_analysis","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("uzroberta_sentiment_analysis","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|uzroberta_sentiment_analysis| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|314.0 MB| + +## References + +https://huggingface.co/murodbek/uzroberta-sentiment-analysis \ No newline at end of file diff --git a/docs/_posts/ahmedlone127/2023-12-02-w1g_en.md b/docs/_posts/ahmedlone127/2023-12-02-w1g_en.md new file mode 100644 index 000000000000..c5471f847b97 --- /dev/null +++ b/docs/_posts/ahmedlone127/2023-12-02-w1g_en.md @@ -0,0 +1,97 @@ +--- +layout: model +title: English w1g RoBertaForSequenceClassification from aloxatel +author: John Snow Labs +name: w1g +date: 2023-12-02 +tags: [roberta, en, open_source, sequence_classification, onnx] +task: Text Classification +language: en +edition: Spark NLP 5.2.0 +spark_version: 3.0 +supported: true +engine: onnx +annotator: RoBertaForSequenceClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Pretrained RoBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`w1g` is a English model originally trained by aloxatel. + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/w1g_en_5.2.0_3.0_1701526544814.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/w1g_en_5.2.0_3.0_1701526544814.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python + +document_assembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +sequenceClassifier = RoBertaForSequenceClassification.pretrained("w1g","en")\ + .setInputCols(["document","token"])\ + .setOutputCol("class") + +pipeline = Pipeline().setStages([document_assembler, tokenizer, sequenceClassifier]) + +data = spark.createDataFrame([["PUT YOUR STRING HERE"]]).toDF("text") + +result = pipeline.fit(data).transform(data) + +``` +```scala + +val document_assembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val sequenceClassifier = RoBertaForSequenceClassification.pretrained("w1g","en") + .setInputCols(Array("document","token")) + .setOutputCol("class") + +val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier)) + +val data = Seq("PUT YOUR STRING HERE").toDS.toDF("text") + +val result = pipeline.fit(data).transform(data) + + +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|w1g| +|Compatibility:|Spark NLP 5.2.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[documents, token]| +|Output Labels:|[class]| +|Language:|en| +|Size:|1.3 GB| + +## References + +https://huggingface.co/aloxatel/W1G \ No newline at end of file